|
楼主 |
发表于 2008-10-24 08:30:56
|
显示全部楼层
23 Oct 2008 20:55:56 UTC
There's been some problems with the web server lately which are hard to track down. However, this morning I found we were being crawled fairly severely by a googlebot. I thought I took care of that ages ago with a proper robots.txt file! I then realized the bot was scanning all the beta result pages, and I only had "disallow" lines for the main project - so I had to add extra lines for beta-specific web pages. This may help.
So we've been getting these frequent, scary, but ultimately harmless kernel warnings on bruno, our upload server. Research by Jeff showed a quick kernel upgrade would fix that. We brought the kernel in yesterday and rebooted this morning to pick it up. The new kernel was fine but exercised a set of other mysterious problems, mostly centered on our upload storage partition (which is software RAIDed). Lots of confusing/misleading fsck freakouts, mounting failures, disk label conflicts, etc. but eventually we were able to convince the system everything was okay, but not after a series of long, boring reboots.
Speaking of RAID, I still haven't put in the new spare on bambi. It's late enough in the week to not mess around with any hardware, especially after dealing with the above. Plus the particular RAID array in question is now 1 drive away from degradation (no big deal), and 2 drives away from failure. Plus it's a replica of the science database - and the primary is in good shape, and is backed up weekly. So no need to panic - we'll get the drive in there early next week.
Speaking of science database, I'm finding our signal tables (pulse, triplet, spike, gaussian) are sufficiently large that informix is automatically guessing that with certain "expensive" queries indexes aren't worth using, and is reverting to sequential scans which take forever. This has to be addressed sooner than later.
- Matt
googlebot对网站的严重抓取可能是导致最近的网络问题的主因,它会扫描所有beta项目的结果页面,而项目仅仅对主要项目禁止了抓取。
因此对服务器安装补丁进行更新,过程出现很多问题。。但最后还是能让服务器正常运作了。 |
|