找回密码
 新注册用户
搜索
楼主: BiscuiT

SETI@home 2008 技术新闻

[复制链接]
 楼主| 发表于 2008-11-27 09:15:17 | 显示全部楼层

26 Nov 2008 21:30:53 UTC

Oops. My web configuration changes yesterday afternoon seemed to work at first (I checked the logs, tested it myself, etc.) but something bad got exercised, probably at the next web log rotation (which quickly stops/starts the web server) which then made it impossible for people to see the home page for a couple hours. Instead they got a broken link to our subversion page (an interface to our freely available source code). My bad. I fixed this as soon as I noticed it later in the evening.

Later on we had some weird behavior on the scheduling server (anakin) where it ran out of memory due to too many httpd/cgi processes running. It actually recovered on its own around midnight, then got choked up again. Nothing really changed, as far as our configuration nor our executables so we restarted it again this morning with the "ceiling" process limit values lower than before. However I noticed the fastcgi's were growing as they stuck around. A memory leak perhaps? Dave pointed out we have been doing client logging the past couple of weeks (which we usually don't do). Maybe that part of the code contains a leak - he's checking. Maybe that combined with the short period of mysql query logging slowing everything down caused the scheduler fastcgi processes to bloat. Not sure exactly, but we turned client logging off, and I added another flag to the fastcgis to force them to exit from time to time regardless of error just to make sure they don't bloat for too long and eat up RAM. I also finally bit the bullet and figured out our broken/wonky web log rotation system given all the above and fixed all that (I think).

Obviously I didn't get dinged with jury duty this time around, though last night the automated reporting instructions hotline told me to call again today at 11am for further instructions. So I did, but then the service kept saying it was "unavailable at this time." You know, I tried. Anyway.. Happy day of turkey. Actually I think we're having goose this year. Jeff and I will both be around and checking in from time to time (as usual).

- Matt
回复

使用道具 举报

 楼主| 发表于 2008-12-2 09:41:14 | 显示全部楼层

1 Dec 2008 21:29:48 UTC

Welcome back from the holiday weekend, those who actually had a holiday weekend. Things were more or less calm around here. However thanks to our predictable nemesis autofs some things got a little murky yesterday. The mysql replica lost contact with the master - a regular occurrence - but we didn't get the warnings as mail was hung on a dead mount. Now that the replica has fallen behind (though it is catching up) the stats/server pages are a bit behind as well. This will clean itself up in due time. A few hours perhaps.

Otherwise work/data seems to be flowing normally, or normal enough. Dave incorporated some new scheduler logic (not sure what offhand) that is being tested in beta, probably rolled out to the public tomorrow. I'm bouncing around between data management, radar blanking code, and OS upgrade projects today.

- Matt
回复

使用道具 举报

发表于 2008-12-2 15:22:17 | 显示全部楼层
我英语不行
回复

使用道具 举报

 楼主| 发表于 2008-12-3 17:10:40 | 显示全部楼层

2 Dec 2008 23:27:39 UTC

Typical Tuesday outage day today (for database maintenance), and currently we're in the midst of smooth recovery from that, more or less. Things sometimes seem weirder on the server status page than they actually are, as the replica database (where we collect the stats) is too far behind the master. Sometime soon I'll add some stats to show this, hopefully thus refusing confusion (and fix the broken XML stuff while I'm at it).

Major improvements during the outage: Jeff put in some freshly compiled servers that went into beta last week, Bob rebuilt an index that has been missing on result for some time (used for occasional statistics Eric checks by hand), and I changed data selection priority to match between both Astropulse and Multibeam splitters (so they chew on the same files at the same time - and make it easier to determine who's splitting faster).

I also been busy with other sysadmin-y tasks. Moving accounts around (still), kicking one of our internal diagnostic cronjobs that has been hanging on stale lock files in /var/lib/rpm, data pipeline management (including shipping empty drives to Arecibo), and messing around with FC10.

- Matt
回复

使用道具 举报

 楼主| 发表于 2008-12-4 10:27:16 | 显示全部楼层

3 Dec 2008 23:24:42 UTC

Ah, Wednesday. It usually today when Jeff and I swap our "focus." Early in the week I'm aimed at hardware/sysadmin and he's deep in software development, and then later in the week we switch. This is an attempt to make sure we both get some programming time as the other person is taking the helm. He's mostly working on the NTPCker, and me on radar blanking stuff. Both projects are slow going.

There are a lot of chores we both manage. Maintaining the raw data pipeline eats up an astonishing amount of time so we swap those duties as well. Simply "walking the beat," chasing down alerts, fixing hung processes and broken services, could easily end up a whole day every day if we're not careful. Today a huge chunk of time was spent by me moving home accounts off the old server onto the new one (and cleaning up a bunch of old garbage in the process). Also lost an hour with Jeff trying to figure out why his subversion repository was out of sync in such a manner he couldn't check changes in. I did get a moment to get the latest version of the software radar blanking signal generator to compile - and I just started a test run.

- Matt
回复

使用道具 举报

 楼主| 发表于 2008-12-6 11:29:50 | 显示全部楼层

5 Dec 2008 23:12:26 UTC

Happy Friday! I don't really have much to add to the proceedings.. today was a lot like Wednesday when last I was here at the lab. Time spent on more filesystem shell games, compiling/running code, and working with Josh to figure out some weird discrepancies between beta/public Astropulse results.

I should point out I added a couple more stats to the server status page, those being mysql queries/second, along with the amount of seconds behind the replica is from the master. Maybe this will help clarify when things go awry, though I know sometimes more information obscures the pertinent stuff.

I forsee a couple dams breaking in the very near future, resulting in massive server closet updates/upgrades including, but not limited to: shutting down the incredibly solid (but physical large and logically small) NetApp rack to be replaced by a 3U system with twice the storage, thus making room to (finally) put vader and sidious in the closet, along with several UPSes, and another CPU server, clarke, which has been waiting for too long to be employed. Sometimes these things have to happen serially. Ducks in a row and all that.

- Matt
回复

使用道具 举报

 楼主| 发表于 2008-12-9 09:50:15 | 显示全部楼层

9 Dec 2008 0:45:19 UTC

Happy Monday, folks. Things were sort of okay over the weekend. The replica mysql database got stuck on Sunday - the usual drill - I logged in and quickly restarted it. The science database, however, also choked. This happened on Friday. Jeff's been doing some NTPCkr testing that would have gone all through the weekend except the excess I/O ate up all the informix threads, thus causing the splitters/assimilators to slow down and run out of work to send. Luckily I caught this before bedtime that night and broke that dam. Jeff's looking into why that happened.

In good news, Overland Storage (formally Snap Appliance, or Adaptec), donated 10 Terabytes of NAS storage in the form of a new "head" and two expansion units. One of the expansion units we'll try to get on our current workunit storage server ASAP (so we stop running out of room to split new work), and the other stuff we'll make a new temporary (possibly permanent) raw data reserve so we can do the big shell game and convert all the science database devices from RAID5 to RAID10. Thanks, Overland!

- Matt
回复

使用道具 举报

 楼主| 发表于 2008-12-10 18:17:26 | 显示全部楼层

10 Dec 2008 0:31:47 UTC

Tuesday outage day (mysql database backup/maintenance). Today Bob took care of the final step of the "single vs. multi-dimensional indexes" exercise. That is, he dropped all the multi-dimensional indexes on the result table in the main project on the master database and we crossed our fingers. Looks like mysql is neatly, or smartly, parsing queries and merging single indexes as needed just fine. This whole point was to remove the number of indexes we need, and thus keep a slightly smaller footprint in memory, which in turn helps performance.

The raw data pipeline has been a major headache, if only because our hot-swap enclosures have been giving us grief. Jeff and I determined one of them is flat out broken, so that reduces our current maximum throughput by half until we get it replaced. This isn't a disaster, as we pretty much never reach half of our maximum throughput anyway, but still a slight inconvenience as we have to more rigorously schedule drive swaps.

Gearing up for the donation drive, I discovered our mass mail server lost its DNS entry for some reason. The lab DNS master replaced it, but not after I turned sendmail on an hour earlier and started my tests, thus causing all kinds of circular bounces that clogged the entire lab's mail queue with literally thousands of e-mails (maybe tens of thousands). It's still draining as I type this. Don't blame me - I didn't remove that DNS entry.

We're another step closer to removing that NetApp box. In fact, it's out of the automounter maps, everything on it is sym-linked elsewhere or chmod'ed to 0, and I scoured all the other servers to remove sym-links to it. Part of this project meant resurrecting server "clarke" (donated many months ago) to be a CPU server (or otherwise internal use) as it will soon have room in the closet. It had a stale configuration at this point which needed refreshing.

No news on the Overland boxes - though one question was: why not combine them into one big box? Well, we have two separate needs: workunit storage, and raw data storage. The former we already have, and it works great - we just need more room - so we'll plug in one of the new expansions and get that room. The latter we don't really have and would like to keep on separate volumes (as you read the raw data and write out workunits, so you don't want the I/O to compete as it would on shared drives). Also.. part of the deal is we're going to continue helping them beta test their latest OS, which they have on the second head unit they gave us. So in a sense we're obliged to have two separate entities - the raw data on the beta test head/expansion and the workunits on the known-reliable head and additional expanion. Other question: form factor - the heads are about 2U and the expansions are about 3U. We have 2 of the former and 3 of the latter now. We'll have room for them eventually. I will update closet photos when we do the next major move (next week, I hope?).

- Matt
回复

使用道具 举报

 楼主| 发表于 2008-12-11 10:45:26 | 显示全部楼层

11 Dec 2008 0:21:20 UTC

Duringthe wee hours this morning our upload server (bruno) froze. We arestill unsure why, but recovery was a comedy of errors. Jeff was alreadyabout to power cycle it (having little other choice given theunresponsive console) when I got in around 8am. After rebooting brunofailed to mount its result storage drives due to some kind of mdadmmismanagement. This forced us into a read-only please-fsck-your-drivesmode. The drives, outside of pointless resyncing due to hard powercycle, were fine - they didn't need to be fsck'ed. Still, being root(/) was read-only we couldn't edit /etc/fstab to prevent this fromhappening again upon every reboot.

So I tried to get it into a real single user mode to make such an edit- all I wanted to do was comment out that one mount line. However, thusstarted a series of about 8 consecutive reboots, each taking about fiveminutes, and all wastes of time due to a typo or an unresponsive kvm. Iultimately gave up and booted from DVD in "rescue mode" where I couldfinally make the fstab edit. Finally all was well with the mount (whichI did on the command line), but then I had all kinds of network errorswith the system. More tweaks, more reboots... Long story short thisserver is being held together with figurative duct tape at the moment.We'll get it all sorted out later.

Jeff and I also worked together to get the remaining pieces of the"donation drive" in place, such as it is. I'm sending out test e-mailsout now, and will probably start sending in earnest on Friday. Pleasesend all questions/comments about our fundraising efforts to theprincipal investigators (Dan, Dave, Eric). I am simply implementing thetechnical aspects of this endeavor, though I would like to point out wefinally updated the text on the plans page.

By the way.. did anybody notice this?

- Matt
回复

使用道具 举报

 楼主| 发表于 2008-12-16 11:24:19 | 显示全部楼层
16 Dec 2008 0:10:30 UTC
Happy Monday, one and all.

So let's see... things are progressing in a general positive direction. Our conversion from multi- to single-dimensional indexes on the result table in the BOINC/mysql database seems to have been a success, though I'm still not sure if it's helping all that much just yet. In any case, we may continue doing the same on other tables. We might get the whole database, indexes and all, fitting entirely in memory. We don't need to (we're doing just fine with whatever level of paging is currently happening), but it'd still be nice. In any case, at least we proved that we don't need to create extra unwieldy multi-dimensional indexes to do specific merges - mysql 5.x and up will figure out how to the merges on its own.

Jeff and I plan to do some big steps towards moving things in and out of the server closet tomorrow. I'll try real hard to remember to bring a camera. If all goes well we'll at least have (a) more free rack space, (b) more available power, and (c) more workunit storage on-line (one less bottleneck to worry about!).

Thanks to those who've been beta tested the cuda version of the SETI@home client. Sorry if I confused people by vaguely mentioning this in my last missive. Once this is formally released I'm sure we're going to exercise new and old bottlenecks, but it will be a huge step in the world of volunteer computing. We may run out of work more often. Depending on your perspective this may be seen as a "good problem."

And we did finally get the donation mass e-mail rolling out late last week. I really appreciate the generosity of the SETI@home community, especially in these dark economic times.

- Matt
回复

使用道具 举报

 楼主| 发表于 2008-12-17 18:54:23 | 显示全部楼层

16 Dec 2008 23:43:25 UTC

First and foremost, it's snowing outside. This doesn't happen very often around here.

So today was an outage day - with one unexpected surprise: a visit from Court, systems administrator extraordinaire here in our lab a couple years back. Nice to see him again and catch up.

The standard outage stuff was, well, standard. Allow me to remind our new readers: Weekly we "compress" the mysql databases (which bloat from continual inserts/deletes all week, much like disk fragmentation) and back them up. These databases contain all the user/host/team info, and who is working on which workunits - basically all the generic volunteer computing stuff. The science is all kept in a separate database (using an Informix engine) on a different server altogether. The latter doesn't suffer from the same bloat, so we can do simple no-frills backups to disk while the database is live, without much ado. In theory we could do the mysql dumps live as well, but we choose to take things down to ensure the master/replica databases are in sync, and allow us some regular downtime to take care of pending server tasks. For example...

Today we finally turned off the old Network Appliance - a NAS server which worked fast and wonderfully, but (a) was only 3 Terabytes raw storage, (b) took up one third of our server closet, and (c) the individual disks have been failing at an increasing rate. We moved all of its functionality elsewhere already, so it was time to say goodbye. Jeff and I tore it apart shelf by shelf. Any sadness was lost in the joy of now having a completely empty rack full of completely useful shelves (we've had ridiculous problems finding racks/shelves that matched in the past). It's kind of funny the most useful part of that system at this point was its racks/shelves. We put all the recently donated Overland Storage servers into this now-empty rack (containing 10 Terabytes worth of storage), as well as anakin (the scheduling server), and there's still room for a lot more stuff. We still have to configure/employ all this new storage, but it's all plugged in and on line at least.

Recovery from the outage is usually painful. Today seems a little worse. Part of that is our work-to-send queue is at zero and the splitters are waiting for some space to free up before creating new work. I also think server "bruno" is having result storage issues slowing things down (people are connecting okay, which they can push through the usual traffic jam). We might need to reconfigure/rebuild that RAID array sooner than later.

I brought the mini video camera to make a quick video tour of our server closet, but the noise of all the fans is so loud it's basically worthless. I did take some low-quality still photos though - I'll get those up on the web someday.

- Matt
回复

使用道具 举报

 楼主| 发表于 2008-12-18 09:57:09 | 显示全部楼层

17 Dec 2008 23:50:51 UTC

So it's official: you can now run SETI@home on your NVIDIA GPU.Of course they're still working out the kinks, and it has yet to beseen what effects (immediate and long term) this will have on ourservers and known bottlenecks. Such things are quite unpredictable,given the dizzying long list of variables.

In order to keep our bandwidth from going bonkers due to all the new client downloads, we employ the use of Coral Cache.This is all well and good, except that some ISPs out there firewallhttp redirects, which means a tiny subset of users cannot downloadthese new clients. This is unfortunate, as we have no choice because wecan't handle the new client downloads ourselves. So these few userswill suffer a bit until we can remove such caching.

Our replica server never did recover from the outage yesterday, causingstats of various kinds to be jammed for the past day or so. Thismorning we found scary log messages and we couldn't even shut mysqldown gracefully, so we had to kill the process and reboot the machine.It's been in really slow recovery mode all day. When finished there's agood chance it'll be out of sync from the master and will have to berebuilt from scratch anyway. Sigh. In the meantime, I'm pointing allqueries at the master, which is loading it down a bit and causing ussome minor grief (running out of work to send, for example).

- Matt


CUDA虽然已经从beta毕业,但还有缺陷,虽然还没看到会对我们的服务器和已知的瓶颈产生是什么样的影响。

大家在运行中发现有什么问题请回报过来,或者直接到官方论坛那里报告。
回复

使用道具 举报

 楼主| 发表于 2008-12-19 08:34:49 | 显示全部楼层

18 Dec 2008 22:41:17 UTC

Moving onward and upward. More and more people are switching over to the GPU version of SETI@home and Dave (and others) are tackling bugs/issues as they arise. As predicted we're hitting various bottlenecks. For starters, increased workunit creation (and current general pipeline management since we have full raw data drives that need to be emptied ASAP) has consumed various i/o resources, filled up the workunit storage, etc. On this front I'm getting around to employing some of the new drives donated by Overland Storage. The first RAID1 mirror is syncing up - may take a while before that's done and we can concatenate it to the current array. Might not be usable until next week.

Also, as many are complaining about on the forums, the upload server is blocked up pretty bad. This is strictly due to our 100Mbit limit, and there's really not much we can do about it at the moment. We're simply going to let this percolate and see if things clear up on their own (they may as I'm about to post this). Given the current state of wildly changing parameters it's not worth our time to fully understand specific issues until we get a better feel for what's going on. Nevertheless, I am working on using server "clarke" to configure/exercise bigger/faster result storage to put on bruno (the struggling upload server) perhaps next week.

As for the mysql replica, it did finally finish its garbage cleanup around midnight last night, but then couldn't start the engine because the pid file location was unreachable (?!). Bob restarted the server again, which initiated another round of garbage cleanup. Sigh. That finished this morning, and with the pid file business corrected in the meantime it started up without much ado - it still has 1.5 days of backlogged queries to chew on, though.

- Matt
回复

使用道具 举报

 楼主| 发表于 2008-12-23 10:21:13 | 显示全部楼层

22 Dec 2008 23:32:27 UTC

Okay,well, it's not like we didn't see difficulties coming with the releaseof a client that could potentially improve our processing by 10x. Butit hasn't been all that bad, either. Due to various reasons, mostlyexcessive i/o, the assimilator queue swelled, which caused the workunitstorage to reach maximum capacity, which in turn constrained thesplitters. This is still the case, more or less - though I am workingto increase the workunit storage which will help break one of our dams.I already employed some of the Overland Storage for raw data images,which will eventually break another dam or two. There's still ournetwork bandwidth limits, though... We're just crossing bridges as weget there.

In any case, I did add a new photo album of our server closet for the nerds in our audience.

Schedules will be erratic for the holidays, as you can imagine.

- Matt

GPU版本带来不少问题,主要是i/o过度,同化队列变大,导致WU存储达到最大,这反过来又制约了分离器。正在增加WU存储,希望能解决。另外还有带宽的限制。。

增加了一些新的服务器设备的照片:http://setiathome.berkeley.edu/s ... m=closet_12_22_2008
回复

使用道具 举报

 楼主| 发表于 2008-12-24 08:37:09 | 显示全部楼层

23 Dec 2008 23:00:32 UTC

Today had our weekly outage for mysql database backup, maintenance, etc. This week we are recreating the replica database from scratch using the dump from the master. This is to ensure that the crash last week didn't leave any secret lingering corruption. That's all happening now as I type this and the project is revving back up to speed.

Had a conference call with our Overland Storage connections to clean up a couple cosmetic issues with their new beta server. That's been working well and is already half full of raw data. Once the splitters start acting on those files the other raw data storage server will breathe a major sigh of relief. I was also set to (finally) bump up the workunit storage space yesterday using their new expansion unit - but waited until their procedure confirmation today lest I did anything silly and blew away millions of workunit files by accident. The good news is that I increased this storage by almost a terabyte today, with more to come. We have officially broken that dam.

I also noticed this morning the high load on bruno (the upload server) may be partially due to an old, old cronjob that checks "last upload" time and alert us accordingly. This process was mounting the upload directories over NFS and doing long directory listings, etc. which might have been slowing down that filesystem in general from time to time. I cleaned all that up - we'll see if it has any positive effect.

Jeff's been hard at work on the NTPCker. It's actually chewing on the beta database now in test mode. We did find that an "order by" clause in the code was causing the informix database engine to lock out all other queries. This may have been the problem we've been experiencing at random over the past months. Maybe informix needs more scratch space to do these sorts, and it locks the database in some kind of internal management panic if it can't find enough. Something to add to the list of "things to address in the new year."

- Matt
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 新注册用户

本版积分规则

论坛官方淘宝店开业啦~

Archiver|手机版|小黑屋|中国分布式计算总站 ( 沪ICP备05042587号 )

GMT+8, 2025-5-11 04:07

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表