找回密码
 新注册用户
搜索
楼主: vmzy

[独立平台] [生命科学类] Folding@Home

[复制链接]
 楼主| 发表于 2010-2-22 20:15:11 | 显示全部楼层
February 20, 2010
Shortage on "small" WU's
Here's a heads up for donors running with clients configured for small WUs (v6 clients) or normal WUs (in pre-6 clients); note that this does not affect "big WU" client configs for v6 or earlier.  It looks like we're running low on those over the weekend.  I hope to have this resolved by Monday, but likely not tomorrow (Sunday).  One workaround is to configure your client for medium-sized WUs:

Acceptable size of work assignment and work result packets (bigger units
may have large memory demands) -- 'small' is <5MB, 'normal' is <10MB, and
'big' is >10MB (small/normal/big) [normal]?

This option states a preference for the size of work units downloaded and uploaded to the project servers. Bigger units will also have bigger memory requirements. If you run on a slower broadband or dialup internet connection, small is the recommended setting to ease your bandwidth usage.

Please see our installation guides if you're not familiar with these settings.  In general, the larger the setting here, the less likely we'll run out of WUs, since we'll assign small WUs to big WU clients if we run out of big WUs to give out, but of course won't send big WUs to small WU clients.
大意:
小包(small)任务不足,我们会尽快上传新任务的。如果您带宽足够建议您将任务类型改回normal。

评分

参与人数 1基本分 +10 维基拼图 +5 收起 理由
BiscuiT + 10 + 5

查看全部评分

回复

使用道具 举报

 楼主| 发表于 2010-2-23 10:18:56 | 显示全部楼层
February 22, 2010
Further bug fixes to the v5 WS
Joe has been pounding on the v5 WS trying to shake it out from the recent disaster with problems returning NVIDIA GPU WUs.  The upshot of all of this is that the v5 server code was pushed hard in many ways and several issues have now been found.  Joe is testing them, but we're hopefully that beyond the initial good news we had a few days ago, that several additional issues may now be fixed.

It's too early too tell since we're still testing, but I'm optimistic. This only affects particular servers (vsp07b, vspg10a, vsp11a) and the vsp09a CS.
大意:
为了彻底解决近期出现服务器问题(nv GPU客户端无法上传任务),我们准备对部分服务器代码进行升级和测试。

评分

参与人数 1基本分 +10 维基拼图 +5 收起 理由
BiscuiT + 10 + 5

查看全部评分

回复

使用道具 举报

 楼主| 发表于 2010-2-26 09:50:39 | 显示全部楼层
February 25, 2010
v5 Work server issues looking pretty good
It is looking like the WS bug fixed has helped resolve the NV issue GPU work server issues.  We are keeping a close eye on things, but it looks like the situation has been stable so far.
大意:
服务器问题貌似解决了。

评分

参与人数 1基本分 +10 维基拼图 +2 收起 理由
BiscuiT + 10 + 2

查看全部评分

回复

使用道具 举报

 楼主| 发表于 2010-3-12 10:49:40 | 显示全部楼层
March 11, 2010
Update on server status & WU shortage
There have been some questions about the server status for Folding@home.  The problem from the donor perspective is that the lack of WUs looks very similar to the servers being down -- the client reports it can't connect to the server.  The servers have been up and in good shape since about Feb 25 (see the blog post on that day -- and that was just for the limited case of NVIDIA GPU servers, not FAH-wide).  However, we have had a WU shortage now and again over the last week or so, which donors have mistaken as server reliability issues.

To fix this, we have been working to greatly increase the number of WUs, both for classic and SMP clients.  There are a very large number of classic WUs that are coming out in new cores: the Protomol B4 core and the new Gromacs A4 core.  B4 has rolled out and A4 is coming out in a week or so (maybe sooner).  We also have new A3 SMP WUs in the pipeline.  There have been some science issues that we have been working out on them.

Right now we are a bit too close, which leads to shortages for certain types of clients.  My goal is to have way more WUs than we need so there are never any donor delays in getting WUs and we are pretty close to that, once the last few issues get worked out.
大意:
最近服务器还算稳定,只有NVIDIA GPU服务器出了点小问题。
现在最重要的问题还是WU紧缺。不过我们已经准备了一大批新的B4单核版任务和A3 smp任务。新的A4内核一周内也会开始公测。

评分

参与人数 1基本分 +20 维基拼图 +8 收起 理由
BiscuiT + 20 + 8

查看全部评分

回复

使用道具 举报

发表于 2010-3-13 01:13:23 | 显示全部楼层
啥时候发布费米能用的Core啊……
回复

使用道具 举报

发表于 2010-3-22 01:04:40 | 显示全部楼层
WU紧缺……
斯坦福在搞啥,好像很久没发布运算成果了?
回复

使用道具 举报

 楼主| 发表于 2010-3-22 09:49:00 | 显示全部楼层
Fri Mar 19, 2010 11:58 pm
updated SMP2 core (A3): v2.17
We have released an updated version of the A3 core used to run SMP2 work units. The current release is now 2.17. Right now one of our SMP2 servers is requiring the new core; your client should auto-download the core next time you get work from that server.

Please do *not* update your cores mid-WU. The checkpoint format is different, and your work unit will start over from the beginning.

Thanks for your support of the project!
大意:
smp2 A3内核升级至2.17版
程序将在新任务开始时自动更新至新内核。
因为存盘点文件格式不兼容,如果你手动更新内核,将导致任务重新开始计算。

March 21, 2010
New award for Folding@home team
Folding@home researcher Greg Bowman was awarded the 2010 Kuhn Paradigm Shift Award from the American Chemical Society (ACS) for his talk on two paradigm shifts resulting from Folding@home: 1) the new methods that Folding@home uses to simulate protein folding, misfolding, etc, and 2) the results themselves, which suggest a significant change in protein folding theory.

评分

参与人数 1基本分 +10 维基拼图 +6 收起 理由
BiscuiT + 10 + 6

查看全部评分

回复

使用道具 举报

 楼主| 发表于 2010-3-28 10:41:17 | 显示全部楼层
March 27, 2010
ATI GPU server going off line for maintenance today, may be some WU shortages
We will likely be taking an ATI GPU server (vspg2v2) down today for maintenance.  We are working to get more jobs on its sister server (vspg3v2) to avoid WU shortages, but we're giving a heads up for donors just in case this doesn't time out well.
大意:
今天ATI GPU 服务器 (vspg2v2) 将进行例行维护,届时ATI GPU 任务可能会出现暂时性断粮。

评分

参与人数 1基本分 +10 维基拼图 +5 收起 理由
BiscuiT + 10 + 5

查看全部评分

回复

使用道具 举报

 楼主| 发表于 2010-4-5 22:34:48 | 显示全部楼层
April 02, 2010
We're working on vspg5v2. It will likely be down for a day or so.
大意:
vspg5v2服务器挂了大概1天左右,我们正在检修中

评分

参与人数 1基本分 +10 维基拼图 +3 收起 理由
BiscuiT + 10 + 3

查看全部评分

回复

使用道具 举报

 楼主| 发表于 2010-4-6 10:37:38 | 显示全部楼层
Mon Apr 05, 2010 6:42 pm
New protomol projects 10009, 10012-17, 10019-20
Projects 10009, 10012, 10013, 10014, 10015, 10016, 10017, 10019, 10020 using the new ProtoMol core have just entered advanced methods.

WARNING
We're using core version 23. A known issue is that killing and restarting the client forces the simulation to restart. We're working to address this problem.

These are all WW domain mutants. We will eventually have about 30 mutants running as separate projects. The goal is to gain further insight into the effect these mutations have on folding. Hopefully the simulations will match the predictions in the referenced paper.

points: 126
timeout: 4.6 days
deadline: 34.5 days

Reference:
"An experimental survey of the transition between two-state and downhill protein folding scenarios"
Feng Liu, Deguo Du, Amelia A. Fuller, Jennifer E. Davoren, Peter Wipf, Jeffery W. Kelly, and Martin Gruebele
PNAS February 19, 2008 vol. 105 no. 7 2369-2374
大意:
开始公测(必须在单核版客户端上使用advmethod参数才有可能收到这些任务)新一批ProtoMol任务10009, 10012-17, 10019-20
测试目的,分别对30处参数做了调整,以研究这些参数对模拟结果的影响。
新任务使用新的23版B4内核进行计算。
已知问题:不支持存盘,一旦重启客户端,将重新计算。

评分

参与人数 1基本分 +15 维基拼图 +6 收起 理由
BiscuiT + 15 + 6

查看全部评分

回复

使用道具 举报

 楼主| 发表于 2010-4-15 09:36:20 | 显示全部楼层
April 14, 2010
Support for GTX 4xx hardware
We have been working behind the scenes to optimize the Folding@home GPU client for the new NVIDIA GTX 4xx hardware.  So far, it's been going well with us hitting some strong performance numbers.  We are internally testing this and hope to soon (weeks) release this for outside beta testing.  Please note that GTX 4xx support will require a new client and also requires some changes to our cluster backend software.
大意:
我们正在为费米开发新客户端,目前一切顺利,理论上几周后就会开始公测。
注:费米客户端将有一套新的专用的客户端和服务器端。

评分

参与人数 1基本分 +20 维基拼图 +5 收起 理由
BiscuiT + 20 + 5

查看全部评分

回复

使用道具 举报

 楼主| 发表于 2010-4-19 10:08:18 | 显示全部楼层
Sat Apr 17, 2010 10:17 pm
Shortage of WUs
FAH has been pretty much out of work to be done for quite a few days now. (The word "few" depends on a number of different factors, including which client you run, which OS you run, what hardware you have, etc.)

This is not the usual situation, but it does happen. By policy, FAH does not distribute work unless the completion of that work is scientifically important so they do not replenish the servers with WUs that have already been successfully processed by someone else.

New projects are on their way, but there is quite a bit of preliminary work to be done before a new project is ready to start distributing WUs. In the meantime, there are a few things that you can try, but there's a pretty good chance that nothing will work except to be patient.

If you get a message about an invalid address (0.0.0.0) and your client was working previously, that's your problem. Unless your OS is an old version which prohibits you from upgrading to the latest client, do that first.

If you're getting the messages
+ No appropriate work server was available; will try again in a bit.
+ Couldn't get work instructions.
you've got the same problem except it's a more recent client.

If you have the latest client and you're not using "advmethods" consider adding it. That gives you access to a wider selection of WUs, but they also might be less stable, so this may not be something that you want to try. In rare instances, removing that option helps. In either case, remember that you may enable it either as a specific option in the Advanced configuration ("Set -advmethods flag always?"), by adding it to the command line, itself, or by entering it in the Advanced Configuration under "Additional client parameters? " and to disable it, you must stop using all three of those options.

In the configuration settings, if you are configured to accept SMALL or NORMAL WUs, you can reconfigure for larger WUs.

In the configuration settings, the default setting for Max Ram is obtained from your actual hardware. If you have restricted it to a smaller value, consider increasing it.

Some new projects are being released but not enough to fill everybody's machine yet so the recovery will take time.
大意:
FAH已经严重缺粮,有段时间了。
虽然我们现在有很多新项目,不过我们还在测试中,暂不能上服务器。在这种情况下,大家只好耐心等待了。原则上,FAH坚守宁缺毋滥信条,没有经过严格测试的WU,我们不会乱上服务器。
如果您被AS分配到0.0.0.0这个无效地址,那说明您的客户端或系统有问题,请升级至最新版客户端,并检查防火墙设置。
如果您得到以下信息:
+ No appropriate work server was available; will try again in a bit.
+ Couldn't get work instructions.
您最好也检查一下客户端和防火墙。
如果您的系统没问题,请试下加advmethods参数,这样您有机会可以接到测试任务。
如果您设定的任务类型是SMALL或NORMAL,那么请您调大些,这样就可能接到更多的任务,不过任务越大计算难度越大。
如果您手动限制了内存使用量(程序默认使用全部物理内存),请您调大些吧。
虽然有些新任务会断断续续上线,不过还是入不敷出,就请大家耐心等待吧。

译者注:看来断粮要持续一段时间了,建议大家做好准备,参加备用项目。个人推荐WCG。

评分

参与人数 1基本分 +30 维基拼图 +15 收起 理由
BiscuiT + 30 + 15

查看全部评分

回复

使用道具 举报

发表于 2010-4-19 23:42:36 | 显示全部楼层
这下好了 fah没得跑了。。。。。。
回复

使用道具 举报

 楼主| 发表于 2010-4-21 10:02:36 | 显示全部楼层
April 20, 2010
Progress on WU shortages
We had some WU shortages over the weekend, but for the most part handled the biggest demands.  However, we still have shortages for certain types of WUs, especially for pre-v6 clients.

We are working to add more A3 WUs as well as more staff members to prepare A3 WU projects.

Several new classic WUs came on line.   Also, once the protomol core (B4) gets more broad code applicability to older hardware (eg pre SSE), those WUs will be able to roll out more broadly as well.
大意:
断粮门进展
周末缺粮了,尤其是老版本(版本号低于6)客户端的任务,奇缺。
我们准备加大A3任务(SMP2)的供应量。
新的标准版(单核版)客户端的任务,已经上传。等b4解决了,对老机器的sse指令兼容性问题,也会大量放包的。

Tue Apr 20, 2010 3:07 am
A number of new projects have been released for some clients so the shortage condition is improving but I wouldn't call the issue isn't solved yet. (It's certainly not solved if you happen to be running one of the client types that is still experiencing shortages.)

Last year, FAH had WU shortages because the servers couldn't handle all of the load. The servers were upgraded and that's no longer the issue. Today it's more a question of taxing the scientists.

FAH only issues WUs from projects that have scientific value. Stanford will not intentionally reissue work that has already been completed successfully, but they do start new projects from time to time. Before a new project can be started, there's a significant amount of research that needs to be done to make sure the project is likely to produce useful results. Then the project must be configured and evaluated so they know it's going to be a successful project. Only then will a project be released for processing.

(Of course once a project yields results, the scientist has to analyze the data and either devise a new project based on what has been learned or prepare a journal paper to publish the results.) In many cases, the same scientists who are doing this pre- and post- work are the same ones that are managing other aspects of FAH. (It's relatively recent that FAH hired a professional programming firm, and resources are still rather limited.)

I don't know how many new projects are being prepared or for which clients or when more will be on-line, but it's a never-ending process. All you can do is be patient if your type of client is not receiving new assignments.
大意:
已经上传了一批新任务,缺粮情况应该有所改善,不过缺粮的问题,仍未得到根本解决(某些特定的任务还是不足)。
去年,我们也经历过缺粮,不过那次是由于服务器问题引起的,现在服务器已经升级了,那个问题已经不存在了。现在纯粹是由于缺粮而缺粮。
很多科学家身兼多职,分身乏术,所以新项目的发布速度很慢(他们既要分析上个项目的结果,又要制造下个项目的任务)。虽然我们把编程任务外包给了一个专业软件公司,不过杯水车薪,科学家们的科研任务依然紧张。
我也不知道,缺粮问题何时得以完美解决,现在大家能做的就只有耐心等待了。

评分

参与人数 1基本分 +30 维基拼图 +15 收起 理由
BiscuiT + 30 + 15

查看全部评分

回复

使用道具 举报

发表于 2010-4-21 14:15:32 | 显示全部楼层
费米的专用客户端和专用服务器……

待遇不低……
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 新注册用户

本版积分规则

论坛官方淘宝店开业啦~

Archiver|手机版|小黑屋|中国分布式计算总站 ( 沪ICP备05042587号 )

GMT+8, 2024-9-25 20:54

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表