[项目]SciLINC(Alpha测试)

Julian_Yuen · 发表于 2007-7-4 10:13:20

http://www.scilinc.org/
http://www.scilinc.org/SciLINC/
暂时未开放注册

项目状态 http://www.scilinc.org/SciLINC/server_status.php
======================

原帖由 Julian_Yuen 于 2007-7-4 11:01 发表
SciLINC项目在开展之处就有4个基本目标。简单地说，就是：

1. 增进公众接触全国的重要的科学文献的机会
2. 通过创建一个具有搜索和分析工具的经过扫描的文献，关键字及在线资源的网络知识库，来加强数字化资料的实用性
3. 创建一个帮助学习植物的教学工具。当屏保程序标注出了关键字时，参加者的电脑将会显示美国乃至全世界的植物的信息。显示的信息将描述各个植物的名称及正显示在参与者电脑上的术语，并将包括描述资料、图像、地图以及与术语相关的外部链接。
4. 提供一个能将公众的计算机资源采入到这个图书馆体系中的模型。

[ 本帖最后由 Julian_Yuen 于 2007-7-8 14:27 编辑 ]

zglloo · 发表于 2007-7-4 10:44:14

是新的项目?

Julian_Yuen · 发表于 2007-7-4 10:49:31

嗯吧，准备找点内容贴上来，先占楼

News（old）

March 22, 2007
The SciLINC project went on-line today. We expect to be processing work within a few days.
SciLINC项目从今天起开始在线了。我们期望在接下来的几天内开始进入处理工作。

March 29, 2007
The SciLINC server sent out its first set of work units and received its first results today, running on Linux. Expect a Windows version shortly.
SciLINC服务器在今天发出了第一份WU，并收到了第一个结果，这是个Linux上的结果。期待马上能有一个windows下的。

May 14, 2007
We have moved SciLINC from our intranet to the Internet. The server is now available via www.scilinc.org.
我们把SciLINC从我们的内部网放到了因特网上。该服务器位于 www.scilinc.org.

May 18, 2007
We are having problems with the server. The scheduler, feeder and associated processes are down, as you can see on the server status page.
我们的服务器出了一点问题。

May 21, 2007
The server is back online, see Problems changing a BOINC server's name for details.
服务器已在线。

June 1, 2007
Results from SciLINC are now being passed to Botanicus.
SciLINC的结果现在已发送到了Botanicus。

June 15, 2007
The process that gathers new work for SciLINC got a little carried away and filled the drive with everything it could find in the vaults. Images, PDFs, everything. This has been corrected and the server should now be running correctly again.

8:00 June 18, 2007
We have experienced a database corruption. Most of the server processes are down.
Updates will be posted here as they are available. At this time we expect to have things back on-line within a day or two.
Thank you for your continued support and understanding.

8:26 June 18, 2007
AlphaLaser and other users on the BOINCstats forums and the BOINC forums have brought to our attention an issue where the large number of files being served up for each workunit are causing high CPU loads in the core BOINC client.
Augustine found a partial workaround to this. It is discussed on BOINCstats as well.
After the database is back on-line we will be looking into correcting this issue.

11:05 June 18, 2007
New account creation has been temporarily disabled until we can iron out some of the issues that we are having.
If you really must have an account or want to help us test fixes to the CPU loading problem [url=mailto:[email protected]?subject=[Account-Request]]send us an email[/url] requesting one and after things are back up we will try to respond.

17:30 June 18, 2007
We have finished our examination and remote analysis of the main SciLINC server. This lead us to believe that it was the victim of electrical problems. It was later confirmed that there were some transient electrical issues on Saturday in the building where the machine is housed.
Binary junk was found written across a number of log files on this server. Those files belong to various SciLINC processes, MySQL, Apache and the Linux Kernel itself all around 7:02 PM local time Saturday the 16th. That's UTC 00:02 Sunday the 17th. This coincides with a "disturbance in the force" that knocked a number of other machines offline over the weekend as well.
These things should not happen, but they do. There are plans to move the machine to a better location in the next couple of weeks.
At this point we are filing bug reports, as requested in the log files, since MySQL is unable to recover from the transaction logs that are present. As soon as this is done we will be attempting recovery of the database and then work on getting the project back on-line.
We will keep you posted and thank you for your continued support.

23:30 June 19, 2007
The database is back on-line. We have disabled the sending of new work units and will be monitoring the system to see what happens.
Fortunately only one table of one database was corrupted. Unfortunately it was the SciLINC results table. This will probably result in no credit being granted for the results that are pending being reported and we apologize for that.
More information will be posted tomorrow.

[ 本帖最后由 Julian_Yuen 于 2007-7-4 14:07 编辑 ]

zglloo · 发表于 2007-7-4 10:57:03

Create an educational tool for learning about plant life. While the screensaver application is indexing keywords, the participant's computer will display information about plant life within the United States and around the world. The information displayed will describe each plant name or term currently being indexed on the participant's computer, and will include descriptive data, images, maps, and the annotated outlinks for that term.

当发展的scilinc项目始于它有四个主要目标. 在我看来似乎与植物有关系!

Julian_Yuen · 发表于 2007-7-4 11:01:53

2:07 June 20, 2007 Forums Online
The forums are now online. Please use the Question and Answers section for reporting any bugs or problems. There is also a special message board for general discussion.

We will be checking them in the morning. For now... sleep.

论坛已经在线了。有问题去留言。我们明天会查看，现在，该睡觉了。

11:38 June 20, 2007 User Profile Image Uploading Fixed
User profile images should now upload properly. Thank you Zain Upton and Paul@home for bringing this to our attention.

The details are in a message on our Question and Answer forums.

用户头像应该可以正确上传了。

11:20 June 21, 2007 Brief Outage Resolved
We experienced a short time down this morning. MySQL was accidentally shutdown for about 40 minutes. Everything should be back to normal now.

我们今早停机了一会儿。MySQL突然关闭了大约40分钟。现在一切都应回归正常了。

[ 本帖最后由 Julian_Yuen 于 2007-7-4 13:59 编辑 ]

Julian_Yuen · 发表于 2007-7-4 11:03:17

10:39 June 22, 2007 SciLINC Update
When development of the SciLINC project began it had four primary goals. Edited for brevity, they were:

1. Increase public access to nationally significant scientific literature.
2. Enhance the usefulness of digitized materials by creating a Web repository of scanned literature, keywords, and online resources with tools for searching and analysis.
3. Create an educational tool for learning about plant life. While the screensaver application is indexing keywords, the participant's computer will display information about plant life within the United States and around the world. The information displayed will describe each plant name or term currently being indexed on the participant's computer, and will include descriptive data, images, maps, and the annotated outlinks for that term.
4. Provide a model for adopting public-resource computing applications within the library community.

Botanicus is doing a wonderful job of meeting goals 1 and 2 including processing data generated by SciLINC. The project has certainly also meet goal 4.

We have learned much about grid-based, distributed, public-resource computing applications and the BOINC architecture. There are thoughts and plans for analyses down the road that will be much more computationally intensive than the original SciLINC analysis and we look forward in time to bringing these projects to you.

While the amount of data that SciLINC has to analyze will increase greatly in the days ahead it does not appear that increasing the volume of information is going to improve the user experience of running the SciLINC client.

It has been suggested that we repackage our data into single files instead of uploading and downloading 50 files per workunit as we currently do. This suggestion has been heeded and implemented. We had planned on doing it before SciLINC was rolled out but scheduling prevented it and the community discovered the project before we were ready to announce it. We expect that testing will show the repackaging lessens the load placed upon the core BOINC client software. But, it does not change the amount of data being transferred.

The truth is that the workunits fly by so rapidly that implementing goal 3 never became realistic.

When development of SciLINC began, the project lead's understanding was that from a technological and economic standpoint it makes sense to use public-resource computing in place of an internal grid computing architecture whenever less than a gigabyte of data is required per cpu-day of computation. Using the BOINC framework to transfer the data to clients, SciLINC meets this volume-of-computation guideline.

However, our brief experience with the dedicated BOINC community over the last couple weeks has shown that, to the community these numbers may differ somewhat. In its original form SciLINC would have needed to transfer roughly 250MiB of compressed data in order to occupy a modern CPU for a day. This would expand to nearly 660MiB of input data. Then the client would need to upload about 44MiB of results which would compress to 17MiB. These numbers have only grown as SciLINC has been improved and made more efficient.

This is not acceptable to the average BOINC user.

Looking at the numbers from the perspective of someone on dial-up, if they set SciLINC to only 1% of their BOINC time, this would be roughly 15 minutes out of a day. For this 15 minutes they would have needed to download around 2.5MiB of data. This may not be a huge issue for broadband users, but if someone is on dial-up (as we have learned many BOINC fans still are) the transfer time would exceed the computation time.

So, where are we now?

Even if the transfer:credit ratios were acceptable to the community, we do not have enough data to realistically occupy hundred or thousands of BOINC enthusiasts for a lengthy period of time. As we have already seen on various community boards a relatively small amount of credit is earned for a comparatively large load on their system resources. Any computational and transport related improvements that have been tested have only resulted in more data needing to be transferred.

As stated above, we are investigating the possibility of performing much more computationally intensive analyses in the months ahead. It is expected that these will be a much better fit for a BOINC project than the current task of text-indexing and taxonomic analysis which has a relatively low mathematical complexity.

Because of this it has been decided that for now all SciLINC computation will be performed internally. When we have something with a better credit-reward ratio (and nicer screensaver) it will be made available to the community.

Thank you again for your interest and support. We look forward to working with you in the future.

The SciLINC Team

This has been cross-posted to the forums for discussion and feedback.

*********************************************************************

SciLINC项目在开展之处就有4个基本目标。简单地说，就是：

1. 增进公众接触全国的重要的科学文献的机会
2. 通过创建一个具有搜索和分析工具的经过扫描的文献，关键字及在线资源的网络知识库，来加强数字化资料的实用性
3. 创建一个帮助学习植物的教学工具。当屏保程序标注出了关键字时，参加者的电脑将会显示美国乃至全世界的植物的信息。显示的信息将描述各个植物的名称及正显示在参与者电脑上的术语，并将包括描述资料、图像、地图以及与术语相关的外部链接。
4. 提供一个能将公众的计算机资源采入到这个图书馆体系中的模型。

Botanicus正在向着目标1和2进行着很不错的工作，包括处理由SciLINC生成的数据。该项目当然也在进行目标4.

我们学习了很多关于基于网格的分布式的公众计算资源程序以及BOINC构架。在这样的思路下，分析者会得到比原有的SciLINC分析方法要好得多的计算方法的构想和计划。我们也希望能够及时的把这些项目带给你们。

当SciLINC需要分析的数据量即将极大的增加时，并没有显示出增加的信息量将改善用户使用SciLINC客户端的体验。

我们被建议把我们的数据重新打包成一个单独的文件，用以取代目前我们所采用的每个WU需上传和下载50个文件的方式。这个建议得到了我们的注意，并被我们执行了。我们本计划在SciLINC大规模展开前就这样做，但时间安排阻止了我们，并且该项目在我们准备好宣布它之前就被外界发现了。我们希望测试能够体现出打包文件可以减轻BOINC客户端软件的负担。但是，这样并没有改变被传输的数据量。

事实是WU进展的如此之快以至于目标3从来都得不到执行。

当开始开发SciLINC时，项目主管的看法就是从一个合理的科技和经济学的立场，来使用公众计算能力来代替内部的网格计算架构，并且每个CPU-day的计算只需不到一个GB的数据(?)。通过BOINC框架来把数据传送给客户端，SciLINC达到了这个计算量的方针。

然而，BOINC社区在过去的两周的重要经历显示出，这些数字有些不同。原先从SciLINC将需传输大约250MB的压缩数据来用于占用一个普通CPU一天的计算。这将扩大为几乎660MB的原始输入数据。然后客户端需要上传44MB的结果(被压缩为17MB)。只有SciLINC被改进的更加高效的时候，这些数字才会增加。

这是无法被一般的BOINC用户接受的。

从一些人的拨号上网的角度来看，如果他们设置SciLINC只占用他们1%的BOINC时间，这将是大约每天15分钟。在这15分钟他们需要下载大约2.5MB的数据，这对于宽带用户来说将不会是一个大问题，但如果是拨号上网的话(我们已经知道很多BOINC爱好者仍是这样的)，传输所需的时间将超过计算所需的时间。

所以，现在我们身处何处？

即使这个传输:信用的比率被大家所接受，我们也没有足够多的数据来实际的占用成千上万的BOINC狂热者一个较长的时期。我们也看到了许多讨论版谈到了在占用了他们较多的系统资源的情况下得分却较少的情况。任何正在测试的与计算和传输相关的改进都会导致需要传输更多的数据。

鉴于以上原因，我们正在研究关于在下个月提供更多的计算强度的分析的可能性。我们期望它能够比现在的低精度低复杂的任务模式更加适合于作为一个BOINC项目。

因此我们决定眼下所有的SciLINC将会在内部进行。当我们有了更好的信用-奖励比率(以及更好的屏保)时，我们将会向公众开放。

再次感谢您的关心和支持。我们期待日后与你们的共同工作。

The SciLINC Team

[ 本帖最后由 Julian_Yuen 于 2007-7-4 16:42 编辑 ]

Julian_Yuen · 发表于 2007-7-4 11:03:58

00:01 June 27, 2007 Internal Testing, Scheduler Unavailable
Some internal tests are being run. So, the scheduler is temporarily unavailable to those on the Internet. This may cause some access denied messages to show up in your BOINC manager.

一些内测正在进行。所以，因特网上的调度将会临时关闭。这可能导致产生一些拒绝信息在你们的boinc maneger里面。

[ 本帖最后由 Julian_Yuen 于 2007-7-4 14:03 编辑 ]

zglloo · 发表于 2007-7-4 11:12:29

现在没有注册的位置吗?

Julian_Yuen · 发表于 2007-7-4 11:24:08

嗯，似乎还没有开放

		自动登录	找回密码
密码			新注册用户

[项目]SciLINC(Alpha测试)

News

附带乱七八糟不通顺的烂翻译

浏览过的版块