It is an exciting time in the lab now as we are recovering from the craziness of CASP. While Bin and Rhiju are taking an incredibly well deserved vacation, the new HIV vaccine design project is starting to come into full swing. We now have computationally designed amino acid sequences for 15 potential vaccine candidates, and we will start the process of making them next tuesday; the first step is to synthesize genes which encode these proteins. We have also designed a whole series of novel enzymes which catalyze a wide variety of reactions, and are starting the gene synthesis process for these as well.
I'm particularly interested now in designing enzymes which destroy organophosphate compounds which are the key ingredients in many pesticides and nerve agents. On rosetta@home, we are carrying out calculations in which we are resampling regions of the landscape found to be low energy in initial sets of runs and we hope these will lead to significnat improvements in our abilities to find global minima.
I'm very sorry about some of the not nice things being passed about on the message boards, and I'm also sorry that my efforts to calm things down haven't helped, so I will be doing my communicating with the project solely through this thread for the next week. You are all making great contributions, and I ask people who have been annoyed by what has been said on one side or the other to try to think about the big picture and what we are all trying to accomplish together.
We have officially switched over to a new crediting system that grants credit based on the amount of structures produced by your computer. Under the new system, the amount of credit awarded per structure for a particular work unit is determined by the average amount of credit claimed per structure using the standard BOINC credit metric over all Rosetta@home runs of that work unit to date. For each work unit type, we keep track of the total amount of claimed credits and structures from valid results returned by hosts, and we use these running totals to determine the amount of credit to award per structure. So if your computer returns 2 structures, the amount of credit awarded would be 2 * total_claimed_credit / total_structures where total_claimed_credit and total_structures are the sum of the claimed credits and structures from valid results returned by all hosts prior to your returned result for that particular work unit type, respectively. The first returned result will be awarded the claimed credit, the second returned result will get the average claimed credit per structure between the two multiplied by the number of structures returned by the result, the third returned result will get the average claimed credit per structure between the three multiplied by the number of structures returned by the result, and so forth. Under the same time frame, a faster computer will produce more structures than a slower computer and thus will be awarded more credits per cpu time.
Today was an exciting day for the group! In our vaccine design and enzyme design calculations, the end result is an amino acid sequence for a protein predicted to be a good vaccine or catalyst of a chemical reaction. The next step is to make a gene--a piece of DNA--that codes for the amino acid sequence. Due to advances in technology, rather than having to laboriously synthesize each gene in the lab, we can buy genes for any amino acid sequence for not to much from DNA synthesis companies, and we are lucky to be collaborating with a startup company in Boston called Codon who can make them for us quite cheaply. Today we ordered genes for 16 potential HIV vaccines, 15 potential new enzymes, and 4 potential new protien-protein complexes. I say potential above because our design calculations are not perfect, and we won't really know if these proteins act as designed until after we get the genes back in a month or so. Then we take advantage of modern molecular biology techniques to put the genes into bacteria where they direct the cells to make large amounts of the designed proteins. We can then separate the designed proteins from the rest of the stuff in the bacteria using a special tag we include in each of them that provides a good handle. Once we have the purified designed proteins, we can see whether they bind the desired antibodies in the case of the vaccine designs or catalyze the desired reactions in the case of the enzymes. In this way, we will learn about both the strengths and weaknesses of the rosetta design methodology, and hopefully have crated proteins that can have a very positive effect on the world!
As Hugo pointed out, we have not quite gotten the design methodology to the point we can run it on rosetta@home, but this should be coming in not too long as several people in my group are now focusing on this. Before this, look for protein-protein docking calculations where we are trying to predict the structures of the complexes between proteins which mediate much of the basic processes important to life. Chu Wang, a graduate student in the group, is close to having his docking methodology compatible with distributed computing, and we anticipate breakthroughs in this importnat problem as it also seems largely limited by cpu power.
Currently running on rosetta@home are the last of the casp tests on protein structure refinement (see the casp7 website) and tests of a general approach for estimating how much compute power is necessary to find the lowest energy structure for a sequence. I will describe the basic idea behind this approach in one of my next posts.
Native structures for several of the CASP targets are beginning to be released. The prediction from Rosetta@home for target T0299 is quite accurate! For details and pictures of the predicted structures, see the top predictions page.
Rosetta@home has been updated to version 5.32 -- our first update after CASP7. The new update has two major new features: 1. protein sidechains are shown in the graphics during Rosetta full-atom simulation stage (left) details; 2. a new type of structure prediciton task -- protein-protein docking is running on Rosetta@Home now (right)