Log in on the mdstud-system.Open a connection to bio.lundberg.gu.se
HMMER 2.1 is installed at the bio-server. On-line documentation of HMMER 2.1 can be found at http://hmmer.wustl.edu/.
A new "Lab5" account is in your home directory (at lundberg).
In this exercise, you will study a group of sequences belonging to the Rel-homology family. They all have a similar 3D structure and similar function.
A second use of HMMER is to look for known domains in a query sequence, by searching a single sequence against a library of HMMs (in contrast to the previous section, in which we searched a single HMM against a sequence database.) To do this, you need a library of profile HMMs. In this case, we will construct the database ourselves. Larger databases are availble for download, such as the PFAM-database.
HMM databases are simply concatenated single HMM files. You can build them either by invoking the -A ``append'' option of hmmbuild, or by concatenating HMM files you've already built.
Download (as in the previous exercise, by searching PFAM) the
following PFAM entries and download the seed alignment in
msf-format for all three:
PF00041 (fn3 domain)
PF00076 (rrm domain)
PF00069 (pkinase domain)
Don't be confused by the files already in your Lab5 directory, the "rrm.slx", "fn3.slx" and "pkinase.slx" files do not work (last minute discovery)!
Use any of the two alternatives below:
The first alternative is to first build three
profiles with hmmbuild, just as in the previous section. Then you shold be able to fuse
them into a new file, for example myhmms, with cat:
bio> cat rrm.hmm fn3.hmm pkinase.hmm > myhmms (if
you called the profiles rrm.hmm, fn3.hmm and
pkinase.hmm).
Calibrate the fused file just as before with
hmmcalibrate.
The second alternative
is to use hmmbuild three
times with the append option -A:
bio> hmmbuild -A myhmms rrm.msf
bio> hmmbuild -A myhmms fn3.msf
bio> hmmbuild -A myhmms pkinase.msf
Then, calibrate myhmms with
hmmcalibrate.
Note that hmmcalibrate can be run on HMM databases as well as single HMMs.
Now that you have a small HMM database called
myhmms, let us use it to analyze the Drosophila Sevenless sequence,
7LES_DROME (in Lab5/):
bio> hmmpfam myhmms 7LES_DROME
Does the sequence seem to belong to any of the protein families
of the library?
Try to find out if the results from HMMER is better than
other methods in the case of the Rel-homology proteins. As a
starting point, imagine that you do not know the homologs of your
protein sequence Swissprot id o96458. Download this
sequence in fasta format from http://www.expasy.ch. This is actually one of
the proteins in the alignment file of the rel-domain from pfam (so we do in fact know
homologs to it, ie. the other proteins in the alignment).
Do a ordinary BLAST (command line or on the web) with o96458 against the PDB.
See Exercise 1 for syntax of the BLAST (blastall) program!
Use the same sequence to search the PDB with psi-BLAST (command line or on the web).
See Exercise 3 for syntax of the psi-BLAST program (i.e. blastpgp). Set the number of rounds to 6 by the option "-j 6". Psi-BLAST also uses profiles (but not HMM's) to iteratively search a database until nothing new is found (i.e. until convergence is reached).
Compare the results you got from HMMER with the ones from BLAST and psi-BLAST. Can you see any differences, does any of the methods seem to be more efficient in finding Rel-homologs?
Write a summary of what you have done in this exercise and the conlusions that you have made, and hand it in no later than next Friday!
The basic strengths of profile HMMs are (even if you did not find any evidence what so ever...):