18.05.2017 | Slides | Lecture Recording
-
Profile-Sequence comparisons are more accurate than sequence-sequence aligments
-
Profile-Profile alignments gain even more accuracy
Question: How do you build up a family (profile) of sequences?
- Find proteins of similar sequence with BLAST
- Use the found proteins to build a PSSM (profile)
- Use profile-sequence alignment with the calculated PSSM to retrieve more distant family members
- Add the newly found proteins to the family by recalculating the PSSM
- When building up a profile, start with a high threshold (only very similar sequences are taken), so the profile is not wrong from the beginnig
Sequence uniquely determines structure! ➡ Thus, from a sequence it should be possible to predict 3D structure and function
How would you assess prediction performance?
CASP: Critical Assessment of Structure Prediction
- Yearly event
- Submit predictions for structures, which will be experimentally predicted before a deadline
- Compare (after release of experimental structures) how the methods performed
Current State
- Only Homology Modeling is good
- No general prediction of 3D structure from sequence yet
- BUT: Important improvement in many fields
Different Methods to determine 3D structure
- 90% - X-Ray Crystallography
- 09% - Nuclear Magnetic Resonance Spectroscopy (NMR)
- 01 % - Cryo Electron Microscope (Cryo-EM)
**X-Ray Crystallography **
- Grow Crystal: Force the protein to grow a crystal
- Observe Diffraction Pattern: Shoot x-rays onto crystal and observe the diffraction pattern
- Compute Electron Density Map
- Fit observations to atomic model
**NMR **
- Protein has to be in similar solution as naturally
- Massive Magnets required
**Cryo-EM **
- worse resolution than other methods
- cheaper than other methods
- Pushing the boundaries of resolution of Cryo-EM is the future
Question: Which methods to experimentally determine the structure of a protein exist? How much are they used?
Fraction of proteins in the PDB by experimental method:
- 90% - X-Ray Crystallography
- 09% - Nuclear Magnetic Resonance Spectroscopy (NMR)
- 01 % - Electron Microscope (EM)
Question: How does X-Ray Crystallography Work
- Grow Crystal: Force the protein to grow a crystal
- Observe Diffraction Pattern: Shoot x-rays onto crystal and observe the diffraction pattern
- Compute Electron Density Map
- Fit observations to atomic model
💡 Idea: Secondary structure is completely explained by hydrogen bond formation.Helix: Hydrogen-Bond between residue i and residue i+4, which stabilize the helix.
Sheet: Two strands come together to form a sheet by forming hydrogen bonds between them
Question: How to get 1D secondary structure from 3D coordinates?
Two methods where used to annotate 3D coordinates:
1) DEFINE, based on geometry (not used anymore)
2) DSSP, based on hydrogen bond pattern (coulomb energy)
Assumption: Sequence uniquely determines structure and therefore, from similar sequence follows similar structure.
Target: Protein to model
Template: Protein to model from
- Identify Template: Query the PDB for similar sequences to your Target
- Align Target / Template: Select the best match as **Template **and assume the Target has the same structure
- Build Model
- Assess Model
- Refine Model

Question: How does Homology Modeling (Comparative Modeling) work?
Target: Protein to model
Template: Protein to model from
- Identify Template: Query the PDB for similar sequences to your Target
- Align Target / Template: Select the best match as **Template **and assume the Target has the same structure
- Build Model
- Assess Model
- Refine Model
Question: Which tradeoff does comparative modeling face? What are the limiting factors based on PSI (Percentage Sequence Identity)?
Tradeoff: Accuracy vs Coverage
Limiting factor in homology modeling:
75% - 100% - Speed of Modeling
50% - 75% - Quality of Model
25% - 50% - Alignment Accuracy
0% - 25% - Detection of Homology
**Summary: **lots of whistles and bells, downloadable, very accurateConstraint Satisfaction: use a set of objective functions to check whether the model is plausible
-
$$C_{\alpha} - C_{\alpha}$$ distance - Molecular dynamics
- Langevin dynamics
- Rigid bodies
- Rigid molecular dynamics
- ...
Optimization Steps (run repeatedly)
- explore different local minima
Typical Errors
- side chain packing
- misalignment
- wrong template
Pick the right solution:
- DOPE score (Discrete Optimized Protein Energy)
- based on knowledge based pair potentials
Question: How to handle a missing loop in comparative modeling?
- One way would be to find similar loops and compute the average over them.
- Another solution would be to apply molecular dynamics on the loop sequence. (only for shot loops)
**Summary: **automated, increasingly comprehensive and flexible
Underlying 'Philosophy'
- fully automated
- for non-expert users / experimental biologists
- do less, make less mistakes
Original
- alignment by BLAST / PSI-BLAST
- copy to coordinates
- end
Today: More complicated ...