SlideShare une entreprise Scribd logo
1  sur  80
Télécharger pour lire hors ligne
Rohit
                                                                                                     Digitally signed by Rohit Jhawer
                                                                                                     DN: cn=Rohit Jhawer, o, ou,
                                                                                                     email=rohit_jhawer@hotmail.


                                                                                  Jhawer
                                                                                                     com, c=IN
                                                                                                     Date: 2007.03.09 14:10:44
                                                                                                     +05'30'




                             Lecture 14:
    Protein Structure Prediction



CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Review of Proteins
• Proteins: polypeptides with a three
  dimensional structure
•
• Primary structure – sequence of amino
  acids constituting polypeptide chain

• Secondary structure – local organization of
  polypeptide chain into secondary structures
  such as α helices and β sheets

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Review of Proteins
• Tertiary structure –three dimensional
  arrangements of amino acids as they react to
  one another due to polarity and interactions
  between side chains

• Quaternary structure – Interaction of several
  protein subunits



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
• Proteins: chains of amino acids joined by
  peptide bonds

• Amino Acids:
  – Polar (separate positive and negatively charged
    regions)
  – free C=O group (CARBOXYL), can act as
    hydrogen bond acceptor
  – free NH group (AMINYL), can act as hydrogen
    bond donor


   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure




CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
• Many confirmations possible due to the
  rotation around the Alpha-Carbon (Cα)
  atom

• Confirmational changes lead to
  differences in three-dimensional
  structure of protein


   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
• Polypeptide chain has pattern of N-Cα-C
  repeated

• Angle between aminyl group and Cα is
  PHI (φ) angle; angle between Cα and
  carboxyl group is PSI (ψ) angle



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure




CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Differences between A.A.’s
• Difference between 20 amino acids is the R
  side chains

• Amino acids can be separated based on the
  chemical properties of the side chains:
  – Hydrophobic
  – Charged
  – Polar



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Differences between A.A.’s
• Hydrophobic: Alanine(A), Valine(V),
  phenylalanine (Y), Proline (P), Methionine
  (M), isoleucine (I), and Leucine(L)

• Charged: Aspartic acid (D), Glutamic Acid
  (E), Lysine (K), Arginine (R)

• Polar: Serine (S), Theronine (T), Tyrosine (Y);
  Histidine (H), Cysteine (C), Asparagine (N),
  Glutamine (Q), Tryptophan (W)
•
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Secondary Structure




•   Image source: http://www.ebi.ac.uk/microarray/biology_intro.html
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Secondary Structures
• Core of each protein made up of regular
  secondary structures

• Regular patterns of hydrogen bonds are
  formed between neighboring amino acids

• Amino acids in secondary structures have
  similar φ and ψ angles


   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Secondary Structures
• Structures act to neutralize the polar groups
  on each amino acid

• Secondary structures tightly packed in protein
  core and a hydrophobic environment

• Each amino acid side group has a limited
  space to occupy -- therefore a limited number
  of possible interactions

    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Types of Secondary
                   Structures
•   α Helices
•   β Sheets
•   Loops
•   Coils




     CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
α Helix
                             • Most abundant secondary
                               structure

                             • 3.6 amino acids per turn

                             • Hydrogen bond formed
                               between every fourth reside

                             • Average length: 10 amino
                               acids, or 3 turns

                             • Varies from 5 to 40 amino acids

Image source: http://www.hhmi.princeton.edu/sw/2002/psidelsk/scavengerhunt.htm; http://www4.ocn.ne.jp/~bio/biology/protein.htm
              CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
α Helix
• Normally found on the surface of protein
  cores

• Interact with aqueous environment
  – Inner facing side has hydrophobic amino
    acids
  – Outer-facing side has hydrophilic amino
    acids

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
α Helix
• Every third amino acid tends to be
  hydrophobic

• Pattern can be detected computationally

• Rich in alanine (A), gutamic acid (E), leucine
  (L), and methionine (M)

• Poor in proline (P), glycine (G), tyrosine (Y),
  and serine (S)
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
β Sheet




     Image source: http://broccoli.mfn.ki.se/pps_course_96/ss_960723_12.html;
                    http://www4.ocn.ne.jp/~bio/biology/protein.htm

CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
β Sheet
• Hydrogen bonds between 5-10
  consecutive amino acids in one portion
  of the chain with another 5-10 farther
  down the chain

• Interacting regions may be adjacent
  with a short loop, or far apart with other
  structures in between

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
β Sheet
• Directions:
  – Same: Parallel Sheet
  – Opposite: Anti-parallel Sheet
  – Mixed: Mixed Sheet

• Pattern of hydrogen bond formation in
  parallel and anti-parallel sheets is
  different

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
β Sheet
• Slight counterclockwise rotation

• Alpha carbons (as well as R side
  groups) alternate above and below the
  sheet

• Prediction difficult, due to wide range of
  φ and ψ angles

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Interactions in Helices and
          Sheets




CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Loop
• Regions between α helices and β
  sheets

• Various lengths and three-dimensional
  configurations

• Located on surface of the structure

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Loop
• Hairpin loops: complete turn in the
  polypeptide chain, (anti-parallel β sheets)

• More variable sequence structure

• Tend to have charged and polar amino acids

• Frequently a component of active sites

    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Coil
• Region of secondary structure that is
  not a helix, sheet, or loop




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Secondary Structure




•   Image source: http://www.ebi.ac.uk/microarray/biology_intro.html
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
6 Classes of Protein Structure
1) Class α: bundles of α helices connected by
  loops on surface of proteins

2) Class β: antiparallel β sheets, usually two
  sheets in close contact forming sandwich

3) Class α/β: mainly parallel β sheets with
  intervening α helices; may also have mixed β
  sheets (metabolic enzymes)

    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
6 Classes of Protein Structure
4) Class α+ β: mainly segregated α helices and
   antiparallel β sheets

5) Multidomain (α and β) proteins more than
   one of the above four domains

6) Membrane and cell-surface proteins and
   peptides excluding proteins of the immune
   system

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
α Class Protein (hemoglobin)




•   http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=3hhb;page=;pid=&opt=show&size=250


       CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
β Class Protein (T-Cell CD8)




•   http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=1cd8;page=;pid=&opt=show&size=500


       CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
α/ β Class Protein
                (tryptohan synthase)




•   http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=2wsy;page=;pid=&opt=show&size=500


       CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
α+β Class Protein
                         (1RNB)




•   http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=1rnb;page=;pid=&opt=show&size=500


       CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Membrane Protein (10PF)




•   http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=1opf;page=;pid=&opt=show&size=500


       CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure Databases
• Databases of three dimensional structures of
  proteins, where structure has been solved
  using X-ray crystallography or nuclear
  magnetic resonance (NMR) techniques

• Protein Databases:
  –    PDB
  –    SCOP
  –    Swiss-Prot
  –    PIR

      CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure Databases
• Most extensive for 3-D structure is the
  Protein Data Bank (PDB)

• Current release of PDB (April 8, 2003)
  has 20,622 structures




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Partial PDB File
ATOM    1   N     VAL    A     1            6.452        16.459         4.843       7.00     47.38           3HHB   162
ATOM    2   CA    VAL    A     1            7.060        17.792         4.760       6.00     48.47           3HHB   163
ATOM    3   C     VAL    A     1            8.561        17.703         5.038       6.00     37.13           3HHB   164
ATOM    4   O     VAL    A     1            8.992        17.182         6.072       8.00     36.25           3HHB   165
ATOM    5   CB    VAL    A     1            6.342        18.738         5.727       6.00     55.13           3HHB   166
ATOM    6   CG1   VAL    A     1            7.114        20.033         5.993       6.00     54.30           3HHB   167
ATOM    7   CG2   VAL    A     1            4.924        19.032         5.232       6.00     64.75           3HHB   168
ATOM    8   N     LEU    A     2            9.333        18.209         4.095       7.00     30.18           3HHB   169
ATOM    9   CA    LEU    A     2           10.785        18.159         4.237       6.00     35.60           3HHB   170
ATOM   10   C     LEU    A     2           11.247        19.305         5.133       6.00     35.47           3HHB   171
ATOM   11   O     LEU    A     2           11.017        20.477         4.819       8.00     37.64           3HHB   172
ATOM   12   CB    LEU    A     2           11.451        18.286         2.866       6.00     35.22           3HHB   173
ATOM   13   CG    LEU    A     2           11.081        17.137         1.927       6.00     31.04           3HHB   174
ATOM   14   CD1   LEU    A     2           11.766        17.306          .570       6.00     39.08           3HHB   175
ATOM   15   CD2   LEU    A     2           11.427        15.778         2.539       6.00     38.96           3HHB   176




        CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Description of PDB File
• second column: amino acid position in the
  polypeptide chain

• fourth column: current amino acid

• Columns 7, 8, and 9: x, y, and z coordinates
  (in angstroms)

• The 11th column: temperature factor -- can be
  used as a measurement of uncertainty
   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
     Classification Databases
• Structural Classification of proteins
  (SCOP)

• based on expert definition of structural
  similarities

• SCOP classifies by class, family, superfamily,
  and fold

• http://scop.mrc-lmb.cam.ac.uk/scop/
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
     Classification Databases
• Classification by class, architecture,
  topology, and homology (CATH)

• Classifies proteins into hierarchical levels by
  class

• a/B and a+B are considered to be a single
  class

• http://www.biochem.ucl.ac.uk/bsm/cath/
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
    Classification Databases
• Molecular Modeling Database (MMDB)

• structures from PDB categorized into
  structurally related groups using the VAST

• looks for similar arrangements of secondary
  structural elements

• http://www.ncbi.nlm.nih.gov/Entrez

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
     Classification Databases
• Spatial Arrangement of Backbone
  Fragments (SARF)

• categorized on structural similarities,
  similar to the MMDB

• http://www-lmmb.ncifcrf.gov/~nicka/sarf2.html


    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Visualization of Proteins
• A number of programs convert atomic
  coordinates of 3-d structures into views of the
  molecule

• allow the user to manipulate the molecule by
  rotation, zooming, etc.

• Critical in drug design -- yields insight into
  how the protein might interact with ligands at
  active sites
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Visualization of Proteins
• Most popular program for viewing 3-
  dimensional structures is Rasmol

Rasmol: http://www.umass.edu/microbio/rasmol/
Chime: http://www.umass.edu/microbio/chime/
Cn3D: http://www.ncbi.nlm.nih.gov/Structure/
Mage: http://kinemage.biochem.duke.edu/website/kinhome.html
Swiss 3D viewer: http://www.expasy.ch/spdbv/mainpage.html




     CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Alignment of Protein Structure
• Three-dimensional structure of one protein
  compared against three-dimensional
  structure of second protein

• Atoms fit together as closely as possible to
  minimize the average deviation

• Structural similarity between proteins does
  not necessarily mean evolutionary
  relationship
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Alignment of Protein Structure
• Positions of atoms in three-dimensional
  structures compared

• Look for positions of secondary
  structural elements (helices and
  strands) within a protein domain



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Alignment of Protein Structure
• Distances between carbon atoms
  examined to determine degree
  structures may be superimposed

• Side chain information can be
  incorporated
  – Buried; visible


   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
SSAP
• Secondary Structure Alignment
  Program

• Incorporates double dynamic
  programming to produce a structural
  alignment between two proteins



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Steps in SSAP
• 1)    Calculate vectors from Cβ of one amino
  acid to set of nearby amino acids
  – Vectors from two separate proteins compared
  – Difference (expressed as an angle) calculated,
    and converted to score


• 2)   Matrix for scores of vector differences
  from one protein to the next is computed.


    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Steps in SSAP
• 3) Optimal alignment found using
  global dynamic programming, with a
  constant gap penalty

• 4) Next amino acid residue
  considered, optimal path to align this
  amino acid to the second sequence
  computed

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Steps in SSAP
• 5) Alignments transferred to
  summary matrix
  – If paths cross same matrix position, scores
    are summed
  – If part of alignment path found in both
    matrices, evidence of similarity




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Steps in SSAP
• 6) Dynamic programming alignment
  is performed for the summary matrix
  – Final alignment represents optimal
    alignment between the protein structures
  – Resulting score converted so it can be
    compared to see how closely related two
    structures are



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Distance Matrix Approach
• Uses graphical procedure similar to dot
  plots

• Identifies atoms that lie most closely
  together in three-dimensional structure

• Two sequences with similar structure
  can have dot plots superimposed

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Distance Matrix Approach
• Values in distance matrix represent distance
  between the Cα atoms in the three
  dimensional structure

• positions of closest packing atoms marked
  with a dot to highlight regions of interest

• Similar groups superimposed as closely as
  possible by minimizing sum of atomic
  distances
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
DALI
• Distance Alignment Tool (DALI)

• Uses distance matrix method to align protein
  structures

• Assembly step uses Monte Carlo simulation
  to find submatrices that can be aligned

• Existing structures that have been compared
  are organized into the FSSP database
   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Fast Structural Similarity
              Search
• Compare types and arrangements of
  secondary structures within two proteins

• If elements similarly arranged, three-
  dimensional structures are similar

• VAST and SARF are programs that use
  these fast methods

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Structural Motifs Based on
      Sequence Analysis
• Some structural elements can be
  determined by looking at sequence
  composition
  – zinc finger motifs
  – leucine zippers
  – coiled-coil structures




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Zinc Finger Motifs
• Found by looking at
  order and spacing of
  cysteine and
  histidine residues

• Typical zinc finger
  motifs are
  composed of two
  cysteines followed                                        Image source: www.bmb.psu.edu/faculty/tan/lab/
  by two histidines                                         tanlab_gallery_protdna.html




    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Leucine Zippers
• Found by looking for
  two antiparallel alpha
  helices held together

• Interactions between
  hydrophobic leucine
  residues found every
  seventh position in helix                                   Image source: ww2.mcgill.ca/biology/undergra/
                                                              c200a/sec3-5.htm




    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Transmembrane Proteins
• traverse back and forth
  through alpha helices

• Typical length: 20-30
  residues

• Transmembrane alpha
  helices have hydrophobic
  residues on the inside
  facing portions, and
  hydrophilic residues on the
  outside                                                 Image source:
                                                          http://www.northwestern.edu/neurobiology/faculty/pinto2/pinto_12big.jpg

     CECS 694-02 Introduction to Bioinformatics University of Louisville    Spring 2004 Dr. Eric Rouchka
Membrane Prediction
               Programs
• PHDhtm: employs neural network approach;
  neural network trained to recognize sequence
  patterns and variations of helices in
  transmembrane proteins of known structures

• Tmpred: functions by searching a protein
  against a sequence scoring matrix obtained
  by aligning the sequences of all known
  transmembrane alpha helix regions

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Distance Matrix Approach
• Uses graphical procedure similar to dot
  plots

• Identifies atoms that lie most closely
  together in three-dimensional structure

• Two sequences with similar structure
  can have dot plots superimposed

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Distance Matrix Approach
• Values in distance matrix represent distance
  between the Cα atoms in the three
  dimensional structure

• positions of closest packing atoms marked
  with a dot to highlight regions of interest

• Similar groups superimposed as closely as
  possible by minimizing sum of atomic
  distances
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
DALI
• Distance Alignment Tool (DALI)

• Uses distance matrix method to align protein
  structures

• Assembly step uses Monte Carlo simulation
  to find sub-matrices that can be aligned

• Existing structures that have been compared
  are organized into the FSSP database
   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Fast Structural Similarity
              Search
• Compare types and arrangements of
  secondary structures within two proteins

• If elements similarly arranged, three-
  dimensional structures are similar

• VAST and SARF are programs that use
  these fast methods

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Structural Motifs Based on
      Sequence Analysis
• Some structural elements can be
  determined by looking at sequence
  composition
  – zinc finger motifs
  – leucine zippers
  – coiled-coil structures




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Zinc Finger Motifs
• Found by looking at
  order and spacing of
  cysteine and
  histidine residues

• Typical zinc finger
  motifs are
  composed of two
  cysteines followed                                        Image source: www.bmb.psu.edu/faculty/tan/lab/
  by two histidines                                         tanlab_gallery_protdna.html




    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Leucine Zippers
• Found by looking for
  two antiparallel alpha
  helices held together

• Interactions between
  hydrophobic leucine
  residues found every
  seventh position in helix                                   Image source: ww2.mcgill.ca/biology/undergra/
                                                              c200a/sec3-5.htm




    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Transmembrane Proteins
• traverse back and forth
  through alpha helices

• Typical length: 20-30
  residues

• Transmembrane alpha
  helices have hydrophobic
  residues on the inside
  facing portions, and
  hydrophilic residues on the
  outside                                                 Image source:
                                                          http://www.northwestern.edu/neurobiology/faculty/pinto2/pinto_12big.jpg

     CECS 694-02 Introduction to Bioinformatics University of Louisville    Spring 2004 Dr. Eric Rouchka
Membrane Prediction
               Programs
• PHDhtm: employs neural network approach;
  neural network trained to recognize sequence
  patterns and variations of helices in
  transmembrane proteins of known structures

• Tmpred: functions by searching a protein
  against a sequence scoring matrix obtained
  by aligning the sequences of all known
  transmembrane alpha helix regions

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Chou-Fasman Method
• based on analyzing frequency of amino acids in
  different secondary structures
   – A, E, L, and M strong predictors of alpha helices
   – P and G are predictors in the break of a helix


• Table of predictive values created for alpha helices,
  beta sheets, and loops

• Structure with greatest overall prediction value
  greater than 1 used to determine the structure



    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
GOR Method
• Improves upon the Chou-Fasman method

• Assumes amino acids surrounding the central amino
  acid influence secondary structure central amino acid
  is likely to adopt

• Scoring matrices used in GOR method, incorporates
  information theory and Bayesian statistics

• Mount, p450-451


    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Neural Network Models
• Programs trained to recognize amino acid
  patterns located in known secondary
  structures

• distinguish these patterns from patterns not
  located in structures

• PHD and NNPREDICT use neural networks


    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Nearest-neighbor
• machine learning method

• secondary structure confirmation of an amino
  acid calculated by identifying sequences of
  known structures similar to the query by
  looking at the surrounding amino acids

• Nearest-neighbor programs include include
  PSSP, Simpa96, SOPM, and SOPMA

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Prediction of 3d Structures
• Threading is most Robust technique
• Time consuming
• Requires knowledge of protein structure




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Threading
• Searches for structures with similar folds
  without sequence similarity

• Threading takes a sequence with unknown
  structure and threads it through the
  coordinates of a target protein whose
  structure has been solved
  – X-ray crystallography
  – NMR imaging


    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Threading
• Considered position by position subject
  to predetermined constraints

• Thermodynamic calculations made to
  determine most energetically favorable
  and confirmationally stable alignment



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Environmental Template
• Environment of each amino acid in each
  known structural core is determined
  – secondary structure
  – area of side chain buried by closeness to
    other atoms
  – types of nearby side chains




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Environmental Template
• Each position classified into one of 18
  types
  – 6 representing increasing levels of residue
    burial
  – three classes of secondary structure (alpha
    helices, beta sheets, and loops).




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Upcoming Seminars
• Topic TBA
  – Rafael Irizarry, Johns Hopkins University
       • Friday, 4/23/2004
       • 8:30 AM – 9:30 AM
       • LOCATION: K-Building Room 2036 (HSC
         Campus)




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Presentations
•   4:45 – 5:00 Richard Jones
•   5:00 – 5:15 Steven Xu
•   5:15 – 5:30 Olutola Iyun
•   5:30 – 5:45 Frank Baker
•   5:45 – 6:00 Guanghui Lan
•   6:00 – 6:15 Tim Hardin
•   6:15 – 6:30 Satish Bollimpalli & Ravi
    Gundlapalli

     CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka

Contenu connexe

Tendances (20)

Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure Prediction
 
Protein database
Protein databaseProtein database
Protein database
 
Protein micro array
Protein micro arrayProtein micro array
Protein micro array
 
Clustal
ClustalClustal
Clustal
 
Homology modeling
Homology modelingHomology modeling
Homology modeling
 
Threading modeling methods
Threading modeling methodsThreading modeling methods
Threading modeling methods
 
Protein protein interaction
Protein protein interactionProtein protein interaction
Protein protein interaction
 
Scop database
Scop databaseScop database
Scop database
 
Motif & Domain
Motif & DomainMotif & Domain
Motif & Domain
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
 
YEAST TWO HYBRID SYSTEM
 YEAST TWO HYBRID SYSTEM YEAST TWO HYBRID SYSTEM
YEAST TWO HYBRID SYSTEM
 
Bioinformatics in drug discovery
Bioinformatics in drug discoveryBioinformatics in drug discovery
Bioinformatics in drug discovery
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
 
Swiss prot database
Swiss prot databaseSwiss prot database
Swiss prot database
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Fasta
FastaFasta
Fasta
 
Protein structure visualization tools-RASMOL
Protein structure visualization tools-RASMOLProtein structure visualization tools-RASMOL
Protein structure visualization tools-RASMOL
 
Peptide Mass Fingerprinting
Peptide Mass FingerprintingPeptide Mass Fingerprinting
Peptide Mass Fingerprinting
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 

Similaire à Protein Structure Prediction

Protien structure and Methods of protein structure determination Rajesh Kumar...
Protien structure and Methods of protein structure determination Rajesh Kumar...Protien structure and Methods of protein structure determination Rajesh Kumar...
Protien structure and Methods of protein structure determination Rajesh Kumar...RajeshKumarKushwaha5
 
Biochemistry lecture 1
Biochemistry lecture 1Biochemistry lecture 1
Biochemistry lecture 1Joxua Lascano
 
Evolution of photosynthesis
Evolution of photosynthesis Evolution of photosynthesis
Evolution of photosynthesis Bishnu Adhikari
 
Proteins chp-4-bioc-361-version-oct-2012b
Proteins chp-4-bioc-361-version-oct-2012bProteins chp-4-bioc-361-version-oct-2012b
Proteins chp-4-bioc-361-version-oct-2012bJody Haddow
 
Protein structure Lecture for M Sc biology students
Protein structure Lecture for M Sc biology students Protein structure Lecture for M Sc biology students
Protein structure Lecture for M Sc biology students Anuj Kumar
 
2. Biomolecules Part B (1).pdf
2. Biomolecules Part B (1).pdf2. Biomolecules Part B (1).pdf
2. Biomolecules Part B (1).pdfNizamKhan69
 
PROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptx
PROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptxPROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptx
PROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptxMDMOBARAKHOSSAIN12
 
Tertiary protetin and its stucture
Tertiary protetin  and its stucture Tertiary protetin  and its stucture
Tertiary protetin and its stucture Muti Ullah Makhmal
 
Bio inspired metal-oxo catalysts for c–h bond functionalization
Bio inspired metal-oxo catalysts for c–h bond functionalizationBio inspired metal-oxo catalysts for c–h bond functionalization
Bio inspired metal-oxo catalysts for c–h bond functionalizationDaniel Morton
 
Phototrophy, chemotrophy and autotrophy in prokaryotes
Phototrophy, chemotrophy and autotrophy in prokaryotesPhototrophy, chemotrophy and autotrophy in prokaryotes
Phototrophy, chemotrophy and autotrophy in prokaryotesRahul Kunwar Singh
 
Nucleic acids and chromosomes
Nucleic acids and chromosomesNucleic acids and chromosomes
Nucleic acids and chromosomesanilkumarvemu
 
2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekinge2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekingeProf. Wim Van Criekinge
 
Acids-Bases-Buffers-pH-VCBCct.pptx
Acids-Bases-Buffers-pH-VCBCct.pptxAcids-Bases-Buffers-pH-VCBCct.pptx
Acids-Bases-Buffers-pH-VCBCct.pptxMudasirHussain65
 
Introduction to biochemistry
Introduction to biochemistryIntroduction to biochemistry
Introduction to biochemistryYaniv Leichtmann
 
2016 bioinformatics i_proteins_wim_vancriekinge
2016 bioinformatics i_proteins_wim_vancriekinge2016 bioinformatics i_proteins_wim_vancriekinge
2016 bioinformatics i_proteins_wim_vancriekingeProf. Wim Van Criekinge
 

Similaire à Protein Structure Prediction (20)

Protien structure and Methods of protein structure determination Rajesh Kumar...
Protien structure and Methods of protein structure determination Rajesh Kumar...Protien structure and Methods of protein structure determination Rajesh Kumar...
Protien structure and Methods of protein structure determination Rajesh Kumar...
 
Biochemistry lecture 1
Biochemistry lecture 1Biochemistry lecture 1
Biochemistry lecture 1
 
Evolution of photosynthesis
Evolution of photosynthesis Evolution of photosynthesis
Evolution of photosynthesis
 
Proteins chp-4-bioc-361-version-oct-2012b
Proteins chp-4-bioc-361-version-oct-2012bProteins chp-4-bioc-361-version-oct-2012b
Proteins chp-4-bioc-361-version-oct-2012b
 
Protein structure Lecture for M Sc biology students
Protein structure Lecture for M Sc biology students Protein structure Lecture for M Sc biology students
Protein structure Lecture for M Sc biology students
 
2. Biomolecules Part B (1).pdf
2. Biomolecules Part B (1).pdf2. Biomolecules Part B (1).pdf
2. Biomolecules Part B (1).pdf
 
PROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptx
PROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptxPROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptx
PROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptx
 
Tertiary protetin and its stucture
Tertiary protetin  and its stucture Tertiary protetin  and its stucture
Tertiary protetin and its stucture
 
Protein
ProteinProtein
Protein
 
Bio inspired metal-oxo catalysts for c–h bond functionalization
Bio inspired metal-oxo catalysts for c–h bond functionalizationBio inspired metal-oxo catalysts for c–h bond functionalization
Bio inspired metal-oxo catalysts for c–h bond functionalization
 
Phototrophy, chemotrophy and autotrophy in prokaryotes
Phototrophy, chemotrophy and autotrophy in prokaryotesPhototrophy, chemotrophy and autotrophy in prokaryotes
Phototrophy, chemotrophy and autotrophy in prokaryotes
 
Ontology work at the Royal Society of Chemistry
Ontology work at the Royal Society of ChemistryOntology work at the Royal Society of Chemistry
Ontology work at the Royal Society of Chemistry
 
Proteins
ProteinsProteins
Proteins
 
Nucleic acids and chromosomes
Nucleic acids and chromosomesNucleic acids and chromosomes
Nucleic acids and chromosomes
 
2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekinge2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekinge
 
Acids-Bases-Buffers-pH-VCBCct.pptx
Acids-Bases-Buffers-pH-VCBCct.pptxAcids-Bases-Buffers-pH-VCBCct.pptx
Acids-Bases-Buffers-pH-VCBCct.pptx
 
vsamson_thesis
vsamson_thesisvsamson_thesis
vsamson_thesis
 
ZO 211 Week 3 lecture
ZO 211 Week 3 lectureZO 211 Week 3 lecture
ZO 211 Week 3 lecture
 
Introduction to biochemistry
Introduction to biochemistryIntroduction to biochemistry
Introduction to biochemistry
 
2016 bioinformatics i_proteins_wim_vancriekinge
2016 bioinformatics i_proteins_wim_vancriekinge2016 bioinformatics i_proteins_wim_vancriekinge
2016 bioinformatics i_proteins_wim_vancriekinge
 

Dernier

Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptxJonalynLegaspi2
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxDhatriParmar
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 

Dernier (20)

Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptx
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 

Protein Structure Prediction

  • 1. Rohit Digitally signed by Rohit Jhawer DN: cn=Rohit Jhawer, o, ou, email=rohit_jhawer@hotmail. Jhawer com, c=IN Date: 2007.03.09 14:10:44 +05'30' Lecture 14: Protein Structure Prediction CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 2. Review of Proteins • Proteins: polypeptides with a three dimensional structure • • Primary structure – sequence of amino acids constituting polypeptide chain • Secondary structure – local organization of polypeptide chain into secondary structures such as α helices and β sheets CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 3. Review of Proteins • Tertiary structure –three dimensional arrangements of amino acids as they react to one another due to polarity and interactions between side chains • Quaternary structure – Interaction of several protein subunits CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 4. Protein Structure • Proteins: chains of amino acids joined by peptide bonds • Amino Acids: – Polar (separate positive and negatively charged regions) – free C=O group (CARBOXYL), can act as hydrogen bond acceptor – free NH group (AMINYL), can act as hydrogen bond donor CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 5. Protein Structure CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 6. Protein Structure • Many confirmations possible due to the rotation around the Alpha-Carbon (Cα) atom • Confirmational changes lead to differences in three-dimensional structure of protein CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 7. Protein Structure • Polypeptide chain has pattern of N-Cα-C repeated • Angle between aminyl group and Cα is PHI (φ) angle; angle between Cα and carboxyl group is PSI (ψ) angle CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 8. Protein Structure CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 9. Differences between A.A.’s • Difference between 20 amino acids is the R side chains • Amino acids can be separated based on the chemical properties of the side chains: – Hydrophobic – Charged – Polar CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 10. Differences between A.A.’s • Hydrophobic: Alanine(A), Valine(V), phenylalanine (Y), Proline (P), Methionine (M), isoleucine (I), and Leucine(L) • Charged: Aspartic acid (D), Glutamic Acid (E), Lysine (K), Arginine (R) • Polar: Serine (S), Theronine (T), Tyrosine (Y); Histidine (H), Cysteine (C), Asparagine (N), Glutamine (Q), Tryptophan (W) • CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 11. Secondary Structure • Image source: http://www.ebi.ac.uk/microarray/biology_intro.html CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 12. Secondary Structures • Core of each protein made up of regular secondary structures • Regular patterns of hydrogen bonds are formed between neighboring amino acids • Amino acids in secondary structures have similar φ and ψ angles CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 13. Secondary Structures • Structures act to neutralize the polar groups on each amino acid • Secondary structures tightly packed in protein core and a hydrophobic environment • Each amino acid side group has a limited space to occupy -- therefore a limited number of possible interactions CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 14. Types of Secondary Structures • α Helices • β Sheets • Loops • Coils CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 15. α Helix • Most abundant secondary structure • 3.6 amino acids per turn • Hydrogen bond formed between every fourth reside • Average length: 10 amino acids, or 3 turns • Varies from 5 to 40 amino acids Image source: http://www.hhmi.princeton.edu/sw/2002/psidelsk/scavengerhunt.htm; http://www4.ocn.ne.jp/~bio/biology/protein.htm CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 16. α Helix • Normally found on the surface of protein cores • Interact with aqueous environment – Inner facing side has hydrophobic amino acids – Outer-facing side has hydrophilic amino acids CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 17. α Helix • Every third amino acid tends to be hydrophobic • Pattern can be detected computationally • Rich in alanine (A), gutamic acid (E), leucine (L), and methionine (M) • Poor in proline (P), glycine (G), tyrosine (Y), and serine (S) CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 18. β Sheet Image source: http://broccoli.mfn.ki.se/pps_course_96/ss_960723_12.html; http://www4.ocn.ne.jp/~bio/biology/protein.htm CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 19. β Sheet • Hydrogen bonds between 5-10 consecutive amino acids in one portion of the chain with another 5-10 farther down the chain • Interacting regions may be adjacent with a short loop, or far apart with other structures in between CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 20. β Sheet • Directions: – Same: Parallel Sheet – Opposite: Anti-parallel Sheet – Mixed: Mixed Sheet • Pattern of hydrogen bond formation in parallel and anti-parallel sheets is different CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 21. β Sheet • Slight counterclockwise rotation • Alpha carbons (as well as R side groups) alternate above and below the sheet • Prediction difficult, due to wide range of φ and ψ angles CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 22. Interactions in Helices and Sheets CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 23. Loop • Regions between α helices and β sheets • Various lengths and three-dimensional configurations • Located on surface of the structure CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 24. Loop • Hairpin loops: complete turn in the polypeptide chain, (anti-parallel β sheets) • More variable sequence structure • Tend to have charged and polar amino acids • Frequently a component of active sites CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 25. Coil • Region of secondary structure that is not a helix, sheet, or loop CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 26. Secondary Structure • Image source: http://www.ebi.ac.uk/microarray/biology_intro.html CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 27. 6 Classes of Protein Structure 1) Class α: bundles of α helices connected by loops on surface of proteins 2) Class β: antiparallel β sheets, usually two sheets in close contact forming sandwich 3) Class α/β: mainly parallel β sheets with intervening α helices; may also have mixed β sheets (metabolic enzymes) CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 28. 6 Classes of Protein Structure 4) Class α+ β: mainly segregated α helices and antiparallel β sheets 5) Multidomain (α and β) proteins more than one of the above four domains 6) Membrane and cell-surface proteins and peptides excluding proteins of the immune system CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 29. α Class Protein (hemoglobin) • http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=3hhb;page=;pid=&opt=show&size=250 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 30. β Class Protein (T-Cell CD8) • http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=1cd8;page=;pid=&opt=show&size=500 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 31. α/ β Class Protein (tryptohan synthase) • http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=2wsy;page=;pid=&opt=show&size=500 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 32. α+β Class Protein (1RNB) • http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=1rnb;page=;pid=&opt=show&size=500 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 33. Membrane Protein (10PF) • http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=1opf;page=;pid=&opt=show&size=500 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 34. Protein Structure Databases • Databases of three dimensional structures of proteins, where structure has been solved using X-ray crystallography or nuclear magnetic resonance (NMR) techniques • Protein Databases: – PDB – SCOP – Swiss-Prot – PIR CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 35. Protein Structure Databases • Most extensive for 3-D structure is the Protein Data Bank (PDB) • Current release of PDB (April 8, 2003) has 20,622 structures CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 36. Partial PDB File ATOM 1 N VAL A 1 6.452 16.459 4.843 7.00 47.38 3HHB 162 ATOM 2 CA VAL A 1 7.060 17.792 4.760 6.00 48.47 3HHB 163 ATOM 3 C VAL A 1 8.561 17.703 5.038 6.00 37.13 3HHB 164 ATOM 4 O VAL A 1 8.992 17.182 6.072 8.00 36.25 3HHB 165 ATOM 5 CB VAL A 1 6.342 18.738 5.727 6.00 55.13 3HHB 166 ATOM 6 CG1 VAL A 1 7.114 20.033 5.993 6.00 54.30 3HHB 167 ATOM 7 CG2 VAL A 1 4.924 19.032 5.232 6.00 64.75 3HHB 168 ATOM 8 N LEU A 2 9.333 18.209 4.095 7.00 30.18 3HHB 169 ATOM 9 CA LEU A 2 10.785 18.159 4.237 6.00 35.60 3HHB 170 ATOM 10 C LEU A 2 11.247 19.305 5.133 6.00 35.47 3HHB 171 ATOM 11 O LEU A 2 11.017 20.477 4.819 8.00 37.64 3HHB 172 ATOM 12 CB LEU A 2 11.451 18.286 2.866 6.00 35.22 3HHB 173 ATOM 13 CG LEU A 2 11.081 17.137 1.927 6.00 31.04 3HHB 174 ATOM 14 CD1 LEU A 2 11.766 17.306 .570 6.00 39.08 3HHB 175 ATOM 15 CD2 LEU A 2 11.427 15.778 2.539 6.00 38.96 3HHB 176 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 37. Description of PDB File • second column: amino acid position in the polypeptide chain • fourth column: current amino acid • Columns 7, 8, and 9: x, y, and z coordinates (in angstroms) • The 11th column: temperature factor -- can be used as a measurement of uncertainty CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 38. Protein Structure Classification Databases • Structural Classification of proteins (SCOP) • based on expert definition of structural similarities • SCOP classifies by class, family, superfamily, and fold • http://scop.mrc-lmb.cam.ac.uk/scop/ CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 39. Protein Structure Classification Databases • Classification by class, architecture, topology, and homology (CATH) • Classifies proteins into hierarchical levels by class • a/B and a+B are considered to be a single class • http://www.biochem.ucl.ac.uk/bsm/cath/ CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 40. Protein Structure Classification Databases • Molecular Modeling Database (MMDB) • structures from PDB categorized into structurally related groups using the VAST • looks for similar arrangements of secondary structural elements • http://www.ncbi.nlm.nih.gov/Entrez CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 41. Protein Structure Classification Databases • Spatial Arrangement of Backbone Fragments (SARF) • categorized on structural similarities, similar to the MMDB • http://www-lmmb.ncifcrf.gov/~nicka/sarf2.html CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 42. Visualization of Proteins • A number of programs convert atomic coordinates of 3-d structures into views of the molecule • allow the user to manipulate the molecule by rotation, zooming, etc. • Critical in drug design -- yields insight into how the protein might interact with ligands at active sites CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 43. Visualization of Proteins • Most popular program for viewing 3- dimensional structures is Rasmol Rasmol: http://www.umass.edu/microbio/rasmol/ Chime: http://www.umass.edu/microbio/chime/ Cn3D: http://www.ncbi.nlm.nih.gov/Structure/ Mage: http://kinemage.biochem.duke.edu/website/kinhome.html Swiss 3D viewer: http://www.expasy.ch/spdbv/mainpage.html CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 44. Alignment of Protein Structure • Three-dimensional structure of one protein compared against three-dimensional structure of second protein • Atoms fit together as closely as possible to minimize the average deviation • Structural similarity between proteins does not necessarily mean evolutionary relationship CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 45. Alignment of Protein Structure • Positions of atoms in three-dimensional structures compared • Look for positions of secondary structural elements (helices and strands) within a protein domain CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 46. Alignment of Protein Structure • Distances between carbon atoms examined to determine degree structures may be superimposed • Side chain information can be incorporated – Buried; visible CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 47. SSAP • Secondary Structure Alignment Program • Incorporates double dynamic programming to produce a structural alignment between two proteins CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 48. Steps in SSAP • 1) Calculate vectors from Cβ of one amino acid to set of nearby amino acids – Vectors from two separate proteins compared – Difference (expressed as an angle) calculated, and converted to score • 2) Matrix for scores of vector differences from one protein to the next is computed. CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 49. Steps in SSAP • 3) Optimal alignment found using global dynamic programming, with a constant gap penalty • 4) Next amino acid residue considered, optimal path to align this amino acid to the second sequence computed CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 50. Steps in SSAP • 5) Alignments transferred to summary matrix – If paths cross same matrix position, scores are summed – If part of alignment path found in both matrices, evidence of similarity CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 51. Steps in SSAP • 6) Dynamic programming alignment is performed for the summary matrix – Final alignment represents optimal alignment between the protein structures – Resulting score converted so it can be compared to see how closely related two structures are CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 52. Distance Matrix Approach • Uses graphical procedure similar to dot plots • Identifies atoms that lie most closely together in three-dimensional structure • Two sequences with similar structure can have dot plots superimposed CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 53. Distance Matrix Approach • Values in distance matrix represent distance between the Cα atoms in the three dimensional structure • positions of closest packing atoms marked with a dot to highlight regions of interest • Similar groups superimposed as closely as possible by minimizing sum of atomic distances CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 54. DALI • Distance Alignment Tool (DALI) • Uses distance matrix method to align protein structures • Assembly step uses Monte Carlo simulation to find submatrices that can be aligned • Existing structures that have been compared are organized into the FSSP database CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 55. Fast Structural Similarity Search • Compare types and arrangements of secondary structures within two proteins • If elements similarly arranged, three- dimensional structures are similar • VAST and SARF are programs that use these fast methods CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 56. Structural Motifs Based on Sequence Analysis • Some structural elements can be determined by looking at sequence composition – zinc finger motifs – leucine zippers – coiled-coil structures CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 57. Zinc Finger Motifs • Found by looking at order and spacing of cysteine and histidine residues • Typical zinc finger motifs are composed of two cysteines followed Image source: www.bmb.psu.edu/faculty/tan/lab/ by two histidines tanlab_gallery_protdna.html CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 58. Leucine Zippers • Found by looking for two antiparallel alpha helices held together • Interactions between hydrophobic leucine residues found every seventh position in helix Image source: ww2.mcgill.ca/biology/undergra/ c200a/sec3-5.htm CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 59. Transmembrane Proteins • traverse back and forth through alpha helices • Typical length: 20-30 residues • Transmembrane alpha helices have hydrophobic residues on the inside facing portions, and hydrophilic residues on the outside Image source: http://www.northwestern.edu/neurobiology/faculty/pinto2/pinto_12big.jpg CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 60. Membrane Prediction Programs • PHDhtm: employs neural network approach; neural network trained to recognize sequence patterns and variations of helices in transmembrane proteins of known structures • Tmpred: functions by searching a protein against a sequence scoring matrix obtained by aligning the sequences of all known transmembrane alpha helix regions CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 61. Distance Matrix Approach • Uses graphical procedure similar to dot plots • Identifies atoms that lie most closely together in three-dimensional structure • Two sequences with similar structure can have dot plots superimposed CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 62. Distance Matrix Approach • Values in distance matrix represent distance between the Cα atoms in the three dimensional structure • positions of closest packing atoms marked with a dot to highlight regions of interest • Similar groups superimposed as closely as possible by minimizing sum of atomic distances CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 63. DALI • Distance Alignment Tool (DALI) • Uses distance matrix method to align protein structures • Assembly step uses Monte Carlo simulation to find sub-matrices that can be aligned • Existing structures that have been compared are organized into the FSSP database CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 64. Fast Structural Similarity Search • Compare types and arrangements of secondary structures within two proteins • If elements similarly arranged, three- dimensional structures are similar • VAST and SARF are programs that use these fast methods CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 65. Structural Motifs Based on Sequence Analysis • Some structural elements can be determined by looking at sequence composition – zinc finger motifs – leucine zippers – coiled-coil structures CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 66. Zinc Finger Motifs • Found by looking at order and spacing of cysteine and histidine residues • Typical zinc finger motifs are composed of two cysteines followed Image source: www.bmb.psu.edu/faculty/tan/lab/ by two histidines tanlab_gallery_protdna.html CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 67. Leucine Zippers • Found by looking for two antiparallel alpha helices held together • Interactions between hydrophobic leucine residues found every seventh position in helix Image source: ww2.mcgill.ca/biology/undergra/ c200a/sec3-5.htm CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 68. Transmembrane Proteins • traverse back and forth through alpha helices • Typical length: 20-30 residues • Transmembrane alpha helices have hydrophobic residues on the inside facing portions, and hydrophilic residues on the outside Image source: http://www.northwestern.edu/neurobiology/faculty/pinto2/pinto_12big.jpg CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 69. Membrane Prediction Programs • PHDhtm: employs neural network approach; neural network trained to recognize sequence patterns and variations of helices in transmembrane proteins of known structures • Tmpred: functions by searching a protein against a sequence scoring matrix obtained by aligning the sequences of all known transmembrane alpha helix regions CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 70. Chou-Fasman Method • based on analyzing frequency of amino acids in different secondary structures – A, E, L, and M strong predictors of alpha helices – P and G are predictors in the break of a helix • Table of predictive values created for alpha helices, beta sheets, and loops • Structure with greatest overall prediction value greater than 1 used to determine the structure CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 71. GOR Method • Improves upon the Chou-Fasman method • Assumes amino acids surrounding the central amino acid influence secondary structure central amino acid is likely to adopt • Scoring matrices used in GOR method, incorporates information theory and Bayesian statistics • Mount, p450-451 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 72. Neural Network Models • Programs trained to recognize amino acid patterns located in known secondary structures • distinguish these patterns from patterns not located in structures • PHD and NNPREDICT use neural networks CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 73. Nearest-neighbor • machine learning method • secondary structure confirmation of an amino acid calculated by identifying sequences of known structures similar to the query by looking at the surrounding amino acids • Nearest-neighbor programs include include PSSP, Simpa96, SOPM, and SOPMA CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 74. Prediction of 3d Structures • Threading is most Robust technique • Time consuming • Requires knowledge of protein structure CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 75. Threading • Searches for structures with similar folds without sequence similarity • Threading takes a sequence with unknown structure and threads it through the coordinates of a target protein whose structure has been solved – X-ray crystallography – NMR imaging CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 76. Threading • Considered position by position subject to predetermined constraints • Thermodynamic calculations made to determine most energetically favorable and confirmationally stable alignment CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 77. Environmental Template • Environment of each amino acid in each known structural core is determined – secondary structure – area of side chain buried by closeness to other atoms – types of nearby side chains CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 78. Environmental Template • Each position classified into one of 18 types – 6 representing increasing levels of residue burial – three classes of secondary structure (alpha helices, beta sheets, and loops). CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 79. Upcoming Seminars • Topic TBA – Rafael Irizarry, Johns Hopkins University • Friday, 4/23/2004 • 8:30 AM – 9:30 AM • LOCATION: K-Building Room 2036 (HSC Campus) CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 80. Presentations • 4:45 – 5:00 Richard Jones • 5:00 – 5:15 Steven Xu • 5:15 – 5:30 Olutola Iyun • 5:30 – 5:45 Frank Baker • 5:45 – 6:00 Guanghui Lan • 6:00 – 6:15 Tim Hardin • 6:15 – 6:30 Satish Bollimpalli & Ravi Gundlapalli CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka