Name:  


Exercise 7A: Sequence Alignment
STA 4953 (Spring 2001)
Due 3/22/2001

Do not use the computer in any way to do this exercise.

  1. Find the best global alignment between the DNA sequences:

    AGTAGTCAAG
    AGAAGTAAG

    Use the following scoring system:
      score of a match = 1
      score of a mismatch = -1
      gap penalty function w(k) = -1 - 2k.

      Solution:
      This best global alignment is

      AGTAGTCAAG
      ||:|||_|||
      AGAAGT_AAG

      To arrive at the best global alignment the following matrix was constructed:
      The numbers in the matrix were derived as follows:
      Consider the cell (G,G) with a score of 2. To find the score of 2, we must consider the possible scores that could be derived by entering the cell from the horizontal, diagonal, and vertical positions. Coming from the horizontal position, we have two possibilities: coming from (G,A) and coming from (G,^). First consider coming from (G,A). The alignment would then be
      AG
      G_
      We can see from cell (G,A) that this alignment of A with G has a score of -2. Add this to the score of -3 for the alignment of G with a gap, to get a total score of -5.
      Now consider coming from the horizontal cell (G,^). The alignment would then be
      ^AG
      G__
      We can see that this alignment of G with the beginning of the sequence gets a score of -5. Two gaps would need to be introduced to get to the cell (G,G). So, we would add the gap penalty of -7 for introducing two gaps to the score of -5 to get a total score of -12.

      Next consider the score coming from the diagonal cell (A,A). If we came from this position, the alignment would be
      AG
      AG
      From cell (A,A) we get a score of 1. Add this to the score of 1 for G being aligned with G, a match, to get a score of 2.
      Next consider coming from the vertical cell (A,G). The alignment would be
      G_
      AG

      This alignment of G with A has a score of -2. Add this to the gap penalty score of -3 to get a total score of -5.
      Next consider coming from the vertical cell (^,G). The alignment would be
      G__
      ^AG

      This alignment of G with ^ has a score of -5. Add this to the gap penalty of -7 for introducing 2 gaps to get a total score of -12.
      Taking into consideration all of the possibilities of arriving at the cell (G,G) and the corresponding scores, find the largest score. Here, the largest score is 2, which was obtained by coming from the diagonal cell (A,A). We enter this score in the matrix with an arrow pointing to the cell (A,A).
      All of the remaining scores in the matrix are arrived at similarly.
      To determine the final alignment, we begin at the last row of the matrix, find the highest score, and begin there. Follow the arrows to the top of the matrix to obtain the best global alignment.

  2. Find the best local alignment between amino acid sequences:

    MAGSPVSSYS
    AGPESPHSAY


    using the BLOSUM 62 score matrix (which can be found in the notes of the 3/6 lecture) and a gap penalty w(k) = -1 - 2k.

    Solution:
    AG__SPVSSY
    AGPESPHSAY


    To arrive at the best local alignment the following matrix was constructed:

    The scores entered in the cells of this matrix were derived similarly to the method described for the global alignment with a few exceptions: any score resulting in a negative score is given a score of 0; scores for matches and mismatches are determined by referring to the BLOSUM 62 scoring matrix.
    To determine the final alignment, first find the cell with the highest score in the matrix. Follow the arrows until a 0 is reached. This will give the best local alignment.