STA 4953 Test II
Introduction to Bioinformatics
April 5, 2001, 3:30-4:45 p.m.
Please show your work in detail. Use additional paper as
necessary.
Part I. [8 points] Substitution Matrix
The observed relative frequencies of nucleotide pairs in a database of DNA
blocks are: (These are the qij's)
|
A | C | G | T |
A | 0.15 | 0.05 | 0.04 | 0.06 |
C | | 0.10 | 0.05 | 0.11 |
G | | | 0.18 | 0.04 |
T | | | | 0.22 |
Construct a BLOSUM matrix and display it in the table below.
|
A | C | G | T |
A | 3 | -2 | -3 | -3 |
C | | 3 | -2 | -1 |
G | | | 3 | -4 |
T | | | | 2 |
The above calculations were derived as follows:
To find the expected frequencies, use the
formulas pij | =
pi2 if i=j |
| = 2pipj if i not equal to
j |
where, pi = qii + [sum(i not equal to
j)qij]/2.
Expected
Relative Frequencies (pij) |
A | 0.050625 | 0.09225 | 0.11025 | 0.14625 |
C | | 0.042025 | 0.10045 | 0.13325 |
G | | | 0.060025 | 0.15925 |
T | | | | 0.105625 |
| A | C | G | T |
The elements of the table should sum to 1.
To find the BLOSUM80 scores, use the formula
2 * sij = 2 *
ln[qij/pij] / ln(2) and round to the nearest
integer.
Part II [8 points] Sequence Alignment
Using the substitution matrix constructed in Part I, and a gap penalty
function w(k) = -1 - k, where k is the length of the
gap. Find the best local alignment(s) between the following pair of
sequences:
To arrive at the best local alignment, the following matrix was
constructed (minus the arrows):
Thus, the best local alignment is
Part III [4 points] Genome Project
- What is a genomic library?
- What is an EST (Expressed Sequence Tag)?
- A piece of DNA which carries another piece of DNA by allowing its
replication and selection is called a :
A.
Volkswagen
B. nice guy
C. insert
D. vector
E. bacteria virus
-
A long DNA sequence made up of the aligned sequences from several smaller
pieces of DNA is called a :
A. genome
B. contig
C. chromosome
D. library
E. RNA
Part IV [2 points] Extra Credit
- Here are the recognition sites and cleavage positions of four
restriction enzymes:
BlaHI | GC|ATGC |
EcoRI | GA|ATTC |
HindIII | AA|GCTT |
BgmII | AC|ATGT |
Which of these enzyme combinations will not create compatible
(complementary) sticky ends?
A. Bls HI and EcoRI
B. EcoRI and Hind III
C. BlaHI and BgmII
d. EcoRI and BgmII
E. None of the above.
-
Which is the sequence read from the following gel?
A | C | G | T |
______ | | | |
| | ______ | |
| ______ | | |
| | | ______ |
| | ______ | |
______ | | | |
| | ______ | |
| ______ | | |
| | ______ | |
_____ | | | |
| | | ______ |
A. TTAGGGCCTGA
B. TAGCGAGTCGA
C. AGCTGCGGGTA
D. GGGCATGCTGA
E. TAGGGCGTCGA
|