STA 4953 (Spring 2001) Exercise 2A

Name:

Exercise 2A

Full credit will only be given to correct answers with a clear explanation of how they are obtained. Use additional paper as necessary.

The following is the transition probability matrix of a Markov nucleotide sequence. Fill in the blanks

P =	(	0.2	0.3	0.1		)
		0.4		0.1	0.2
		0.5	0.1		0.2
			0.2	0.3	0.4

For a Markov chain X₀, X₁, X₂, ... with transition probability matrix P as in question 1, suppose the probability distribution of X₀ is

x	f_o(x)
A	1/4
C	1/4
G	1/4
T	1/4

That is, the initial nucleotide may be any of the four bases equally likely.

Work out the probability distribution of X₁. (Hint: Use the Law of Total Probability:

P(E) = SUM_i [P( E and B_i )]
= SUM_i [P(B_i) * P(E | B_i)].)

x	f₁(x)
A
C
G
T

Then also work out the probability distribution of X₂.

x	f₂(x)
A
C
G
T

Can you suggest a method for finding the probability distribution of X_n?

Construct a Markov chain model for a nucleotide sequence generated according to these rules:

(i). The present nucleotide is equally likely to be A, C, G, T if the preceding two nucleotides are identical.

(ii). The present nucleotide will be twice as likely to be C or G than A or T if the preceding two nucleotides are different. Furthermore, when making a choice between C versus G and A versus T, purines will be used 60% of the time.

Write out its transition probability matrix.