Information Entropy Analysis of the H1N1 Genetic Code
ORAL
Abstract
During the current H1N1 pandemic, viral samples are being obtained from large numbers of infected people world-wide and are being sequenced on the NCBI Influenza Virus Resource Database. The information entropy of the sequences was computed from the probability of occurrence of each nucleotide base at every position of each set of sequences using Shannon's definition of information entropy, \[ H=\sum\limits_b {p_b \,\log _2 \left( {\frac{1}{p_b }} \right)} \] where H is the observed information entropy at each nucleotide position and p$_{b}$ is the probability of the base pair of the nucleotides {\{}A, C, G, U{\}}. Information entropy of the current H1N1 pandemic is compared to reference human and swine H1N1 entropy. As expected, the current H1N1 entropy is in a low entropy state and has a very large mutation potential. Using the entropy method in mature genes we can identify low entropy regions of nucleotides that generally correlate to critical protein function.
–
Authors
-
Andy Martwick
Portland State