GS 010133/BME 387J Statistical methods in Bioinformatics

Course Description

Advances in recent high-throughput experimental technologies have generated enormous amounts of data and provided valuable resources in studying gene sequences, expression and biological interaction at a whole genome-wide scale. Robust statistical models and efficient computational methods are needed to fully take advantage of the rapid data accumulation. This course will introduce students to the concepts and statistical methods for analyzing large-scale biological data generated from emerging genomic and proteomic techniques. The statistical methods covered include dynamic programming, maximum likelihood estimation, Bayesian inference, Hidden Markov Models, Markov Chain Monte Carlo, classification and clustering methods. The students will master advanced applications of statistical computing in a wide range of biological and biomedical problems, including multiple sequence alignment, biomarker and disease gene identification, inference of transcriptional regulation network, protein interaction network and protein functional modules.


This course is designed for graduate students and advanced undergraduate students who are interested in the emerging area of biomedical informatics and computational biology. Basic knowledge in statistics inference, algorithms and programming experience in R/MATLAB/C/C++ are expected. Knowledge in biology is a plus but not a must.

Recommended Text

Warren J. Ewens, Gregory Grant: Statistical Methods in Bioinformatics: An Introduction, Springer, 2005

Richard Durbin, Sean R. Eddy, Anders Krogh, Graeme Mitchison: Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press, 1999

Grading: homework (40%), midterm exam (30%), and final project (30%).

Time: Course schedule varies. Please see syllabus for details.

Location: We meet in room ETC 2.146 (Austin) or MSB G.520c (Houston). See syllabus for details.

Instructor: Dr. Yin Liu
Office location: BME Student Advising Office (Austin) or MSB 7.162 (Houston)
Office hours: Wednesday 10:00-12:00pm or by appointment
Phone: (713) 500-5632

Syllabus (Tentative)

Week Date Location Topic
1 08/26 MSB G.520c Introduction
2 09/02 MSB G.520c Sequence analysis I
3 09/09 MSB G.520c Sequence analysis II
5 09/16 ETC 2.146 Markov chain and Hidden Markov Models
6 09/23 MSB G.520c Hidden Markov Models and Applications
7 09/30 MSB G.520c Middle term exam
8 10/07 MSB G.520c Microarray data analysis I
9 10/14 ETC 2.146 Microarray data analysis II
10 10/21 MSB G.520c Bayesian inference
11 10/28 MSB G.520c Network Biology
12 11/04 MSB G.520c Data integration from multiple sources
13 11/11 ETC 2.146 Network reconstruction and pathway analysis
14 11/18 MSB G.520c Bayesian inference and its application in network reconstruction
15 12/02 MSB G.520c Wrap up and final presentation

Additional Reading and Resources