To search for papers
Possible Topics and some papers for you to start with
If you decide to write a review on a specific topic, make sure you read the representative papers/methods listed below as well as other relevant papers. For a mini-review paper, I expect it cites at least 20-30 papers as references. .
Multiple sequence alignment
- Edgar RC, Batzoglou S (2006) Multiple sequence alignment. Curr Opin
Struct Biol 16: 368–373.
- Edgar RC (2004) MUSCLE: A multiple sequence alignment method with
reduced time and space complexity. BMC Bioinformatics 5: 113.
- Do CB, Mahabhashyam MS, Brudno M, Batzoglou S (2005) ProbCons:
Probabilistic consistency-based multiple sequence alignment. Genome Res
15: 330–340.
- Katoh K, Kuma K, Toh H, Miyata T (2005) MAFFT version 5: Improvement
in accuracy of multiple sequence alignment. Nucleic Acids Res 33: 511–518.
- Notredame C, Higgins DG, Heringa J (2000) T-Coffee: A novel method for
fast and accurate multiple sequence alignment. J Mol Biol 302: 205–217.
- Wallace IM, O’Sullivan O, Higgins DG, Notredame C (2006) M-Coffee:
Combining multiple sequence alignment methods with T-Coffee. Nucleic
Acids Res 34: 1692–1699.
- Pei J, Sadreyev R, Grishin NV (2003) PCMA: Fast and accurate multiple
sequence alignment based on profile consistency. Bioinformatics 19: 427–
428.
- Pei J, Grishin NV (2006) MUMMALS: Multiple sequence alignment
improved by using hidden Markov models with local structural
information. Nucleic Acids Res 34: 4364–4374.
- Raphael B, Zhi D, Tang H, Pevzner P: A novel method for multiple
alignment of sequences with repeated and shuffled elements.
Genome Res 2004, 14:2336-2346.
Application of Hidden Markov models
- Nguyen C, Gardiner KJ, Cios KJ. A hidden Markov model for predicting protein interfaces., J Bioinform Comput Biol. 2007 Jun;5(3):739-53.
- Cui X, Vinar T, Brejova B, Shasha D, Li M. Homology search for genes., Bioinformatics. 2007 Jul 1;23(13):i97-103.
- Bagos PG, Liakopoulos TD, Hamodrakas SJ.Algorithms for incorporating prior topological information in HMMs: application to transmembrane proteins., BMC Bioinformatics. 2006 Apr 5;7:189.
- Aydin Z, Altunbasak Y, Borodovsky M. Protein secondary structure prediction for a single-sequence using hidden semi-Markov models., BMC Bioinformatics. 2006 Mar 30;7:178.
Application of array technologies
- Probe design
- Gasieniec L, Li CY, Sant P, Wong PW. J Theor Biol. 2007 Oct 7;248(3):512-21.
- Tomiuk S, Hofmann K. Microarray probe selection strategies. Brief Bioinform. 2001 Dec;2(4):329-40.
- L. Kaderali and A. Schliep, Selecting signature oligonucleotides to identify organisms using DNA arrays, Bioinformatics 18 (2002), pp. 1340–1349.
- DE (Differentially Expressed) gene detection methods
- Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 2001, 98(9):5116-5121.
- Benjamini Y, Yekutieli D: The control of the false discovery rate in multiple hypothesis testing under dependency. The Annals of Statistics 2001, Vol. 29(4):1165-1188.
- Cui X, Churchill GA: Statistical tests for differential expression in cDNA microarray experiments. Genome Biol 2003, 4(4):210.
- Dudoit S, Shaffer JP, Boldrick JC: Multiple Hypothesis Testing in Microarray Experiments. Statistical Science 2003, 18(1):71-103.
- McLachlan GJ, Bean RW, Jones LB: A simple implementation of a normal mixture approach to differential gene expression in multiclass microarrays. Bioinformatics 2006, 22(13):1608-1615.
- Storey JD, Dai JY, Leek JT: The optimal discovery procedure for large-scale significance testing, with applications to comparative microarray experiments. Biostatistics 2007, 8(2):414-432.
- Pearson RD: A comprehensive re-analysis of the Golden Spike data: towards a benchmark for differential expression methods. BMC Bioinformatics 2008, 9:164.
- Tiling array
- A. Schliep and R. Krause, Efficient Computational Design of Tiling Arrays Using a Shortest Path Approach, Algorithms in Bioinformatics, Pages 383-394, Springer Berlin / Heidelberg, 2007
- Graf S, Nielsen FG, Kurtz S, Huynen MA, Birney E, Stunnenberg H, Flicek P., Optimized design and assessment of whole genome tiling arrays. Bioinformatics. 2007 Jul 1;23(13):i195-204.
- Sharp AJ, Itsara A, Cheng Z, Alkan C, Schwartz S, Eichler EE. Optimal design of oligonucleotide microarrays for measurement of DNA copy-number.Hum Mol Genet. 2007 Nov 15;16(22):2770-9. 2007
- Bertone P, Trifonov V, Rozowsky JS, Schubert F, Emanuelsson O, Karro J, Kao MY, Snyder M, Gerstein M. Design optimization methods for genomic DNA tiling arrays.Genome Res. 2006 Feb;16(2):271-81.
- Huber W, Toedling J, Steinmetz LM. Transcript mapping with high-density oligonucleotide tiling arrays. Bioinformatics. 2006 Aug 15;22(16):1963-70.
- Exon array
- Xing Y, Kapur K, Wong WH., Probe selection and expression index computation of Affymetrix Exon Arrays., PLoS ONE. 2006 Dec 20;1:e88.
- Lee C, Wang Q., Bioinformatics analysis of alternative splicing., Brief Bioinform. 2005 Mar;6(1):23-33.
- CGH array
- Munch K, Gardner PP, Arctander P, Krogh A. A hidden Markov model approach for determining expression from genomic tiling micro arrays.BMC Bioinformatics. 2006 May 3;7:239.
- Picard F, Robin S, Lebarbier E, Daudin JJ. A segmentation/clustering model for the analysis of array CGH data., Biometrics. 2007 Sep;63(3):758-66.
- Stjernqvist S, Ryden T, Skold M, Staaf J.Continuous-index hidden Markov modelling of array CGH copy number data., Bioinformatics. 2007 Apr 15;23(8):1006-14.
- Shi Y, Klustein M, Simon I, Mitchell T, Bar-Joseph Z., Continuous hidden process model for time series expression experiments., Bioinformatics. 2007 Jul 1;23(13):i459-67.
- Rueda OM, Diaz-Uriarte R., Flexible and accurate detection of genomic copy-number changes from aCGH., PLoS Comput Biol. 2007 Jun 22;3(6):e122. Epub 2007 May 16.
- Gaeta BA, Malming HR, Jackson KJ, Bain ME, Wilson P, Collins AM., iHMMune-align: hidden Markov model-based alignment and identification of germline genes in rearranged immunoglobulin gene sequences., Bioinformatics. 2007 Jul 1;23(13):1580-7.
- Hu J, Gao JB, Cao Y, Bottinger E, Zhang W., Exploiting noise in array CGH data to improve detection of DNA copy number change., Nucleic Acids Res. 2007;35(5):e35.
- Marioni JC, Thorne NP, Tavare S., BioHMM: a heterogeneous hidden Markov model for segmenting array CGH data, Bioinformatics. 2006 May 1;22(9):1144-6
Next-generation Sequencing Data Analysis
- Euskirchen GM, Rozowsky JS, Wei CL, Lee WH, Zhang ZD, Hartman S, Emanuelsson O, Stolc V, Weissman S, Gerstein MB et al: Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies. Genome Res 2007, 17(6):898-909.
- Ji H, Jiang H, Ma W, Johnson DS, Myers RM, Wong WH: An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nat Biotechnol 2008, 26(11):1293-1300.
- Johnson DS, Mortazavi A, Myers RM, Wold B: Genome-wide mapping of in vivo protein-DNA interactions. Science 2007, 316(5830):1497-1502.
- Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A et al: Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 2007, 4(8):651-657.
- Valouev A, Johnson DS, Sundquist A, Medina C, Anton E, Batzoglou S, Myers RM, Sidow A: Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat Methods 2008, 5(9):829-834.
- Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nussbaum C, Myers RM, Brown M, Li W et al: Model-based analysis of ChIP-Seq (MACS). Genome Biol 2008, 9(9):R137.
Application of Bayesian inference in Network Biology
- Jansen R, Yu H, Greenbaum D, Kluger Y, Krogan NJ, Chung S, Emili A, Snyder M, Greenblatt JF, and Gerstein M. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science, 302:449-453, 2003.
- Friedman N. Inferring cellular networks using probabilistic graphical models. Science, 303: 799-805, 2004.
- Kim I, Liu Y, and Zhao H. Bayesian methods for predicting interacting protein pairs using domain information. Biometrics, 63: 824-33, 2007.
- Qi Y, Missiuro PE, Kapoor A, Hunter CP, Jaakkola TS, Gifford DK, and Ge H. Semi-supervised analysis of gene expression profiles for lineage-specific development in the Caenorhabditis elegans embryo. Bioinformatics, 22:e417-423, 2006.
- Schäfer J, and Strimmer K. An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics, 21: 754-764, 2005.
- Werhli AV, Grzegorczyk M, and Husmeier D. Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks. Bioinformatics, 22: 2523-2531, 2006.
- Wilkinson DJ. Bayesian methods in bioinformatics and computational systems biology. Brief Bioinform., 8:109-116, 2007.
Motif finding
- Eden E, Lipson D, Yogev S, Yakhini Z. 2007. Discovering motifs in ranked lists of DNA sequences.. PLoS Comput Biol 3 (3):e39.
- Ng, P, Keich, U. 2008. GIMSAN: a Gibbs motif finder with significance analysis. BIOINFORMATICS 24 (19):2256-2257
- Chen XY, Hughes TR; Morris Q. 2007. RankMotif++: a motif-search algorithm that accounts for relative ranks of K-mers in binding transcription factors. BIOINFORMATICS 23 (13):I72-I79
- Frickey T, Weiller G. 2007. Mclip: motif detection based on cliques of gapped local profile-to-profile alignments. BIOINFORMATICS 23 (4):502-503
- Fratkin E, Naughton BT, Brutlag DL, Batzoglou S. 2006. MotifCut: regulatory motifs finding with maximum density subgraphs. BIOINFORMATICS 22 (14):E150-E157
- Hon LS, Jain AN. 2006. A deterministic motif finding algorithm with application to the human genome. BIOINFORMATICS 22 (9):1047-1054 .
- Leung H, Chin F. 2006. Finding motifs from all sequences with and without binding sites. BIOINFORMATICS 22 (18):2217-2223.
- Mendes ND, Casimiro AC, Santos PM, Sa-Correia I, Oliveira AL, Freitas AT. 2006. MUSA: a parameter free algorithm for the identification of biologically significant motifs. BIOINFORMATICS 22 (24):2996-3002.
- Newberg LA, Thompson WA, Conlan S, Smith TM, McCue LA, and Lawrence CE. (2007) A phylogenetic Gibbs sampler that yields centroid solutions for cis regulatory site prediction. Bioinformatics 23(14):1718-27.
- Down TA, Bergman CM, Su J, Hubbard TJ. Large-scale discovery of promoter motifs in Drosophila melanogaster., PLoS Comput Biol. 2007 Jan 19;3(1):e7.
- Sabatti C, Rohlin L, Lange K, Liao JC. Vocabulon: a dictionary model approach for reconstruction and localization of transcription factor binding sites., Bioinformatics. 2005 Apr 1;21(7):922-31.
- Bussemaker HJ, Li H, Siggia ED. Building a dictionary for genomes: identification of presumptive regulatory sites by statistical analysis., Proc Natl Acad Sci U S A. 2000 Aug 29;97(18):10096-100.
- Elemento O, Tavazoie S (2005) Fast and systematic genome-wide discovery of conserved regulatory elements using a non-alignment based approach. Genome Biol 6: R18.
Protein interaction networks / complex prediction / Data integration
- Hart GT, Ramani AK and Marcotte EM (2006) How complete are current yeast and human protein-interaction networks?, Genome Biol, 7, 120.
- Qi YA, Ge H. Modularity and dynamics of cellular networks. PLoS Comput Biol 2006, 2(12):1502-1510.
- Gavin AC, Aloy P, Grandi P, Krause R, et al. Proteome survey reveals modularity of the yeast cell machinery. Nature, 440:631-636, 2006.
- Huang H, Jedynak BM, and Bader JS. Where have all the interactions gone? Estimating the coverage of two-hybrid protein interaction maps. PLoS Comput Biol., 3:e214, 2007.
- Krogan NJ, Cagney G, Yu H, Zhong G, et al. Global landscape of protein complexes in the yeast saccharomyces cerevisiae. Nature, 440:637-643, 2006.
- Scholtens D, Vidal M, and Gentleman R. Local modeling of global interactome networks. Bioinformatics, 21:3548-3557, 2005.
- Spirin V and Mirny LA. Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci USA, 100:12123-12128, 2003.
- Chu W, Ghahramani Z, Krause R, and Wild DL. Identifying protein complexes in high-throughput protein interaction screens using an infinite latent feature model. Pac Symp Biocomput., 2006:231-42, 2006.
- Qi Y, Bar-Joseph Z, and Klein-Seetharaman J. Evaluation of different biological data and computational classification methods for use in protein interaction prediction. Proteins, 63: 490-500, 200, 2006.