References ========== .. _cock2009: [1] Peter J. A. Cock, Tiago Antao, Jeffrey T. Chang, Brad A. Chapman, Cymon J. Cox, Andrew Dalke, Iddo Friedberg, Thomas Hamelryck, Frank Kauff, Bartek Wilczynski, Michiel J. L. de Hoon: “Biopython: freely available Python tools for computational molecular biology and bioinformatics”. *Bioinformatics* **25** (11), 1422–1423 (2009). `doi:10.1093/bioinformatics/btp163 `__, .. _pritchard2006: [2] Leighton Pritchard, Jennifer A. White, Paul R.J. Birch, Ian K. Toth: “GenomeDiagram: a python package for the visualization of large-scale genomic data”. *Bioinformatics* **22** (5): 616–617 (2006). `doi:10.1093/bioinformatics/btk021 `__, .. _toth2006: [3] Ian K. Toth, Leighton Pritchard, Paul R. J. Birch: “Comparative genomics reveals what makes an enterobacterial plant pathogen”. *Annual Review of Phytopathology* **44**: 305–336 (2006). `doi:10.1146/annurev.phyto.44.070505.143444 `__, .. _vanderauwera2009: [4] Géraldine A. van der Auwera, Jaroslaw E. Król, Haruo Suzuki, Brian Foster, Rob van Houdt, Celeste J. Brown, Max Mergeay, Eva M. Top: “Plasmids captured in C. metallidurans CH34: defining the PromA family of broad-host-range plasmids”. *Antonie van Leeuwenhoek* **96** (2): 193–204 (2009). `doi:10.1007/s10482-009-9316-9 `__ .. _proux2002: [5] Caroline Proux, Douwe van Sinderen, Juan Suarez, Pilar Garcia, Victor Ladero, Gerald F. Fitzgerald, Frank Desiere, Harald Brüssow: “The dilemma of phage taxonomy illustrated by comparative genomics of Sfi21-Like Siphoviridae in lactic acid bacteria”. *Journal of Bacteriology* **184** (21): 6026–6036 (2002). `http://dx.doi.org/10.1128/JB.184.21.6026-6036.2002 `__ .. _jupe2012: [6] Florian Jupe, Leighton Pritchard, Graham J. Etherington, Katrin MacKenzie, Peter JA Cock, Frank Wright, Sanjeev Kumar Sharma1, Dan Bolser, Glenn J Bryan, Jonathan DG Jones, Ingo Hein: “Identification and localisation of the NB-LRR gene family within the potato genome”. *BMC Genomics* **13**: 75 (2012). `http://dx.doi.org/10.1186/1471-2164-13-75 `__ .. _cock2010: [7] Peter J. A. Cock, Christopher J. Fields, Naohisa Goto, Michael L. Heuer, Peter M. Rice: “The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants”. *Nucleic Acids Research* **38** (6): 1767–1771 (2010). `doi:10.1093/nar/gkp1137 `__ [8] Patrick O. Brown, David Botstein: “Exploring the new world of the genome with DNA microarrays”. *Nature Genetics* **21** (Supplement 1), 33–37 (1999). `doi:10.1038/4462 `__ .. _talevich2012: [9] Eric Talevich, Brandon M. Invergo, Peter J.A. Cock, Brad A. Chapman: “Bio.Phylo: A unified toolkit for processing, analyzing and visualizing phylogenetic trees in Biopython”. *BMC Bioinformatics* **13**: 209 (2012). `doi:10.1186/1471-2105-13-209 `__ .. _cornish1985: [10] Athel Cornish-Bowden: “Nomenclature for incompletely specified bases in nucleic acid sequences: Recommendations 1984.” *Nucleic Acids Research* **13** (9): 3021–3030 (1985). `doi:10.1093/nar/13.9.3021 `__ .. _cavener1987: [11] Douglas R. Cavener: “Comparison of the consensus sequence flanking translational start sites in Drosophila and vertebrates.” *Nucleic Acids Research* **15** (4): 1353–1361 (1987). `doi:10.1093/nar/15.4.1353 `__ .. _bailey1994: [12] Timothy L. Bailey and Charles Elkan: “Fitting a mixture model by expectation maximization to discover motifs in biopolymers”, *Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology* 28–36. AAAI Press, Menlo Park, California (1994). .. _chapman2000: [13] Brad Chapman and Jeff Chang: “Biopython: Python tools for computational biology”. *ACM SIGBIO Newsletter* **20** (2): 15–19 (August 2000). .. _dehoon2004: [14] Michiel J. L. de Hoon, Seiya Imoto, John Nolan, Satoru Miyano: “Open source clustering software”. *Bioinformatics* **20** (9): 1453–1454 (2004). `doi:10.1093/bioinformatics/bth078 `__ [15] Michiel B. Eisen, Paul T. Spellman, Patrick O. Brown, David Botstein: “Cluster analysis and display of genome-wide expression patterns”. *Proceedings of the National Academy of Science USA* **95** (25): 14863–14868 (1998). `doi:10.1073/pnas.96.19.10943-c `__ .. _golub1971: [16] Gene H. Golub, Christian Reinsch: “Singular value decomposition and least squares solutions”. In *Handbook for Automatic Computation*, **2**, (Linear Algebra) (J. H. Wilkinson and C. Reinsch, eds), 134–151. New York: Springer-Verlag (1971). .. _golub1989: [17] Gene H. Golub, Charles F. Van Loan: *Matrix computations*, 2nd edition (1989). .. _hamelryck2003a: [18] Thomas Hamelryck and Bernard Manderick: 11PDB parser and structure class implemented in Python”. *Bioinformatics*, **19** (17): 2308–2310 (2003) `doi: 10.1093/bioinformatics/btg299 `__. .. _hamelryck2003b: [19] Thomas Hamelryck: “Efficient identification of side-chain patterns using a multidimensional index tree”. *Proteins* **51** (1): 96–108 (2003). `doi:10.1002/prot.10338 `__ .. _hamelryck2005: [20] Thomas Hamelryck: “An amino acid has two sides; A new 2D measure provides a different view of solvent exposure”. *Proteins* **59** (1): 29–48 (2005). `doi:10.1002/prot.20379 `__. [21] John A. Hartiga. *Clustering algorithms*. New York: Wiley (1975). [22] Anil L. Jain, Richard C. Dubes: *Algorithms for clustering data*. Englewood Cliffs, N.J.: Prentice Hall (1988). .. _kachitvichyanukul1988: [23] Voratas Kachitvichyanukul, Bruce W. Schmeiser: Binomial Random Variate Generation. *Communications of the ACM* **31** (2): 216–222 (1988). `doi:10.1145/42372.42381 `__ .. _kohonen1997: [24] Teuvo Kohonen: “Self-organizing maps”, 2nd Edition. Berlin; New York: Springer-Verlag (1997). .. _lecuyer1988: [25] Pierre L’Ecuyer: “Efficient and Portable Combined Random Number Generators.” *Communications of the ACM* **31** (6): 742–749,774 (1988). `doi:10.1145/62959.62969 `__ .. _majumdar2005: [26] Indraneel Majumdar, S. Sri Krishna, Nick V. Grishin: “PALSSE: A program to delineate linear secondary structural elements from protein structures.” *BMC Bioinformatics*, **6**: 202 (2005). `doi:10.1186/1471-2105-6-202 `__. .. _matys2003: [27] V. Matys, E. Fricke, R. Geffers, E. G?ssling, M. Haubrock, R. Hehl, K. Hornischer, D. Karas, A.E. Kel, O.V. Kel-Margoulis, D.U. Kloos, S. Land, B. Lewicki-Potapov, H. Michael, R. Münch, I. Reuter, S. Rotert, H. Saxel, M. Scheer, S. Thiele, E. Wingender E: “TRANSFAC: transcriptional regulation, from patterns to profiles.” Nucleic Acids Research **31** (1): 374–378 (2003). `doi:10.1093/nar/gkg108 `__ [28] Robin Sibson: “SLINK: An optimally efficient algorithm for the single-link cluster method”. *The Computer Journal* **16** (1): 30–34 (1973). `doi:10.1093/comjnl/16.1.30 `__ .. _snedecor1989: [29] George W. Snedecor, William G. Cochran: *Statistical methods*. Ames, Iowa: Iowa State University Press (1989). .. _tamayo1999: [30] Pablo Tamayo, Donna Slonim, Jill Mesirov, Qing Zhu, Sutisak Kitareewan, Ethan Dmitrovsky, Eric S. Lander, Todd R. Golub: “Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation”. *Proceedings of the National Academy of Science USA* **96** (6): 2907–2912 (1999). `doi:10.1073/pnas.96.6.2907 `__ [31] Robert C. Tryon, Daniel E. Bailey: *Cluster analysis*. New York: McGraw-Hill (1970). [32] John W. Tukey: “Exploratory data analysis”. Reading, Mass.: Addison-Wesley Pub. Co. (1977). .. _yeung2001: [33] Ka Yee Yeung, Walter L. Ruzzo: “Principal Component Analysis for clustering gene expression data”. *Bioinformatics* **17** (9): 763–774 (2001). `doi:10.1093/bioinformatics/17.9.763 `__ [34] Alok Saldanha: “Java Treeview—extensible visualization of microarray data”. *Bioinformatics* **20** (17): 3246–3248 (2004). `http://dx.doi.org/10.1093/bioinformatics/bth349 `__