Nucleotide Sequence Databases (the principal ones)
- NCBI - National Center for Biotechnology Information
- EBI - European Bioinformatics Institute
- DDBJ - DNA Data Bank of Japan
Protein Sequence Databases
- SWISS-PROT & TrEMBL - Protein sequence database and computer annotated supplement
- UniProt - UniProt (Universal Protein Resource) is the world's most comprehensive catalog of information on proteins. It is a central repository of protein sequence and function created by joining the information contained in Swiss-Prot, TrEMBL, and PIR.
- PIR - Protein Information Resource
- MIPS - Munich Information centre for Protein Sequences
- HUPO - HUman Proteome Organization
Database Searching by Sequence Similarity
Sequence Alignment
- USC Sequence Alignment Server - align 2 sequences with all possible varieties of dynamic programming
- T-COFFEE - multiple sequence alignment
- ClustalW @ EBI - multiple sequence alignment
- MSA 2.1 - optimal multiple sequence alignment using the Carrillo-Lipman method
- BOXSHADE - pretty printing and shading of multiple alignments
- Splign - Splign is a utility for computing cDNA-to-Genomic, or spliced sequence alignments. At the heart of the program is a global alignment algorithm that specifically accounts for introns and splice signals. New!
- Spidey - an mRNA-to-genomic alignment program
- SIM4 - a program to align cDNA and genomic DNA (My Personal favorite!)
- Wise2 - align a protein or profile HMM against genomic sequence to predict a gene structure, and related tools
- PipMaker - computes alignments of similar regions in two (long) DNA sequences (Yet another of my favorites!)
- VISTA - align + detect conserved regions in long genomic sequences
- myGodzilla - align a sequence to its ortholog in the human genome
Human Genome Databases
[Top]
Databases of other Organisms
[Top]
Genome-wide Analysis
- MBGD - comparative analysis of completely sequenced microbial genomes
- COGs - phylogenetic classification of orthologous proteins from complete genomes
- STRING - detect whether a given query gene occurs repeatedly with certain other genes in potential operons
- Pedant - automatic whole genome annotation
- GeneCensus - various whole genome comparisons
[Top]
Protein Domains: Databases and Search Tools
- InterPro - integration of Pfam, PRINTS, PROSITE, SWISS-PROT + TrEMBL
- PROSITE - database of protein families and domains
- Pfam - alignments and hidden Markov models covering many common protein domains
- SMART - analysis of domains in proteins
- ProDom - protein domain database
- PRINTS Database - groups of conserved motifs used to characterise protein families
- Blocks - multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins
- Protein Domain Profile Analysis @ BMERC - search a library of profiles with a protein sequence
- TIGRFAMs - yet more protein families based on Hidden Markov Models
[Top]
Motif and Pattern Search in Sequences
- Gibbs Motif Sampler - identification of conserved motifs in DNA or protein sequences
- AlignACE Homepage - gene regulatory motif finding
- MEME - motif discovery and search in protein and DNA sequences
- SAM - tools for creating and using Hidden Markov Models
- Pratt - discover patterns in unaligned protein sequences
-
Motivated Proteins - a web facility for
exploring small hydrogen-bonded motifs
-
[Top]
Protein 3D Structure
[Top]
Phylogeny & Taxonomy
[Top]
Gene Prediction
[Top]
Gene Expression Databases
[Top]
Gene Regulation
- TRAFAC - For identifying conserved and shared cis regulatory elements between a pair of genes.
- CisMols - For identifying conserved and shared cis regulatory elements between a set of co-expressed genes.
- TRANSFAC - database of eukaryotic cis-acting regulatory DNA elements and trans-acting factors
- EPD - eukaryotic promoter database
- DBTSS - DataBase of Transcriptional Start Sites (human)
- SCPD - Saccharomyces cerevisiae promoter database
- DCPD - Drosophila Core Promoter Database
- RegulonDB - a database on transcriptional regulation in E. coli
- DPInteract - protein binding sites on E. coli DNA
- PromoterInspector - prediction of promoter regions in mammalian genomic sequences
- MatInspector - search for transcription factor binding sites
- Cister - cis-element cluster finder
- Gene regulatory Tools
-
microRNA.org: microRNA Targets & Expression
Profiles New!
-
miRBase New!
-
TarBase Provides a means of searching
through a comprehensive set of
experimentally supported microRNA targets in
at least 8 organisms New!
-
microRNA resource A gateway to all types
of information about microRNAs, including
articles, products, news, events, and other
websites New!
[Top]
Metabolic, Gene Regulatory & Signal Transduction Network Databases
- KEGG - Kyoto Encyclopedia of Genes and Genomes
- BioCarta
- DAVID - Database for Annotation, Visualization and Integrated Discovery - A useful server to for annotating microarray and other genetic data.
- stke - Signal Transduction Knowledge Environment
- BIND - Biomolecular Interaction Network Database
- EcoCyc
- WIT
-
PathGuide A very useful
collection of resources dealing primarily
with pathways New!
- SPAD - Signaling Pathway Database
- CSNDB - Cell Signalling Networks Database
- PathDB
- Transpath
- DIP - Database of Interacting Proteins
- PFBP - Protein Function and Biochemical Networks
- Alliance for Cellular Signalling
[Top]
Systems
Biology New!
Other Databases (Annotations, Ontologies, Consortia, etc.)
- Entrez Gene - Gene provides a unified query environment for genes defined by sequence and/or in NCBI's Map Viewer. You can query on names, symbols, accessions, publications, GO terms, chromosome numbers, E.C. numbers, and many other attributes associated with genes and the products they encode. Replaces LocusLink.
- LocusLink - A single query interface to curated sequence and descriptive information about genetic loci presenting information on official nomenclature, aliases, sequence accessions, phenotypes, EC numbers, MIM numbers, UniGene clusters, homology, map locations, and related web sites.
- Cancer Genome Anatomy Project
- HUGO's Human Gene Nomenclature
- Gene Ontology Consortium - a controlled vocabulary of eukaryotic gene roles
- Open Biological Ontologies an umbrella web address for well-structured controlled vocabularies for shared use across different biological domains.
- ACUTS - compilation of Ancient Conserved UnTranslated Sequences
- UTR database
- ENZYME - enzyme nomenclature database
- BRENDA - enzyme database
- TC-DB - comprehensive classification of membrane transport proteins
- The SNP Consortium
- HGBASE - database of sequence variations in the human genome
- MethDB - DNA methylation database
- SpliceDB - canonical and non-canonical splice site sequences in mammalian genes
- SpliceOme - database of intron-exon boundaries
- InBase - intein database
- The I.M.A.G.E. Consortium
- The Kabat Database of Sequences of Proteins of Immunological Interest
- Nelson Lab: Cytochrome C
- REBASE - restriction enzyme database
- Chemfinder.com - molecule database
- Genomics Institute of the Novartis Research Foundation
- Mouse SNPs Database- 670,000+ SNP records, 8.0+ million allele calls. Allele tables are provided by investigators or retrieved from public sources. All SNPs are mapped to NCBI Mouse Genome build 33 (C57BL/6J assembly). Most are linked to NCBI dbSNP build 123. New!
-
MetaBase
is a user contributed database of databases,
listing all the biological databases
currently available on the internet. New!
-
Bio-computing.org
Bioinformatics, Databases and Software for
Medicine. New!
[Top]
Miscellaneous Tools
[Top]
Computational Resources
[Top]
Bioinformatics on-line course materials and tutorials (not an exhaustive collection)
Intro to bioinformatics and computational biology:
Algorithms:
[Top]
Miscellaneous:
[Top]
Web Sites for Background Information & News
[Top]
Other Collections of Bioinformatics Resources
[Top]
Suggestions and comments: Anil Jegga

This page was last updated on
December 16, 2007
|
|
|