To the editor:

Blood coagulation proteins (BCPs) play a major role in hemostasis.1  Except for a few that have their own dedicated databases, information on most BCPs are scattered across various disparate data sources in multiple formats. This information has been compiled, manually curated, and assembled into a knowledgebase called ClotBase with the aim of accelerating clinical diagnosis and research in the area of coagulation disorders (Table 1). It presents up-to-date information on all aspects of BCPs ranging from sequence and structure information, source organisms, function, subcellular location, tissue specificity and related literature. Links to external databases such as PubMed, European Molecular Biology Laboratory, Protein Information Resource, Protein Data Bank, and Online Mendelian Inheritance in Man are also provided for retrieval of additional information on BCPs. The interactive search features permit easy retrieval of the information available in ClotBase.

The deficiency of BCPs leads to various diseases such as hemophilia, thrombosis and increased risk of myocardial infarction.2,4  Identification of these disease-causing mutations in patients can help in genetic testing to confirm or rule out a suspected syndrome or help determine a person's chance of developing or passing on a genetic disorder. Presently, more than 2796 mutations have been identified in BCPs. These data have been compiled from various data sources and are presented in ClotBase as information on protein sequence, position of mutation, wild-type and mutant residues, domain involved, codon and exon/intron position, associated diseases, and relevant literature links.

Evolutionarily conserved residues are known to be crucial for maintaining the structural stability and function of the protein.5  The availability of vast sequence information on BCPs makes it ideal to explore data mining tools to identify their conserved residues. Consensus sequence represents the result of a multiple sequence alignment of homologs; wherein each position denotes the residues that are most abundant in the alignment. Thus, consensus sequence is a single sequence representation for a protein family. The extent of conservation and the possible residues that can be accommodated in a particular position without perturbing the structure and function of the protein can be obtained from the pattern information. Patterns, also known as motifs, signatures or fingerprints for a protein family, pose tight constraints during the evolution of these sequences. Consensus sequences and patterns present in BCPs were identified using various in silico tools. Users can access this information and also search for homologs in ClotBase. The query sequence can be searched for similarity with all or specific BCPs.

An important feature of ClotBase is the sequence-based Screen Mutation tool by which researchers/clinicians can detect lethal mutations in protein sequences based on reported literature or the evolutionarily conserved residues of BCPs that have been identified by our study.

ClotBase is currently the only open-access, manually curated database that stores information on all the known BCPs. It is designed in a user-friendly manner to allow easy and interactive navigation across its various interfaces. ClotBase aims to be a one-stop information portal for accessing manually curated data as well as submitting relevant data on BCPs. It will be updated every 4 months and can be freely accessed at http://www.clotbase.bicnirrh.res.in.

This work was supported by grants from Indian Council of Medical Research (63/128/2001-BMS).

Contribution:. A.S. was involved in data collection; P.N. developed the database; R.S.B. created the Web interface; and S.I.-T. guided the work.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Susan Idicula-Thomas, Biomedical Informatics Center, National Institute for Research in Reproductive Health, J.M. St, Parel, Mumbai 400012, India; e-mail: thomass@nirrh.res.in.

1
Furie
B
Furie
BC
The molecular basis of blood coagulation.
Cell
1998
53
4
505
518
2
White
GC
Shoemaker
CB
Factor VIII gene and hemophilia A.
Blood
1989
73
1
1
12
3
Lowe
GD
Factor IX and thrombosis.
Br J Haematol
2001
115
3
507
513
4
Patnaik
MM
Moll
S
Inherited antithrombin deficiency: a review.
Haemophilia
2008
14
6
1229
1239
5
Chaing
P
Sampaleanu
LM
Ayers
M
Pahuta
M
Howell
PL
Burrows
LL
Functional role of conserved residues in the characteristic secretion NTPase motifs of the Pseudomonas aeruginosa type IV pilus motor proteins PilB, PilT and PilU.
Microbiology
2008
154
1
114
126
Sign in via your Institution