Supplementary Material: Current vocabulary of method and category names used in FANTOM annotation

Method names

'annotation_type'
Categories of methods and evidences with which cDNA annotation was determined.
'InterPro'
Containing InterPro motifs
'Pfam'
Containing Pfam motifs
'GO'
Assigned Gene Ontology terms
'library'
Libraries from which cDNA were derived
'TU'
Transcription Units (TU) from which cDNA were transcribed

Category names

'annotation_type' method

'type_category1'
Directly annotated in an external database (Known in a specific species)
'type_category2'
Hit to a DNA sequence (CDS complete)
'type_category3'
Hit to a DNA sequence (CDS partial)
'type_category4'
Hit to a protein sequence with >98% identity, 100% length, same species)
'type_category5'
Hit to a protein sequence with >85% identity, 100% length)
'type_category6'
Hit to a protein sequence with >85% identity)
'type_category7'
Hit to a protein sequence with >70% identity, 100% length)
'type_category8'
Hit to a protein sequence with >70% identity)
'type_category9'
Hit to a protein sequence with >50% identity, 100% length)
'type_category10'
Hit to a protein sequence with >50% identity)
'type_category11'
Hit to both TIGR gene indices and UniGene named clusters
'type_category12'
Hit to UniGene named clusters
'type_category13'
Hit to TIGR gene indices named clusters
'type_category14'
Containing InterPro domains/motifs
'type_category15'
Containing MDS domains/motifs
'type_category16'
Containing SCOP domains/motifs
'type_category17'
Having a CDS >303 bp
'type_category18'
Hit to unknown ESTs
'type_category19'
Unclassifiable

'InterPro' method

InterPro accession number

'Pfam' method

Pfam accession number

'GO' method

Gene Ontolgoy ID

'library'

Library ID

'TU'

TU ID