Chart 1. Decay of E-values in PSI-BLAST results using human ABCG proteins as query.
A minor decay occurs after the first 4 sequences (ABCG1 and ABCG4 of human and mouse), followed by ABCG2, ABCG5 and ABCG8 of human and mouse. Then, sequences homologous to Drosophila White, and Abcg3 are found. Beginning with sequence 19, the plant WBC and PDR families are featured, starting with WBC14 from Arabidopsis, followed by many members of these families forming a rather slow descending slope, until at sequence 80 (NAP12, Arabidopsis) the point of decay is reached. This extends up to sequence 83 (Q18900, Caenorhabditis elegans). A second plateau is reached at sequence 84 (MALK, Escherichia coli), followed by a plethora of Escherichia coli proteins and human proteins of other subfamilies (starting with ABCAs, which are known as the closest human homologs to the ABCG transporter proteins).


Table 1: Sequences found in the vicinity of the decay in E-value
Sequence number Sequence name Organism E-value Comment
...
...
...
...
...
73
PDR2 Plant 4e-47 Full-transporter
74
PDR11 Yeast 4e-45 Full-transporter
75
WBC25 Plant 5e-45
76
PDR12 Yeast 1e-42 Full-transporter
77
WBC29 Plant 4e-42
78
WBC3 Plant 3e-41
79
YOR011W Yeast 7e-41 Full-transporter
80
NAP12 Plant 5e-40
81
CG18633 Fly 3e-29
82
Q9XTU7 Worm 2e-25
83
Q18900 Worm 8e-22 Last sequence of decay and last ABCG-related sequence in final alignment.
84
MALK Bacteria 1e-21 Used as outgroup to root the tree.
85
Q9TXV8 Worm 3e-20 Full-transporter
86
ABCA10 Human 5e-20 Full-transporter, subfamily A, member 10
87
GLNQ Bacteria 5e-20
88
LOLD Bacteria 6e-20
89
ABCA3 Human 1e-19 Full-transporter, subfamily A, member 3
90
Q18901 Worm 3e-19
91
EM27392 Mouse 3e-19 ABCA12 homolog
92
NAP11 Plant 3e-19
...
...
...
...
...

Table 2. Sequences found before the decay in E-value that are included in our analysis of ABCG-related proteins.
Arabidopsis thaliana
Sequence name E-value Accession code
  Sequence name E-value Accession code
  Sequence name E-value Accession code
WBC14 7e-83 Q9C6W5 (SWISSPROT/TREMBL)

WBC1 5e-65 O80946 (SWISSPROT/TREMBL)

PDR9 1e-57 Q9LFH0 (SWISSPROT/TREMBL)
WBC21 3e-80 Q9LI82 (SWISSPROT/TREMBL)

WBC27 8e-65 Q9LK50 (SWISSPROT/TREMBL)

PDR13 9e-57 Q9C623 (SWISSPROT/TREMBL)
WBC7 8e-78 Q9ZU35 (SWISSPROT/TREMBL)

WBC13 2e-64 Q9C8J8 (SWISSPROT/TREMBL)

PDR1 4e-56 O04323 (SWISSPROT/TREMBL)
WBC10 2e-74 Q9MAH4 (SWISSPROT/TREMBL)

WBC2 6e-63 Q9ZUT0 (SWISSPROT/TREMBL)

PDR7 7e-56 Q9XI48 (SWISSPROT/TREMBL)
WBC26 7e-73 Q9C8W6 (SWISSPROT/TREMBL)

WBC6 2e-62 Q9FNB5 (SWISSPROT/TREMBL)

PDR10 1e-54 Q9LHK8 (SWISSPROT/TREMBL)
WBC5 3e-72 Q9SIT6 (SWISSPROT/TREMBL)

WBC17 7e-62 Q9M2V6 (SWISSPROT/TREMBL)

PDR6 4e-54 Q9SJR6 (SWISSPROT/TREMBL)
WBC23 4e-70 Q9FG17 (SWISSPROT/TREMBL)

WBC4 7e-62 Q9SW08 (SWISSPROT/TREMBL)

PDR5 6e-52 Q9ZUT8 (SWISSPROT/TREMBL)
WBC19 1e-69 Q9M3D6 (SWISSPROT/TREMBL)

WBC8 8e-62 Q9FLX5 (SWISSPROT/TREMBL)

PDR2 4e-47 O23377 (SWISSPROT/TREMBL)
WBC28 2e-69 Q9FT51 (SWISSPROT/TREMBL)

WBC11 4e-60 Q9LMU4 (SWISSPROT/TREMBL)

WBC25 5e-45 Q9MAG3 (SWISSPROT/TREMBL)
WBC12 1e-68 Q9C8K2 (SWISSPROT/TREMBL)

WBC20 9e-60 Q9LFG8 (SWISSPROT/TREMBL)

WBC29 4e-42 Q9FF46 (SWISSPROT/TREMBL)
WBC9 1e-68 Q9SZR9 (SWISSPROT/TREMBL)

PDR12 2e-59 Q9M9E1 (SWISSPROT/TREMBL)

WBC3 3e-41 Q9ZUU9 (SWISSPROT/TREMBL)
WBC18 5e-68 Q9M2V5 (SWISSPROT/TREMBL)

PDR4 5e-59 O81016 (SWISSPROT/TREMBL)

NAP12 5e-40 Q9SJK6 (SWISSPROT/TREMBL)
WBC22 1e-67 Q9LJC3 (SWISSPROT/TREMBL)

PDR3 2e-58 O80878 (SWISSPROT/TREMBL)




WBC16 3e-66 Q9M2V7 (SWISSPROT/TREMBL)

PDR8 4e-58 Q9XIE2 (SWISSPROT/TREMBL)





Homo sapiens

Mus musculus

Drosophila melanogaster
Sequence name E-value Accession code
  Sequence name E-value Accession code
  Sequence name E-value Accession code
ABCG1 0.0 NP_004906 (RefSeq)

ABCG1 0.0 Q64343 (SWISSPROT/TREMBL)

ATET 1e-134 Q9VQY4 (SWISSPROT/TREMBL)
ABCG4 0.0 NP_071452 (RefSeq)

ABCG4 0.0 Q91WA9 (SWISSPROT/TREMBL)

CG5853 1e-108 Q9VL61 (SWISSPROT/TREMBL)
ABCG8 1e-168 NP_071882 (RefSeq)

ABCG8 1e-149 Q9DBM0 (SWISSPROT/TREMBL)

CG9663 1e-99 Q9VQN5 (SWISSPROT/TREMBL)
ABCG5 1e-166 NP_071881 (RefSeq)

ABCG5 1e-145 Q99PE8 (SWISSPROT/TREMBL)

white 1e-92 P10090 (SWISSPROT/TREMBL)
ABCG2 1e-165 NP_004818 (RefSeq)

ABCG2 1e-140 Q9R004 (SWISSPROT/TREMBL)

CG9664 2e-90 Q9VQN4 (SWISSPROT/TREMBL)




ABCG3 2e-89 Q99P81 (SWISSPROT/TREMBL)

CG9892 1e-89 Q9VQF1 (SWISSPROT/TREMBL)








CG17646 6e-89 Q9VQ41 (SWISSPROT/TREMBL)








CG4822 1e-73 Q9VPJ7 (SWISSPROT/TREMBL)








CG7346 1e-73 Q9VTL3 (SWISSPROT/TREMBL)








scarlet 1e-71 P45843 (SWISSPROT/TREMBL)








CG3327 5e-63 Q9VQP3 (SWISSPROT/TREMBL)








CG11069 3e-49 Q9VC15 (SWISSPROT/TREMBL)








brown 2e-47 P12428 (SWISSPROT/TREMBL)








CG18633 3e-29 Q9VC16 (SWISSPROT/TREMBL)

Caenorhabditis elegans

Saccharomyces cerevisiae

Escherichia coli
Sequence name E-value Accession code

Sequence name E-value Accession code

Sequence name E-value Accession code
O16574 7e-73 O16574 (SWISSPROT/TREMBL)

YOL075C 4e-70 Q08234 (SWISSPROT/TREMBL)

MALK 1e-21 P02914 (SWISSPROT/TREMBL)
P90746 1e-70 P90746 (SWISSPROT/TREMBL)

ADP1 1e-66 P25371 (SWISSPROT/TREMBL)




Q19585 2e-70 Q19585 (SWISSPROT/TREMBL)

YNR070W 7e-51 P53756 (SWISSPROT/TREMBL)




Q22802 3e-65 Q22802 (SWISSPROT/TREMBL)

PDR5 1e-49 P33302 (SWISSPROT/TREMBL)




Y47D3A.11 3e-64 Q9U2D0 (SWISSPROT/TREMBL)

PDR15 3e-49 Q04182 (SWISSPROT/TREMBL)




Q09466 2e-60 Q09466 (SWISSPROT/TREMBL)

SNQ2 3e-48 P32568 (SWISSPROT/TREMBL)




Q9XTU7 2e-25 Q9XTU7 (SWISSPROT/TREMBL)

PDR10 2e-47 P51533 (SWISSPROT/TREMBL)




Q18900 8e-22 Q18900 (SWISSPROT/TREMBL)

PDR11 4e-45 P40550 (SWISSPROT/TREMBL)








PDR12 1e-42 Q02785 (SWISSPROT/TREMBL)








YOR011W 7e-41 Q08409 (SWISSPROT/TREMBL)