Conservation and gene birth and death across vertebrate
species
To comprehensively understand the gene birth and death events for the
ABC transporter superfamily invertebrates, we interrogated 62 vertebrate
ABC genes across 64 vertebrate species (12 primates, five rodents, 21
other mammals, three marsupials, five birds, 13 fish, two reptiles, and
one amphibian). Each gene was examined in the gene tree of the human or
representative species in the ENSEMBL database. We noted the appearance
of a full-length or partial gene as well as potential missing or
duplicated genes. We compared these species against species with formal
analyses of the ABC superfamily (human, mouse, zebrafish, and lamprey)
(Dean, Rzhetsky, et al., 2001). There are high coverage genomes for 13
species that are likely to provide an accurate gene count (human, chimp,
macaque, mouse, rat, dog, opossum, chicken, Xenopus, zebrafish, and
fugu). This result provides at least one index species for most of the
major orders of primates, rodents, carnivores, marsupials, birds,
amphibians, and fish. However, as many of the remaining species have
low-coverage draft genome assemblies, many missing genes are not likely
to be gene loss events (Milinkovitch, Helaers, Depiereux, Tzika, &
Gabaldon, 2010).
The number of ABC genes in primates is very stable. The ABCA10gene is missing from the orangutan, gibbon, and marmoset genomes;ABCA10 is part of a cluster of five ABCA5 -related genes
that are duplicated head-to-tail on human chromosome 17. The gene loss
event converting ABCC13 into a pseudogene (Annilo & Dean, 2004)
appears to be confined to the great apes, as ABCC13 is intact in
all other primates. The bushbaby (Otolemur garnettii ) genome
seems to have an additional TAP2 /ABCB3 gene. The predicted
amino acid sequences show that the two bushbaby TAP2 genes are in
the same sequence contig. Their amino acid sequences have diverged,
consistent with gene duplication. TAP1 and TAP2 play essential roles in
antigen presentation, and duplication of TAP2 also occurs in many
fish genomes. This result is of potential interest for the study of the
evolution of immunogenetics of primates. In total, all primates contain
between 48 and 50 ABC genes.
Rodents have many gene gain and loss events affecting the A, B, and G
subfamilies. The ABCA5 -like cluster contains from three to five
genes, and a cluster of Abca14 , Abca15 , Abca16 , andAbca17 genes (Ban, Sasaki, Sakai, Ueda, & Inagaki, 2005; Z. Q.
Chen, Annilo, Shulenin, & Dean, 2004) is present only in the mouse,
rat, and squirrel genomes, not in the guinea pig or kangaroo rat. The
well-described duplication of the Abcb1 gene in the mouse and rat
genomes is also found in the guinea pig but not in other rodents. The
loss of the ABCC11 gene from the mouse genome extends to all
rodents, but ABCC11 is present in the Lagomorphs (rabbit, pika),
indicating that this gene loss is specific to rodents. Abcg3 is a
gene first discovered in the mouse genome closely related toABCG2, a well-described efflux transporter (Mickley et al.,
2001). Abcg3 is only found in rodents, but the rat genome is
predicted to have two Abcg3 genes, and the hamster 4-6 copies.
Further examination of additional rodent genomes shows an Abcg3gene present in the prairie vole and up to four copies in the deer mouse
genome. The function of Abcg3 is unknown but proposed to be an efflux
pump due to its close sequence homology with ABCG2. However, it is
exclusively expressed in the spleen and thymus in the mouse, suggesting
it has a role in the immune response (Mickley et al., 2001). In
addition, the presence of multiple Abcg3 gene birth events in the
rodent lineage suggests that it has an unknown vital function.
There are no other apparent ABC gene death or birth events within other
mammalian genomes, and for those mammals with complete genome
assemblies, there are 44-54 ABC genes annotated. However, it is
difficult to accurately determine the gene counts in the ABCA5 and
ABCA14 gene clusters. These clusters contain from 3 to 5 genes in most
mammals and pseudogene fragments (Annilo, Chen, Shulenin, & Dean,
2003). Examination of the assemblies in these regions in species with
apparently missing genes shows gaps in the assembly. More complete
genomes, including long-range sequencing or assembly methods, are needed
to resolve these areas. However, we did not search for species for new
ABC genes, and there may be yet undiscovered gene birth events.
There has been no previous formal analysis of the ABC gene family for
birds, amphibians, or marsupials. The opossum is the index marsupial
species with a 7.3x genome coverage and contains 37 predicted
full-length and ten partial ABC genes for 47 genes. The opossum appears
to be missing ABCA15 , 16 and 17 , ABCB5 andABCB13 . These same genes were absent from the genomes of other
marsupials, the Tasmanian devil, and the wallaby. The frog,Xenopus tropicalis , is an amphibian index species with 37 full
and four partially predicted genes. There are two predictedXenopus ABCB5 genes on separate contigs. An alignment of
these sequences shows considerable diversity in well-aligned regions,
suggesting that this is an actual duplication. The anole lizard is the
one reptile species with a high-density genome assembly (Alfoldi et al.,
2011). There are 38 complete and four partial gene annotations for 42
ABC transporters. The lizard and other reptile genomes (snake, turtles,
tortoises, tegu lizard, and tuatara) duplicate the ABCG2 gene.
The chicken is the index bird species and has multiple apparent ABC gene
loss events, with the genome lacking ABCB12 and ABCB13 ,ABCD1 , and ABCF1 . As ABCD1 and ABCF1 are
very conserved genes, this is unexpected. ABCD1 and ABCD2are closely related, and a single ABCD1/2 gene is found in
invertebrates. However, fish have both ABCD1 and ABCD2orthologs, suggesting that birds lost the ABCD1 gene. In the
human genome, the ABCD1 gene is on the X chromosome, and
mutations in ABCD1 are responsible for the severe, often lethal,
X-linked recessive disease, adrenoleukodystrophy. ABCD1 is expressed in
the peroxisome and adrenoleukodystrophy is a demyelinating disorder, but
the functional effect of ABCD1 defects in the disease is not clear.
There have been detailed analyses published of the ABC gene superfamily
in zebrafish, carp, catfish, and lamprey (S. Liu, Li, & Liu, 2013; X.
Liu et al., 2016; Ren et al., 2015). These studies all document multiple
gene birth events in fish, such as duplications of ABCA1 ,ABCA4 , ABCB3 , ABCB6 , ABCB11 , ABCC5 ,ABCC6 , ABCG2 , and ABCG4 . Only the ABCC6genes have been studied in detail, with the Abcc6a gene shown to
be essential and Abcc6b expressed in the developing kidney (Li et
al., 2010). As fish underwent a whole-genome duplication, the number of
genes that have been retained and now carry out new functions is
complex. Some duplications are confined to specific species, such as a
duplicated ABCF2 in zebrafish, catfish, and a few other species
(S. Liu et al., 2013). Many other examples of lineage-specific
duplications and losses in specific fish lineages have been described,
and it will require highly accurate genome assemblies to understand the
complexity (X. Liu et al., 2016). For example, there are fourABCG2 -related genes in the zebrafish, and other fish species have
complex combinations of these genes, including additional duplications.
As a representative of a more primitive fish species, the lamprey has
few of the gene duplications seen in jawed fish and has only 34
predicted ABC genes.
In conclusion, the availability of many vertebrate genome assemblies
allows a more detailed analysis of the evolution of ABC transporters.
There have been dynamic changes in the gene number in each of the seven
common subfamilies, with the most dramatic changes in the A, B, and G
subfamilies. Because ABC proteins can carry out a wide variety of
transport functions, it is likely that individual lineages of species,
and even specific animals, would develop specific transporters for
highly specialized functions, probably due to environmental pressure. It
is also apparent within the phylogenetic trees of individual genes that
considerable diversification has taken place. As even a single amino
acid change can alter the substrate specificity of an ABC transporter,
the true diversity of substrates is enormous. One of the most
diversified sets of genes is the multi-specific transporters ABCB1/PGP
and ABCG2. This finding is consistent with an essential role for these
pumps in xenobiotic elimination and maintenance of tissue barriers in
the brain, intestine, and placenta. ABCB1 has independently
duplicated in several species such as certain rodents, the cow, and
fish. Even more dramatic are the duplications of ABCG2 that have
taken place in fish species. As fish live in highly diverse aquatic
environments, they are exposed internally and externally to an aqueous
environment. Therefore, it is not surprising that they need to excrete
many environmental toxins and protect internal organs from xenobiotics.
For some gene clusters, particularly the ABCA5 and ABCA14clusters, these genes are challenging to assemble, as the genes are
large and closely related. Therefore, the complete annotation will
require complete draft genome assemblies.
One of the most intriguing ABC gene subfamilies is the ABCH family.
Initially identified in Drosophila and Dictyostelium , ABCH
genes are half transporters, with an N-terminal NBD, the same structure
as the ABCG genes. Invertebrates, the ABCH genes are only found in fish.
There is a single ABCH1 in most fish species and the coelacanth, but the
gene is missing from lamprey and other fish species (Jeong et al.,
2015). A function in lipid transport has been described for an ABCH gene
(LmABCH-9C) in the locust, Locusta migratoria (Yu et al., 2017).
Still, to date, there is no functional information on this gene group in
vertebrates.