Molecular Biology & Evolution

Syndicate content
Molecular Biology and Evolution - RSS feed of recent issues (covers the latest 3 issues, including the current issue)
Updated: 41 min 45 sec ago

The Influence of Anticodon-Codon Interactions and Modified Bases on Codon Usage Bias in Bacteria

Tue, 08/17/2010 - 04:50

Most transfer RNAs (tRNAs) can translate more than one synonymous codon, and most codons can be translated by more than one isoacceptor tRNA. The rates of translation of synonymous codons are dependent on the concentrations of the tRNAs and on the rates of pairing of each anticodon–codon combination. Translational selection causes a significant bias in codon frequencies in highly expressed genes in most bacteria. By comparing codon frequencies in high and low-expression genes, we determine which codons are preferred for each amino acid in a large sample of bacterial genomes. We relate this to the number of copies of each tRNA gene in each genome. In two-codon families, preferred codons have Watson–Crick pairs (GC and AU) between the third codon base and the wobble base of the anticodon rather than GU pairs. This suggests that these combinations are more rapidly recognized by the ribosome. In contrast, in four-codon families, preferred codons do not correspond to Watson–Crick rules. In some cases, a wobble-U tRNA can pair with all four codons. In these cases, A and U codons are preferred over G and C. This indicates that the nonstandard UU combination appears to be translated surprisingly well. Differences in modified bases at the wobble position of the anticodon appear to be responsible for the differences in behavior of tRNAs in two- and four-codon families. We discuss the way changes in the bases in the anticodon influence both the speed and the accuracy of translation. The number of tRNA gene copies and the strength of translational selection correlate with the growth rate of the organism, as we would expect if the primary cause of translational selection in bacteria is the requirement to optimize the speed of protein production.

Categories: Journals

Evolution in Regulatory Regions Rapidly Compensates the Cost of Nonoptimal Codon Usage

Tue, 08/17/2010 - 04:50

The redundant genetic code contains synonymous codons, whose relative frequencies vary among species. Nonoptimal codon usage lowers gene translation efficiency, potentially leading to a fitness cost. This is particularly relevant for horizontal gene transfer, common among bacteria and a key player in antibiotic resistance propagation. By mimicking the horizontal transfer of an antibiotic resistance gene, we established that a nonoptimal codon usage renders Escherichia coli 10–20 times more sensitive to the antibiotic. After 350 generations of experimental evolution under antibiotic selection pressure, this cost was compensated through both in cis changes in the gene promoter and in trans changes in the host bacterial genome, without introducing mutations in the coding sequence of the resistance gene. Further, we have found experimental evidence for convergent molecular adaptive evolution. The high fitness cost of nonoptimal codon usage remains a minor obstacle to gene fixation upon horizontal transfer. Our results highlight the importance of rapid evolution of regulatory mechanisms in the adaptation to new environmental and genetic situations.

Categories: Journals

Evolutionary Pattern of Gene Homogenization between Primate-Specific Paralogs after Human and Macaque Speciation Using the 4-2-4 Method

Tue, 08/17/2010 - 04:50

Homogenization of duplicated genes is an important factor for gene family evolution. In the previous study, we developed a method, named 4-2-4 here, to detect partial homogenization with high sensitivity and high specificity using quartets. A quartet is a set of four genes generated by a duplication event and the subsequent speciation of two closely related species. We searched the human and macaque genomes and found 430 nonredundant quartets, which correspond to primate-specific paralogs. The prevalence of homogenization in these quartets was 10.0% (43/430), which was ca. one-third of that (29.8% = 206/691) in the rodent-specific nonredundant quartets obtained through comparison of mouse and rat genomes. Part of this difference comes from the fact that primate paralogs tend to be more remotely located to each other than rodent paralogs, and the remainder may be explained by the inherent difference in the neutral evolutionary rate between the primate and rodent lineages. A statistical analysis taking account of the effects of false negatives uncovered negative correlations between sequence divergence and homogenization prevalence both in primates and rodents. Further statistical analyses controlling for false-negative rates and sequence divergences revealed two characteristics shared by the primate and rodent paralogs; 1) significant negative correlations of the homogenization prevalence with physical distances, and 2) no significant correlation between the prevalence and relative transcriptional orientations. Patterns of the homogenization in the genomic alignments of human–macaque quartets indicate that gene conversion, rather than unequal crossing-over, is the major cause of the homogenization.

Categories: Journals

Adaptation and Constraint at Toll-Like Receptors in Primates

Tue, 08/17/2010 - 04:50

Frequent positive selection is a hallmark of genes involved in the adaptive immune system of vertebrates, but the incidence of positive selection for genes underlying innate immunity in vertebrates has not been well studied. The toll-like receptors (TLRs) of the innate immune system represent the first line of defense against pathogens. TLRs lie directly at the host–environment interface, and they target microbial molecules. Because of this, they might be subject to frequent positive selection due to coevolutionary dynamics with their microbial counterparts. However, they also recognize conserved molecular motifs, and this might constrain their evolution. Here, we investigate the evolution of the ten human TLRs in the framework of these competing ideas. We studied rates of protein evolution among primate species and we analyzed patterns of polymorphism in humans and chimpanzees. This provides a window into TLR evolution at both long and short timescales. We found a clear signature of positive selection in the rates of substitution across primates in most TLRs. Some of the implicated sites fall in structurally important protein domains, involve radical amino acid changes, or overlap with polymorphisms with known clinical associations in humans. However, within species, patterns of nucleotide variation were generally compatible with purifying selection, and these patterns differed between humans and chimpanzees and between viral and nonviral TLRs. Thus, adaptive evolution at TLRs does not appear to reflect a constant turnover of alleles and instead might be more episodic in nature. This pattern is consistent with more ephemeral pathogen–host associations rather than with long-term coevolution.

Categories: Journals

The Evolution of Thrombospondins and Their Ligand-Binding Activities

Tue, 08/17/2010 - 04:50

The extracellular matrix (ECM) is a complex, multiprotein network that has essential roles in tissue integrity and intercellular signaling in the metazoa. Thrombospondins (TSPs) are extracellular, calcium-binding glycoproteins that have biologically important roles in mammals in angiogenesis, vascular biology, connective tissues, immune response, and synaptogenesis. The evolution of these complex functional properties is poorly understood. We report here on the evolution of TSPs and their ligand-binding capacities, from comparative genomics of species representing the major phyla of metazoa and experimental analyses of the oligomerization properties of noncanonical TSPs of basal deuterostomes. Monomeric, dimeric, trimeric, and pentameric TSPs have arisen through separate evolutionary events involving gain, loss, or modification of a coiled-coil domain or distinct domains at the amino-terminus. The relative transience of monomeric forms under evolution implicates a biological importance for multivalency of the C-terminal region of TSPs. Most protostomes have a single TSP gene encoding a pentameric TSP. The pentameric form is also present in deuterostomes, and gene duplications at the origin of deuterostomes and gene loss and further gene duplication events in the vertebrate lineage gave rise to distinct forms and novel domain architectures. Parallel analysis of the major ligands of mammalian TSPs revealed that many binding activities are neofunctions representing either coevolutionary innovations in the deuterostome lineage or neofunctions of ancient molecules such as CD36. Contrasting widely conserved capacities include binding to heparan glycosaminoglycans, fibrillar collagen, or RGD-dependent integrins. These findings identify TSPs as fundamental components of the extracellular interaction systems of metazoa and thus impact understanding of the evolution of ECM networks. The widely conserved activities of TSPs in binding to ECM components or PS2 clade integrins will be relevant to use of TSPs in synthetic extracellular matrices or tissue engineering. In contrast, the neofunctions of vertebrate TSPs likely include interactions suitable for therapeutic targeting without general disruption of ECM.

Categories: Journals

Low-Complexity Regions in Plasmodium falciparum: Missing Links in the Evolution of an Extreme Genome

Tue, 08/17/2010 - 04:50

Over the past decade, attempts to explain the unusual size and prevalence of low-complexity regions (LCRs) in the proteins of the human malaria parasite Plasmodium falciparum have used both neutral and adaptive models. This past research has offered conflicting explanations for LCR characteristics and their role in, and influence on, the evolution of genome structure. Here we show that P. falciparum LCRs (PfLCRs) are not a single phenomenon, but rather consist of at least three distinct types of sequence, and this heterogeneity is the source of the conflict in the literature. Using molecular and population genetics, we show that these families of PfLCRs are evolving by different mechanisms. One of these families, named here the HighGC family, is of particular interest because these LCRs act as recombination hotspots, both in genes under positive selection for high levels of diversity which can be created by recombination (antigens) and those likely to be evolving neutrally or under negative selection (metabolic enzymes). We discuss how the discovery of these distinct species of PfLCRs helps to resolve previous contradictory studies on LCRs in malaria and contributes to our understanding of the evolution of the of the parasite's unusual genome.

Categories: Journals

Associate

Tue, 08/17/2010 - 04:50
Categories: Journals

Eboard

Tue, 08/17/2010 - 04:50
Categories: Journals

Subscription

Tue, 08/17/2010 - 04:50
Categories: Journals

Constrained Intron Structures in a Microsporidian

Tue, 08/17/2010 - 04:50

The 2.9-Mbp genome of the microsporidian Encephalitozoon cuniculi is severely reduced and compacted, possessing only 16 known tiny spliceosomal introns. Based on motif and expression data, intron profiles were constructed to screen the genome. Twenty additional introns were predicted and verified, doubling the previous estimate. We further predict that accurate 3' splice site (3'SS) selection is accomplished via a scanning mechanism with specificity achieved by maintaining a constrained variable length between the branch point motif and 3'SS. Only introns in ribosomal protein genes exhibit positional bias, and we hypothesize that splicing could be regulating expression of these genes. The large set of new introns in non-ribosomal protein genes suggests that current models of intron loss are unlikely sufficient to explain the distribution of introns. Together, these results extend our understanding of the role of intron loss in genome evolution and contribute to a novel model for splice site selection.

Categories: Journals

Improved Phylogenomic Taxon Sampling Noticeably Affects Nonbilaterian Relationships

Tue, 08/17/2010 - 04:50

Despite expanding data sets and advances in phylogenomic methods, deep-level metazoan relationships remain highly controversial. Recent phylogenomic analyses depart from classical concepts in recovering ctenophores as the earliest branching metazoan taxon and propose a sister-group relationship between sponges and cnidarians (e.g., Dunn CW, Hejnol A, Matus DQ, et al. (18 co-authors). 2008. Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452:745–749). Here, we argue that these results are artifacts stemming from insufficient taxon sampling and long-branch attraction (LBA). By increasing taxon sampling from previously unsampled nonbilaterians and using an identical gene set to that reported by Dunn et al., we recover monophyletic Porifera as the sister group to all other Metazoa. This suggests that the basal position of the fast-evolving Ctenophora proposed by Dunn et al. was due to LBA and that broad taxon sampling is of fundamental importance to metazoan phylogenomic analyses. Additionally, saturation in the Dunn et al. character set is comparatively high, possibly contributing to the poor support for some nonbilaterian nodes.

Categories: Journals

Robustness of Ancestral Sequence Reconstruction to Phylogenetic Uncertainty

Tue, 08/17/2010 - 04:50

Ancestral sequence reconstruction (ASR) is widely used to formulate and test hypotheses about the sequences, functions, and structures of ancient genes. Ancestral sequences are usually inferred from an alignment of extant sequences using a maximum likelihood (ML) phylogenetic algorithm, which calculates the most likely ancestral sequence assuming a probabilistic model of sequence evolution and a specific phylogeny—typically the tree with the ML. The true phylogeny is seldom known with certainty, however. ML methods ignore this uncertainty, whereas Bayesian methods incorporate it by integrating the likelihood of each ancestral state over a distribution of possible trees. It is not known whether Bayesian approaches to phylogenetic uncertainty improve the accuracy of inferred ancestral sequences. Here, we use simulation-based experiments under both simplified and empirically derived conditions to compare the accuracy of ASR carried out using ML and Bayesian approaches. We show that incorporating phylogenetic uncertainty by integrating over topologies very rarely changes the inferred ancestral state and does not improve the accuracy of the reconstructed ancestral sequence. Ancestral state reconstructions are robust to uncertainty about the underlying tree because the conditions that produce phylogenetic uncertainty also make the ancestral state identical across plausible trees; conversely, the conditions under which different phylogenies yield different inferred ancestral states produce little or no ambiguity about the true phylogeny. Our results suggest that ML can produce accurate ASRs, even in the face of phylogenetic uncertainty. Using Bayesian integration to incorporate this uncertainty is neither necessary nor beneficial.

Categories: Journals

Combined EST and Proteomic Analysis Identifies Rapidly Evolving Seminal Fluid Proteins in Heliconius Butterflies

Tue, 08/17/2010 - 04:50

Seminal fluid proteins (SFPs) directly influence a wide range of reproductive processes, including fertilization, sperm storage, egg production, and immune response. Like many other reproductive proteins, the molecular evolution of SFPs is generally characterized by rapid and frequently adaptive evolution. However, the evolutionary processes underlying this often-documented pattern have not yet been confidently determined. A robust understanding of the processes governing SFP evolution will ultimately require identifying SFPs and characterizing their evolution in many different taxa, often where only limited genomic resources are available. Here, we report the first comprehensive molecular genetic and evolutionary analysis of SFPs conducted in Lepidoptera (moths and butterflies). We have identified 51 novel SFPs from two species of Heliconius butterflies (Heliconius erato and Heliconius melpomene) by combining "indirect" bioinformatic and expression analyses of expressed sequence tags from male accessory gland and wing tissues with "direct" proteomic analyses of spermatophores. Proteomic analyses identified fewer SFPs than the indirect criteria but gave consistent results. Of 51 SFPs, 40 were identified in both species but fewer than half could be functionally annotated via similarity searches (Blast, IPRscan, etc.). The majority of annotated Heliconius SFPs were predicted to be chymotrypsins. Comparisons of Heliconius SFPs with those from fruit fly, mosquito, honeybee, and cricket suggest that gene turnover is high among these proteins and that SFPs are rarely conserved across insect orders. Pairwise estimates of evolutionary rates between SFPs and nonreproductive proteins show that, on average, Heliconius SFPs are evolving rapidly. At least one of these SFPs is evolving adaptively (dN/dS > 1), implicating a role for positive selection in this rapid evolution. This work establishes a strong precedent for future research on the causes and consequences of reproductive protein evolution in the Lepidoptera. Butterflies and moths have an extremely rich history of organismal research, which will provide an informative ecological context for further molecular evolutionary investigations.

Categories: Journals

Population-Level Analyses Indirectly Reveal Cryptic Sex and Life History Traits of Pseudoperkinsus tapetis (Ichthyosporea, Opisthokonta): A Unicellular Relative of the Animals

Tue, 08/17/2010 - 04:50

We use population genetics to detect the molecular footprint of a sexual cycle, of a haploid vegetative state, and of lack of host specificity in Pseudoperkinsus tapetis, a marine unicellular relative of the animals. Prior to this study, complete life cycles were not known for any of the unicellular lineages sharing common ancestry with multicellular animals and fungi. We established the first collection of conspecific cultures of any member from the unicellular opisthokont lineage ichthyosporea, isolating 126 cultures of P. tapetis from guts of marine invertebrates ranging from clams to sea cucumbers. We sequenced fragments of the elongation factor alpha-like (EFL) and heat-shock protein 70 (HSP70) genes for a subset of our isolates. Absence of heterozygotes from the EFL locus in 52 isolates provided evidence for haploidy. Phylogenetic incongruence and a lack of support for linkage between two loci from 34 sequenced isolates signified a history of recombination consistent with a sexual cycle. Shared haplotypes in different invertebrate species showed that P. tapetis was not host specific. Based on estimates of the frequency of sex and on observations of cultures, we propose that P. tapetis is transmitted between hosts via asexual endospores. New protists are continually being discovered, and, as this study illustrates, analysis of culturable collections from natural habitats can transform a species from a near unknown to a model system for better understanding the evolution of life histories.

Categories: Journals

Evolution of Characterized Phosphorylation Sites in Budding Yeast

Tue, 08/17/2010 - 04:50

Phosphorylation is one of the most studied and important regulatory mechanisms that modulate protein function in eukaryotic cells. Recently, several studies have investigated the evolution of phosphorylation sites identified by high-throughput methods. These studies have revealed varying degrees of evidence for constraint and plasticity, and therefore, there is currently no consensus as to the evolutionary properties of this important regulatory mechanism. Here, we present a study of high-confidence annotated sites from budding yeast and show that these sites are significantly constrained compared with their flanking region in closely related species. We show that this property does not change in structured or unstructured regions. We investigate the birth, death and compensation rates of the phosphorylation sites and test if sites are more likely to be gained or lost in proteins with greater numbers of sites. Finally, we also show that this evolutionary conservation can yield significant improvement for kinase target predictions when the kinase recognition motif is known, and can be used to infer the recognition motif when a set of targets is known. Our analysis indicates that phosphorylation sites are under selective constraint, consistent with their functional importance. We also find that a small fraction of phosphorylation sites turnover during evolution, which may be an important process underlying the evolution of regulatory networks.

Categories: Journals

Using Time-Structured Data to Estimate Evolutionary Rates of Double-Stranded DNA Viruses

Tue, 08/17/2010 - 04:50

Double-stranded (ds) DNA viruses are often described as evolving through long-term codivergent associations with their hosts, a pattern that is expected to be associated with low rates of nucleotide substitution. However, the hypothesis of codivergence between dsDNA viruses and their hosts has rarely been rigorously tested, even though the vast majority of nucleotide substitution rate estimates for dsDNA viruses are based upon this assumption. It is therefore important to estimate the evolutionary rates of dsDNA viruses independent of the assumption of host-virus codivergence. Here, we explore the use of temporally structured sequence data within a Bayesian framework to estimate the evolutionary rates for seven human dsDNA viruses, including variola virus (VARV) (the causative agent of smallpox) and herpes simplex virus-1. Our analyses reveal that although the VARV genome is likely to evolve at a rate of approximately 1 x 10–5 substitutions/site/year and hence approaching that of many RNA viruses, the evolutionary rates of many other dsDNA viruses remain problematic to estimate. Synthetic data sets were constructed to inform our interpretation of the substitution rates estimated for these dsDNA viruses and the analysis of these demonstrated that given a sequence data set of appropriate length and sampling depth, it is possible to use time-structured analyses to estimate the substitution rates of many dsDNA viruses independently from the assumption of host-virus codivergence. Finally, the discovery that some dsDNA viruses may evolve at rates approaching those of RNA viruses has important implications for our understanding of the long-term evolutionary history and emergence potential of this major group of viruses.

Categories: Journals

Evolution and Cytological Diversification of the Green Seaweeds (Ulvophyceae)

Tue, 08/17/2010 - 04:50

The Ulvophyceae, one of the four classes of the Chlorophyta, is of particular evolutionary interest because it features an unrivaled morphological and cytological diversity. Morphological types range from unicells and simple multicellular filaments to sheet-like and complex corticated thalli. Cytological layouts range from typical small cells containing a single nucleus and chloroplast to giant cells containing millions of nuclei and chloroplasts. In order to understand the evolution of these morphological and cytological types, the present paper aims to assess whether the Ulvophyceae are monophyletic and elucidate the ancient relationships among its orders. Our approach consists of phylogenetic analyses (maximum likelihood and Bayesian inference) of seven nuclear genes, small subunit nuclear ribosomal DNA and two plastid markers with carefully chosen partitioning strategies, and models of sequence evolution. We introduce a procedure for fast site removal (site stripping) targeted at improving phylogenetic signal in a particular epoch of interest and evaluate the specificity of fast site removal to retain signal about ancient relationships. From our phylogenetic analyses, we conclude that the ancestral ulvophyte likely was a unicellular uninucleate organism and that macroscopic growth was achieved independently in various lineages involving radically different mechanisms: either by evolving multicellularity with coupled mitosis and cytokinesis (Ulvales—Ulotrichales and Trentepohliales), by obtaining a multinucleate siphonocladous organization where every nucleus provides for its own cytoplasmic domain (Cladophorales and Blastophysa), or by developing a siphonous organization characterized by either one macronucleus or millions of small nuclei and cytoplasmic streaming (Bryopsidales and Dasycladales). We compare different evolutionary scenarios giving rise to siphonous and siphonocladous cytologies and argue that these did not necessarily evolve from a multicellular or even multinucleate state but instead could have evolved independently from a unicellular ancestor.

Categories: Journals

Biosynthesis of Wyosine Derivatives in tRNA: An Ancient and Highly Diverse Pathway in Archaea

Tue, 08/17/2010 - 04:50

Wyosine (imG) and its derivatives such as wybutosine (yW) are found at position 37 of phenylalanine-specific transfer RNA (tRNAPhe), 3' adjacent to the anticodon in Eucarya and Archaea. In Saccharomyces cerevisiae, formation of yW requires five enzymes acting in a strictly sequential order: Trm5, Tyw1, Tyw2, Tyw3, and Tyw4. Archaea contain wyosine derivatives, but their diversity is greater than in eukaryotes and the corresponding biosynthesis pathways still unknown. To identify these pathways, we analyzed the phylogenetic distribution of homologues of the yeast wybutosine biosynthesis proteins in 62 archaeal genomes and proposed a scenario for the origin and evolution of wyosine derivatives biosynthesis in Archaea that was partly experimentally validated. The key observations were 1) that four of the five wybutosine biosynthetic enzymes are ancient and may have been present in the last common ancestor of Archaea and Eucarya, 2) that the variations in the distribution pattern of biosynthesis enzymes reflect the diversity of the wyosine derivatives found in different Archaea. We also identified 7-aminocarboxypropyl-demethylwyosine (yW-86) and its N4-methyl derivative (yW-72) as final products in tRNAs of several Archaea when these were previously thought to be only intermediates of the eukaryotic pathway. We confirmed that isowyosine (imG2) and 7-methylwyosine (mimG) are two archaeal-specific guanosine-37 derivatives found in tRNA of both Euryarchaeota and Crenarchaeota. Finally, we proposed that the duplication of the trm5 gene in some Archaea led to a change in function from N1 methylation of guanosine to C7 methylation of 4-demethylwyosine (imG-14).

Categories: Journals

The Enamelin Genes in Lizard, Crocodile, and Frog and the Pseudogene in the Chicken Provide New Insights on Enamelin Evolution in Tetrapods

Tue, 08/17/2010 - 04:50

Enamelin (ENAM) has been shown to be a crucial protein for enamel formation and mineralization. Previous molecular analyses have indicated a probable origin early in vertebrate evolution, which is supported by the presence of enamel/enameloid tissues in early vertebrates. In contrast to these hypotheses, ENAM was only characterized in mammals. Our aims were to 1) look for ENAM in representatives of nonmammalian tetrapods, 2) search for a pseudogene in the chicken genome, and 3) see whether the new sequences could bring new information on ENAM evolution. Using in silico approach and polymerase chain reaction, we obtained and characterized the messenger RNA sequences of ENAM in a frog, a lizard, and a crocodile; the genomic DNA sequences of ENAM in a frog and a lizard; and the putative sequence of chicken ENAM pseudogene. The comparison with mammalian ENAM sequences has revealed 1) the presence of an additional coding exon, named exon 8b, in sauropsids and marsupials, 2) a simpler 5'-untranslated region in nonmammalian ENAMs, 3) many sequence variations in the large exons while there are a few conserved regions in small exons, and 4) 25 amino acids that have been conserved during 350 million years of tetrapod evolution and hence of crucial biological importance. The chicken pseudogene was identified in a region that was not expected when considering the gene synteny in mammals. Together with the location of lizard ENAM in a homologous region, this result indicates that enamel genes were probably translocated in an ancestor of the sauropsid lineage. This study supports the origin of ENAM earlier in vertebrate evolution, confirms that tooth loss in modern birds led to the invalidation of enamel genes, and adds information on the important role played by, for example, the phosphorylated serines and the glycosylated asparagines for correct ENAM functions.

Categories: Journals

Compositional Heterogeneity and Phylogenomic Inference of Metazoan Relationships

Tue, 08/17/2010 - 04:50

Compositional heterogeneity of sequences between taxa may cause systematic error in phylogenetic inference. The potential influence of such bias might be mitigated by strategies to reduce compositional heterogeneity in the data set or by phylogeny reconstruction methods that account for compositional heterogeneity. We adopted several of these strategies to analyze a large ribosomal protein data set representing all major metazoan taxa. Posterior predictive tests revealed that there is compositional bias in this data set. Only a few taxa with strongly deviating amino acid composition had to be excluded to reduce this bias. Thus, this is a good solution, if these taxa are not central to the phylogenetic question at hand. Deleting individual proteins from the data matrix may be an appropriate method, if compositional heterogeneity among taxa is concentrated in a few proteins. However, half of the ribosomal proteins had to be excluded to reduce the compositional heterogeneity to a degree that the CAT model was no longer significantly violated. Recoding of amino acids into groups is another alternative but causes a loss of information and may result in badly resolved trees as demonstrated by the present data set. Bayesian inference with the CAT–BP model directly accounts for compositional heterogeneity between lineages by introducing breakpoints along the branches of the phylogeny at which the amino acid composition is allowed to change but is computationally expensive. Finally, a neighbor joining tree based on equal input distances that consider pattern and rate heterogeneity showed several unusual groupings, which are most likely artifacts, probably caused by the loss of information resulting from the transformation of the sequence data into distances. As long as no more efficient phylogenetic inference methods are available that can directly account for compositional heterogeneity in large data sets, using methods for reducing compositional heterogeneity in the data in combination with methods that assume a stationary amino acid composition remains an option for controlling systematic errors in tree reconstruction that result from compositional bias. Our analyses indicated that the paraphyly of Deuterostomia in some analyses is the result of systematic errors that also affected the relationships of Entoprocta and Ectoprocta.

Categories: Journals