Consultar ensayos de calidad


Phylogeny and signatures distinctive of α-proteobacteria



Critical Reviews in Microbiology, 31:101–135, 2005 Copyright c Taylor & Francis Inc. ISSN: 1040-841X print / 1549-7828 online DOI: 10.1080/10408410590922393

Protein Signatures Distinctive of Alpha Proteobacteria and Its Subgroups and a Model for α-Proteobacterial Evolution
Radhey S. Gupta
Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada

Alpha (α) proteobacteria comprise a large and metabolically diverse group. No biochemical or molecular feature is presently known that can distinguish these bacteria from other groups. The evolutionary relationships among this group, which includes numerous pathogens and agriculturally important microbes, are also not understood. Shared conserved inserts and deletions (i.e., indels or signatures) in molecular sequences provide a powerful means for identification of different groups in clear terms, and for evolutionary studies (see www.bacterialphylogeny.com). This review describes, for the first time, a large number of conserved indels in broadly distributed proteins that are distinctive and unifying characteristics of either all α-proteobacteria, or many of its constituent subgroups (i.e., orders, families, etc.). These signatures were identified by systematic analyses of proteins found in the Rickettsia prowazekii (RP) genome. Conserved indels that are unique to αproteobacteria are present in the following proteins: Cytochrome c oxidase assembly protein Ctag, PurC, DnaB, ATP synthase αsubunit, exonuclease VII, prolipoprotein phosphatidylglycerol transferase, RP-400, FtsK, puruvate phosphate dikinase, cytochromeb, MutY, and homoserine dehydrogenase. The signatures in succinyl-CoA synthetase, cytochrome oxidase I, alanyl-tRNA synthetase, and MutS proteins are found in all α-proteobacteria, except the Rickettsiales, indicating that this group has diverged prior to the introduction of these signatures. A number of proteins contain conserved indels that are specific for Rickettsiales (XerD integrase and leucine aminopeptidase), Rickettsiaceae (Mfd, ribosomal protein L19, FtsZ, Sigma 70 and exonuclease VII), or Anaplasmataceae (Tgt and RP-314), and they distinguish these groups from all others. Signatures in DnaA, RP-057, and DNA ligase A are commonly shared by various Rhizobiales, Rhodobacterales, and Caulobacter, suggesting that these groups shared a common ancestor exclusive of other α-proteobacteria. A specific relationship between Rhodobacterales and Caulobacter is indicated by a large insert in the Asn-Gln amidotransferase. The Rhizobiales group of species are distinguished from others by a large insert in the Trp-tRNA synthetase. Signature sequences in a number of other proteins (viz. oxoglutarate dehydogenase, succinyl-CoA synthase, LytB, DNA gyrase A, LepA, and Ser-tRNA synthetase) serve to distinguish the Rhizobiaceae, Brucellaceae, and Phyllobacteriaceae families from Bradyrhizobiaceae and Methylobacteriaceae. Based



on the distribution patterns of these signatures, it is now possible to logically deduce a model for the branching order among α-proteobacteria, which is as follows: Rickettsiales → Rhodospirillales-Sphingomonadales → Rhodobacterales-Caulobacterales → Rhizobiales(Rhizobiaceaea-Brucellaceae-Phyllobacteriaceae, and Bradyrhizobiaceae). The deduced branching order is also consistent with the topologies in the 16 rRNA and other phylogenetic trees. Signature sequences in a number of other proteins provide evidence that α-proteobacteria is a late branching taxa within Bacteria, which branched after the δ, -subdivisions but prior to the β,γproteobacteria. The shared presence of many of these signatures in the mitochondrial (eukaryotic) homologs also provides evidence of the α-proteobacterial ancestry of mitochondria. Keywords Bacterial Phylogeny; Alpha Proteobacteria Trees; Protein Signatures; Rickettsiales; Rhodobacterales; Branching Order; Mitochondrial Origin; Rickettsia prowazekii; Rhizobiales

Received 20 December 2004; accepted 8 December 2005. Address correspondence to Radhey S. Gupta, Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada L8N 3Z5. E-mail: gupta@mcmaster.ca

INTRODUCTION The alpha (α) proteobacteria comprise an important group within Bacteria, which has contributed seminally to many aspects of the history of life (Margulis 1970; Kersters et al. 2003). It is now established that mitochondria, which enable eukaryotic cells to produce energy via oxidative phosphorylation, are the result of endosymbitotic capture of an α-proteobacteria by the primitive eukaryotic cell (Margulis 1970; Falah & Gupta 1994; Viale & Arakaki 1994; Andersson et al. 1998; Gray et al. 1999; Karlin & Brocchieri 2000; Emelyanov 2001a; Esser et al. 2004). There is also strong evidence indicating that the ancestral eukaryotic cell itself may haveoriginated via a fusion, or long-term symbiotic association, event between one or more αproteobacteria and an archaebacteria (or Archaea) (Gupta et al. 1994; Lake & Rivera 1994; Gupta & Golding 1996; Margulis 1996; Gupta 1998; Martin & Muller 1998; Ribeiro & Golding 1998; Andersson et al. 1998; Karlin et al. 1999; Lang et al. 1999; Kurland & Andersson 2000; Emelyanov 2001a, 2003b). The symbiosis between α-proteobacteria (viz. Rhizobiaceae species) and plant root nodules plays a central role in the fixation of atmospheric nitrogen by plants (Sadowsky & Graham 2000; Van Sluys et al. 2002; Kersters et al. 2003; Sawada et al. 2003). Additionally, many α-proteobacterial species (viz. Rickettsiales,



101


102

R. S. GUPTA

Brucella, Bartonella) are adapted to intracellular life style and are major human and animal pathogens (Moreno & Moriyon 2001; Kersters et al. 2003; Yu & Walker 2003). The α-proteobacteria exhibit enormous diversity in terms of their morphological and metabolic characteristics and they include numerous phototrophs, chemolithotrophs and chemoorganotrophs (Stackebrandt et al. 1988; De Ley 1992; Kersters et al. 2003). This group also harbors all known aerobic photoheterotrohic bacteria, which contain bacteriochlorophyll a, but are unable to grow photosynthetically under anaerobic conditions (Yurkov & Beatty 1998). These bacteria are abundant in the upper layers of oceans (Kolber et al. 2001). The α-proteobacterial species are presently recognized on the basis of their branching pattern in the 16S rRNA trees, where they form a distinct clade within the proteobacterial phylum (Woese et al.1984; Stackebrandt et al. 1988; Olsen et al. 1994; Gupta 2000; Kersters et al. 2003). This group has been given the rank of a Class or subdivision within the Proteobacteria phylum (Stackebrandt et al. 1988; Murray et al. 1990; De Ley 1992; Stackebrandt 2000; Ludwig & Klenk 2001; Garrity & Holt 2001; Kersters et al. 2003). Other than their distinct branching in the 16S rRNA or other phylogenetic trees (De Ley 1992; Viale et al. 1994; Eisen 1995; Gupta et al. 1997; Gupta 2000; Stepkowski et al. 2003; Emelyanov 2003a; Battistuzzi et al. 2004), there is no reliable phenotypic or molecular characteristic known at present that is uniquely shared by different α-proteobacteria which distinguish them from all other bacteria (Kersters et al. 2003). On the basis of 16S rRNA trees the α-proteobacteria have been divided into seven main subgroups or orders (viz. Caulobacterales, Rhizobiales, Rhodobacterales, Rhodospirillales, Rickettsiales, Sphingomondales, and Parvularucales) (Maidak et al. 2001; Garrity & Holt 2001; Kersters et al. 2003). However, the branching order and interrelationships among these subgroups are presently not resolved and no distinctive features that can distinguish these groups from each other are known (Kersters et al. 2003). In our recent work, we have been utilizing a new approach based on identification of conserved indels (also referred to as signatures) in proteins sequences that is proving very useful in identifying different groups within Bacteria in clear molecular terms and clarifying evolutionary relationships among them (see www.bacterialphylogeny.com) (Gupta 1998, 2003, 2004;Griffiths & Gupta 2002, 2004a; Gupta & Griffiths 2002; Gupta et al. 2003). We have previously described many protein signatures that are distinctive characteristics of the proteobacterial phylum and which also provided information regarding its branching position relative to other bacterial groups (Gupta 1998, 2000; Griffiths & Gupta 2004b). This review focuses on examining the evolutionary relationships among α-proteobacteria using the signature sequence as well as traditional phylogenetic approaches. In recent years, complete genomes of several α-proteobacteria (viz. Bartonella henselae, Bart. quintana, Bradyrhizobium japonicum, Brucella melitensis, Bru. suis, Caulobacter crescentus, Mesorhizobium loti, Sinorhizobium loti, Rhodopseudomonas palustris, Agrobacterium tumefaciens, Rick-

ettsia conorii, Ri. prowazekii, Ri. typhi, and Wolbachia sp. (Drosophila endosymbiont)) have become available (Andersson et al. 1998; Kaneko et al. 2000, 2002; Nierman et al. 2001; Wood et al. 2001; Ogata et al. 2001; Galibert et al. 2001; DelVecchio et al. 2002; Paulsen et al. 2002; Larimer et al. 2004; McLeod et al. 2004). These provide valuable resources for identifying novel molecular features that are likely distinctive characteristics of α-proteobacteria and its various subgroups, and which may prove helpful in clarifying the evolutionary relationships among them. This article, describes for the first time, a large number of conserved indels in widely distributed proteins that are either uniquely shared by all α-proteobacteria, or which are shared by only particular subgroups (i.e., families or orders) of this Class. Thesesignatures provide novel and definitive molecular means for distinguishing α-proteobacteria and many of its subgroups from all other bacteria. The distribution of these signatures in different α-proteobacteria also enables one to logically deduce the relative branching orders and interrelationships among different α-proteobacteria subgroups. Phylogenetic studies have also been carried out based on 16S rRNA and a number of proteins sequences. Based on this information, a detailed model for the evolutionary relationships among α-proteobacteria has been developed. PHYLOGENETIC TREE FOR ALPHA PROTEOBACTERIA BASED ON 16S rRNA SEQUENCES Although α-Proteobacteria comprise a major group within Bacteria (Garrity & Holt 2001) with >5200 sequences in the Ribosomal Database Project II (Maidak et al. 2001), there is no detailed review or article that discusses the evolutionary relationships among this group (i.e. indicating the relationships among different subgroups and orders within this Class) (Kersters et al. 2003). Most of the articles on α-Proteobacteria are aimed at clarifying the phylogenetic placement of particular species at either genus or family levels (Dumler et al. 2001; Gaunt et al. 2001; Young et al. 2001; Taillardat-Bisch et al. 2003; van Berkum et al. 2003; Broughton 2003; Stepkowski et al. 2003; Sawada et al. 2003). The second edition of Bergey’s Manual (Ludwig & Klenk 2001) and the third edition of Prokaryotes (Kersters et al. 2003) present condensed phylogenetic trees for the α-Proteobacteria (or Proteobacteria) as a whole to indicate presumed relationships among different subgroups comprisingthis subdivision. However, most of these trees do not show any bootstrap scores or even individual species (Ludwig & Klenk 2001; Kersters et al. 2003), making it difficult to get a clear sense of the reliability of the observed (or indicated) relationships. Hence, as an initial step toward understanding the evolutionary relationships among α-Proteobacteria, a phylogenetic tree based on 16S rRNA sequences was constructed from 65 α-proteobacterial species, covering its major subgroups. The resulting neighborjoining bootstrapped consensus tree is presented in Figure 1. The tree shown was rooted using the 16S rRNA sequences from epsilon proteobacteria, which show deeper branching than the α-subdivision in the rRNA as well as various other trees (Olsen




PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA

103

FIG. 1. A neighbor-joining bootstrap consensus tree for α-proteobacteria based on 16S rRNA sequences. The tree was bootstrapped 100 times and bootstrap scores which were >60 are indicated on the nodes. The tree was rooted using H. pylori. However, the tree topologies was not altered on rooting with other deep branching bacteria (e.g., Aq. aeolicus). The groups of species corresponding to some of the main subgroups within α-proteobacteria are marked. indicates anomalous branching in the tree.


104

R. S. GUPTA

et al. 1994; Viale et al. 1994; Eisen 1995; Gupta 1998). The bootstrap scores for all nodes, which were >60 (out of 100) are indicated on the tree. In the resulting tree a number of different clades are either clearly (>90% bootstrap score) or reasonably well resolved. Theseincluded the clades corresponding to group of species which are recognized as major orders within the α-Proteobacteria (Rhizobiales, Rhodospirillales, Caulobacterales, Sphingomonadales, Rhodobacterales, and Rickettsiales) (Ludwig & Klenk 2001; Garrity & Holt 2001; Kersters et al. 2003). Within Rhizobiales, the Bradyrhizobiaceae family of species was clearly separated from some of the other families within this order (viz. Rhizobiaceae, Brucellaceae, and Phyllobacteriaceae) (Wang et al. 1998; Sadowsky & Graham 2000; Dumler et al. 2001; van Berkum et al. 2003; Stepkowski et al. 2003). Within α-Proteobacteria, the deepest branching was observed for the Rickettsiales group of species. Within the Rickettsiales, the Rickettsia, and Orientia genera, which form part of the Rickttsiaceae family, were clearly resolved from the Anaplasmataceae family comprised of Ehrlichia, Wolbachia, Anaplasma, and Neorickettsia species (Dumler et al. 2001; Yu & Walker 2003). In contrast to these well-resolved clades or relationships, various nodes indicating the interrelationships among different orders had lower bootstrap scores (50% are indicated on various nodes. All inserts and deletions were excluded from the sequence alignment used for phylogenetic analysis. The α-proteobacteria formed a well-defined clade in this tree, however, their branching position relative to other groups was not resolved. The Rickettsiales order formed the deepest branch within α-proteobacteria and they were also clearly resolved from other α-proteobacteria. The arrows mark the suggested positions where the identified signatures were introduced inthis protein.

(Figure 10). One of these deletions is a distinctive characteristic of all α-proteobacteria and not found in any other bacteria. The other deletion, in addition to the α-proteobacteria, is also commonly present in the two Desulfovibrio species (δ-proteobacteria), suggesting a distant relationship of this group to α-proteobacteria, as also seen with the PPDK protein (Figure 9). In addition to these deletions, the FtsK protein also contains a 5–6 aa insert that is unique to various α-proteobacteria in comparison to the other groups of proteobacteria (present in

position corresponding to aa 513–520 in Ri. prowazekii protein). Since the region where this insert is found exhibits variability in other bacteria, this signature is not shown. The FtsK protein has also been previously shown to contain an 8–9 aa insert in a different region of the protein that is a distinctive characteristic of various Bacteriodetes and Chlorobium species (Gupta 2004). The FtsK homologs are not found in most eukaryotic organisms. However, a homolog of this protein is present in Plasmodium yoelii (Genebank accession number 23485217). The origin and


PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA

111

FIG. 8.

Partial sequence alignment of RP-400 protein showing a 4–6 aa insert that is specific for various α-proteobacteria, except Z. mobilis.

possible significance of this gene/protein is presently unclear. A 1 aa deletion that is specific for various α-proteobacteria is also present in the Cytochrome b (Cyt b; PetB) protein (Figure 11), which is a subunit of the cytochrome reductase, which isan integral part of the electron transport chain (Daldal et al. 1987; Stryer 1995; Emelyanov 2003a). This indel is not present in other bacteria including that from Aquifex aeolicus, indicating that it is a deletion in α-proteobacteria, rather than an insert in other bacteria. Cyt b is one of the 13 proteins that is still encoded by mitochondrial DNA (Lang et al. 1999). Sequence information for Cyt b is available from a large number (>500) of mitochondrial genomes and phylogenetic studies based on this protein provides evidence for the origin of mitochondria from within the Rickettsiaceae (Sicheritz-Ponten et al. 1998; Emelyanov 2003a). Similar to the α-proteobacteria, Cyt b from all eukaryotic mitochondrial homologs was found to lack this 1 aa indel, providing evidence of their specific relationship to the α-proteobacteria. B. Signature Sequences Distinguishing Rickettsiales from Other α-Proteobacteria In phylogenetic trees based on 16S rRNA, as well as many protein sequences, the Rickettsiales are found to form the deepest

branching clade within α-proteobacteria (see Figures 1 and 7) (Dumler et al. 2001; Gaunt et al. 2001; Yu et al. 2001; Kersters et al. 2003; Yu & Walker 2003; Stepkowski et al. 2003). We have identified several signatures that are present in various αproteobacteria, except the Rickettsiales. These signatures are described below. The enzyme succinyl CoA-synthetase, which is part of the citric acid cycle, carries out cleavage of the thioester bond in succinyl-CoA in a coupled reaction to generate succinate and producing GTP (Bridger et al. 1987; Stryer 1995). It is the only step inthe citric acid cycle that directly leads to the formation of a high-energy phosphate bond. The beta subunit of this protein contains a conserved insert of 10 aa, that is commonly present in all other α-proteobacteria, except the Rickettsiales (Figure 12). Surprisingly, this insert is also present in Ral. metallidurans (a β-proteobacterium), but not in any other β-proteobacteria, including the closely related species Ral. solanacearum. This suggests that the Succ-CoA synthetase gene in Ral. metallidurans has likely originated by non-specific means such as LGT. A smaller unrelated insert in this region, which is presumably of independent origin, is also present in Cytophaga and Rhodopirellula species (not shown). It is of interest that a 7–8 aa insert is also present in this position in various eukaryotic homologs. It is unclear at present, whether this latter insert


112

R. S. GUPTA

FIG. 9. Excerpt from sequence alignment for pyruvate phosphate dikinase (PPDK) protein showing a signature for α-proteobacteria. The Rickettsiales species contain a 5 aa long insert, where all other α-proteobacteria have a 12 aa insert in the same position. Two different homologs of PPDK are found in Brad. japonicum, only one of which is found to contain the insert. A smaller conserved insert of 10 aa is also present in this position in various δ-proteobacteria suggesting that they may be specifically, but distantly, related to the α-proteobacteria.

has originated from an α-proteobacterial ancestor or it is of independent origin. If these inserts are of common origin, then this would suggest that the eukaryotichomologs of Succ-CoAsynthetase have originated from an α-proteobacterial ancestor other than the Rickettsiales. This observation will be at variance with other evidence pointing to a closer relationship of mitochondria to the Rickettsiales species (Viale & Arakaki 1994; Gupta 1995; Andersson et al. 1998; Sicheritz-Ponten et al. 1998; Gray et al. 1999; Lang et al. 1999; Emelyanov 2001a, 2001b, 2003a). Emelyanov (2001a, 2001b) has observed a closer relationship of mitochondrial homologs to certain rickettsial species (e.g. Holospora obtusa, Caedibactera caryophila), for which sequence information for this protein is lacking at present. It is possible that Succ-CoA synthetase from these species may contain this insert. Presently, the possibility that the insert in eukaryotic homologs was independently introduced also cannot be excluded.

Another signature showing a similar distribution pattern has been identified in cytochrome oxidase polypeptide I (Cox I). In this case, a 5 aa insert in a conserved region is commonly present in various α-proteobacterial species except the Rickettsiales (Figure 13). It should be noted that α-proteobacteria contain two different related proteins. One of these, which harbors this insert seems to correspond to Cox I, whereas the other homologs lacking the insert are mainly those from Cytochrome o ubiquinol oxidase (Davidson & Daldal 1987). However, all Rickettsiales species contain only a single homolog of this protein, corresponding to Cox I. The observed insert in both SuccCoA-synthetase and Cox I were thus likely introduced in a common ancestor of the remainder of theα-proteobacteria after the branching of Rickettsiales. Similar to the Cyt b, the Cox I in eukaryotic cells is also encoded by mitochondrial DNA (Andersson et al. 1998; Gray et al. 1999) and sequence information for this protein is available from a large number of mitochondrial


PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA

113

FIG. 10. Partial sequence alignments of FtsK protein showing two different signatures (1 aa deletions) that are informative characteristics of α-proteobacteria. The deletion on the left is unique to various α-proteobacteria, whereas the one on the right is also commonly shared by two Desulfovibrio species (δ-proteobacteria) suggesting their relatedness to the α-proteobacteria.

genomes. The eukaryotic homologs of Cox I do not contain the identified insert (results not shown) indicating their possible derivation from Rickettsiales (Emelyanov 2003a). Two other proteins were found to contain inserts of variable lengths in highly conserved regions in various α-proteobacterial species, with the exception of Rickettsiales (Figure 14). In alanyl-tRNA synthetase (AlaRS), which is ubiquitously found in all organisms, an insert of between 5–11 aa is present in a highly conserved region in various α-proteobacteria, except the Rickettsiales (and also Mag. magnetotacticum) (Figure 14A). Another signature showing a similar distribution pattern is found in the MutS protein, which is involved in the DNA mismatch repair (Sixma 2001; Martins-Pinheiro et al. 2004). In this case, a conserved insert of 2–5 aa is present in various α-proteobacteria (Figure 14B), but not inRickettsiales. The simplest explanation

for these signatures is that they were introduced in an ancestral α-proteobacterial lineage, after the branching of Rickettsiales (and also possibly Mag. magnetotacticum). The observed variations in the lengths of these inserts have presumably resulted from subsequent genetic changes. We have also identified a number of α-proteobacteria-specific signatures in proteins for which no homologs are found in the Rickettsiales. In the MutY protein, which is an A-G specific DNA glycosylase involved in DNA repair (Parker & Eshleman 2003; Martins-Pinheiro et al. 2004), a 4–9 aa insert in a conserved region is present in various α-proteobacteria (Figure 15A). An insert of similar length is also present in most eukaryotic homologs (with the exception of Anopheles gambiae) indicating their possible derivation from α-proteobacteria. Another signature showing similar species distribution is present in


114

R. S. GUPTA

FIG. 11. Partial sequence alignment for Cyt b protein showing a 1 aa deletion that is specific for various α-proteobacteria. This deletion is also present in all mitochondrial homologs (Cyt b is encoded by mitochondrial DNA) providing strong evidence of their α-proteobacterial ancestry.

the protein homoserine dehydrogenase (Figure 15B). This indel consists of a 1 aa insert in a conserved region that is present in various α-proteobacteria, but not any other proteobacteria. The homologs of both these proteins were not detected in the Rickettsiales species and their absence is very likely due to selective loss of these genes in a common ancestor of theRickettsiales (Martins-Pinheiro et al. 2004), presumably due to the intracellular life-style of these organisms (Boussau et al. 2004). The

observed inserts in these genes could have been introduced in a common ancestor of the α-proteobacteria, either before or after the loss of these genes in Rickettsiales. Several proteins contain conserved inserts that are either unique for the Rickettsiales or for the two main families, Rickettsiaceae and Anaplasmataceae, comprising this order (Dumler et al. 2001; Yu & Walker 2003). The Rickettsiales-specific signatures are present in the proteins XerD and leucine aminopeptidase


PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA

115

FIG. 12. Partial sequence alignment of Succ-CoA synthase showing a 10 aa insert that is present in various α-proteobacteria, except the Rickettsiales. This insert is not found in other bacteria, except Ral. metallidurans, which has likely acquired it by non-specific means. A smaller insert is also present in this position in various eukaryotic homologs.

(Figure 16). XerD protein (Figure 16B) is a part of the XerCD integrase/recombinase that is involved in the cell division process and decatenation of DNA duplexes (Ip et al. 2003). A 7 aa insert is present in a conserved region of this protein which is uniquely shared by all Rickettsiales and not found in any other bacteria (Figure 16A). Another 2 aa insert that is specific for Rickettsiales is present in leucine aminopeptidase (Figure 16A), which is an exopeptidase that selectively releases N-terminal amino acids from peptides and proteins (Gonzales & RobertBaudouy 1996). Thesignatures that are specific for Rickettsia include a 4 aa insert in a highly conserved region of the transcription repair coupling factor (Mfd) (Martins-Pinheiro et al. 2004) (Figure 17A), a 10 aa insert in ribosomal protein L19

(Figure 17B) and a 1 aa insert in the FtsZ protein (Figure 17C). Two additional Rickettsia-specific signatures consisting of a 1 aa insert in the major sigma factor-70 (at position 141 in the R. prowazekii sequence) and a 1 aa deletion in exouclease VII (at position 137 in the Ri. prowazekii homolog) were also identified, but they are not shown here. The identified signatures in these proteins are present only in various Rickettsiaceae species and not found in other Rickettsiales (viz. Ehrlichia, Wolbachia, Anaplasma) or other groups of bacteria. Within eukaryotes, a homolog of the transcription repair-coupling factor is only detected in Arabidoposis thaliana and it lacks the identified insert (results not shown). The homologs of ribosomal protein L19 are found in various plants and algae but not in any of the animal


116

R. S. GUPTA

FIG. 13. Partial sequence alignment of Cox I showing a 5 aa insert that is present in various α-proteobacteria, except Rickettsiales. The other α-proteobacteria also contains a second more distantly related homolog that lacks this insert.

species. Of these, an 8 aa insert in the same position is present only in the homolog from Cyanophora paradox (not shown). The significance and possible origin of this insert is not clear. Similar to the ribosomal protein L19, FtsZ homologs are also found only in plants but not in animals. These homologsalso lacked the insert that is present in Rickettsiaceae. The plant homologs of these proteins likely correspond to those of the plastids, which because of their cyanobacterial ancestry (Gray 1989; Morden et al. 1992; Margulis 1993; Gupta et al. 2003) are expected to be lacking Rickettsia-specific signatures. We have also identified two large inserts that are commonly shared by the Ehrlichia, Wolbachia, and Anaplasma species but not found in any of the Rickettsia species or other bacteria. These signatures include a 15 aa insert in the HlyD family of secretory protein (Figure 18A) and a 10–11 aa insert in the tRNA guanine transglycosylase (Tgt) protein (Figure 18B), involved in the synthesis of hypermodified nucleoside queousine (Reuter & Ficner 1995). The eukaryotic homologs of Tgt do not contain this insert providing evidence against their origin from Anaplasmatacaea family of species (results not shown). The homologs of HlyD are not found in eukaryotes. These signatures point to a close relationship between Ehrlichia, Wolbachia, and Anaplasma species, which is also seen in phylogenetic trees based on many other sequences (Dumler et al. 2001; Gaunt et al. 2001; Yu et al. 2001; Taillardat-Bisch et al. 2003; Yu & Walker 2003; Stepkowski

et al. 2003; Emelyanov 2003a). These signatures were likely introduced in a common ancestor of the Anaplasmataceae family, which now includes all Ehrlichia, Anaplasma, Cowdria, Wolbachia, and Neorickettsia species (Dumler et al. 2001; Yu & Walker 2003).

C. Signature Sequences for Other Subgroups of α-Proteobacteria and Providing Information Regarding TheirInterrelationships Signature sequences in a number of other proteins are useful in distinguishing other subgroups of α-proteobacteria and they also provide information clarifying the interrelationships among them. In the DnaA protein involved in chromosomal replication (Messer 2002), a 5 aa insert is present in various Rhizobiales and Caulobacter/Rhodobacter species (Figure 19A). However, this insert is not found in any of the Rickettsiales, as well most α-proteobacterial species belonging to the orders Sphingomonadales and Rhodospirillales. The species Mag. magnetotacticum contains two different homologs of this protein, only one of which is found to contain the insert. Another insert showing a similar distribution pattern is present in the protein RP057, which is a homolog of the glucose-inhibited division protein B (Romanowski et al. 2002). This protein contains a 3 aa insert that is common to the same subgroups of α-proteobacteria


PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA

117

FIG. 14. Signature sequences in alanyl-tRNA synthetase (AlaRS) and MutS proteins that are informative for the α-proteobacteria. In AlaRS (upper panel) an insert of variable length in a highly conserved region is present in various α-proteobacteria, except the Rickettsiales and Mag. magentotacticum. The DNA mismatch repair protein MutS (lower panel) also contains a 3–5 aa insert in various α-proteobacteria, except Rickettsiales. The inserts lengths in this case also serve to differentiate Rhodospirillales and Sphingomonadales species from the Rickettsiales, Rhodobacterales, and Caulobacterales.


118

R.S. GUPTA

FIG. 15. Partial sequence alignments of MutY (upper panel) and homoserine dehydrogenase (lower panel) proteins showing inserts (boxed) in conserved regions that are specific for α-proteobacteria. The homologs of both these proteins are not found in the Rickettsiales. For MutY, an insert of approximately similar length is also present in various eukaryotic homologs, with the exception of Anopheles gambiae.

as the insert in the DnaA protein, but which is not found in the Rickettsiales or Rhodospirillales/Sphingomonadales species (Figure 19B). The variable length inserts are also present in this position in other bacteria (not shown). However, within proteobacteria this insert is limited to the above subgroups of

α-proteobacteria. Based on the distribution patterns of these signatures, these inserts were likely introduced in a common ancestor of the Rhizobiales and Caulobacter/Rhodobacter after the branching of Rickettsiales and Rhodospirillales/ Sphingomonadales orders (Figure 19C).


PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA

119

FIG. 16. Signature sequences in XerD integrase (upper panel) and leucine aminopeptidase (lower panel) that are distinctive of the Rickettsiales order and not found in other α-proteobacteria or other bacteria.


120

R. S. GUPTA

FIG. 17. Signature sequences in transcription repair coupling factor Mfd (A), Ribosomal protein L19 (B), and FtsZ (C) proteins that are distinctive of Rickettsia species and not found in other α-proteobacteria including Anaplasmataceae family (e.g., Wolbachia, Ehrlichia, Anaplasma) of species. Two additionalsignatures showing similar distribution are found in the sigma factor-70 and exonuclease VII proteins.


PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA

121

FIG. 18. Signature sequences in RP-314 (A) and tRNA guanine transglycosylase (Tgt) (B) proteins that are distinctive of the Anaplasmataceae family of species and not found in Rickettsia or various other bacteria.

The protein DNA ligase (NAD dependent; Lig A) contains a 12 aa insert in a highly conserved region that is commonly shared by various Rhizobiales as well as Rhodobacterales species (Figure 20A), but which is not found in C. crescentus, Rhodospirillales (Rhodo. rubrum, Mag. magnetotacticum), and Sph-

ingomonadales (Z. mobilis, Novo. armoaticivorans). The absence of this insert in the Mesorhizobium sp. BNC1, is somewhat surprising, but it could result from non-specific mechanisms. This signature suggests that Rhizobiales species may be more closely related to Rhodobacterales in comparison to


122

R. S. GUPTA

FIG. 19. Partial sequence alignments of DnaA (panel A) and RP-057 (panel B) proteins showing inserts in conserved regions (boxed) that are only present in various Rhizobiales, Rhodobacterales, and Caulobacter, but not found in other α-proteobacteria or bacteria. These inserts were likely introduced in a common ancestor of the above groups after the branching of Rickettsiales, Rhodospirillales, and Sphingomonadales as indicated in panel C.

Caulobacter and other α-proteobacteria. However, another prominent insert (11 aa) in a highly conserved region of the protein aspargine-glutamine amidotransferasepoints to a specific relationship between Rhodobacterales and Caulobacter species (Figure 20B), to the exclusion of all other α-proteobacteria. Martins-Pinheiro et al. (2004) have reported phylogenetic analysis based on LigA sequences. The α-proteobacteria formed

a distinct clade in the tree, but they consisted of only certain Rhizobiaceace and Caulobacter species (Martins-Pinheiro et al. 2004). To fully understand the evolutionary significance of these signatures, it would be necessary to obtain sequence information for these proteins from additional Caulobacterales. We have also identified many conserved inserts that are specific for species belonging to the Rhiziobiales order. The


PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA

123

FIG. 20. Signature sequences in DNA ligase A and (upper panel) and Asn-Gln amidotransferase (lower panel) that are informative for α-proteobacteria. The signature in DNA ligase is commonly shared by various Rhizobiales as well as C. crescentus species, while that in the Asn-Gln amidotransferase is uniquely shared by Rhodobacterales and Caulobacter, indicating a specific relationship between these subgroups.

Trp-tRNA synthetase (TrpRS) contains a large insert in a highly conserved region which is uniquely shared by various Rhizobiales species (Figure 21A), but not found in any of the other α-proteobacteria or other groups of bacteria (results for other groups of bacteria not shown). The absence of this insert in various Rickettsiales, Rhodospirillales, Sphingomonadales, and Rhodobacterales as well as Caulobacter provides evidence that these groups havebranched off prior to the introduction of this insert (Figure 21A). The length of the insert in TrpRS also serves to distinguish the Rhizobiaceae, Brucellaceae, and Phyllobacteriaceae family of species from those belonging to Bradyrhizobiaceae and Methylobacteriaceae. The insert in the former group of species is 19 aa long, whereas the latter species contain only a 9–10 aa insert. Because the insert sequence in all of these species is conserved, it is likely that the insert was introduced only once

in a common ancestor of the Rhizobiales and subsequent modification has led to the observed length variation. The distinctness of Bradyrhizobium and Rhodopseudomonas from other Rhizobiales is also supported by a signature (3 aa insert) in Seryl-tRNA synthetase (SerRS), which is uniquely present in these species (Figure 21B) and it serves to distinguish them from other Rhizobiales as well as other α-proteobacteria. A schematic diagram indicating the suggested positions where signatures described in Figures 20 to 23 have been introduced is presented in Figure 21C. We have also identified several signatures that are uniquely present in the Rhizobiaceae, Brucellaceae, and Phyllobacteriaceae families of species, but not found in other α-proteobacteria including Bradyrhizobium and Rhodopseudomonas. These signatures include a 7 aa insert in Oxoglutarate dehydrogenease (Figure 22A), a 5 aa insert in Succ-CoA synthase (Figure 22B),


124

R. S. GUPTA

FIG. 21. Signature sequences in Trp-tRNA synthetase (upper panel) and Ser-tRNA synthetase (lower panel) that are informative for α-proteobacteria. The first ofthese signatures is specific for Rhizobiales. The insert length in this signature also distinguishes Bradyrhizobiaceae and Methylobacteriaceae species from other Rhizobiales. The insert in the Ser-tRNA synthetase is specific for the Bradyrhizobiaceae species and distinguishes this family from other Rhizobiales.

a 3 aa insert in LytB metalloproteinase (Figure 23A) and a 2 aa insert in DNA gyrase A subunit (Figure 23B). A smaller insert in oxoglutarate dehydrogenase is also present in Novosphingobacteria, but since its sequence is unrelated, it is either of independent origin or could have resulted from LGT. In addition to these proteins, a 1–2 aa insert that is specific for Rhizobiaceae is also found in a conserved region of the LepA protein (Figure 23C). The evolutionary positions where these signatures have been introduced are indicated in Figure 21C. It is of interest that in contrast to other Rhizobiaceae species, which contain only 1 aa inserts, Sinorhizobium meliloti and Agrobacterium tumefacienes are found to contain 2 aa inserts in the LepA protein (Figure 23C). This observation points to a specific relationship between these two Rhizobiaceae species, as has been suggested based on other lines of evidences (Young et al. 2001). A 2 aa insert in the DnaK protein, which is commonly shared by species belonging to Rhizobium and Sinrhizobium genera, as

well as Ehrlichia and a few other proteobacteria, has also been described by Stepkowski et al. (2003). D. Signature Sequences Indicating the Phylogenetic Placement of α-Proteobacteria A number of signatures described in earlier work have indicated thatproteobacteria is a late branching phylum in comparison to other main groups within Bacteria (Gupta 1998, 2000, 2003; Gupta & Griffiths 2002; Griffiths & Gupta 2004b). These signatures included a 4 aa insert in alanyl-tRNA synthetase, an insert of >100 aa in RNA polymerase β (RpoB) subunit, a 10 aa insert in CTP synthase, a 2 aa insert in inorganic pyrophosphatase, and a 2 aa insert in Hsp70 protein. The identified signatures in these proteins were present in all proteobacterial homologs, but they were absent from most other bacterial phyla (viz. Firmicutes, Actinobacteria, Thermotogae, DeinococcusThermus, Cyanobacteria, Spirochetes). In a number of cases,


PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA

125

FIG. 22. Signature sequences in Oxoglutarate dehydrogenase (upper panel) and Succ-CoA synthase (lower panel) proteins that are commonly shared by only certain Rhizobiales families (e.g., Rhizobiaceae, Brucellaceae, and Phyllobacteriaceae), and not found in Bradyrhizobiaceae or other α-proteobacteria.

where the corresponding proteins were present in Archaea (viz. RpoB, Hsp70, AlaRS), the archael homologs also lacked the indicated inserts, indicating that the absence of these indels constitute the ancestral states and that these signatures were introduced after branching of the groups lacking these indels (Gupta & Griffiths 2002; Gupta 2003; Griffiths & Gupta 2004b). A number of identified signatures (7 aa insert in SecA, 1 aa deletion in the Lon protease) were uniquely shared by only the α, β, and γ -proteobacteria, providing evidence of the later branching of these subdivisions(Gupta 2000, 2001, 2003). Two additional signatures that are helpful in understanding the phylogenetic placement of α-proteobacteria are described in the following section.

Figure 24 shows the excerpt from a sequence alignment for the transcription termination factor Rho, which is an RNAbinding protein that plays a central role in the RNA chain termination (Opperman & Richardson 1994). This protein is present in all main groups of bacteria, except cyanobacteria (Gupta & Griffiths 2002; Gupta 2003), where RNA chain termination presumably occurs via a Rho-independent mechanism. A 3 aa insert is present in a highly conserved region of Rho, which is a distinctive characteristic of all α, β, and γ -proteobacteria. The length of this insert is 2–3 aa longer in various Rickettsiales species, which suggests an additional insert in this group of bacteria. In contrast to the α, β, and γ -proteobacteria, this insert is not present in δ, ε-proteobacteria or any other


126

R. S. GUPTA

FIG. 23. Signature sequences in LytB (A), DNA gyrase A (B) and LepA proteins that are distinctive characteristics of only certain Rhizobiales families (e.g., Rhizobiaceae, Brucellaceae, and Phyllobacteriaceae), but not found in Bradyrhizobiaceae or other α-proteobacteria.


PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA

127

FIG. 24. Partial sequence alignment of Rho protein showing a conserved insert that is commonly shared by various α, β, and γ -proteobacteria, but not found in any other groups of bacteria including the δ, -proteobacteria and all other phyla of gram-positive and gram-negative bacteria. Thisinsert was likely introduced in a common ancestor of the α, β, and γ -proteobacteria after the branching of other bacterial phyla (see Figure 26). Many other signatures showing similar distribution pattern and supporting the indicated branching position of α, β, and γ -proteobacteria have been described in earlier work.

groups of Gram-negative and Gram-positive bacteria. This signature provides evidence that the groups consisting of α, β, and γ -proteobacteria have branched off late in comparison to the other groups of bacteria. Another novel signature that is useful in understanding the branching position of αproteobacteria is present in the ATP synthase alpha subunit. In this case, an 11 aa insert in a highly conserved region

is present in various β and γ -proteobacteria, but it is not found in any α-proteobacteria or other groups of bacteria (Figure 25). The absence of this insert in various other bacteria as well as archael homologs provides evidence that it was introduced in a common ancestor of the β and γ -proteobacteria after the divergence of other bacteria, including α-proteobacteria (Figure 26).


128

R. S. GUPTA

FIG. 25. Partial sequence alignment of ATP synthase α-subunit showing a highly conserved insert that is commonly shared by various β and γ -proteobacteria, but not found in any other groups of bacteria including the α- and δ, -proteobacteria and all other phyla of Gram-positive and Gram-negative bacteria. This insert is also not present in archael or eukaryotic homologs indicating that it was introduced in a common ancestor of the β and γ -proteobacteria after thebranching of all other groups including α-proteobacteria. Other signatures showing similar relationships have been described in earlier work (Gupta 1998, 2000, 2001, 2003).


PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA

129

FIG. 26. Evolutionary relationships among α-proteobacteria based on signature sequences in different proteins. The branching position of α-proteobacteria relative to other groups of bacteria is based on signature sequences such as those shown in Figures 24 and 25. The evolutionary stages where these signatures have been introduced are indicated by thick arrows. Many other signatures that are helpful in resolving the branching order of other groups have been described in our earlier work (Gupta 1998, 2000, 2001, 2003, 2004; Gupta & Griffiths 2002; Griffiths & Gupta 2004b (see also www.bacterialphylogeny.com)). The evolutionary relationship among α-proteobacteria shown here was deduced based on the distribution patterns of different signatures described in this review. The long thin arrows mark the positions where the signature sequences in various proteins have likely been introduced.


130

R. S. GUPTA

CONCLUSIONS The α-proteobacteria are a morphologically and metabolically very diverse group of organisms, which are presently recognized as a distinct group solely on the basis of their branching pattern in the 16S rRNA tree (Woese et al. 1984; Stackebrandt et al. 1988; Murray et al. 1990; De Ley 1992; Ludwig & Klenk 2001; Kersters et al. 2003). No biochemical, molecular or other features are presently known, which are uniquely shared by various α-proteobacteriaand that can clearly distinguish this group from all others. The evolutionary relationships within this group of bacteria are also presently not understood. This review describes many novel signatures consisting of conserved inserts and deletions in widely distributed proteins that provide definitive means for defining the α-proteobacteria and many of its subgroups, and for understanding evolutionary relationships among them. Because of the rarity and highly specific nature of these genetic changes, the possibility of their arising independently by either convergent or parallel evolution is low (Gupta 1998; Rokas & Holland 2000). The simplest and most parsimonious explanation for such rare genetic changes, when restricted to a particular clade(s), is that they were introduced only once in common ancestors of the particular group(s) and then passed on to various descendants. The signature approach has proven very useful in the past in clarifying a number of important evolutionary relationships, which could not be reliably resolved based on phylogenetic trees (Rivera & Lake 1992; Baldauf & Palmer 1993). Our earlier work has identified many signatures that are either specific for particular groups of bacteria (viz. chlamydiae, cyanobacteria, Bacteroidetes-ChlorobiFibrobacter, Deinococcus-Thermus, Proteobacteria) (Gupta 2000, 2004; Griffiths & Gupta 2002, 2004a; Gupta et al. 2003), or which are commonly shared by certain bacterial phyla providing information regarding their interrelationships (Gupta 1998, 2003; Gupta & Griffiths 2002; Griffiths & Gupta 2004b). A summary of the different signatures that weredescribed in this review and the overall picture of α-proteobacterial evolution that emerges based upon them is presented in Figure 26. Most of the signatures described here were unique for either all α-proteobacteria or certain of its subgroups, and except for a few isolated instances, they were not found in other bacteria. These finding provides evidence that the genes containing these signatures have not been laterally transferred from α-proteobacteria to other bacteria, although LGT for certain other genes have been previously reported (Wolf et al. 1999). A large number of these signatures, present in broadly distributed proteins (cytochrome assembly protein Ctag, SAICAR synthetase, DnaB, ATP synthase α, exonuclease VII, PLPG transferase, RP-400, puruvate phosphate dikinase, FtsK, and Cyt b) were distinctive characteristics of all α-proteobacteria. Two additional proteins, MutY and homoserine dehydrogenase, also contain signatures that were specific for α-proteobacteria. However, the homologs of these proteins were not found in Rickettsiales. These signatures, for the first time, describe molecular characteristics that unify all α-proteobacteria, and provide means to clearly distin-

guish them from all other bacteria. The unique presence of these signatures in various α-proteobacteria, which is a very diverse group (Kersters et al. 2003), strongly suggests that these indels should be functionally important for this group of organisms. Hence, studies examining their functional effects should be of much interest. Signature sequences in other proteins are helpful in defining many of the α-proteobacteriasubgroups and in clarifying evolutionary relationships among them. A number of proteins, which include, Succ-CoA synthetase, Cox I, AlaRS, and MutS, contain conserved inserts that are shared by all other α-proteobacteria, except the Rickettsiales. The homologs of these proteins from other bacteria also lack these indels providing evidence that these signatures were introduced in a common ancestor of other αproteobacteria after the divergence of Rickettsiales. The Rickettsiales order also consistently forms the deepest branching lineage in 16S rRNA and various protein trees (Dumler et al. 2001; Gaunt et al. 2001; Kersters et al. 2003; Yu & Walker 2003; Stepkowski et al. 2003). Signature sequences in a number of proteins were found to be specific for either the Rickettsiales order (viz. XerD integrase and leucine aminopeptidase) or the two main families, Rickettsiaceae (viz. transcription repair coupling factor, ribosomal protein L19, and FtsZ proteins) and Anaplasmataceae (RP-314 and Tgt proteins). These signatures were likely introduced in the common ancestors of these groups. These groups are also clearly distinguished in the phylogenetic trees based on 16S rRNA (Figure 1) (Dumler et al. 2001; Yu & Walker 2003) and various proteins (Figure 7) (Stepkowski et al. 2003; Emelyanov 2003a). Signature sequences in a number of proteins (viz. chromosomal replication factor, RP-057 and DNA ligase) were commonly shared by various Rhizobiales, Rhodobacterales, and in most cases Caulobacterales (currently represented by only C. crescentus), but they were not present in Rickettsiales, Rhodospirillales as well asSphingomonadales species. These results provide evidence that the groups lacking these signatures diverged prior to the introduction of these signatures. A unique signature has also been identified for the Rhizobiales order (viz. TrpRS), and one which is commonly shared by Rhodobacterales and C. crescentus. The latter signature suggests a specific relationship between the Rhodobacterales and Caulobacter groups. The relationships indicated by these signatures are also generally supported by the phylogenetic trees based on 16S rRNA and various proteins (Gaunt et al. 2001; Kersters et al. 2003; Stepkowski et al. 2003; Emelyanov 2003a). Signatures sequences in a number of other proteins (viz. oxoglutarate dehydrogenase, Succ-CoA synthase, DNA gyrase A, LepA, and LytB), are able to distinguish the Rhizobiaceae, Brucellaceae, and Phyllobacteriaceae families from the Bradyrhizobiaceae species. The distinctness of Bradyrhizobiaceae from other Rhizobiales is also clearly indicated by a signature sequence in seryl-tRNA synthetase that is specific for this group. These signatures are also consistent with the observation that Bradyrhizobiaceae species are only distantly related to other Rhizobiales (viz. Rhizobeaceae,


PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA

131

Brucellaceae, and Phyllobacteriaceae) (Figure 1) (Sadowsky & Graham 2000; Gaunt et al. 2001; van Berkum et al. 2003; Kersters et al. 2003; Stepkowski et al. 2003; Moulin et al. 2004). A specific relationship between Sinorhizobium and Agrobacterium species was also indicated by the signature sequence in the LepA protein. On the basis of16S rRNA or various genes/proteins trees, it has proven difficult to reliably determine the interrelationships among different α-proteobacterial subgroups (Ludwig & Klenk 2001; Kersters et al. 2003). However, based upon the distribution patterns of various signatures, it is now possible to logically deduce the branching order of the main α-proteobacterial subgroups (Figure 26). The model for α-proteobacterial evolution, which has been developed here is based upon a large number of proteins, which are involved in different functions. This model is internally highly consistent and it is difficult to logically explain the observed distributions of these signatures by alternate means. The model developed here is also consistent with the relationships, which are resolved in the 16S rRNA or other phylogenetic trees (viz. deep branching and distinctness of Rickettsiales, a closer relationship between Rhizobiaceae, Brucellaceae, and Phyllobacteriaceae as compared to Bradyrhizobiaceae; a closer relationship between Rhodobacterales and Caulobacterales; distinctness of Rickettsiaceae from Anaplasmataceae species; distinctness of Rhizobiales order containing various root nodule bacteria, etc.) (Sadowsky & Graham 2000; Dumler et al. 2001; Kersters et al. 2003; Yu & Walker 2003; Moulin et al. 2004). A few minor inconsistencies seen at present (e.g., phylogenetic placement of Ca. crescentus) should be clarified when sequence information from additional species becomes available. In this context, it is important to acknowledge that sequence information is available at present from only a limited number ofα-proteobacterial species. Although, these species include representatives from different α-proteobacterial orders, it is necessary to obtain sequence information for many other species from different genera and families to test and validate this model. Signature sequences in a number of proteins, a few of which are described here, also provide evidence that α-proteobacteria is a late diverging group within Bacteria (Gupta 1998, 2000, 2003; Gupta & Griffiths 2002). Within proteobacteria, δ and -subdivisions are indicated to have branched prior to α-proteobacteria, whereas β and γ -subdivisions are indicated as later branching groups (see also www.bacterialphylogeny.com). The branching of α-proteobacteria in this position is also supported by the16S rRNA and various protein trees (Olsen et al. 1994; Viale et al. 1994; Eisen 1995; Kersters et al. 2003). The αproteobacteria, which is a very large group within Bacteria (>5000 entries in the RDP-II database) (Maidak et al. 2001), are presently recognized as a Class within the Proteobacteria phylum (Woese et al. 1984; Stackebrandt et al. 1988; Murray et al. 1990; Ludwig & Schleifer 1999; Boone et al. 2001; Kersters et al. 2003). However, presently there are no clearly defined criteria for the higher taxa (viz. Phylum, Class, Order, etc.)

within Bacteria (Woese et al. 1985; Stackebrandt 2000; Ludwig & Klenk 2001; Gupta & Griffiths 2002; Gupta 2002). Based on the observations that α-proteobacteria can now be clearly distinguished from all other bacteria based upon a large number of molecular characteristics, and that this group also branches distinctly from all other groupsof bacteria including the β, γ - and δ, -proteobacteria, it is suggested that α-proteobacteri should be recognized as a main group or phylum within Bacteria, rather than as a subdivision or class of the Proteobacteria (Gupta 2000, 2004; Gupta & Griffiths 2002). Signature sequences in a few proteins (viz. PPDK and FtsK) indicate that α-proteobacteria might have shared a distant ancestry with the δ-proteobacteria exclusive of other bacteria, but this relationship needs to be further investigated and confirmed. The α-proteobacteria have also given rise to mitochondria (Margulis 1970; Gray & Doolittle 1982; Andersson et al. 1998; Sicheritz-Ponten et al. 1998; Gray et al. 1999; Gupta 2000; Emelyanov 2001a, 2003a, 2003b) and very likely played a central role in the origin of the ancestral eukaryotic cell (Gupta & Singh 1994; Gupta & Golding 1996; Margulis 1996; Gupta 1998; Martin & Muller 1998; Lopez-Garcia & Moreira 1999; Karlin et al. 1999; Lang et al. 1999; Emelyanov 2003b; Rivera & Lake 2004). Many of the α-proteobacteria specific signatures identified in the present work are also present in the mitochondrial/eukaryotic homologs, providing additional evidence of their derivation from an α-proteobacterial ancestor. In a few cases, the α-proteobacterial signatures are present in genes which are encoded by the mitochondrial DNA (viz. Cox I and Cyt b). The shared presence of these signatures in the mitochondrial homologs provides further strong evidence for the α-proteobacterial ancestry of mitochondria, as previously shown by phylogenetic analysis (Andersson et al. 1998; Sicheritz-Ponten et al. 1998;Emelyanov 2003a). The current evidence suggests that within α-proteobacteria, the Rickettsiales group of species are the closest relatives of mitochondria (Gupta 1995; Andersson et al. 1998; Sicheritz-Ponten et al. 1998; Gray et al. 1999; Lang et al. 1999; Emelyanov 2001a, 2001b). However, this view is supported by only some of the identified signatures and further work is needed to clarify this aspect.

LIST OF ABBREVIATIONS AlaRS, alanyl-tRNA synthetase; CFBG, ChlamydiaFibrobacter-Bacteroidetes-Green sulfur bacteria; Cyt., Cytochrome; Cox I, Cytochrome oxidase polypeptide I; LGT, lateral gene transfer; PLPG, Prolipoprotein-phosphatidylgycerol; PPDK, pyruvate phosphate dikinase; RP, Rickettsia prowazekii; SerRS, serine-tRNA synthetase; Succ-CoA, Succinyl-CoA; Tgt, tRNA-guanine transglycosylase; TrpRS, tryptophanyl-tRNA synthetase; Abbreviations in the species names are: A., Agrobacterium; Ana., Anaplasma; Aqu., Aquifex; Azo., Azotobacter; Azospir., Azospirillum; Bac., Bacillus; Bact., Bacteroides; Bart., Bartonella; Bdello., Bdeollovibrio; Bif., Bifidobacterium; Bor., Borrelia; Bord., Bordetella; Brad. Bradyrhizobium; Bru.,


132

R. S. GUPTA Boone, D.R., Castenholz, R.W., and Garrity, G.M. 2001. Bergey’s Manual of Systematic Bacteriology. Springer, New York. Boussau, B., Karlberg, E.O., Frank, A.C., Legault, B.A., and Andersson, S.G. 2004. Computational inference of scenarios for alpha-proteobacterial genome evolution. Proc. Natl. Acad. Sci. USA 101, 9722–9727. Bridger, W.A., Wolodko, W.T., Henning, W., Upton, C., Majumdar, R., and Williams, S.P. 1987. The subunits of succinyl-coenzyme A synthetase—function and assembly. Biochem. Soc. Symp. 103–111. Broughton, W. J. 2003. Roses by other names: Taxonomy of the Rhizobiaceae. J. Bacteriol. 185, 2975–2979. Capiaux, H., Lesterlin, C., Perals, K., Louarn, J.M., and Cornet, F. 2002. A dual role for the FtsK protein in Escherichia coli chromosome segregation. EMBO Rep. 3, 532–536. Chase, J.W., Rabin, B.A., Murphy, J.B., Stone, K.L., and Williams, K.R. 1986. Escherichia coli exonuclease VII. Cloning and sequencing of the gene encoding the large subunit (xseA). J. Biol. Chem. 261, 14929–14935. Daldal, F., Davidson, E., and Cheng, S. 1987. Isolation of the structural genes for the Rieske Fe-S protein, cytochrome b and cytochrome c1 all components of the ubiquinol: Cytochrome c2 oxidoreductase complex of Rhodopseudomonas capsulata. J. Mol. Biol. 195, 1–12. Davidson, E., and Daldal, F. 1987. Primary structure of the bc1 complex of Rhodopseudomonas capsulata. Nucleotide sequence of the pet operon encoding the Rieske cytochrome b, and cytochrome c1 apoproteins. J. Mol. Biol. 195, 13–24. De Ley, J. 1992. The Proteobacteria: Ribosomal RNA cistron similarities and bacterial taxonomy. In The Prokaryotes, eds. A. Balows, H.G. Tr¨ per, u M. Dworkin, W. Harder, and K.H. Schleifer, 2111–2140. Springer-Verlag, New York. DelVecchio, V.G., Kapatral, V., Redkar, R.J., Patra, G., Mujer, C., Los, T., Ivanova, N., Anderson, I., Bhattacharyya, A., Lykidis, A., Reznik, G., Jablonski, L., Larsen, N., D’Souza, M., Bernal, A., Mazur, M., Goltsman, E., Selkov, E., Elzer, P. H., Hagius, S., O’Callaghan, D., Letesson, J. J., Haselkorn, R., and Kyrpides, N. 2002. The genome sequence ofthe facultative intracellular pathogen Brucella melitensis. Proc. Natl. Acad. Sci. USA 99, 443–448. Dumler, J.S., Barbet, A.F., Bekker, C.P., Dasch, G.A., Palmer, G.H., Ray, S.C., Rikihisa, Y., and Rurangirwa, F.R. 2001. Reorganization of genera in the families Rickettsiaceae and Anaplasmataceae in the order Rickettsiales: Unification of some species of Ehrlichia with Anaplasma, Cowdria with Ehrlichia and Ehrlichia with Neorickettsia, descriptions of six new species combinations and designation of Ehrlichia equi and ‘HGE agent’ as subjective synonyms of Ehrlichia phagocytophila. Int. J. Syst. Evol. Microbiol. 51, 2145–2165. Eisen, J.A. 1995. The RecA protein as a model molecule for molecular systematic studies of bacteria: Comparison of trees of RecAs and 16S rRNAs from the same species. J. Mol. Evol. 41, 1105–1123. Emelyanov, V.V. 2001a. Evolutionary relationship of Rickettsiae and mitochondria. FEBS Letters 501, 11–18. Emelyanov, V.V. 2001b. Rickettsiaceae, rickettsia-like endosymbionts, and the origin of mitochondria. Biosci. Rep. 21, 1–17. Emelyanov, V.V. 2003a. Common evolutionary origin of mitochondrial and rickettsial respiratory chains. Arch. Biochem. Biophys. 420, 130–141. Emelyanov, V.V. 2003b. Mitochondrial connection to the origin of eukaryotic cell. Eur. J. Biochem. 270, 1599–1618. Espeli, O., Lee, C., and Marians, K.J. 2003. A physical and functional interaction between Escherichia coli FtsK and topoisomerase IV. J. Biol. Chem. 278, 44639–44644. Esser, C., Ahmadinejad, N., Wiegand, C., Rotte, C., Sebastiani, F., GeliusDietrich, G., Henze, K., Kretschmann, E., Richly, E., Leister, D.,Bryant, D., Steel, M.A., Lockhart, P.J., Penny, D., and Martin, W. 2004. A Genome Phylogeny for Mitochondria Among -Proteobacteria and a Predominantly Eubacterial Ancestry of Yeast Nuclear Genes. Mol. Biol. Evol. 21, 1643–1660. Falah, M., and Gupta, R.S. 1994. Cloning of the hsp70 (dnaK) genes from Rhizobium meliloti and Pseudomonas cepacia: Phylogenetic analyses of mitochondrial origin based on a highly conserved protein sequence. J. Bacteriol. 176, 7748–7753.

Brucella; Buch., Buchnera; Burk., Burkholderia; Ca., Caulobacter; Camp., Campylobacter; Cb., Chlorobium; Cfx., Chloroflexus; Chl., Chlamydia; Chlam, Chlamydophila; Chromo., Chromo-bacterium; Clo., Clostridium; Cor., Cornyebacterium; Cox., Coxiella; Cyt., Cytophaga; Dei., Deinococcus; Dechloro., Dechloromonas; Des., Desulfovibrio; Desulf., Desulfitobacterium; Dros. endo., Drosophila endosymbiont; E., Escherichia; Ent., Enterococcus; Fuso., Fusobacterium; Geo., Geobacter; H., Haemophilus; Hel., Helicobacter; Lac., Lactococcus; Lactobac., Lactobacillus; Lep., Leptospira; Lis., Listeria; Leg., Legionella; Mag., Magnetococcus; Meso., Mesorhizobium; Methano., Methanobacterium; Methyl., Methylobacillus; Microbul., Microbulbifer; Myc., Mycobacterium; Myx., Myxococcus; Nei., Neisseria; Nit., Nitrosomonas; Nitro., Nitrosospira; Novo., Novosphingobacterium; Olig., Oligotropha; Para., Paracoccus; Pas., Pasteurella; Photobac., Photobacterium; Por., Porphyromonas; Pse., Pseudomonas; Ral., Ralstonia; Rhi., Rhizobium; Rho., Rhodobacter; Rhodo., Rhodospirillum; Rhodopseud., Rhodopseudomonas; Ri., Rickettsia; Shew., Shewanella; Sino., Sinorhizobium; Sta.,Staphylococcus; Str., Streptomyces; Strep., Streptococcus; Syn., Synechococcus; Sulfo., Sulfolobus; T., Thermotoga; Thermoan., Thermoanaerobacter; Thermosyn., Thermosynechococcus; Tre., Treponema; Vib., Vibrio; Xan., Xanthomonas; Thiobac., Thiobacillus; Wol., Wolinella; Xyl., Xylella; Yer., Yersinia; Z., Zymomonas. ACKNOWLEDGMENTS The competent technical assistance of Pinay Kanth, Jeveon Clements, Larissa Shamseer, and Adeel Mahmood in creating sequence alignments of proteins from Rickettsia prowazekii and other genomes is thankfully acknowledged. I am also thankful to Yan Li for developing certain computer programs that facilitated the creation of signature sequence files and for help in setting up the bacterial signatures website (www.bacterialphylogeny.com). Thanks are also due to Emma Griffiths and Pinay Kanth for helpful comments on the manuscript. The work on signature sequences described here was mostly completed by August 2004. This work was supported by a research grant from the National Science and Engineering Research Council of Canada and the Canadian Institute of Health Research. REFERENCES
Andersson, S.G., Zomorodipour, A., Andersson, J.O., Sicheritz-Ponten, T., Alsmark, U.C., Podowski, R.M., Naslund, A.K., Eriksson, A.S., Winkler, H.H., and Kurland, C.G. 1998. The genome sequence of Rickettsia prowazekii and the origin of mitochondria. Nature 396, 133–140. Baldauf, S.L., and Palmer, J.D. 1993. Animals and fungi are each other’s closest relatives: Congruent evidence from multiple proteins. Proc. Natl. Acad. Sci. USA 90, 11558–11562. Battistuzzi, F.U., Feijao, A., and Hedges, S.B. 2004. Agenomic timescale of prokaryote evolution: Insights into the origin of methanogenesis, phototrophy, and the colonization of land. BMC. Evol. Biol. 4, 44. Bengtsson, J., von Wachenfeldt, C., Winstedt, L., Nygaard, P., and Hederstedt, L. 2004. CtaG is required for formation of active cytochrome c oxidase in Bacillus subtilis. Microbiology 150, 415–425.


PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA Galibert, F., Finan, T.M., Long, S.R., Puhler, A., Abola, P., Ampe, F., BarloyHubler, F., Barnett, M. J., Becker, a., Boistard, P., Bothe, G., Boutry, M., Bowser, L., Buhrmester, J., Cadieu, E., Capela, D., Chain, P., Cowie, A., Davis, R. W., Dreano, s., Federspiel, N. A., Fisher, R. F., Gloux, S., godrie, T., Goffeau, A., Golding, B., Gouzy, J., Gurjal, M., Hernandez-Lucas, I., Hong, A., Huizar, L., Hyman, R. W., Jons, T., Kahn, D., Kahn, M. L., Kalman, S., Keating, D. H., Kiss, E., Komp, c., Lelaure, v., Masuy, d., Palm, C., Peck, M. C., Pohl, T. M., Portetelle, d., Purnelle, B., Ramsperger, U., Surzycki, r., Thebault, P., Vandenbol, M., Vorholter, F. J., Weidner, S., Wells, D. H., Wong, K., Yeh, K. C., and Batut, J. 2001. The composite genome of the legume symbiont Sinorhizobium meliloti. Science 293, 668–672. Garrity, G.M., and Holt, J.G. 2001. The road map to the manual. In Bergey’s Manual of Systematic Bacteriology, eds. D. R. Boone and R. W. Castenholz, 119–166. Springer-Verlag, Berlin. Gaunt, M.W., Turner, S.L., Rigottier-Gois, L., Lloyd-Macgilp, S.A., and Young, J.P. 2001. Phylogenies of atpD and recA support the small subunit rRNAbased classification of rhizobia. Int. J. Syst. Evol.Microbiol. 51, 2037– 2048. Gonzales, T., and Robert-Baudouy, J. 1996. Bacterial aminopeptidases: Properties and functions. FEMS Microbiol. Rev. 18, 319–344. Gray, M.W. 1989. The evolutionary origins of organelles. Trends in Genet. 5, 294–299. Gray, M.W., Burger, G., and Lang, B.F. 1999. Mitochondrial evolution. Science 283, 1476–1481. Gray, M.W., and Doolittle, W.F. 1982. Has the endosymbiont hypothesis been proven?. Microbiol. Rev. 46, 1–42. Griffiths, E., and Gupta, R.S. 2002. Protein signatures distinctive of chlamydial species: Horizontal transfer of cell wall biosynthesis genes glmU from Archaebacteria to Chlamydiae, and murA between Chlamydiae and Streptomyces. Microbiology 148, 2541–2549. Griffiths, E., and Gupta, R.S. 2004a. Distinctive protein signatures provide molecular markers and evidence for the monophyletic nature of the Deinococcus-Thermus phylum. J. Bacteriol. 186, 3097–3107. Griffiths, E., and Gupta, R.S. 2004b. Signature sequences in diverse proteins provide evidence for the late divergence of the order Aquificales. International Microbiol. 7, 41–52. Gupta, R.S. 1995. Evolution of the chaperonin families (Hsp60, Hsp10 and Tcp1) of proteins and the origin of eukaryotic cells. Mol. Microbiol. 15, 1–11. Gupta, R.S. 1998. Protein phylogenies and signature sequences: a reappraisal of evolutionary relationships among archaebacteria, eubacteria, and eukaryotes. Microbiol. Mol. Biol. Rev. 62, 1435–1491. Gupta, R.S. 2000. The phylogeny of Proteobacteria: Relationships to other eubacterial phyla and eukaryotes. FEMS Microbiol. Rev. 24, 367–402. Gupta, R.S. 2001. The branching order and phylogeneticplacement of species from completed bacterial genomes, based on conserved indels found in various proteins. Inter. Microbiol. 4, 187–202. Gupta, R.S. 2002. Phylogeny of Bacteria: Are we now close to understanding it?. ASM News. 68, 284–291. Gupta, R.S. 2003. Evolutionary relationships among photosynthetic bacteria. Photosynth. Res. 76, 173–183. Gupta, R.S. 2004. The phylogeny and signature sequences characteristics of Fibrobacters, Chlorobi and Bacteroidetes. Crit. Rev. Microbiol. 30, 123– 143. Gupta, R.S., Aitken, K., Falah, M., and Singh, B. 1994. Cloning of Giardia lamblia heat shock protein HSP70 homologs: Implications regarding origin of eukaryotic cells and of endoplasmic reticulum. Proc. Natl. Acad. Sci. USA 91, 2895–2899. Gupta, R.S., Bustard, K., Falah, M., and Singh, D. 1997. Sequencing of heat shock protein 70 (DnaK) homologs from Deinococcus proteolyticus and Thermomicrobium roseum and their integration in a protein-based phylogeny of prokaryotes. J. Bacteriol. 179, 345–357. Gupta, R.S., and Golding, G.B. 1996. The origin of the eukaryotic cell. Trends Biochem. Sci. 21, 166–171. Gupta, R.S., and Griffiths, E. 2002. Critical issues in bacterial phylogenies. Theor. Popul. Biol. 61, 423–434.

133

Gupta, R.S., Pereira, M., Chandrasekera, C., and Johari, V. 2003. Molecular signatures in protein sequences that are characteristic of Cyanobacteria and plastid homologues. Int. J. Syst. Evol. Microbiol. 53, 1833–1842. Gupta, R.S., and Singh, B. 1994. Phylogenetic analysis of 70 kD heat shock protein sequences suggests a chimeric origin for the eukaryotic cell nucleus. Curr. Biol. 4, 1104–1114.Hiser, L., Di Valentin, M., Hamer, A.G., and Hosler, J.P. 2000. Cox11p is required for stable formation of the Cu(B) and magnesium centers of cytochrome c oxidase. J. Biol. Chem. 275, 619–623. Hui, F.M., and Morrison, D.A. 1993. Identification of a purC gene from Streptococcus pneumoniae. J. Bacteriol. 175, 6364–6367. Ip, S.C., Bregu, M., Barre, F.X., and Sherratt, D.J. 2003. Decatenation of DNA circles by FtsK-dependent Xer site-specific recombination. EMBO J. 22, 6399–6407. Jeanmougin, F., Thompson, J.D., Gouy, M., Higgins, D.G., and Gibson, T.J. 1998. Multiple sequence alignment with Clustal x. Trends Biochem. Sci. 23, 403–405. Kaneko, T., Nakamura, Y., Sato, S., Asamizu, E., Kato, T., Sasamoto, S., Watanabe, a., Idesawa, K., Ishikawa, a., Kawashima, K., Kimura, t., Kimura, T., Kishida, Y., Kiyokawa, c., Kohara, M., Matsumoto, M., Matsuno, a., Mochizuki, Y., Nakayama, S., Nakazaki, N., Shimpo, S., Sugimoto, M., Takeuchi, C., Yamada, M., and tabata, S., Complete genome structure of the nitrogen-fixing symbiotic bacterium Mesorhizobium loti. DNA Res. 7, 331–338. Kaneko, T., Nakamura, Y., Sato, S., Minamisawa, K., UCHIUMI, T., Sasamoto, s., Watanabe, A., Idesawa, K., Iriguchi, M., Kawashima, K., Kohara, M., Matsumoto, M., Shimpo, S., Tsuruoka, H., Wada, T., Yamada, M., and Tabata, S., 2002. Complete genomic sequence of nitrogen-fixing symbiotic bacterium Bradyrhizobium japonicum USDA110. DNA Res. 9, 189–197. Karlin, S., and Brocchieri, L. 2000. Heat shock protein 60 sequence comparisons: Duplications, lateral transfer, and mitochondrial evolution. Proc. Natl. Acad. Sci. USA 97, 11348–11353. Karlin, S.,Brocchieri, L., Mrazek, J., Campbell, A.M., and Spormann, A.M. 1999. A chimeric prokaryotic ancestory of mitochondria and primitive eukaryotes. Proc. Natl. Acad. Sci. USA 96, 9190–9195. Kersters, K., Devos, P., Gillis, M., Vandamme, P., and Stackebrandt, E. 2003. Introduction to the Proteobacteria. In The Prokaryotes: An Evolving Electronic Resource for the Microbiological Community, ed. M. e. al. Dworkin, Springer-Verlag, New York. Kolber, Z.S., Plumley, F.G., Lang, A.S., Beatty, J.T., Blankenship, R.E., VanDover, C.L., Vetriani, C., Koblizek, M., Rathgeber, C., and Falkowski, P.G. 2001. Contribution of aerobic photoheterotrophic bacteria to the carbon cycle in the ocean. Science 292, 2492–2495. Ku, M.S., Kano-Murakami, Y., and Matsuoka, M. 1996. Evolution and expression of C4 photosynthesis genes. Plant Physiol. 111, 949–957. Kurland, C.G., and Andersson, S.G. 2000. Origin and evolution of the mitochondrial proteome. Microbiol. Mol. Biol. Rev. 64, 786–820. Lake, J.A., and Rivera, M.C. 1994. Was the nucleus the first endosymbiont? Proc. Natl. Acad. Sci. USA 91, 2880–2881. Lang, B.F., Gray, M.W., and Burger, G. 1999. Mitochondrial genome evolution and the origin of eukaryotes. Annual Review of Genetics 33, 351– 397. Larimer, F.W., Chain, P., Hauser, L., Lamerdin, J. Malfatti, S., Do, L., Land, M. L., Pelletier, D. A., Beatty, J. t., Lang, A. S., Tabita, F. R., Gibson, J. L., Hanson, T. E., Bobst, C., Torres, J. L., Peres, C., Harrison, F. H., Gibson, J., and Harwood, C. S., 2004. Complete genome sequence of the metabolically versatile photosynthetic bacterium Rhodopseudomonas palustris. Nat. Biotechnol.22, 55–56. Leyva, J.A., Bianchet, M.A., and Amzel, L.M. 2003. Understanding ATP synthesis: Structure and mechanism of the F1-ATPase (Review). Mol. Membr. Biol. 20, 27–33. Lopez-Garcia, P., and Moreira, D. 1999. Metabolic symbiosis at the origin of eukaryotes. Trends Biochem. Sci. 24, 88–93. Ludwig, W., and Klenk, H.-P. 2001. Overview: A phylogenetic backbone and taxonomic framework for prokaryotic systamatics. In Bergey’s Manual of


134

R. S. GUPTA Ayodeji, B., Kraul, M., Shetty, J., Malek, J., Van Aken, S. E., Reidmuller, S., Tettelin, H., Gill, S. R., White, O., Salzberg, S. L., Hoover, D. L., Lindler, L. E., Halling, s. M., Boyle, S. M., and Fraser, C. M., 2002. The Brucella suis genome reveals fundamental similarities between animal and plant pathogens and symbionts. Proc. Natl. Acad. Sci. USA 99, 13148–13153. Qi, H.Y., Sankaran, K., Gan, K., and Wu, H.C. 1995. Structure-function relationship of bacterial prolipoprotein diacylglyceryl transferase: Functionally significant conserved regions. J. Bacteriol. 177, 6820–6824. Reuter, K., and Ficner, R. 1995. Sequence analysis and overexpression of the Zymomonas mobilis tgt gene encoding tRNA-guanine transglycosylase: Purification and biochemical characterization of the enzyme. J. Bacteriol. 177, 5284–5288. Ribeiro, S., and Golding, G.B. 1998. The mosaic nature of the eukaryotic nucleus. Mol. Biol. Evol. 15, 779–788. Rivera, M.C., and Lake, J.A. 1992. Evidence that eukaryotes and eocyte prokaryotes are immediate relatives. Science 257, 74–76. Rivera, M.C., and Lake, J.A. 2004. The ring of life provides evidence for a genome fusion origin of eukaryotes.Nature 431, 152–155. Rokas, A., and Holland, P.W. 2000. Rare genomic changes as a tool for phylogenetics. Trends Ecol. Evol. 15, 454–459. Romanowski, M.J., Bonanno, J.B., and Burley, S.K. 2002. Crystal structure of the Escherichia coli glucose-inhibited division protein B (GidB) reveals a methyltransferase fold. Proteins 47, 563–567. Sadowsky, M.J. and P.H. Graham. 2000. Root and Stem Nodule Bacteria of Legumes. In The Prokaryotes: An Evolving Electronic Resource for the Microbiological Community, ed. M. e. al. Dworkin. Springer-Verlag, New York. Sawada, H., Kuykendall, L.D., and Young, J.M. 2003. Changing concepts in the systematics of bacterial nitrogen-fixing legume symbionts. J. Gen. Appl. Microbiol. 49, 155–179. Sicheritz-Ponten, T., Kurland, C.G., and Andersson, S.G. 1998. A phylogenetic analysis of the cytochrome b and cytochrome c oxidase I genes supports an origin of mitochondria from within the Rickettsiaceae. Biochim. Biophys. Acta. 1365, 545–551. Sixma, T.K. 2001. DNA mismatch repair: MutS structures bound to mismatches. Curr. Opin. Struct. Biol. 11, 47–52. Soni, R.K., Mehra, P., Choudhury, N.R., Mukhopadhyay, G., and Dhar, S.K. 2003. Functional characterization of Helicobacter pylori DnaB helicase. Nucleic Acids Res. 31, 6828–6840. Stackebrandt, E. 2000. Defining Taxonomic Ranks. In The Prokaryotes: An Evolving Electronic Resource for the Microbiological Community, ed. M. e. al. Dworkin. Springer-Verlag, New York. Stackebrandt, E., Murray, R.G.E., and Tr¨ per, H.G. 1988. Proteobacteria classis u nov., a name for the phylogenetic taxon that includes the “Purple bacteria and theirRelatives.” Int. J. Syst. Bacteriol. 38, 321–325. Stepkowski, T., Czaplinska, M., Miedzinska, K., and Moulin, L. 2003. The variable part of the dnaK gene as an alternative marker for phylogenetic studies of rhizobia and related alpha Proteobacteria. Syst. Appl. Microbiol. 26, 483–494. Stryer, L. 1995. Biochemistry. W.H. Freeman and Co., New York. Taillardat-Bisch, A.V., Raoult, D., and Drancourt, M. 2003. RNA polymerase beta-subunit-based phylogeny of Ehrlichia spp., Anaplasma spp., Neorickettsia spp. and Wolbachia pipientis. Int. J. Syst. Evol. Microbiol. 53, 455– 458. van Berkum, P., Terefework, Z., Paulin, L., Suomalainen, S., Lindstrom, K., and Eardly, B.D. 2003. Discordant phylogenies within the rrn loci of Rhizobia. J. Bacteriol. 185, 2988–2998. Van Sluys, M.A., Monteiro-Vitorello, C.B., Camargo, L.E., Menck, C.F., da Silva, A.C., Ferro, J.A., Oliveira, M.C., Setubal, J.C., Kitajima, J.P., and Simpson, A.J. 2002. Comparative genomic analysis of plant-associated bacteria. Annu. Rev. Phytopathol. 40, 169–189. Viale, A.M., and Arakaki, A.K. 1994. The chaperone connection to the origins of the eukaryotic organelles. FEBS Letters 341, 146–151. Viale, A.M., Arakaki, A.K., Soncini, F.C., and Ferreyra, R.G. 1994. Evolutionary relationships among eubacterial groups as inferred from GroEL (chaperonin) sequence comparisons. Int. J. Syst. Bacteriol. 44, 527–533.

Systematic Bacteriology, eds. D. R. Boone and R. W. Castenholz, 49–65. Springer-Verlag, Berlin. Ludwig, W., and Schleifer, K.H. 1999. Phylogeny of Bacteria beyond the 16S rRNA Standard. ASM News 65, 752–757. Maidak, B.L., Cole, J.R., Lilburn, T.G., Parker,C.T., Jr., Saxman, P.R., Farris, R.J., Garrity, G.M., Olsen, G.J., Schmidt, T.M., and Tiedje, J.M. 2001. The RDP-II (Ribosomal Database Project). Nucleic Acids Res. 29, 173– 174. Margulis, L. 1970. Origin of Eukaryotic cells. Yale University Press, New Haven, CT. Margulis, L. 1993. Symbiosis in Cell Evolution. W.H. Freeman and Company, New York. Margulis, L. 1996. Archaeal-eubacterial mergers in the origin of Eukarya: Phylogenetic classification of life. Proc. Natl. Acad. Sci. USA 93, 1071–1076. Martin, W., and Muller, M. 1998. The hydrogenosome hypothesis for the first eukaryote. Nature 392, 37–41. Martins-Pinheiro, M., Galhardo, R.S., Lage, C., Lima-Bessa, K.M., Aires, K.A., and Menck, C.F. 2004. Different patterns of evolution for duplicated DNA repair genes in bacteria of the Xanthomonadales group. BMC. Evol. Biol. 4, 29. McLeod, M.P., Qin, X., Karpathy, S.E., Gioia, J. Highlander, S. K., Fox, G. E., McNeill, T. Z., Jiang, H., Muzny, d., Jacob, L. S., Hawes, A. C., Sodergren, E., Gill, R., Hume, J., Morgan, M., Fan, G., Amin, A. G., Gibbs, R. A., Hong, C., Yu, X. J., Walker, D. H., and Weinstock, G. M., 2004. Complete genome sequence of Rickettsia typhi and comparison with sequences of other rickettsiae. J. Bacteriol. 186, 5842–5855. Messer, W. 2002. The bacterial replication initiator DnaA. DnaA and oriC, the bacterial mode to initiate DNA replication. FEMS Microbiol. Rev. 26, 355– 374. Morden, C.W., Delwiche, C.F., Kuhsel, M., and Palmer, J.D. 1992. Gene phylogenies and the endosymbiotic origin of plastids. Biosystems 28, 75–90. Moreno, E., and Moriyon, I. 2001. The Genus Brucella. The Prokaryotes:An Evolving Electronic Resource for the Microbiological Community. In ed. M. e. al. Dworkin. Springer-Verlag, New York. Moulin, L., Bena, G., Boivin-Masson, C., and Stepkowski, T. 2004. Phylogenetic analyses of symbiotic nodulation genes support vertical and lateral gene cotransfer within the Bradyrhizobium genus. Mol. Phylogenet. Evol. 30, 720– 732. Murray, R.G.E., Brenner, D.J., Colwell, R.R., De Vos, P., Goodfellow, M., Grimont, P.A.D., Pfennig, N., Stackebrandt, E., and Zavarzin, G.A. 1990. Report of the Ad Hoc Committee on approaches to taxonomy within the Proteobacteria. Int. J. Syst. Bacteriol. 40, 213–215. Nierman, W.C., Feldblyum, T.V., Laub, M.T., Paulsen, I.T., Nelson, K.E., Eisen, J., Heidelberg, J.F., Alley, M.R., Ohta, N., Maddock, J.R., Potocka, I., Nelson, W.C., Newton, A., Stephens, C., Phadke, N.D., Ely, B., DeBoy, R.T., Dodson, R.J., Durkin, A.S., Gwinn, M.L., Haft, D.H., Kolonay, J.F., Smit, J., Craven, M.B., Khouri, H., Shetty, J., Berry, K., Utterback, T., Tran, K., Wolf, A., Vamathevan, J., Ermolaeva, M., White, O., Salzberg, S.L., Venter, J.C., Shapiro, L., and Fraser, C.M. 2001. Complete genome sequence of Caulobacter crescentus. Proc. Natl. Acad. Sci. USA 98, 4136–4141. Ogata, H., Audic, S., Renesto-Audiffren, P., Fournier, P.E., Barbe, V., Samson, D., Roux, V., Cossart, P., Weissenbach, J., Claverie, J.M., and Raoult, D. 2001. Mechanisms of evolution in Rickettsia conorii and R. prowazekii. Science 293, 2093–2098. Olsen, G. J., Woese, C. R., and Overbeek, R. 1991. The winds of (evolutionary) change: Breathing new life into microbiology. J. Bacteriol. 176, 1–6. Opperman, T.,and Richardson, J.P. 1994. Phylogenetic analysis of sequences from diverse bacteria with homology to the Escherichia coli rho gene. J. Bacteriol. 176, 5033–5043. Parker, A.R., and Eshleman, J.R. 2003. Human MutY: Gene structure, protein functions and interactions, and role in carcinogenesis. Cell Mol. Life Sci. 60, 2064–2083. Paulsen, I.T., Seshadri, R., Nelson, K.E., Eisen, J.A. Heidelberg, J. F., Read, T. D., Dodson, R. J., Umayam, L., Brinkac, L. M., Beanan, M. J., Daugherty, s. C., DeBoy, R. T., Durkin, A. S., Kolonay, J. F., Madupu, r., Nelson, W. C.,


PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA Wang, E.T., van Berkum, P., Beyene, D., Sui, X.H., Dorado, O., Chen, W.X., and Martinez-Romero, E. 1998. Rhizobium huautlense sp. nov., a symbiont of Sesbania herbacea that has a close phylogenetic relationship with Rhizobium galegae. Int. J. Syst. Bacteriol. 48 Pt. 3, 687–699. Woese, C.R., Stackebrandt, E., Macke, R.J., and Fox, G.E. 1985. A phylogenetic definition of the major eubacterial taxa. System. Appl. Microbiol. 6, 143– 151. Woese, C.R., Stackebrandt, E., Weisburg, W.G., Paster, B.J., Madigan, M.T., Fowler, C.M.R., Hahn, C.M., Blanz, P., Gupta, R., Nealson, K.H., and Fox, G.E. 1984The phylogeny of purple bacteria: The alpha subdivision. System. Appl. Microbiol. 5, 315–326. Wolf, Y.I., Aravind, L., and Koonin, E.V. 1999. Rickettsiae and Chlamydiae— evidence of horizontal gene transfer and gene exchange. Trends Genet 15, 173–175. Wood, D.W., Setubal, J.C., Kaul, R., Monks, D.E. Kitajima, J. P., Okura, V. K., Zhou, Y., Chen, L., Wood, G. E., Almeida, N. F., Jr., Woo, L., Chen,Y.,Paulsen, I. T., Eisen, J. A., Karp, P. D., Bovee, D., Sr., Chapman, P., Clendenning, J., Deatherage, G., Gillet, W., Grant, c., Kutyavin, T., Levy, R., Li, M. J., McClelland, E., Palmieri, A., Raymond, C., Rouse, G., Saenphimmachak, C., Wu, Z., Romero, P., Gordon, D., Zhnag, S., Yoo, H., Tao, Y., Biddle,

135

P., Jung, M., Krespan, W., Perry, M., Gordon-Kamm, B., Lioa, L., Kim, S., Hendrick, C., Zhao, Z. Y., Dolan, M., Chumley, F., Tingey, S. V., Tomb, J. F., Godon, M. P., Olson, M. V., and Nester, E. W., 2001. The genome of the natural genetic engineer Agrobacterium tumefaciens C58. Science 294, 2317– 2323. Young, J.M., Kuykendall, L.D., Martinez-Romero, E., Kerr, A., and Sawada, H. 2001. A revision of Rhizobium Frank 1889, with an emended description of the genus, and the inclusion of all species of Agrobacterium Conn 1942 and Allorhizobium undicola de Lajudie et al. 1998 as new combinations: Rhizobium radiobacter, R. rhizogenes, R. rubi, R. undicola and R. vitis. Int. J. Syst. Evol. Microbiol. 51, 89–103. Yu, X.J. and D. H. Walker. 2003. The Order Rickettsiales. In The Prokaryotes: An Evolving Electronic Resource for the Microbiological Community, ed. M. e. al. Dworkin. Springer-Verlag, New York. Yu, X.J., Zhang, X.F., McBride, J.W., Zhang, Y., and Walker, D.H. 2001. Phylogenetic relationships of Anaplasma marginale and ‘Ehrlichia platys’ to other Ehrlichia species determined by GroEL amino acid sequences. Int. J. Syst. Evol. Microbiol. 51, 1143–1146. Yurkov, V.V., and Beatty, J.T. 1998. Aerobic anoxygenic phototrophic bacteria. Microbiol. Mol. Biol. Rev. 62, 695–724.

 





Política de privacidad