Some examples are provided to illustrate the results for the GEBA

Some examples are provided to illustrate the results for the GEBA project itself and for a more concise project that targets a much smaller group of organisms, the Roseobacter find more clade [17,18] within Rhodobacteraceae (Alphaproteobacteria) [19]. Material and methods Design goals of the phylogenetic scoring The major goals of the novel approach were that the scoring (i) is independent of changes in the set of ongoing or finished genome projects, (ii) considers the contribution of a species to the total phylogenetic diversity, as measured using branch lengths, (iii) gives a relatively low weight to organisms in densely sampled groups and a relatively high weight to isolated species, and (iv) if summed up over all leaves of a subtree would provide a biologically sensible score for this subtree.

The first goal, independence of changes in the set of ongoing or finished genome projects, was primarily of practical importance, to avoid recalculation of the scores each time a genome project is initialized. A stable score that only depends on the underlying phylogenetic tree is also much easier to use for calculating summary statistics; examples are given below. Further, the same scores can be used for distinct projects if the scoring depends only on a phylogenetic hypothesis, but not on the set of (un-)selected targets. In addition to genome sequencing, phylogeny-based target selection might indeed be of interest in projects on the extraction of secondary metabolites such as antibiotics (e.g., [20-25]), pigments [26] or siderophores [27].

Genome sequencing of phylogenetically selected strains revealed more novel protein families than sequencing randomly selected targets [10]. Hence, it is promising to apply phylogeny-based target selection also to phenotypic investigations, as phylogenetically more distant organisms might be expected to display more divergent phenotypes than close relatives. The second goal, to consider the contribution of a species to the total phylogenetic diversity in the scoring, as measured using branch lengths [10], is justified as follows. Whereas a rooted tree topology alone indicates the relative branching order, the lengths of the branches also indicate the expected or minimal number of character changes on the respective branch [28], depending on whether the tree was estimated under maximum likelihood [29] or maximum parsimony [30].

These character changes within the dataset (e.g., gene) from which the tree has been inferred can then serve as a proxy for the estimated number Anacetrapib of changes within the characters of interest (e.g., content of protein families [10] and possibly also selected phenotypic traits, see above). This approach apparently only presupposes that some correlation exists between the rates of change of the distinct kinds of characters looked at, but it does not presuppose the existence of a molecular (or even phenotypic) clock [28].

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>