Back to Multi-week Labs

home course additional labs published labs      

Phylogenetic Systematics

A laboratory exercise for undergraduate evolutionary biology courses

This lab outline includes


One of the distinctive aspects of biology as a science is that its objects of study, living organisms, have constantly changed through time. A significant portion of the field of biology is thus devoted to the investigation of these changes to determine how organisms have evolved, how the enormous diversity of life on earth has arisen, and how different organisms are related to one another. These fundamental questions are historical in nature.

A powerful approach for answering historical questions in biology is phylogenetic systematics. The purpose of phylogenetic systematics is to attempt to reconstruct the historical relationships among organisms. That is, it attempts to determine (a) the evolutionary pathway by which modem species arose, (b) how and to what degree they are related, and (c) what their ancestors may have looked like.

The goal of today's lab is to learn how to apply some of the methods of phylogenetic systematics, then to use these methods to examine degrees of relatedness among a group of species. By doing so, you will be able to make inferences about the course of speciation. Usually speciation proceeds far too slowly to be observed directly, but many of the changes that occurred as species diverged are preserved as characteristics in the organisms presently alive. Using these characteristics and applying the methods of phylogenetic systematics to determine the one correct set of evolutionary pathways is a stimulating intellectual challenge.

The Premise

Evolutionary theory provides an extremely powerful set of guidelines for thinking about diversity, ancestors, morphological change, and the relationships among species. For the sake of simplicity, we will assume that species originate from other species by dichotomous splitting. That is, during speciation species A gives rise to two new species (no more, no less), B and C, and at that time the ancestral species, A, is considered to cease existence (see diagram below), Each of the resulting species now may gradually change form (evolve) over time, and may itself split into two new species when conditions for speciation are encountered. As you will see, these seemingly trivial propositions have some enormous consequences.


How many phylogenetic trees are there?

Suppose you have three species of animals or plants (A, B, and C) and you wish to know how they are related to each other. Ultimately, of course, they will share a common ancestor (which we will call ancestor 1), but the three species could have arisen in three different ways,


Given our assumption that evolution proceeds by dichotomous splitting, then a) only two species can be most closely related to each other, and b) that pair must have had one additional ancestor (ancestor 2).

Now suppose we discover a new species, D, which we wish to fit into our existing tree for A, B, and C. Obviously it can be most closely related to only one of the species we already have. If you drew all the possible phylogenetic trees that could result from adding species D, you would have 15 different trees. If you wished to determine phylogenetic relationships of 4 species, it would be a fairly simple matter to draw all 15 possible trees and decide, by some subjective argument, which one you think represents the most likely relationship. The difficulty with such an approach, however, is that adding even a few more species to the tree makes the problem quite unmanageable. For instance, with 10 species there are 282,137,824 possible trees! Clearly, trial and error methods are ruled out if we are to study and understand the relationships between even a modest number of species. We will have to adopt a method that approaches the problem in an organized and rational manner.

How to construct a phylogenetic tree

Deciding on one tree is easy in principle, but quite a mind bender in practice. All one really needs to do is find, in the collection of species under consideration, the two species that are most closely related to each other. These can then be connected by a simple Y-shaped diagram. One can then proceed to find the next species that is most closely related to either or both and attach it via an appropriate branch, and so forth until all species are placed. In practice this procedure has two difficulties.

The first difficulty is in deciding how closely two species are related, and specifically, which two species are most closely related. If by relatedness we mean commonality of descent, then the problem can also be phrased: how do we know which two species share the most recent common ancestor?

There is, as you might guess, a difference of opinion among biologists about how to determine the "degree of relatedness" of existing species. One group believes that this should be done on the basis of greatest overall similarity, taking into account as many characteristics of the organisms as possible. A second group believes that some characters (e.g. embryonic ones) are much better indicators of relatedness than others and insist that classification should emphasize those particular traits. Yet another group, the phylogenetic systematists, believes that since the sequence of evolution proceeds from "ancestral" to "derived", and since different characteristics of an organism are likely to evolve independently, it is essential to discover which characters of a species are ancestral and which ones are derived, and then use this information to interpret relationships. Phylogenetic systematists emphasize that ancestral characters are of no use in determining relationships among species. Relationships can be discovered ONLY BY STUDYING DERIVED CHARACTERS. The reason for this assertion will be discussed below.

The second difficulty arises because every time we join two species, we in fact "create" a new species in the form of a hypothetical common ancestor whose characteristics must also be taken into account. In other words, if we have joined two species A and B, and wish to add a third species C, we how have to determine whether it is most closely related to A, or to B, or to the now common ancestor of A and B.

Obviously an cases 1 and 2 we have decided that C is most closely related to B or A, respectively, while in case 3 it is most closely related to the common ancestor of A and B. The point is that once we join two species in a tree we must make a decision about what the common ancestor looked like. Likewise, adding C to the tree for A and B involves the creation of an additional common ancestor - either to C and A, and C and B, or to C and (A and B) - whose characteristics will have to be taken into account when a fourth species is added to the tree.

Ancestral versus derived

Before continuing we must correct some common misconceptions about the meanings of "ancestral" and "derived". They do not mean that an ancestral species is primitive or has primitive characters. When the form (or state) of a character changes in the course of evolution (e.g. a reduction in the number of toes from 4 to 3, the initial condition is deemed ancestral and the new condition is considered derived. Now if evolution proceeds further (e.g. reduction in number of toes to 2 then the original derived condition becomes ancestral to the new character state. Hence the character state "3 toes" is “ancestral” relative to the character state "2 toes", but it is derived relative to the state "4 toes". It is crucial that you recognize that a given species invariably has a mixture of both ancestral (shared with its ancestor) and derived (not shared with its ancestor) character states.

The importance of derived characters

We now have the information necessary to develop an important rule to use when we examine the phylogeny of fasteners. Suppose we have a group consisting of two species, A and B, which share a common ancestor. Such a group is said to be monophyletic. This group will therefore be characterized (that is, it can be told apart from all other groups of organisms) by the features that its members share with their common ancestor. Suppose we also have a second monophyletic group of species, C and D and their common ancestor, and suppose further that both of these two groups (A-B and C-D) are in turn monophyletic, that is, they are linked by a third common ancestor.

Because of their recent common ancestors, A-B and C-D are called sister taxa. A "natural group" of sister taxa can only be distinguished from another natural group of taxa by the characters that all the members of the group uniquely share. And since any character that is unique to a natural group of taxa must be a derived character (why?), it follows that group A-B and group C-D (and their respective ancestors) must have different sets of shared derived characters. This, in turn, implies immediately that one particular character must always occur in a more ancestral condition in one group than in the other group. This conclusion is an exercise in logic, and if it is not intuitively obvious to you, the following procedure should make it clear.

In Figure A (below) a number of hatch marks have been drawn across each of the intervals connecting species and ancestors. These marks correspond to evolutionary "events". That is to say, at each mark one or more of the characters of the species in question underwent a change. Suppose that each species in this diagram (A, B, C, and ancestors) possesses 7 analyzable characters a, b, c, d, e, f, g (these could be leg length, body size, hair color, etc.). If we let each hatch mark represent a change in a single character, we would get a scheme like the one shown in Figure B, in which all characters that are primed (e.g. x') represent a derived character state and all unmarked characters (e.g. x) represent ancestral states. The important thing to do next is to picture what each of the 7 species in the diagram looks like with respect to characters a - g. This is shown in Figure C.


The above figures should clarify several of the things we have mentioned so far.

1)    Each species has a mix of ancestral (x) and derived (x') characters.

2)    Each monophyletic group (note that a single species is a group of one) is defined only by its derived characters. That is, members of different groups often share ancestral characters.

3)    The particular characters that define (i.e. are unique to) any monophyletic group occur in a more ancestral form in the sister group. This is the conclusion we were after and which you can now verify in Figure C.
It should be evident to you now that in order to reconstruct the phylogeny of a set of organisms it is essential to have both an extensive list of characters and to determine the ancestral or derived state for each character.

The latter is an important task that you and other phylogenetic systematists perform because, unless it is done right, your interpretation of the relationships among the organisms under study will be incorrect. How can we determine which "state" of a given character is most ancestral?

There are three criteria for identifying an ancestral character.

1) The outgroup criterion. Suppose we could identify a species that you are certain should be placed outside your group of interest. It is best to use several. These taxa would be placed onto your phylogeny by connecting them somewhere below the ancestor for your entire group in interest. These taxa are called outgroups, and relative to them, your group of interest is clearly monophyletic. Now recall that the characters that define a monophyletic group are derived characters (above), so the characters that define taxa within your group of interest must derived. By contrast, any characters that members of your group of interest share with the outgroup must be ancestral, since both groups must have inherited that character from the ancestor of the entire phylogeny. With DNA sequence data, only the outgroup criterion can be used for defining ancestral versus derived characters, so nowadays, this is the primary method for identifying derived characters.

2) The fossil criterion. Suppose you have a well-preserved fossil that you are certain is ancestral to your group of interest. Any characters your group shares with the fossil are ancestral, and any characters unique to your group of interest are derived. We will use this criterion because the ancestors of modern fasteners are well preserved in the fossil record.

3) The ontological criterion. Early development appears very similar in a wide variety of organisms, with more derived characters that define different taxa appearing later in development. If this is true, it follows that derived characters could, in some cases, be identified if they appear later in an organism’s development.