Tricky!: estimating diversity

Tricky!: estimating diversity Sept 14, 2005 8:35:14 GMT -5

Quote

Post by Joe Botting on Sept 14, 2005 8:35:14 GMT -5

Here's one that could potentially be really useful, even leading to publication one day. Anyone with a mathematical bent will probably have something very useful to say here. Or so we hope. Apologies to casual readers for it being more technical, but it has to happen sometimes! ;-)
A while ago, Lucy and I were trying to think of ways to estimate the real diversity of a locality. Obviously, we can never be sure that we've found every species in a particular bed, but can we find any ways to suggest how many more there are to go..?

To tackle this, we need to know a bit about 'normal' diversity patterns. In a typical ecosystem, there is what we call a 'hollow curve' distribution. Effectively, this means that there are a few species that are very common, quite a lot that are rarer, and loads that are very scarce. The chances of us ever finding a specimen of the rarest ones is very low, even assuming that it was fossilisable.
(Atcually, that's a point - let's ignore soft-bodied things for the moment at least; they're just unmanageable!)

There is one established way of estimating diversity, called a 'rarefaction curve.' Here we make a graph of the number of specimens on the x-axis, and the number of species on the y-axis. We then plot each specimen as a point, the line gradually going upwards as we move from left to right, with the finding of new species. The curve goes up fast to begin with, but rapidly flattens out as new species become rarer. Eventually, the curve would become a horizontal line when there are no new species appearing, but we can estimate an extrapolation from an incomplete curve to 'guesstimate' the final diversity.
The problem with this method is that it's time-consuming, and requires completely unbiased sampling - you sit at the locality, hammer through rocks, and take note of every specimen that turns up (several hundred are normally needed, at least, to give a reasonable idea). There are also problems with things like trilobites, which moult (how many specimens count as one individual? If you find three ribs, it's a single specimen, but what about three 'tails'?). Rarefaction curves are useful, but it takes a long time and is very approximate indeed.

So we wanted something better, and which can make use of large, biased collections made over years. We came up with a few ideas, and have been trying to see if they're viable. To start with, I'll just give you the basic scenario we came up with, and we'll see if there are any suggestions unsullied by our probably false extrapolations...
It makes use of the large number of rare species, by assuming that among the rarest ones, there is an effectively equal probability distribution of finding one. It follows, we thought, that we can get something by collecting from the locality after previous extensive collections, and noting (a) the number of new species recorded, and (b) the number of species going from one to two known specimens. If all species have been recorded, then clearly a=0, but b probably will be >0. In the early stages of collecting, there are far more rare species still 'out there' than there are rare species already collected, so a>b. Therefore, it should be possible statistically to produce an idealised curve of a/b against completeness of the collection, and then use to the former to calculate the latter, and hence the total diversity.
You challenge, should you choose to accept it, is to work out exactly how to do this. We've been playing around with ideas, and got some vague graphs, but nothing usefully analytical. To get a really accurate technique out of this (statistically speaking; I guess several samples will end up being needed, to give a few successive points on the graph) will actually be a surprisingly big advance over current methods, because it would cut out a lot of the problems of rarefaction curves - although it would inevitably have some more of its own.

Over to you!