One method of discriminating boundaries was implemented by binning the start and end locations of strings into bins of one unit size. Peaks in this binned data representing regions of high coincidence in the location of species' string start or end points, were used as one of the regionalisation schemes.
A second method relied upon computing the dissimilarity in species composition of two adjacent bins via the Jaccard Index. In our implementation, we have compared bins separated by a number of bins between those being compared (i.e. moving window filter). This essentially acts a smoothing filter on local disparities, allowing more regional trends to emerge. The Index is here defined as the ratio of the sum of the species which differ between the bins to the total number of distinct species in both bins. For the draft report submitted in May 1996 (CSIRO, 1996) a filter width of 7 bins was chosen for the draft regionalisations. For the regionalisations presented in this report, species dissimilarity were computed for adjacent bins in order to preserve the spatial information to at least a unit bin scale, and in order to simplify the statistical threshold probability analysis.
For the draft regionalisation (CSIRO, 1996), two algorithms for peak-detection were employed. The first was a global search which required a peak to be larger by a prescribed factor than the average number of string boundaries in a bin. In the second scheme, denoted as a local peak, the peak was required to be larger by a prescribed factor than the median of a sample from a number of bins adjacent to the bin being examined. Both of these peak detectors were then run for a range of realisation of the parameters of the filters (filter width range of 3 - 9 bins, multiplication factor range 2.5 - 3.5). The results of detection (=1) were then summed for the range of realisations and provided the information used to designate bioregional boundaries and zootones.
In the regionalisations presented here, the critical value of the proportional turnover is based on species dissimilarity computed for adjacent bins. The distribution of starts and ends in the bins is assumed to be Binomial. This assumption can then be used to calculate a critical value for the null hypothesis that the starts and ends of ranges are independent of each other across species. In other words, that they do not group together at bioregion boundaries.
We first calculate the expected number of starts or stops in a pair of bins. This is the number of ranges divided by the number of species intervals (each range segment - there may be more than one per species - has a start and end - the artefactual starts at the origin of the mainland string were removed.) The number of species intervals is the sum of numbers of species in each bin. This is the probability of a species having a start or stop in a bin. We want either a start in interval one or an end in interval two this is the same as either a start or stop in the same interval.
If Ni is the number of species present in the ith pair of bins then the Binomial distribution can be used to give say the 99% percentile of the random distribution. That is, 99% of the time the proportion of species showing starts or ends in this pair of bins will be less than this value. It would be exceeded by chance only one in 100 times. Observed proportional turnover values larger than this could be thought of as "statistically significant".
The Binomial distribution is discrete, it can only take certain values, therefore particularly with small values of Ni a probability of 99% was not perfectly achievable. This is why the critical value can change more than one might expect given differences in Ni.
Next Chapter: 13. Biological Regionalisation: Bass Strait