Modelling


This section discusses the different modelling algorithms, they fall into two different categories; namely inductive and deductive.  Inductive modelling depends on occurrence data, generated from sampling species in the field to inform the model.  Deductive modelling takes place when element-environment relationships that have been discovered through research are used to inform the model.

Before running the modelling program: 

Before you run your chosen modelling program, consider how you will test the results of the model.

  1. Reserve some of the sample data to test the model with.  The reserved sample localities can be overlaid with the areas of predicted occurrence and the number of occurrences falling outside this area will give an indication of the accuracy of the model.
  2. Investigate species occurrence in the field.  The predicted localities are visited and it is determined if the species is present in those localities, this can mean that one has to complete a number of field visits over a number of days.

Various modeling techniques and software packages can then be used to analyze the predicted and projected distributions of a species.  Certain software packages will model distribution using only presence data, while other packages will model distribution using both presence and absence data.  These different packages may also offer different outputs, such as distribution data listed in text files or mapped in GIS shape files.  It is best to choose a modeling method that will accept the input data (presence only or prescience/absence) as well as offering the required output.  A number of different model types and resources exist.

Please note that this list is in no way an exhaustive list of all modeling packages, nor does it attempt to promote or elevate the status of those packages listed below.

Try this website for more ecological modelling packages:

http://ecobas.org/www-server/mod-info/index.html

Climex

Description

The CLIMEX simulation model is based on an expert knowledge of how key environmental variables affect critical functions of a species (such as what temperatures cause growth to cease). This knowledge is transformed into simple response curves. In other words, CLIMEX attempts to mimic the mechanisms that limit species’ geographical distributions and determine their seasonal phenology and to a lesser extent their relative abundance.

Download CLIMEX 3.0.2 here.

Hulls and Kernels

Data type
Presence data only

Description

In this technique two different methods are considered.  Firstly the convex hull method creates the smallest possible polygon to enclose all of the sample points.  This method will represent irregular shapes, however outliers could cause the polygon to be extremely large.  Furthermore clusters and sample density are not taken into account

.

The second method is kernel mapping, this method creates a continuous density surface where the density of sample points are represented by 3-dimensional peaks whose height is determined by the number of sample points in that locality.  This method represents irregular shapes and allows the prediction of occurrence over the sample area

.

Download scripts for ESRI software here.

 

BIOCLIM

Data type
Presence data only

Description

The user sets maximum and minimum values for each environmental predictor where the species is known to occur, thereby creating a rectangular environmental envelope.  The software then uses these values to predict in what areas the species can be found.  The Mahalonobis technique can be used to create an oblique ellipse environmental envelope based on the environmental variables supplied

.

Download BIOCLIMav (v1.2) here.

Download Mahalonobis Distances extension for ArcView here.

Logistic Regression (GLM)

Data type
Presence and absence data

Description

This technique allows the user to formulate a relationship between species occurrence and the environmental aspects.  The function consists of a response(dependant) variable, predictor (independent) variables and a link (relationship) function, which describes the relationship between the variables.  Generalized additive models (GAMs) bring additional smoothing functions into the relationship function, but require large sample sizes.

Most standard statistical packages offer logistic regression.

Download BIOMOD for R here.   
Download the StatMod Zone extension to ArcView for SAS here.
Download ArcView-SDM for ordinary regression here.

Classification and Regression Trees (CART)

Data type
Presence and absence data

Description

This technique creates a classification or regression tree where occurrence data is split into mostly present and mostly absent groups depending on the environmental value given.  This process happens recursively until the data cannot be split any further or until the stop value is reached.Most advanced statistical packages offer classification techniques.

Download the StatMod Zone extension to ArcView 3X for SAS here

 

MaxEnt (Maximum Entropy)

Data type
Presence data only

Description

This technique uses the maximum value of entropy to estimate the most uniform distribution of the occurrence data over the study area.  This uniform distribution is constrained by the environmental values or proportion of occurrence points in a category.  The resulting predicted distribution is regularized to avoid over fitting.  The output values are represented as percentages with 100% being most suitable and 0% being least suitable.

Download MaxEnt here:

 

BioMapper

Data type
Presence data only

Description

BioMapper is a kit of GIS- and statistical tools designed to build habitat suitability (HS) models and maps for any kind of animal or plant. It is centred on the Ecological Niche Factor Analysis (ENFA) that allows it to compute habitat suitablility models without the need of absence data.

Download BioMapper here.

 

Ordination

Data type
Presence and absence data

Description

Canonical Correspondence Analysis (CCA) is a widely used method for direct gradient analysis.  CCA assumes that species have a distribution with one mode. This means that the species has one optimal environmental condition. If any aspect of the environment is greater or lesser than this optimum, the species will perform more poorly (i.e. it will have a lesser abundance).

Some statistical packages offer Ordination (DA – Splus)

Download XLSTAT here.

 

MARS (Multivariate adaptive regression splines)

Data type
Presence and absence data

Description

This technique requires a target variable and a set of predictor variables.  Irrelevant predictors are removed and interactions between predictors are discovered.  The model tests itself to avoid over fitting and handles missing variables by creating a missing value indicator.

Download MARS here.