13: Modelling Species’ Distributions

MCED Chapter 13


Modelling Species’ Distributions


Carsten F. Dormann

Abstract (from the book)

Species distribution models have become a commonplace exercise over the last 10 years, however, analyses vary due to different traditions, aims of applications and statistical backgrounds. In this chapter, I lay out what I consider to be the most crucial steps in a species distribution analysis: data pre-processing and visualisation, dimensional reduction (including collinearity), model formulation, model simplification, model type, assessment of model performance (incl. spatial autocorrelation) and model interpretation. For each step, the most relevant considerations are discussed, mainly illustrated with Generalised Linear Models and Boosted Regression Trees as the two most contrasting methods. In the second section, I draw attention to the three most challenging problems in species distribution modelling: identifying (and incorporating into the model) the factors that limit a species range; separating the fundamental, realised and potential niche; and niche evolution.

Additional material

  • Where’s the sperm whale? A species distribution example analysis (co-authored by Kristin Kaschner)
  • Dormann_Kaschner_MCED

  • We analyse the coarse-scale distribution of sperm whale around Antarctica as an example study of a typical species distribution model. Following the outline and structure in chapter 13, we demonstrate each point with this data set, show results and their interpretation, compare two modelling techniques (GLM and Boosted Regression Trees), and discuss the steps of the analyses in the light of the ecology behind the target species. Data and R-code are provided in order for readers and teachers to be able to reproduce our analysis.
    Case Study Sperm Whale (link to pdf)
  • R-scripts 1:  spermwhale_R.Analysis

2: brtfunctions_cfd.r

Summarising questions

  • Do the data you analyse match the question you pose? E.g., if the data are limited to a small region, do you think you can deduce an animal’s climate niche? Or, if the data cover a large area but only at very low resolution, do you think you can infer habitat usage?

  • Do you have additional abundance data? Hierarchical models may allow you to augment coarse-grain, large-extent data by fine-grain, small-extent data (see references below)!
  • Why do you use this approach and these data? Are you aware of the implications of these choices? Do you know literature comparing this approach to others? Do you know the papers on the impact of different types of predictor sets on model predictions? Would you bet you career on the conclusions you draw from your study, or should they be couched much more cautiously?
  • Can you list all factors that introduce uncertainty into your model (data, model structure, parameterisation, resolution, …)? Do you have a feeling for how much they affect model predictions? Are you aware of studies investigating such effects?

Further Reading
(from the book and additional recommendations)

  1. Franklin, J. 2009. Mapping Species Distributions: Spatial Inference and Prediction. Cambridge, UK: Cambridge University Press.
  2. Special features on SDMs:

    1. Richardson, D. M., R. J. Whittaker 2009. Conservation biogeography – foundations, concepts and challenges. Diversity and Distributions 16: 313–320.
    2. Zimmermann, N. E., T. C. Edwards, C. H. Graham, P. B. Pearman, J.-C. Svenning 2010. New trends in species distribution modelling. Ecography 33 (6): 985–989.
  3. The many, many publications resulting from the NCEAS working group Species Distribution Modelling; most papers were co-authored by Jane Elith (http://www.botany.unimelb.edu.au/envisci/about/staff/elith.html), Antoine Guisan (http://www.unil.ch/Jahia/site/dee/cache/offonce/pid/55116;jsessionid=32F098CE53050B5119CA0D455588A242.jvm1) and Catherine Graham (http://life.bio.sunysb.edu/ee/grahamlab/Publication.html). Their homepages give specific references.
  4. Bayesian hierarchical models allow for more complex analyses, involving, e.g., data at different resolutions, detection probabilities, repeated visits, spatial effects, species interactions, etc. They are more demanding to implement and much slower to compute. Good points for entry are:
    1. Gelman, A., J. Hill 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge, UK: Cambridge University Press.
    2. Royle, J. A., R. M. Dorazio 2008. Hierarchical Modeling and Inference in Ecology. Amsterdam: Academic Press.