Statistical Approaches for Hidden Variables in Ecology. Nathalie Peyrard

Читать онлайн.
Название Statistical Approaches for Hidden Variables in Ecology
Автор произведения Nathalie Peyrard
Жанр Социология
Серия
Издательство Социология
Год выпуска 0
isbn 9781119902782



Скачать книгу

these behaviors and understanding the underlying determinisms. In this chapter, we shall present two latent variable models, widely used in movement ecology for trajectory analysis. Each model corresponds to a specific objective: the reconstruction of real trajectories with the removal of any geolocation errors, and the identification of different behaviors in the course of movement.

      1.1.1. Reconstructing a real trajectory from imperfect observations

      Observation errors are generally small (a few meters) in cases where positions are obtained using a GPS system on open ground and with good satellite coverage. Far larger errors may occur using other technologies, such as the Argos system (into the tens of kilometers). A hierarchical model for reconstructing real trajectories from observed trajectories is presented in section 1.2.1.

      1.1.2. Identifying different behaviors in movement

      Individuals rarely move in a homogeneous manner, and different movement patterns are often observed. In Nathan et al. (2008), the authors propose a formalization of the mechanisms responsible for individual movement. Among the different aspects mentioned, the internal state of the individual and the environment in which it exists are identified as important mechanisms of movement. It seems reasonable to believe that the internal state of an individual affects its behavior, resulting in a change of movement regime.

      Any study of individual movement must permit the identification of different states or activities. In this case, the hidden variable is the activity of the individual, while the observed variable is its position, or various metrics derived from this position, as we shall see later. Section 1.2.2 presents a reconstruction of behavior based on movement observations, using a specific latent variable model known as a hidden Markov model.

      1.2.1. Trajectory reconstruction model

      1.2.1.1. Overview

      In cases where there are errors in observed positions, data can be smoothed in order to recreate the real trajectory. To smooth errors, all collected data points are combined with a movement model in order to “straighten out” outlying observations and thus correct positioning errors.

      We consider that all of these random variables obey the following hierarchical model:

      From top to bottom, these three equations define:

       – The initial distribution: the a priori initial position of the individual. In this case, we have a normal distribution (in dimension 2) about an initial position μ0, with a variance–covariance matrix Σ0.

       – The transition distribution (or dynamic model): in this case, a model of the individual’s movement. We consider that the current position is given by a random Gaussian variable, centered about an affine transformation of the previous position, with a variance–covariance matrix Σm. The affine transformation is obtained from two parameters: a matrix A (of size 2 × 2) and a vector μ of dimension 2. The most common approach is to consider that μ = 0 and to take A as the identity matrix. The resulting model is a random walk.

       – The emission distribution (or observation model): the observation is taken to be a random Gaussian variable centered about an affine transformation of the current position, with variance–covariance matrix Σo. The affine transformation is given by two parameters: a matrix B (of size 2 × 2) and a vector ν of dimension 2. The most common approach is to consider that ν = 0 and to take B as the identity matrix. The observation is thus presumed to be centered about the real position.

      1.2.1.2. Inference

       – Estimation of positions: in this case, inference is used to determine the distribution of actual positions based on observations, that is, for 0 ≤ t ≤ n, the distribution of the random variable Zt|Y0:n. This distribution is known as the smoothing distribution.

       – Estimation of parameters: to estimate the unknown parameters in the model (which, in the majority of cases, correspond to the two variance–covariance matrices, Σm and Σo).

      With known parameters and for any 0 ≤ tn, the distribution of Zt|Y0:n is Gaussian. The mean and the variance–covariance matrix of this distribution can be calculated explicitly. This step is carried out using Kalman smoothing, which will not be described in detail here; interested readers may wish to consult Tusell (2011). It is important to note that the explicit nature of this solution is exceptional in the context of latent variable models, and is a result of the Gaussian linear formulation of model [1.1].

      In practice, the parameter θ = {μ, A, ν, B, Σm, Σo} is unknown. In a frequentist context, the natural aim is to identify the parameter that maximizes the likelihood associated with observations