Bout analysis

Tools and classes for the identification of behavioural bouts

A histogram of log-transformed frequencies of x with a chosen bin width and upper limit forms the basis for models. Histogram bins following empty ones have their frequencies averaged over the number of previous empty bins plus one. Models attempt to discern the number of random Poisson processes, and their parameters, generating the underlying distribution of log-transformed frequencies.

The abstract class Bouts provides basic methods.

Abstract class & methods summary

Bouts(x, bw[, method])

Abstract base class for models of log-transformed frequencies

Bouts.init_pars(x_break[, plot, ax])

Find starting values for mixtures of random Poisson processes

Bouts.fit(start)

Fit Poisson mixture model to log frequencies

Bouts.bec(coefs)

Calculate bout ending criteria from model coefficients

Bouts.plot_fit(coefs[, ax])

Plot log frequency histogram and fitted model

Nonlinear least squares models

Currently, the model describing the histogram as it is built is implemented in the BoutsNLS class. For the case of a mixture of two Poisson processes, this class would set up the model:

(1)\[y = log[N_f \lambda_f e^{-\lambda_f t} + N_s \lambda_s e^{-\lambda_s t}]\]

where \(N_f\) and \(N_s\) are the number of events belonging to process \(f\) and \(s\), respectively; and \(\lambda_f\) and \(\lambda_s\) are the probabilities of an event occurring in each process. Mixtures of more processes can also be added to the model.

The bout-ending criterion (BEC) corresponding to equation (1) is:

(2)\[BEC = \frac{1}{\lambda_f - \lambda_s} log \frac{N_f \lambda_f}{N_s \lambda_s}\]

Note that there is one BEC per transition between Poisson processes.

The methods of this subclass are provided by the abstract super class Bouts, and adds the methods below.

Methods summary

BoutsNLS.plot_ecdf(coefs[, ax])

Plot observed and modelled empirical cumulative frequencies

Maximum likelihood models

This is the preferred approach to modelling mixtures of random Poisson processes, as it does not rely on the subjective construction of a histogram. The histogram is only used to generate reasonable starting values, but the underlying paramters of the model are obtained via maximum likelihood, so it is more robust.

For the case of a mixture of two processes, as above, the log likelihood of all the \(N_t\) in a mixture can be expressed as:

(3)\[log\ L_2 = \sum_{i=1}^{N_t} log[p \lambda_f e^{-\lambda_f t_i} + (1-p) \lambda_s e^{-\lambda_s t_i}]\]

where \(p\) is a mixing parameter indicating the proportion of fast to slow process events in the sampled population.

The BEC in this case can be estimated as:

(4)\[BEC = \frac{1}{\lambda_f - \lambda_s} log \frac{p\lambda_f}{(1-p)\lambda_s}\]

The subclass BoutsMLE offers the framework for these models.

Class & methods summary

BoutsMLE.loglik_fun(params, x[, transformed])

Log likelihood function of parameters given observed data

BoutsMLE.fit(start[, fit1_opts, fit2_opts])

Maximum likelihood estimation of log frequencies

BoutsMLE.bec(fit)

Calculate bout ending criteria from model coefficients

BoutsMLE.plot_fit(fit[, ax])

Plot log frequency histogram and fitted model

BoutsMLE.plot_ecdf(fit[, ax])

Plot observed and modelled empirical cumulative frequencies

API

class bouts.Bouts(x, bw, method='standard')[source]

Abstract base class for models of log-transformed frequencies

This is a base class for other classes to build on, and do the model fitting. Bouts is an abstract base class to set up bout identification procedures. Subclasses must implement fit and bec methods, or re-use the default NLS methods in Bouts.

x

1D array with input data.

Type

array_like

method

Method used for calculating the histogram.

Type

str

lnfreq

DataFrame with the centers of histogram bins, and corresponding log-frequencies of x.

Type

pandas.DataFrame

abstract bec(coefs)[source]

Calculate bout ending criteria from model coefficients

Implementing default as from NLS method.

Parameters

coefs (pandas.DataFrame) – DataFrame with model coefficients in columns, and indexed by parameter names “a” and “lambda”.

Returns

out – 1-D array with BECs implied by coefs. Length is coefs.shape[1]

Return type

ndarray, shape (n,)

abstract fit(start)[source]

Fit Poisson mixture model to log frequencies

Default is non-linear least squares method.

Parameters

start (pandas.DataFrame) – DataFrame with coefficients for each process in columns.

Returns

  • coefs (pandas.DataFrame) – Coefficients of the model.

  • pcov (2D array) – Covariance of coefs.

init_pars(x_break, plot=True, ax=None, **kwargs)[source]

Find starting values for mixtures of random Poisson processes

Starting values are calculated using the “broken stick” method.

Parameters
  • x_break (array_like) – One- or two-element array with values determining the break(s) for broken stick model, such that x < x_break[0] is first process, x >= x_break[1] & x < x_break[2] is second process, and x >= x_break[2] is third one.

  • plot (bool, optional) – Whether to plot the broken stick model.

  • ax (matplotlib.Axes, optional) – An Axes instance to use as target. Default is to create one.

  • **kwargs (optional keyword arguments) – Passed to plotting function.

Returns

out – DataFrame with coefficients for each process.

Return type

pandas.DataFrame

plot_fit(coefs, ax=None)[source]

Plot log frequency histogram and fitted model

Parameters
  • coefs (pandas.DataFrame) – DataFrame with model coefficients in columns, and indexed by parameter names “a” and “lambda”.

  • ax (matplotlib.Axes instance) – An Axes instance to use as target.

Returns

ax

Return type

matplotlib.Axes

class bouts.BoutsMLE(x, bw, method='standard')[source]

Nonlinear least squares bout identification

bec(fit)[source]

Calculate bout ending criteria from model coefficients

Parameters

fit (scipy.optimize.OptimizeResult) – Object with the optimization result, having a x attribute with coefficients of the solution.

Returns

out

Return type

ndarray

Notes

Current implementation is for a two-process mixture, hence an array of a single float is returned.

fit(start, fit1_opts=None, fit2_opts=None)[source]

Maximum likelihood estimation of log frequencies

Parameters
  • start (pandas.DataFrame) – DataFrame with starting values for coefficients of each process in columns. These can come from the “broken stick” method as in Bouts.init_pars(), and will be transformed to minimize the first log likelihood function.

  • fit2_opts (fit1_opts,) – Dictionaries with keywords to be pass to scipy.optimize.minimize(), for the first and second fits.

Returns

fit1, fit2 – Objects with the optimization result from the first and second fit, having a x attribute with coefficients of the solution.

Return type

scipy.optimize.OptimizeResult

Notes

Current implementation handles mixtures of two Poisson processes.

loglik_fun(params, x, transformed=True)[source]

Log likelihood function of parameters given observed data

Parameters
  • params (array_like) – 1-D array with parameters to fit. Currently must be 3-length, with mixing parameter \(p\), density parameter \(\lambda_f\) and \(\lambda_s\), in that order.

  • x (array_like) – Independent data array described by parameters p and lambdas.

  • transformed (bool) – Whether params are transformed and need to be un-transformed to calculate the likelihood.

Returns

Return type

out

plot_ecdf(fit, ax=None)[source]

Plot observed and modelled empirical cumulative frequencies

Parameters
  • fit (scipy.optimize.OptimizeResult) – Object with the optimization result, having a x attribute with coefficients of the solution.

  • ax (matplotlib.Axes instance) – An Axes instance to use as target.

Returns

ax

Return type

matplotlib.Axes

plot_fit(fit, ax=None)[source]

Plot log frequency histogram and fitted model

Parameters
  • fit (scipy.optimize.OptimizeResult) – Object with the optimization result, having a x attribute with coefficients of the solution.

  • ax (matplotlib.Axes instance) – An Axes instance to use as target.

Returns

ax

Return type

matplotlib.Axes

class bouts.BoutsNLS(x, bw, method='standard')[source]

Nonlinear least squares bout identification

bec(coefs)[source]

Calculate bout ending criteria from model coefficients

The metaclass bouts.Bouts implements this method.

Parameters

coefs (pandas.DataFrame) – DataFrame with model coefficients in columns.

Returns

out – List of BEC’s implied by coefs.

Return type

list

fit(start)[source]

Fit non-linear least squares to log frequencies

The metaclass bouts.Bouts implements this method.

Parameters

start (pandas.DataFrame) – DataFrame with coefficients for each process in columns.

Returns

  • coefs (pandas.DataFrame) – Coefficients of the model.

  • pcov (2D array) – Covariance of coefs.

plot_ecdf(coefs, ax=None, **kwargs)[source]

Plot observed and modelled empirical cumulative frequencies

Parameters
  • coefs (pandas.DataFrame) – DataFrame with model coefficients in columns.

  • ax (matplotlib.Axes instance) – An Axes instance to use as target.

  • **kwargs (optional keyword arguments) – Passed to matplotlib.pyplot.gca.

Returns

ax

Return type

matplotlib.Axes

bouts.label_bouts(x, bec, as_diff=False)[source]

Classify data into bouts based on bout ending criteria

Parameters
  • x (pandas.Series) – Series with data to classify according to bec.

  • bec (array_like) – Array with bout-ending criteria. It is assumed to be sorted.

  • as_diff (bool, optional) – Whether to apply diff on x so it matches bec’s scale.

Returns

out – Integer array with the same shape as x.

Return type

ndarray