---
title: "GLMMcosinor"
author: "Oliver Jayasinghe and Rex Parsons"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{GLMMcosinor}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
bibliography: REFERENCES.bib
editor_options: 
  markdown: 
    wrap: 72
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r srr, eval = FALSE, echo = FALSE}
#' @srrstats {G1.0}
#' @srrstats {G1.1}
#' @srrstats {G1.3}
```

## An brief introduction to the cosinor model

A cosinor model aims to model the amplitude ($A$), acrophase ($\phi$),
and MESOR ($M$) of a rhythmic dataset.

-   MESOR ($M$) is the Midline Estimating Statistic of Rhythm, and may
    also be referred to as the equilibrium point.

-   Amplitude ($A$) is the difference between the MESOR and the maximum
    height of the rhythm.

-   Acrophase ($\phi$) is the phase at which the maximal response
    occurs.

These could be modeled using a cosine function:

$$Y(t) = M + Acos(\frac{2\pi t}{\tau} - \phi) + e(t)$$ where $e(t)$ is
the error term.

However, these cannot be estimated using a linear modeling framework!
Other packages, including
[`{circacompare}`](https://cran.r-project.org/package=circacompare)
[@parsons2020circacompare], fit this exact nonlinear model but most
packages (including this one) decomposes this into linear terms,
creating the cosinor model:

$$Y(t) = M + \beta x + \gamma z + e(t)$$

Where $x =cos(\frac{2\pi t}{τ})$, $z =sin(\frac{2\pi t}{τ})$,
$\beta = A cos(\phi)$, $\gamma = A sin(\phi)$

This linear model is passed to the package
[`{glmmTMB}`](https://cran.r-project.org/package=glmmTMB)
[@brooks2017glmmTMB] in `lme4` syntax. If the model has no random
effects, `glmmTMB` uses maximum likelihood estimation to estimate the
linear coefficients of the model. For models with random effects, a
Laplace approximation is used to integrate over the random effects. This
approximation is handled by the
[`{TMB}`](https://cran.r-project.org/package=TMB)[@Kristensen2016TMB]
package which uses automatic differentiation of the joint likelihood
function to provide fast computations of parameter estimates. A detailed
explanation of this process is described
[here](https://doi.org/10.18637/jss.v070.i05) [@Kristensen2016TMB]

`glmmTMB` returns the estimates of the linear coefficients from the
linear model. To recover the estimates of the original parameters for
amplitude ($A$) and acrophase ($\phi$), the estimates for $\hat\beta$
and $\hat\gamma$ must be transformed as per the following equations:

$$\hat\phi = \arctan(\frac{\hat\gamma}{\hat\beta}) $$

$$\hat A = \sqrt{\hat\beta ^2 + \hat\gamma ^ 2}$$ These transformed
parameters for acrophase and amplitude, along with MESOR are returned as
part of the `cglmm` output. For a more thorough introduction to cosinor
modeling, see [here](https://doi.org/10.1186%2F1742-4682-11-16)
[@cornelissen2014cosinor].

## Introduction

`{GLMMcosinor}` allows the user to fit generalized linear models based
on rhythmic data with a cosinor model. It allows users to summarize,
predict, and plot these models too. Existing packages have focused
primarily on Gaussian data. Some circadian regression modeling packages
have allowed users to specify generalized linear models, but with
limited flexibility. `{GLMMcosinor}` takes a comprehensive approach to
modeling by harnessing the `{glmmTMB}` package, that has a wide range
of available link functions, allowing users to model rhythmic data from
a wide range of distributions (for full list - see `?family` and
`?glmmTMB::family_glmmTMB`) including:

-   Binomial
-   Guassian
-   Inverse Gaussian
-   Gamma
-   Poisson
-   Negative Binomial

The table below shows what features are available within `{GLMMcosinor}`
and other methods.


![flextable methods](../man/figures/methods-table.png){width=100%}

## `cglmm()`

`cglmm()` wrangles the data appropriately to fit the cosinor model given
the formula specified by the user. It returns a model, providing
estimates of amplitude, acrophase, and MESOR (Midline Statistic Of
Rhythm).

The formula argument for `cglmm()` is specified using the `{lme4}` style
(for details see `vignette("lmer", package = "lme4")`). The only
difference is that it allows for use of `amp_acro()` within the formula
that is used to identify the cosinor (rhythmic) components and relevant
variables in the provided data. Any other combination of covariates can
also be included in the formula as well as random effects. Additionally,
zero-inflation (`ziformula`) and dispersion (`dispformula`) formulae can
be incorporated if required. For detailed examples of how to specify
these types of models, see the
[mixed-models](https://docs.ropensci.org/GLMMcosinor/articles/mixed-models.html),
[model-specification](https://docs.ropensci.org/GLMMcosinor/articles/model-specification.html)
and
[multiple-components](https://docs.ropensci.org/GLMMcosinor/articles/multiple-components.html)
vignettes.

For example, consider the following model and its output:

```{r, message=F, warning=F}
library(GLMMcosinor)

cosinor_model <- cglmm(
  vit_d ~ X + amp_acro(time, period = 12, group = "X"),
  data = vitamind
)
```

Notice how both the raw and transformed coefficients are provided as
output. The adapted `data.frame` that was used to fit the raw model can
be accessed from the model and includes `main_rrr1` and `main_sss1`
columns of data:

```{r, message=F, warning=F}
head(cosinor_model$newdata)
```

In this example, the `main` prefix indicates that this is the data for
the conditional model, as opposed to (potential) dispersion or
zero-inflation models, which have the prefixes `disp` and `zi`,
respectively. The numeric suffix, indicates that this is the data for
the first (and only) cosinor component. If there are multiple
components, the columns of data will be named accordingly.

## A basic overview of `cglmm()`

The `cglmm()` function is used to fit cosinor models to a variety of
distributions using the `glmmTMB()` function.

```{r, warning=F, message=F}
cglmm(
  formula = vit_d ~ amp_acro(time, period = 12),
  data = vitamind,
  family = gaussian
)
```

-   `formula`: A formula specifying the model structure, including the
    response variable and the cosinor components (using `amp_acro()`).
-   `data`: The `data.frame` containing the variables used in the
    formula.
-   `family`: The family of the distribution for the response variable
    (e.g., poisson, gaussian, or any family found in`?family` and
    `?glmmTMB::family_glmmTMB`)

The `amp_acro()` function is used within the formula to specify the
cosinor components. It allows you to specify the period of the rhythm
and, if necessary, the grouping structure and the number of components.
The arguments of `amp_acro()` are:

-   `group`: The name of the grouping variable in the dataset.
-   `time_col`: The name of the time column.
-   `n_components`: The number of components in the cosinor model.
-   `period`: The period(s) of the rhythm.

### Understanding the output

The most relevant output from the `cglmm()` function is likely to be the
parameter estimates for MESOR, amplitude, and acrophase under the
'Transformed Coefficients' heading. These are the recovered estimates
mentioned at the beginning of this vignette: the amplitude and phase.
The 'Raw Coefficients' are the coefficients from the cosinor model. In
this example, the `main_rrr1` and `main_sss1` correspond to $\hat\beta$
and $\hat\gamma$ in the first section, respectively.

The following example fits a grouped single-component model with a
Guassian distribution (the default).

```{r, message=F, warning=F}
cglmm(
  vit_d ~ X + amp_acro(time, period = 12, group = "X"),
  data = vitamind
)
```

Under the 'Transformed Coefficients' heading:

-   `(Intercept) = 29.6898`is the MESOR estimate of group 0

-   `[X=1] = 1.90186` is the difference between the MESOR estimates of
    group 1 and 2 \*

-   `[X=0]:amp = 6.27046` is the amplitude estimate for group 0

-   `[X=1]:amp = 8.09947` is the amplitude estimate for group 1

-   `[X=0]:acr = 1.42181` is the acrophase estimate in radians for group
    0 \*\*

-   `[X=1]:acr = 0.63715` is the acrophase estimate in radians for group
    1

\* Hence, the MESOR estimate for group 1 would be
`29.6898 + 1.90186 = 31.59166`. This is due to the behaviour of the
`glmmTMB()` function. This can be adjusted by adding a `0 +` to the
beginning of the formula:

```{r, message=F, warning=F}
cglmm(
  vit_d ~ 0 + X + amp_acro(time,
    period = 12,
    group = "X"
  ),
  data = vitamind
)
```

Note how now, `[X=1] = 31.59165` and this represents the estimate for
the MESOR for group 1, rather than the difference.

\*\* Note how the acrophase is provided in units of radians. Since the
period is 12, an acrophase of 1.42181 radians corresponds to a time of
$\frac{1.42181}{2 \pi} \times 12 = 2.715457$. This means the maximum
response occurs at 2.715 time units. We can check this visually using
the `autoplot()` function, looking at the `[X=0]` level (red line)

```{r}
cosinor_model <- cglmm(
  vit_d ~ 0 + X + amp_acro(time,
    period = 12,
    group = "X"
  ),
  data = vitamind
)
autoplot(cosinor_model, predict.ribbon = FALSE)
```

## More advanced `cglmm()` model specification

The `cglmm()` function allows you to specify different types of cosinor
models with or without grouping variables. The function can also
generate dispersion models and zero-inflation models. For more detailed
explanations and examples, see the
[model-specification](https://docs.ropensci.org/GLMMcosinor/articles/model-specification.html)
article.

Additionally, the `cglmm()` function provides more advanced
functionality for multi-component models, and detailed explanations can
be found in the
[multiple-components](https://docs.ropensci.org/GLMMcosinor/articles/multiple-components.html)
article.

The `cglmm()` function also allows mixed model specification. See the
[mixed-models](https://docs.ropensci.org/GLMMcosinor/articles/mixed-models.html)
article for more details.

## Using `summary()` and testing for differences between estimates

The `summary()` method for the outputs from `cglmm()` provides a more
detailed summary of the model and its parameter estimates and
uncertainty. It outputs the estimates, standard errors, confidence
intervals, and p-values for both the raw model parameters and the
transformed parameters. The summary statistics do not represent a
comparison between any groups for the cosinor components - that is the
role of the `test_cosinor_components()` and `test_cosinor_levels()`
functions.

Here is an example of how to use `summary()` with some simulated data:

```{r, echo=F}
withr::with_seed(
  50,
  {
    testdata_simple <- simulate_cosinor(
      1000,
      n_period = 2,
      mesor = 5,
      amp = 2,
      acro = 1,
      beta.mesor = 4,
      beta.amp = 1,
      beta.acro = 0.5,
      family = "poisson",
      period = 12,
      n_components = 1,
      beta.group = TRUE
    )
  }
)
```

```{r, eval=F}
testdata_simple <- simulate_cosinor(
  1000,
  n_period = 2,
  mesor = 5,
  amp = 2,
  acro = 1,
  beta.mesor = 4,
  beta.amp = 1,
  beta.acro = 0.5,
  family = "poisson",
  period = 12,
  n_components = 1,
  beta.group = TRUE
)
```

```{r, message=F, warning=F}
object <- cglmm(
  Y ~ group + amp_acro(times, period = 12, group = "group"),
  data = testdata_simple, family = poisson()
)
summary(object)
```

If we wanted to test the difference between the amplitude estimate for
component 1 between `group 1` and `group 2`, we can use the
`test_cosinor_levels()` function:

```{r}
test_cosinor_levels(object, x_str = "group", param = "amp")
```

The estimate here is the estimate of the difference between the inputted
values, along with its confidence interval. The real parameters for
`amp` in the first component were 2 and 1 for groups 0 and 1
respectively, and so the difference is approximately -1.

Now, consider an example where the difference is not so clear.

```{r, echo=F}
withr::with_seed(
  50,
  {
    testdata_poisson <- simulate_cosinor(100,
      n_period = 2,
      mesor = 7,
      amp = c(0.1, 0.5),
      acro = c(1, 1),
      beta.mesor = 4.4,
      beta.amp = c(0.1, 0.46),
      beta.acro = c(0.5, -1.5),
      family = "poisson",
      period = c(12, 6),
      n_components = 2,
      beta.group = TRUE
    )
  }
)
```

```{r, eval=F}
testdata_poisson <- simulate_cosinor(100,
  n_period = 2,
  mesor = 7,
  amp = c(0.1, 0.5),
  acro = c(1, 1),
  beta.mesor = 4.4,
  beta.amp = c(0.1, 0.46),
  beta.acro = c(0.5, -1.5),
  family = "poisson",
  period = c(12, 6),
  n_components = 2,
  beta.group = TRUE
)
```

```{r}
cosinor_model <- cglmm(
  Y ~ group + amp_acro(times,
    period = c(12, 6),
    n_components = 2,
    group = "group"
  ),
  data = testdata_poisson,
  family = poisson()
)
test_cosinor_levels(
  cosinor_model,
  x_str = "group",
  param = "amp",
  component_index = 1
)
```

In this example, there is no significant difference in the estimate of
`amp` for the first component between the reference group and the
comparator group. Also notice how if we are comparing between levels, we
should keep the component the same, and that is what `component_index`
sets. Likewise, when we test between components using
`test_cosinor_components()`, we can indicate which level this comparison
occurs using `level_index`. There may be multiple `groups`, in which
case we can fix the `group` using the `x_str` argument.

As an example of testing the difference between components for the same
level:

```{r}
test_cosinor_components(
  cosinor_model,
  x_str = "group",
  param = "acr",
  level_index = 1
)
```

In this situation, there is a significant difference between the
acrophase for the comparator group between its two components.

## Using `predict()`

The `predict()` method allows users to get predicted values from the
model on either the existing or new data.

```{r, eval=F}
cbind(predictions = predict(cosinor_model, type = "response"), testdata_poisson)
```

```{r, echo=F}
head(cbind(
  predictions = predict(cosinor_model, type = "response"),
  testdata_poisson
))
```

## Plotting `cglmm` objects

The `{GLMMcosinor}` package includes two ways to visualize `cglmm()`
objects. Firstly, the `autoplot()` method creates a time-response plot
of the fitted model for all groups:

```{r, message=F, warning=F}
autoplot(cosinor_model, superimpose.data = TRUE)
```

This function also allows users to superimpose the data (that was used
to fit the model) over the fitted model, using the
`superimpose.data = TRUE`, as demonstrated above. By default, the
generated plot will have x-limits corresponding to the minimum and
maximum values of the time-vector in the original dataframe, although
the x-limits can be manually defined by the user using the `xlims`
argument. The details of using the `autoplot` function are found in the
[model-visualization](https://docs.ropensci.org/GLMMcosinor/articles/model-visualizations.html)
vignette.

## References