--- title: "GLMMcosinor" author: "Oliver Jayasinghe and Rex Parsons" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{GLMMcosinor} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} bibliography: REFERENCES.bib editor_options: markdown: wrap: 72 --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r srr, eval = FALSE, echo = FALSE} #' @srrstats {G1.0} #' @srrstats {G1.1} #' @srrstats {G1.3} ``` ## An brief introduction to the cosinor model A cosinor model aims to model the amplitude ($A$), acrophase ($\phi$), and MESOR ($M$) of a rhythmic dataset. - MESOR ($M$) is the Midline Estimating Statistic of Rhythm, and may also be referred to as the equilibrium point. - Amplitude ($A$) is the difference between the MESOR and the maximum height of the rhythm. - Acrophase ($\phi$) is the phase at which the maximal response occurs. These could be modelled using a cosine function: $$Y(t) = M + Acos(\frac{2\pi t}{\tau} - \phi) + e(t)$$ where $e(t)$ is the error term. However, these cannot be estimated using a linear modelling framework! Other packages, including [`{circacompare}`](https://cran.r-project.org/package=circacompare) [@parsons2020circacompare], fit this exact nonlinear model but most packages (including this one) decomposes this into linear terms, creating the cosinor model: $$Y(t) = M + \beta x + \gamma z + e(t)$$ Where $x =cos(\frac{2\pi t}{τ})$, $z =sin(\frac{2\pi t}{τ})$, $\beta = A cos(\phi)$, $\gamma = A sin(\phi)$ This linear model is passed to the package [`{glmmTMB}`](https://cran.r-project.org/package=glmmTMB) [@brooks2017glmmTMB] in `lme4` syntax. If the model has no random effects, `glmmTMB` uses maximum likelihood estimation to estimate the linear coefficients of the model. For models with random effects, a Laplace approximation is used to integrate over the random effects. This approximation is handled by the [`{TMB}`](https://cran.r-project.org/package=TMB)[@Kristensen2016TMB] package which uses automatic differentiation of the joint likelihood function to provide fast computations of parameter estimates. A detailed explanation of this process is described [here](https://doi.org/10.18637/jss.v070.i05) [@Kristensen2016TMB] `glmmTMB` returns the estimates of the linear coefficients from the linear model. To recover the estimates of the original parameters for amplitude ($A$) and acrophase ($\phi$), the estimates for $\hat\beta$ and $\hat\gamma$ must be transformed as per the following equations: $$\hat\phi = \arctan(\frac{\hat\gamma}{\hat\beta}) $$ $$\hat A = \sqrt{\hat\beta ^2 + \hat\gamma ^ 2}$$ These transformed parameters for acrophase and amplitude, along with MESOR are returned as part of the `cglmm` output. For a more thorough introduction to cosinor modelling, see [here](https://doi.org/10.1186%2F1742-4682-11-16) [@cornelissen2014cosinor]. ## Introduction `{GLMMcosinor}` allows the user to fit generalised linear models based on rhythmic data with a cosinor model. It allows users to summarise, predict, and plot these models too. Existing packages have focused primarily on Gaussian data. Some circadian regression modelling packages have allowed users to specify generalised linear models, but with limited flexibility. `{GLMMcosinor}` takes a comprehensive approach to modelling by harnessing the `{glmmTMB}` package, that has a wide range of available link functions, allowing users to model rhythmic data from a wide range of distributions (for full list - see `?family` and `?glmmTMB::family_glmmTMB`) including: - Binomial - Guassian - Inverse Gaussian - Gamma - Poisson - Negative Binomial The table below shows what features are available within `{GLMMcosinor}` and other methods. ![flextable methods](../man/figures/methods-table.png){width=100%} ## `cglmm()` `cglmm()` wrangles the data appropriately to fit the cosinor model given the formula specified by the user. It returns a model, providing estimates of amplitude, acrophase, and MESOR (Midline Statistic Of Rhythm). The formula argument for `cglmm()` is specified using the `{lme4}` style (for details see `vignette("lmer", package = "lme4")`). The only difference is that it allows for use of `amp_acro()` within the formula that is used to identify the cosinor (rhythmic) components and relevant variables in the provided data. Any other combination of covariates can also be included in the formula as well as random effects. Additionally, zero-inflation (`ziformula`) and dispersion (`dispformula`) formulae can be incorporated if required. For detailed examples of how to specify these types of models, see the [mixed-models](https://docs.ropensci.org/GLMMcosinor/articles/mixed-models.html), [model-specification](https://docs.ropensci.org/GLMMcosinor/articles/model-specification.html) and [multiple-components](https://docs.ropensci.org/GLMMcosinor/articles/multiple-components.html) vignettes. For example, consider the following model and its output: ```{r, message=F, warning=F} library(GLMMcosinor) cosinor_model <- cglmm( vit_d ~ X + amp_acro(time, period = 12, group = "X"), data = vitamind ) ``` Notice how both the raw and transformed coefficients are provided as output. The adapted `data.frame` that was used to fit the raw model can be accessed from the model and includes `main_rrr1` and `main_sss1` columns of data: ```{r, message=F, warning=F} head(cosinor_model$newdata) ``` In this example, the `main` prefix indicates that this is the data for the conditional model, as opposed to (potential) dispersion or zero-inflation models, which have the prefixes `disp` and `zi`, respectively. The numeric suffix, indicates that this is the data for the first (and only) cosinor component. If there are multiple components, the columns of data will be named accordingly. ## A basic overview of `cglmm()` The `cglmm()` function is used to fit cosinor models to a variety of distributions using the `glmmTMB()` function. ```{r, warning=F, message=F} cglmm( formula = vit_d ~ amp_acro(time, period = 12), data = vitamind, family = gaussian ) ``` - `formula`: A formula specifying the model structure, including the response variable and the cosinor components (using `amp_acro()`). - `data`: The `data.frame` containing the variables used in the formula. - `family`: The family of the distribution for the response variable (e.g., poisson, gaussian, or any family found in`?family` and `?glmmTMB::family_glmmTMB`) The `amp_acro()` function is used within the formula to specify the cosinor components. It allows you to specify the period of the rhythm and, if necessary, the grouping structure and the number of components. The arguments of `amp_acro()` are: - `group`: The name of the grouping variable in the dataset. - `time_col`: The name of the time column. - `n_components`: The number of components in the cosinor model. - `period`: The period(s) of the rhythm. ### Understanding the output The most relevant output from the `cglmm()` function is likely to be the parameter estimates for MESOR, amplitude, and acrophase under the 'Transformed Coefficients' heading. These are the recovered estimates mentioned at the beginning of this vignette: the amplitude and phase. The 'Raw Coefficients' are the coefficients from the cosinor model. In this example, the `main_rrr1` and `main_sss1` correspond to $\hat\beta$ and $\hat\gamma$ in the first section, respectively. The following example fits a grouped single-component model with a Guassian distribution (the default). ```{r, message=F, warning=F} cglmm( vit_d ~ X + amp_acro(time, period = 12, group = "X"), data = vitamind ) ``` Under the 'Transformed Coefficients' heading: - `(Intercept) = 29.6898`is the MESOR estimate of group 0 - `[X=1] = 1.90186` is the difference between the MESOR estimates of group 1 and 2 \* - `[X=0]:amp = 6.27046` is the amplitude estimate for group 0 - `[X=1]:amp = 8.09947` is the amplitude estimate for group 1 - `[X=0]:acr = 1.42181` is the acrophase estimate in radians for group 0 \*\* - `[X=1]:acr = 0.63715` is the acrophase estimate in radians for group 1 \* Hence, the MESOR estimate for group 1 would be `29.6898 + 1.90186 = 31.59166`. This is due to the behaviour of the `glmmTMB()` function. This can be adjusted by adding a `0 +` to the beginning of the formula: ```{r, message=F, warning=F} cglmm( vit_d ~ 0 + X + amp_acro(time, period = 12, group = "X" ), data = vitamind ) ``` Note how now, `[X=1] = 31.59165` and this represents the estimate for the MESOR for group 1, rather than the difference. \*\* Note how the acrophase is provided in units of radians. Since the period is 12, an acrophase of 1.42181 radians corresponds to a time of $\frac{1.42181}{2 \pi} \times 12 = 2.715457$. This means the maximum response occurs at 2.715 time units. We can check this visually using the `autoplot()` function, looking at the `[X=0]` level (red line) ```{r} cosinor_model <- cglmm( vit_d ~ 0 + X + amp_acro(time, period = 12, group = "X" ), data = vitamind ) autoplot(cosinor_model, predict.ribbon = FALSE) ``` ## More advanced `cglmm()` model specification The `cglmm()` function allows you to specify different types of cosinor models with or without grouping variables. The function can also generate dispersion models and zero-inflation models. For more detailed explanations and examples, see the [model-specification](https://docs.ropensci.org/GLMMcosinor/articles/model-specification.html) article. Additionally, the `cglmm()` function provides more advanced functionality for multi-component models, and detailed explanations can be found in the [multiple-components](https://docs.ropensci.org/GLMMcosinor/articles/multiple-components.html) article. The `cglmm()` function also allows mixed model specification. See the [mixed-models](https://docs.ropensci.org/GLMMcosinor/articles/mixed-models.html) article for more details. ## Using `summary()` and testing for differences between estimates The `summary()` method for the outputs from `cglmm()` provides a more detailed summary of the model and its parameter estimates and uncertainty. It outputs the estimates, standard errors, confidence intervals, and p-values for both the raw model parameters and the transformed parameters. The summary statistics do not represent a comparison between any groups for the cosinor components - that is the role of the `test_cosinor_components()` and `test_cosinor_levels()` functions. Here is an example of how to use `summary()` with some simulated data: ```{r, echo=F} withr::with_seed( 50, { testdata_simple <- simulate_cosinor( 1000, n_period = 2, mesor = 5, amp = 2, acro = 1, beta.mesor = 4, beta.amp = 1, beta.acro = 0.5, family = "poisson", period = 12, n_components = 1, beta.group = TRUE ) } ) ``` ```{r, eval=F} testdata_simple <- simulate_cosinor( 1000, n_period = 2, mesor = 5, amp = 2, acro = 1, beta.mesor = 4, beta.amp = 1, beta.acro = 0.5, family = "poisson", period = 12, n_components = 1, beta.group = TRUE ) ``` ```{r, message=F, warning=F} object <- cglmm( Y ~ group + amp_acro(times, period = 12, group = "group"), data = testdata_simple, family = poisson() ) summary(object) ``` If we wanted to test the difference between the amplitude estimate for component 1 between `group 1` and `group 2`, we can use the `test_cosinor_levels()` function: ```{r} test_cosinor_levels(object, x_str = "group", param = "amp") ``` The estimate here is the estimate of the difference between the inputted values, along with its confidence interval. The real parameters for `amp` in the first component were 2 and 1 for groups 0 and 1 respectively, and so the difference is approximately -1. Now, consider an example where the difference is not so clear. ```{r, echo=F} withr::with_seed( 50, { testdata_poisson <- simulate_cosinor(100, n_period = 2, mesor = 7, amp = c(0.1, 0.5), acro = c(1, 1), beta.mesor = 4.4, beta.amp = c(0.1, 0.46), beta.acro = c(0.5, -1.5), family = "poisson", period = c(12, 6), n_components = 2, beta.group = TRUE ) } ) ``` ```{r, eval=F} testdata_poisson <- simulate_cosinor(100, n_period = 2, mesor = 7, amp = c(0.1, 0.5), acro = c(1, 1), beta.mesor = 4.4, beta.amp = c(0.1, 0.46), beta.acro = c(0.5, -1.5), family = "poisson", period = c(12, 6), n_components = 2, beta.group = TRUE ) ``` ```{r} cosinor_model <- cglmm( Y ~ group + amp_acro(times, period = c(12, 6), n_components = 2, group = "group" ), data = testdata_poisson, family = poisson() ) test_cosinor_levels( cosinor_model, x_str = "group", param = "amp", component_index = 1 ) ``` In this example, there is no significant difference in the estimate of `amp` for the first component between the reference group and the comparator group. Also notice how if we are comparing between levels, we should keep the component the same, and that is what `component_index` sets. Likewise, when we test between components using `test_cosinor_components()`, we can indicate which level this comparison occurs using `level_index`. There may be multiple `groups`, in which case we can fix the `group` using the `x_str` argument. As an example of testing the difference between components for the same level: ```{r} test_cosinor_components( cosinor_model, x_str = "group", param = "acr", level_index = 1 ) ``` In this situation, there is a significant difference between the acrophase for the comparator group between its two components. ## Using `predict()` The `predict()` method allows users to get predicted values from the model on either the existing or new data. ```{r, eval=F} cbind(predictions = predict(cosinor_model, type = "response"), testdata_poisson) ``` ```{r, echo=F} head(cbind( predictions = predict(cosinor_model, type = "response"), testdata_poisson )) ``` ## Plotting `cglmm` objects The `{GLMMcosinor}` package includes two ways to visualise `cglmm()` objects. Firstly, the `autoplot()` method creates a time-response plot of the fitted model for all groups: ```{r, message=F, warning=F} autoplot(cosinor_model, superimpose.data = TRUE) ``` This function also allows users to superimpose the data (that was used to fit the model) over the fitted model, using the `superimpose.data = TRUE`, as demonstrated above. By default, the generated plot will have x-limits corresponding to the minimum and maximum values of the time-vector in the original dataframe, although the x-limits can be manually defined by the user using the `xlims` argument. The details of using the `autoplot` function are found in the [model-visualisations](https://docs.ropensci.org/GLMMcosinor/articles/model-visualisations.html) vignette. ## References