---
title: "Using colocr"
subtitle: "An R package for conducting co-localization analysis"
author: "Mahmoud Ahmed"
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette:
    self_contained: yes
vignette: >
  %\VignetteIndexEntry{Using colocr}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.align = 'center',
  message = FALSE
)
```

## Overview

A few R packages are available for conducting image analysis, which is a very wide topic. As a result, some of us might feel at a loss when all they want to do is a simple co-localization calculations on a small number of microscopy images. This package provides a simple straight forward workflow for loading images, choosing regions of interest (ROIs) and calculating co-localization statistics. Included in the package, is a [shiny app](https://shiny.rstudio.com) that can be invoked locally to interactively select the regions of interest in a semi-automatic way. The package is based on the R package [`imager`](https://CRAN.R-project.org/package=imager).

## Getting started

To get started, load the required packages and the images. The images below
are from [DU145](https://en.wikipedia.org/wiki/DU145) cell line and were 
stained for two proteins; [RKIP](https://en.wikipedia.org/wiki/Raf_kinase_inhibitor_protein) and [LC3](https://en.wikipedia.org/wiki/MAP1LC3B).
Then, apply the appropriate parameters for choosing the regions of interest
using the `roi_select`. Finally, check the appropriateness of the 
parameters by highlighting the ROIs on the image.

```{r getting_started, fig.width=7, fig.height=7}
# load libraries
library(colocr)

# load images
fl <- system.file('extdata', 'Image0001_.jpg', package = 'colocr')
img <- image_load(fl)

# select ROI and show the results
par(mfrow = c(2,2), mar = rep(1, 4))

img %>%
  roi_select(threshold = 90) %>%
  roi_show()
```

The same can be achieved interactively using an accompanying **shiny** app.
To launch the app run.

```{r colocr_app, eval=FALSE}
colocr_app()
```

The rest of the analysis depends on the particular kind of images. Now, `colocr`
implements two simple co-localization statistics; Pearson's Coefficient Correlation [(PCC)](https://www.ncbi.nlm.nih.gov/pubmed/20653013) and the Manders Overlap Coefficient [(MOC)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3074624/).

To apply both measures of correlation, we first get the pixel intensities and call `roi_test` on the merge image.

```{r call_roi_test}
# calculate co-localization statistics
img %>%
  roi_select(threshold = 90) %>%
  roi_test(type = 'both')
```

The same analysis and more can be conducted using a web interface for the package available [here](https://mahshaaban.shinyapps.io/colocr_app2/)

## Detailed Example

The following example uses images from the [DU145](https://en.wikipedia.org/wiki/DU145) prostate cancer cell line. In this experiment, the cell line was treated with probes for two proteins [RKIP](https://en.wikipedia.org/wiki/Raf_kinase_inhibitor_protein) and [LC3](https://en.wikipedia.org/wiki/MAP1LC3B). The aim of this experiment is to determine, how much of the two proteins are co-localized or co-distributed in the particular cell line.  

```{r show_images, fig.width=7, fig.height=2.5}
# load required libraries
library(colocr)

# load images
img <- image_load(system.file('extdata', 'Image0001_.jpg', package = 'colocr'))       # merge
img1 <- imager::channel(img, 1)  # red
img2 <- imager::channel(img, 2)  # green

# show images
par(mfrow = c(1,3), mar = rep(1,4))
plot(img, axes = FALSE, main = 'Merged')
plot(img1, axes = FALSE, main = 'RKIP')
plot(img2, axes = FALSE, main = 'LC3')
```

The `colocr` package provides a straight forward workflow for determining the amount of co-localization. This workflow consists of two steps only:

  * Choosing regions of interest (ROI)
  * Calculating the correlation between the pixel intensities

The first step can be achieved by calling `roi_select` on the image. In addition, `roi_show` can be used to visualize the regions that were selected to make sure they
match the expectations. Similarly, `roi_check` can be used to visualize the 
pixel intensities of the selected regions. The second step is calling `roi_test`
to calculate the co-localization statistics.

The calls to these functions can be piped using `%>%` to reduce the amount of typing
and make the code more readable.  

```{r workflow, echo=FALSE, out.width='300px'}
workflow_fig <- system.file('colocr_app', 'workflow.png', package = 'colocr')
knitr::include_graphics(workflow_fig)
```

### Choosing ROIs

The function `roi_select` relies on different algorithms from the `imager` package. However, using the functions to select the ROIs doesn't require any background knowledge in the workings of the algorithms and can be done through trying different parameters and choosing the most appropriate ones. Typically, one wants to select the 
regions of the image occupied by a cell or a group of cells. However, the package can
also be used to select certain areas/structures within the cell if they are distinct 
enough. By default, the largest contiguous region of the image is selected, more regions can be added using the argument `n`. The details of the other inputs are documented in the function help page `?roi_select`.
 
```{r call_roi_select}
# select regions of interest
img_rois <- img %>%
  roi_select(threshold = 90)
```

This function returns `cimg` object containing the original input image and an added attribute called `label` which indicates the selected regions. `label` is a vector of `integer`s; with 0 indicating the non-selected pixels and 1 for the selected regions. When the argument `n` is provided to `roi_select`, 1 is replaced by `integer` labels for each of the selected regions separately.

```{r output_str}
# class of the returned object
class(img); class(img_rois)

# name of added attribut
names(attributes(img)); names(attributes(img_rois))

# str of labels
label <- attr(img_rois, 'label')
str(label)

# unique labels 
unique(label)
```

Now, to make sure these parameters are appropriately encompassing the ROIs, call the `roi_show` to visualize side by side the original merge picture, a low resolution picture of the ROI and the images from the two different channels highlighted by the ROIs.

```{r call_roi_show, fig.width=7, fig.height=7}
# select ROI and show the results
par(mfrow = c(2,2), mar = rep(1, 4))

img_rois %>%
  roi_show()
```

Both the co-localization statistics implemented in this package quantify different aspects of the linear trend between the pixel intensities from the two channels of the image. Therefore, it is useful to visualize this trend and the distribution of the intensities to make sure the analysis is appropriate.

```{r call_coloc_show, fig.width=7, fig.height=3}
# show the scatter and density of the pixel values
par(mfrow=c(1,2), mar = c(4,4,1,1))

img_rois %>%
  roi_check()
```

Arguably, selecting the regions of interest is the most time consuming step in this kind of analysis. Usually, one has to do this selection by hand when using image analysis software such as [imageJ](https://imagej.nih.gov/ij/). This package only semi-automates this step, but still relies on the user's judgment on which parameters to use and whether or not the selected ROIs are appropriate. To make life easier, the package provides a simple shiny app to interactively determine these parameters and use it in the rest of the workflow. To launch the app run the following

```{r colocr_app2, eval=FALSE}
# run the shiny app
colocr_app()
```

And here is a screen shot from the app after applying the same parameters used previously.

```{r app_screenshot, echo=FALSE, out.width="600px"}
# show the screen shot of the shiny app
app_ss <- system.file('colocr_app/tests/one_image-expected', '001.png', package = 'colocr')
knitr::include_graphics(app_ss)
```

Although this app was designed to be invoked from within the package to help the users to choose the selection parameters interactively, it's a stand alone app and can run the same analysis described here. The app can be accessed from the web [here](https://mahshaaban.shinyapps.io/colocr_app2/)

### Calculating Correlation Statistics

The two different statistics implemented in this package are the PCC and SCC. The formal description and the rational for using them is detailed elsewhere. Invoking the test is a one function call on the selected regions of interest. 

```{r call_roi_test2}
# Calculate the co-localization statistics
tst <- img_rois %>%
  roi_test(type = 'both')
tst
```

`roi_test` returns a `data.frame` with a column for each of the desired statistics. When `n` is used in the selection of the regions of interest, a separate row is returned for each 
region.

```{r output_str2}
# str of the roi_test output
str(tst)
```

## Selecting a multiple regions in an image

In this example, we want to select the all cells as regions of interest 
and make sure no background is included in the selected region. To do that, we
use the different arguments in `roi_select`.

```{r roi_subset, fig.width=7, fig.height=7}
# load image
img2 <- image_load(system.file('extdata', 'Image0003_.jpg', package = 'colocr'))       # merge

# select ROI and show the results
par(mfrow = c(2,2), mar = rep(1, 4))

img2 %>%
  roi_select(threshold = 85,
             shrink = 10,
             clean = 10,
             n = 3) %>%
  roi_show()
```

## Analyzing a collection of images at once

To process a collection of images at once, we first make a `list` of the image objects
and pass it to the `roi_select`. Other arguments can be provided as a single value to
be applied to all images or specific values for each image.

```{r analyze_collection}
# make a list of images
fls <- c(system.file('extdata', 'Image0001_.jpg', package = 'colocr'),
         system.file('extdata', 'Image0003_.jpg', package = 'colocr'))
image_list <- image_load(fls)

# call roi_select on multiple images
image_list %>%
  roi_select(threshold = 90) %>%
  roi_test()
```

```{r analyze_collection2}
# make threshold input list
thresholds <- c(90, 95)

# call roi_select on multiple images and specific thresholds for each
image_list %>%
  roi_select(threshold = thresholds) %>%
  roi_test()
```

The same applies of the other two functions; `roi_show` and `roi_check`. When a `list`
of images is provided, they return same set of plots for each of the images.

## Reproducing the shiny app output

The interactive shiny app can be useful when trying the analysis for the first time.
Or when choosing the appropriate input values to select the regions of interest. 
To make things easier for the users, we enabled exporting the input and the output 
from the shiny app. These can be used to report an analysis directly, or to 
reproduce the results using R code.

This section describes how to use the exported input and output tables from the app
and using them to reproduce the same results. First, we start by loading `inputs_19.05.23_05.55.22.csv` which is the output of an analysis of two images 
performed by in the shiny app. Second, we load the `inputs_18.09.02_05.15.08.csv`
which contains the input parameters used in that analysis. Then, we call `roi_select`
and `roi_test` using these inputs. Finally, we compare the outputs form the app
and the script.

```{r read_output}
# show the output
stats <- read.csv(system.file('colocr_app', 'stats_19.05.23_05.55.21.csv', package = 'colocr'))

stats
```

```{r read_inputs}
# show the inputs
inputs <- read.csv(system.file('colocr_app', 'inputs_19.05.23_05.55.22.csv', package = 'colocr'), stringsAsFactors = FALSE)

inputs
```

```{r apply_inputs}
# read images
fls <- lapply(inputs$image, function(x) {
  system.file('extdata', x, package = 'colocr')
  })
imgs <- image_load(fls)

# use the app input to the roi_select function
rep_stats <- imgs %>%
  roi_select(threshold = inputs$threshold,
             shrink = inputs$shrink,
             grow = inputs$grow,
             fill = inputs$fill,
             clean = inputs$clean,
             tolerance = inputs$tolerance,
             n = inputs$roi_num) %>%
  roi_test(type = 'both')

rep_stats
```

```{r check_equal}
# check the app and the package output is equal
all.equal(round(stats$pcc, 2), round(c(rep_stats[[1]]$pcc, rep_stats[[2]]$pcc), 2))
all.equal(round(stats$moc, 2), round(c(rep_stats[[1]]$moc, rep_stats[[2]]$moc), 2))
```

## Description of the co-localization statistics

The following is a brief discussion of the theory and interpretations of different statistics that we used in this package as a measure of co-localization. For thorough and formal details, check this article by [Dunn et al.](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3074624/).

### PCC

Pearson's correlation coefficient is the co-variance of the pixel intensity from the two channels. The mean of the intensities is subtracted from each pixel which makes the coefficient independent of the background level.

The PCC is calculated as follows

$$
PCC = \frac{\sum_i{(R_i-\bar R)\times(G_i-\bar G)}}{\sqrt{\sum_i{(R_i-\bar R)^2\times\sum_i(G_i-\bar G)^2}}}
$$

Where $R_i$ and the $G_i$ is the intensities of the red and green channels and the $\bar R$ and $\bar G$ are the average intensities.

The values of PCC are between 1 and -1 for perfect correlations in the positive and negative directions respectively and 0 means no correlation.

### MOC

On the other hand, the Manders Overlap Coefficient doesn't require subtraction of the mean. Therefore, the values are always between 0 and 1. Also, the MOC is independent from signal proportionality. 

$$
MOC = \frac{\sum_i{(R_i\times G_i)}}{\sqrt{\sum_i{R_i^2\times\sum_i G_i^2}}}
$$
Where $R_i$ and the $G_i$ is the intensities of the red and green channels.

## Colocalization Benchmark Source

> The Colocalization Benchmark Source (CBS) is a free collection of downloadable images to test and validate the degree of colocalization of markers in fluorescence microscopy studies. It consists of computer-simulated images with exactly known (pre-defined) values of colocalization ranging from 0% to 90%. They can be downloaded as sets as well as separately.

Here, we used two examples from the CBS to test the results of the analysis using `colocr`.

```{r example1_roi, fig.height=4, fig.width=4}
# load image
fl1 <- system.file('extdata', 'example1.png', package = 'colocr')
ex1 <- image_load(fl1)

# select and show regions of interest (based on the app)
par(mfrow=c(2,2), mar = rep(1, 4))

ex1 %>%
  roi_select(threshold = 90, shrink = 8, n = 50) %>%
  roi_show()
```

```{r example1_test}
# calculate co-localization statistics
# (expected pcc = 0.68, moc = 0.83)
ex1 %>%
  roi_select(threshold = 90, shrink = 8, n = 50) %>%
  roi_test(type = 'both') %>%
  colMeans()
```

```{r example3_roi, fig.height=4, fig.width=4}
# load image
fl3 <- system.file('extdata', 'example3.png', package = 'colocr')
ex3 <- image_load(fl3)

# select and show regioins of interest (based on the app)
par(mfrow=c(2,2), mar = rep(1, 4))

ex3 %>%
  roi_select(threshold = 90, shrink = 9, grow = 1, fill = 10, clean = 10, n = 20) %>%
  roi_show()
```

```{r example3_test}
# calculate co-localization statistics
# (expected pcc = 0.61, moc = 0.71)
ex3 %>%
  roi_select(threshold = 90, shrink = 9, grow = 1, fill = 10, clean = 10, n = 20) %>%
  roi_test() %>%
  colMeans()
```

## Acknowledgement

* The vignette images from [Lai Huyen Trang](https://www.researchgate.net/profile/Lai_Huyen_Trang)  
* The test examples from [Colocalization Benchmark Source (CBS)](https://www.colocalization-benchmark.com/index.html)  
* The implementation of the co-localization statistics from [Dunn et al.](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3074624/)