--- title: "Primers and probes calibration vignette" author: "Edward Wallace" date: "April 2022" output: rmarkdown::html_vignette: toc: true toc_depth: 2 vignette: > %\VignetteIndexEntry{Primers and probes calibration vignette} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- # Summary: calibrating primer sets from a real experimental test This vignette shows how to use tidyqpcr functions to calibrate qPCR probes. This is real qPCR data by Edward Wallace in Feb 2019, testing new RT-qPCR primer sets against _S. cerevisiae_ genes. We took exponential-phase total RNA previously extracted by Jamie Auxillos. We tested 2-3 primer sets each for 7 genes: * ECM38/YLR299W (3 primer sets) * FET5/YFL041W (3 primer sets) * GPT2/YKR067W * ILV5/YLR355C * NRD1/YNL251C * RDL1/YOR285W * TFS1/YLR178C We started with two biological replicate RNA samples, treated with DNase and then split for a test sample with reverse transcriptase (RT) and negative control without reverse transcriptase (-RT). We also took a no template (NT) negative control. For each RT reaction we do serial 5x dilutions down to 125x to form a quantitative calibration curve. The data were measured on a Roche LC480 instrument in a single 384-well plate. Quantification was performed in the Roche LightCycler software prior to exporting the data. ## Setup knitr options and load packages ```{r setup,warning=FALSE,message=FALSE} # knitr options for report generation knitr::opts_chunk$set( warning = FALSE, message = FALSE, echo = TRUE, cache = FALSE, results = "show" ) # Load packages library(tidyr) library(ggplot2) library(dplyr) library(tidyqpcr) # set default theme for graphics theme_set(theme_bw(base_size = 11) %+replace% theme( strip.background = element_blank(), panel.grid = element_blank() )) ``` # Set up experiment ## Describe which primer set we put in which well using a row key In this experiment, each primer set was in a different row of the 384-well plate. We describe this by creating a row key, a data frame describing the rows of the plate, the primer sets, and the genes that they detect. We refer to each primer set as a `target_id`, because each primer set has a different target amplicon, on a different location in a gene. tidyqpcr insists on having a variable called `target_id` that uniquely identifies each different target that you detect. ```{r create_rowkey,dependson="plate_functions"} # Names of target genes gene_name_levels <- c("ECM38", "FET5", "GPT2", "ILV5", "NRD1", "RDL1", "TFS1") # ORF ids of target genes target_levels <- c("YLR299W", "YFL041W", "YKR067W", "YLR355C", "YNL251C", "YOR285W", "YLR178C") # Repeats of gene names to account for testing multiple primer sets gene_name_values <- c(rep(gene_name_levels[1:2], each = 3), rep(gene_name_levels[3:7], each = 2)) # id numbers of multiple probesets (reflecting IDs as ordered) target_id_levels <- paste(gene_name_values, c(1, 2, 3, 1, 3, 4, 1, 4, 1, 4, 1, 2, 4, 5, 1, 5), sep = "_" ) rowkey <- tibble( well_row = LETTERS[1:16], gene_name = factor(gene_name_values, levels = gene_name_levels), target_id = factor(target_id_levels, levels = target_id_levels) ) print(rowkey) ``` ## Combine the row key describing primer sets with column key describing on samples and dilutions. We use a default design built in to tidyqpcr, `create_colkey_4diln_2ctrl_in_24`. ```{r label_plates,dependson=c("plate_functions","create_rowkey")} plate1plan <- label_plate_rowcol( create_blank_plate(), rowkey, create_colkey_4diln_2ctrl_in_24() ) %>% mutate(sample_id = paste(biol_rep, dilution_nice, sep = "_")) ``` ## Spot-check the plate plan Checks that for selected technical replicate/probe/dilution combinations, the plate contains the right number of replicates. ```{r print_techreps,results="show"} plate1plan %>% filter(tech_rep == "1", target_id == target_id_levels[1], dilution_nice == "1x") plate1plan %>% filter(tech_rep == "2", target_id == target_id_levels[4]) ``` ## Display the plate plan This can be printed out to facilitate loading the plate correctly. ```{r display_plates,fig.height=8,fig.width=12,dependson="label_plates"} display_plate_qpcr(plate1plan) ``` # Analyse Cq (quantification cycle count) data ## Load and summarize data ```{r load_plates,dependson="label_plates",results="show"} # NOTE: system.file() accesses data from this R package # To use your own data, remove the call to system.file(), # instead pass your data's filename to read_lightcycler_1colour_cq() # or to another relevant read_ function file_path_cq <- system.file("extdata", "Edward_qPCR_Nrd1_calibration_2019-02-02_Cq.txt.gz", package = "tidyqpcr") plates <- file_path_cq %>% read_lightcycler_1colour_cq() %>% right_join(plate1plan, by = "well") ``` ```{r show_plates,dependson="load_plates",results="show"} plates summary(plates) ``` ## Visualise Cq values for each well. Visualising the Cq values shows that the Cq value is different for each primer set in each row. Within each section of a row for a single replicate of dilutions, Cq consistently increases with dilutions as expected. The grey tiles for most -RT and NT columns mean that the value is `NA`, i.e. no Cq value was reported. This is good. ```{r check_edge_effects,fig.height=4,fig.width=6,dependson="load_plates",results="show"} display_plate_value(plates) ``` Visualisation might also help to identify unwanted positional effects. For example, a PCR machine is broken, wells close to an edge of the plate can have different behaviour from central wells. ## Plot unnormalized data shows that -RT and NT controls are low This plot visualises the Cq data in a way that highlights the meaning instead of the position on the plate. Again, it shows that the Cq value is different for each primer set, and that for each primer st Cq consistently increases with dilutions as expected. Again, we detect no signal in NT (no template) negative control so those points are mostly missing. There is a very weak signal with high Cq in some -RT (no reverse transcriptase) negative controls. ```{r plot_unnormalized,dependson="load_plates",fig.height=6,fig.width=9} ggplot(data = plates) + geom_point(aes(x = target_id, y = cq, colour = dilution_nice, shape = prep_type), position = position_jitter(width = 0.2, height = 0) ) + labs( y = "Cycle count to threshold", title = "All reps, unnormalized" ) + facet_wrap(~biol_rep) + scale_y_continuous(breaks = seq(from = 15, to = 35, by = 5)) + theme(axis.text.x = element_text(angle = 90, vjust = 0.5), panel.grid.major.y = element_line(colour="grey80", size = 0.2)) ``` ## Dilution series is linear for all probes Visual display of linearity of cq with log(dilution). ```{r plot_dilutions,dependson="load_plates",fig.height=11,fig.width=6} ggplot(data = filter(plates, prep_type == "+RT"), aes(x = dilution, y = cq)) + geom_point() + stat_smooth( formula = y ~ x, method = "lm", se = FALSE, aes(colour = "fit", linetype = "fit") ) + stat_smooth( formula = y ~ 1 + offset(-x * log(10) / log(2)), method = "lm", se = FALSE, aes(colour = "theory", linetype = "theory") ) + scale_x_log10(breaks = 10 ^ - (0:3)) + scale_y_continuous(breaks = seq(0, 30, 2)) + labs( y = "Cycle count to threshold", title = "All reps, unnormalized", colour = "Dilution", linetype = "Dilution" ) + facet_grid(target_id ~ biol_rep, scales = "free_y") + theme(axis.text.x = element_text(angle = 90, vjust = 0.5)) ``` ## Calculate primer efficiencies for all probes Use regression to estimate linearity of cq with log(dilution), including the slope or efficiency. ```{r calibrate_dilutions,dependson="load_plates",results="show"} calculate_efficiency_bytargetid(plates) ``` ## Dilution series for nice probes only shows linearity clearly ```{r plot_dilutions_nice,dependson="load_plates",fig.height=6,fig.width=4} target_id_levels_niceprobes <- c("ECM38_3", "FET5_1", "GPT2_4", "ILV5_4", "NRD1_1", "RDL1_4", "TFS1_1") ggplot( data = filter(plates, prep_type == "+RT", target_id %in% target_id_levels_niceprobes), aes(x = dilution, y = cq) ) + geom_point() + stat_smooth( formula = y ~ x, method = "lm", se = FALSE, aes(colour = "fit", linetype = "fit") ) + stat_smooth( formula = y ~ 1 + offset(-x * log(10) / log(2)), method = "lm", se = FALSE, aes(colour = "theory", linetype = "theory") ) + scale_x_log10(breaks = 10 ^ - (0:3)) + scale_y_continuous(breaks = seq(0, 30, 2)) + labs( y = "Cycle count to threshold", title = "All reps, unnormalized", colour = "Dilution", linetype = "Dilution" ) + facet_grid(target_id ~ biol_rep, scales = "free_y") + theme(axis.text.x = element_text(angle = 90, vjust = 0.5)) ``` # Analyse amplification and melt curve data ## Load raw data for amplification and melt curves. ```{r load_amp,dependson="label_plates",results="show"} # NOTE: system.file() accesses data from this R package # To use your own data, remove the call to system.file(), # instead pass your data's filename to read_lightcycler_1colour_cq() # or to another relevant read_ function file_path_raw <- system.file("extdata/Edward_qPCR_Nrd1_calibration_2019-02-02.txt.gz", package = "tidyqpcr") plate1curve <- file_path_raw %>% read_lightcycler_1colour_raw() %>% debaseline() %>% left_join(plate1plan, by = "well") # amplification curve is program 2 platesamp <- plate1curve %>% filter(program_no == 2) # melt curve is program 3 or 4, depending on cycler setting platesmelt <- plate1curve %>% filter(program_no == 3) %>% calculate_drdt_plate() %>% filter(temperature >= 61) ``` ## Plot de-baseline'd raw data for single well ```{r plotamp_A1,dependson="load_amp",fig.width=4,fig.height=3} ggplot( data = platesamp %>% filter(well == "A1"), aes(x = cycle, y = fluor_signal) ) + geom_line() + scale_y_continuous(expand = c(0.01, 0.01)) ``` ## Plot all amplification curves Broken up by technical replicate here, to avoid overplotting. ```{r plotamp_all,dependson="load_amp",fig.height=11,fig.width=7} ggplot( data = platesamp %>% filter(tech_rep == "1"), aes(x = cycle, y = fluor_signal, colour = factor(dilution), linetype = prep_type) ) + facet_grid(target_id ~ biol_rep, scales = "free_y") + scale_linetype_manual(values = c("+RT" = "solid", "-RT" = "dashed", "NT" = "dotted")) + geom_line() + scale_x_continuous(breaks = seq(60, 100, 10), minor_breaks = seq(60, 100, 5)) + labs(title = "All Amp Curves, tech_rep A") ggplot( data = platesamp %>% filter(tech_rep == "2"), aes(x = cycle, y = fluor_signal, colour = factor(dilution), linetype = prep_type) ) + facet_grid(target_id ~ biol_rep, scales = "free_y") + scale_linetype_manual(values = c("+RT" = "solid", "-RT" = "dashed", "NT" = "dotted")) + geom_line() + scale_x_continuous(breaks = seq(60, 100, 10), minor_breaks = seq(60, 100, 5)) + labs(title = "All Amp Curves, tech_rep B") ``` ## Plot melt curve for single well ```{r plotmelt_A1,dependson="load_amp",fig.width=4,fig.height=1.5} ggplot( data = platesmelt %>% filter(well == "A1"), aes(x = temperature, y = dRdT) ) + facet_wrap(~target_id) + geom_line() + scale_y_continuous(expand = c(0.02, 0.02)) ``` ## Plot all melt curves Again broken up by technical replicate. ```{r plotmelt_all,dependson="load_amp",fig.height=11,fig.width=7} ggplot( data = platesmelt %>% filter(tech_rep == "1"), aes(x = temperature, y = dRdT, colour = factor(dilution), linetype = prep_type) ) + facet_grid(target_id ~ biol_rep, scales = "free_y") + scale_linetype_manual(values = c("+RT" = "solid", "-RT" = "dashed", "NT" = "dotted")) + geom_line() + scale_x_continuous(breaks = seq(60, 100, 10), minor_breaks = seq(60, 100, 5)) + labs(title = "All Melt Curves, tech_rep A") ggplot( data = platesmelt %>% filter(tech_rep == "2"), aes(x = temperature, y = dRdT, colour = factor(dilution), linetype = prep_type) ) + facet_grid(target_id ~ biol_rep, scales = "free_y") + scale_linetype_manual(values = c("+RT" = "solid", "-RT" = "dashed", "NT" = "dotted")) + geom_line() + scale_x_continuous(breaks = seq(60, 100, 10), minor_breaks = seq(60, 100, 5)) + labs(title = "All Melt Curves, tech_rep B") ``` ## Plot zoomed melt curves ```{r plotmelt_zoomed,dependson="load_amp",fig.height=11,fig.width=7} ggplot( data = platesmelt %>% filter(tech_rep == "1", prep_type == "+RT"), aes(x = temperature, y = dRdT, colour = factor(dilution)) ) + facet_grid(target_id ~ biol_rep, scales = "free_y") + geom_line() + scale_x_continuous( breaks = seq(60, 100, 5), minor_breaks = seq(60, 100, 1), limits = c(73, 87) ) + labs(title = "Melt curves, zoomed, tech_rep A") + theme( panel.grid.major.x = element_line(colour = "grey50", size = 0.4), panel.grid.minor.x = element_line(colour = "grey70", size = 0.1) ) ggplot( data = platesmelt %>% filter(tech_rep == "2", prep_type == "+RT"), aes(x = temperature, y = dRdT, colour = factor(dilution)) ) + facet_grid(target_id ~ biol_rep, scales = "free_y") + geom_line() + scale_x_continuous( breaks = seq(60, 100, 5), minor_breaks = seq(60, 100, 1), limits = c(73, 87) ) + labs(title = "Melt curves, zoomed, tech_rep B") + theme( panel.grid.major.x = element_line(colour = "grey50", size = 0.4), panel.grid.minor.x = element_line(colour = "grey70", size = 0.1) ) ``` ## Plot only zoomed melt curves for nice probes ```{r plotmelt_zoomed_nice,dependson="load_amp",fig.height=6,fig.width=4} ggplot( data = platesmelt %>% filter( tech_rep == "1", prep_type == "+RT", dilution_nice == "1x", target_id %in% target_id_levels_niceprobes ), aes(x = temperature, y = dRdT, colour = biol_rep) ) + facet_grid(target_id ~ ., scales = "free_y") + geom_line() + scale_x_continuous( breaks = seq(60, 100, 5), minor_breaks = seq(60, 100, 1), limits = c(73, 87) ) + labs(title = "Nice probes, tech_rep A") + theme( panel.grid.major.x = element_line(colour = "grey50", size = 0.4), panel.grid.minor.x = element_line(colour = "grey70", size = 0.1) ) ``` # Conclude acceptable primer sets These probes have sensible standard curves, amplification curves, melt curves. In tie-breakers we pick the more highly detected probe. * ECM38 set 3 * FET5 set 1 or 4 * GPT2 set 4 * ILV5 set 4 * NRD1 set 1 or 2 * RDL1 set 4 * TFS1 set 1