--- title: "Phylogenetics with phruta" vignette: > %\VignetteIndexEntry{Phylogenetics with phruta} \usepackage[utf8]{inputenc} %\VignetteEngine{knitr::rmarkdown} output: # pdf_document: # highlight: null # number_sections: yes knitr:::html_vignette: toc: true fig_caption: yes --- # Table of Contents 1. [Installing RAxML](#paragraph1) 2. [Installing PATHd-8 and treePL](#paragraph2) 3. [Phylogenetic inference with `phruta` and `RAxML`](#paragraph3) 3. [Tree dating in `phruta`](#paragraph4) ## Installing RAxML In `MacOS`, `RAxML` can be easily installed to the `PATH` using one of the two lines below in `conda`: ```bash conda install -c bioconda/label/cf201901 raxml ``` ```bash conda install -c bioconda raxml ``` For other `OS` (Windows, Linux), please follow the instructions listed in the official `RAxML` [website](https://cme.h-its.org/exelixis/web/software/raxml/) Once `RAxML` has been installed to your computer, open `R` and make sure that the following line doesn't throw an error. ```r system("raxmlHPC") ``` Depending on how `RAxML` was installed, you may want to check if `RAxML` is called from the terminal using `raxmlHPC` or `raxmlHPC`. This string needs to be passed to `tree.raxml()` using the argument `raxml_exec`. Please note that this argument corresponds to the `exec` argument in `ips::raxml()`. Finally, note that `RStudio` sometimes has issues finding *stuff* in the path while using `system()`. If you're using `macOS`, try starting `RStudio` from the command line by running the following line: ```bash open /Applications/RStudio.app ``` In other `OS`, it might be better to simply avoid using `RStudio` if you're interested in running the phylogenetic functions in `phruta`. ## Installing PATHd-8 and treePL There are excellent guides for installing `PATHd-8` and `treePL`. Here, I summarize two potentially relevant options. First, you can use [Brian O'Meara's](https://github.com/bomeara/phydocker/blob/master/Dockerfile) approach for installing `PATHd-8` in MacOs and linux. I summarize the code in the following [link](https://gist.github.com/cromanpa94/a43bc710a17220f71d796d6590ea7fe4). For Windows users, please use the compiled version of the software provided in the following [link](https://www2.math.su.se/PATHd8/). Second, you can use homebrew to install `treePL` (Windows, MacOS, and Linux), thanks to Jonathan Chang. ```bash brew install brewsci/bio/treepl ``` Please check the following [link](https://docs.brew.sh/Homebrew-on-Linux)) if you're interested in running `brew` from Windows and Linux. ## Phylogenetic inference with `phruta` and `RAxML` Phylogenetic inference is conducted using the `tree.raxml()` function. We need to indicate where the aligned sequences are located (`folder` argument), the patterns of the files in the same folder (`FilePatterns` argument; "`Masked_`" in our case). We'll run a total of 100 boostrap replicates and set the outgroup to "Manis_pentadactyla". ```r tree.raxml(folder='2.Alignments', FilePatterns= 'Masked_', raxml_exec='raxmlHPC', Bootstrap=100, outgroup ="Manis_pentadactyla") ``` The trees are saved in `3.Phylogeny`. Likely, the bipartitions tree, "RAxML_bipartitions.phruta", is the most relevant. `3.Phylogeny` also includes additional `RAxML`-related input and output files. Finally, let's perform tree dating in our phylogeny using secondary calibrations extracted from [Scholl and Wiens (2016)](https://royalsocietypublishing.org/doi/pdf/10.1098/rspb.2016.1334). This study curated potentially the most comprenhensive and reliable set of trees to summarize the temporal dimension in evolution across the tree of life. In `phruta`, the trees from Scholl and Wiens (2016) were renamed to match taxonomic groups. ## Tree dating in `phruta` Tree dating is performed using the `tree.dating()` function in `phruta`. We have to provide the name of the folder containing the `1.Taxonomy.csv` file created in `sq.curate()`. We also have to indicate the name of the folder containing the `RAxML_bipartitions.phruta` file. We will scale our phylogeny using `treePL`. ```r tree.dating(taxonomyFolder="1.CuratedSequences", phylogenyFolder="3.Phylogeny", scale='treePL') ``` Running this line will result in a new folder `4.Timetree`, including the different time-calibrated phylogenies obained (if any) and associated secondary calibrations used in the analyses. We found only a few overlapping calibration points (family-level constraints): Here's the resulting time-calibrated phylogeny. The whole process took \~20 minutes to complete on my computer (16 gb RAM, i5). ![Time-calibrated phylogeny for the target taxa. Analyses performed within the phruta R package.](tree.png)