--- title: "Accessing Publication Data" author: "Helen Miller" date: "2022-06-15" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Accessing Publication Data} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- DataSpace maintains a curated collection of relevant publications, which can be accessed through the [Publications page](https://dataspace.cavd.org/cds/CAVD/app.view?#learn/learn/Publication) through the app. Some publications laos include datasets which can be downloaded as a zip file. `DataSpaceR` provides an interface for browsing publications in DataSpace and downloading publication data where available. ## Browsing publications in DataSpace The `DataSpaceConnection` object includes methods for browsing and downloading publications and publication data. ```r library(DataSpaceR) library(data.table) con <- connectDS() con #> #> URL: https://dataspace.cavd.org #> User: jmtaylor@scharp.org #> Available studies: 273 #> - 77 studies with data #> - 5049 subjects #> - 423195 data points #> Available groups: 3 #> Available publications: 1530 #> - 12 publications with data ``` The `DataSpaceConnection` print method summarizes the publications and publication data. More details about publications can be accessed through `con$availablePublications`. ```r knitr::kable(head(con$availablePublications[, -"link"])) ``` |publication_id |first_author |all_authors |title |journal |publication_date |pubmed_id |related_studies |studies_with_data |publication_data_available | |:--------------|:-------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------|:----------------|:---------|:---------------|:-----------------|:--------------------------| |1006 |Abbink P |Abbink P, Maxfield LF, Ng'ang'a D, Borducchi EN, Iampietro MJ, Bricault CA, Teigler JE, Blackmore S, Parenteau L, Wagh K, Handley SA, Zhao G, Virgin HW, Korber B, Barouch DH |Construction and evaluation of novel rhesus monkey adenovirus vaccine vectors |J Virol |2015 Feb |25410856 |NA |NA |FALSE | |1303 |Abbott RK |Abbott RK, Lee JH, Menis S, Skog P, Rossi M, Ota T, Kulp DW, Bhullar D, Kalyuzhniy O, Havenar-Daughton C, Schief WR, Nemazee D, Crotty S |Precursor frequency and affinity determine B Cell competitive fitness in germinal centers, tested with germline-targeting HIV vaccine immunogens |Immunity |2018 Jan 16 |29287996 |NA |NA |FALSE | |1136 |Abu-Raddad LJ |Abu-Raddad LJ, Boily MC, Self S, Longini IM Jr |Analytic insights into the population level impact of imperfect prophylactic HIV vaccines |J Acquir Immune Defic Syndr |2007 Aug 1 |17554215 |NA |NA |FALSE | |1485 |Abu-Raddad LJ |Abu-Raddad LJ, Barnabas RV, Janes H, Weiss HA, Kublin JG, Longini IM Jr, Wasserheit JN, HIV Viral Load Working Group |Have the explosive HIV epidemics in sub-Saharan Africa been driven by higher community viral load? |AIDS |2013 Mar 27 |23196933 |NA |NA |FALSE | |842 |Acharya P |Acharya P, Tolbert WD, Gohain N, Wu X, Yu L, Liu T, Huang W, Huang CC, Kwon YD, Louder RK, Luongo TS, McLellan JS, Pancera M, Yang Y, Zhang B, Flinko R, Foulke JS Jr, Sajadi MM, Kamin-Lewis R, Robinson JE, Martin L, Kwong PD, Guan Y, DeVico AL, Lewis GK, Pazgier M |Structural definition of an antibody-dependent cellular cytotoxicity response implicated in reduced risk for HIV-1 infection |J Virol |2014 Nov |25165110 |NA |NA |FALSE | |1067 |Ackerman ME |Ackerman ME, Mikhailova A, Brown EP, Dowell KG, Walker BD, Bailey-Kellogg C, Suscovich TJ, Alter G |Polyfunctional HIV-specific antibody responses are associated with spontaneous HIV control |PLOS Pathog |2016 Jan |26745376 |NA |NA |FALSE | This table summarizes all publications, providing some information like first author, journal where it was published, and title as a `data.table`. It also includes a pubmed url where available under `link`. Related studies under `related_studies`, and related studies with data available under `studies_with_data`. We can use `data.table` methods to filter and sort this table to browse available publications. For example, we can filter this table to view only publications related to a particular study: ```r vtn096_pubs <- con$availablePublications[grepl("vtn096", related_studies)] knitr::kable(vtn096_pubs[, -"link"]) ``` |publication_id |first_author |all_authors |title |journal |publication_date |pubmed_id |related_studies |studies_with_data |publication_data_available | |:--------------|:------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------|:----------------|:---------|:----------------------|:-----------------|:--------------------------| |1466 |Cram JA |Cram JA, Fiore-Gartland AJ, Srinivasan S, Karuna S, Pantaleo G, Tomaras GD, Fredricks DN, Kublin JG |Human gut microbiota is associated with HIV-reactive immunoglobulin at baseline and following HIV vaccination |PLoS One |2019 |31869338 |vtn096 |NA |TRUE | |250 |Huang Y |Huang Y, DiazGranados C, Janes H, Huang Y, deCamp A, Metch B, Grant S, Sanchez B, Phogat S, Koutsoukos M, Kanesa-Thasan N, Bourguignon P, Collard A, Buchbinder S, Tomaras GD, McElrath J, Gray G, Kublin J, Corey L, Gilbert PB |Selection of HIV vaccine candidates for concurrent testing in an efficacy trial |Curr Opin Virol |2016 Jan 28 |26827165 |mrv144, vtn096 |NA |FALSE | |267 |Huang Y |Huang Y, Zhang L, Janes H, Frahm N, Isaacs A, Kim JH, Montefiori D, McElrath MJ, Tomaras GD, Gilbert PB |Predictors of durable immune responses six months after the last vaccination in preventive HIV vaccine trials |Vaccine |2017 Feb 22 |28131393 |mrv144, vtn096, vtn205 |NA |FALSE | |268 |Huang Y |Huang Y, Gilbert P, Fu R, Janes H |Statistical methods for down-selection of treatment regimens based on multiple endpoints, with application to HIV vaccine trials |Biostatistics |2017 Apr 1 |27649715 |mrv144, vtn096 |NA |FALSE | |1392 |Pantaleo G |Pantaleo G, Janes H, Karuna S, Grant S, Ouedraogo GL, Allen M, Tomaras GD, Frahm N, Montefiori DC, Ferrari G, Ding S, Lee C, Robb ML, Esteban M, Wagner R, Bart PA, Rettby N, McElrath MJ, Gilbert PB, Kublin JG, Corey L, NIAID HIV Vaccine Trials Network |Safety and immunogenicity of a multivalent HIV vaccine comprising envelope protein with either DNA or NYVAC vectors (HVTN 096): a phase 1b, double-blind, placebo-controlled trial |Lancet HIV |2019 Oct 7 |31601541 |mrv144, vtn096, vtn100 |NA |TRUE | |1420 |Westling T |Westling T, Juraska M, Seaton KE, Tomaras GD, Gilbert PB, Janes H |Methods for comparing durability of immune responses between vaccine regimens in early-phase trials |Stat Methods Med Res |2019 Jan 9 |30623732 |vtn094, vtn096 |NA |FALSE | |281 |Yates NL |Yates NL, deCamp AC, Korber BT, Liao HX, Irene C, Pinter A, Peacock J, Harris LJ, Sawant S, Hraber P, Shen X, Rerks-Ngarm S, Pitisuttithum P, Nitayapan S, Berman PW, Robb ML, Pantaleo G, Zolla-Pazner S, Haynes BF, Alam SM, Montefiori DC, Tomaras GD |HIV-1 envelope glycoproteins from diverse clades differentiate antibody responses and durability among vaccinees |J Virol |2018 Mar 28 |29386288 |mrv144, vtn096 |NA |FALSE | or publications that have related studies with integrated data in DataSpace: ```r pubs_with_study_data <- con$availablePublications[!is.na(studies_with_data)] knitr::kable(head(pubs_with_study_data[, -"link"])) ``` |publication_id |first_author |all_authors |title |journal |publication_date |pubmed_id |related_studies |studies_with_data |publication_data_available | |:--------------|:-----------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------|:------------------|:----------------|:---------|:------------------------------------------------------------------------------|:-----------------|:--------------------------| |1540 |Andersen-Nissen E |Andersen-Nissen E, Fiore-Gartland A, Ballweber Fleming L, Carpp LN, Naidoo AF, Harper MS, Voillet V, Grunenberg N, Laher F, Innes C, Bekker L-G, Kublin JG, Huang Y, Ferrari G, Tomaras GD, Gray G, Gilbert PB, McElrath MJ |Innate immune signatures to a partially-efficacious HIV vaccine predict correlates of HIV-1 infection risk |PLOS Pathog |2021 |NA |vtn097 |vtn097 |TRUE | |213 |Andrasik MP |Andrasik MP, Yoon R, Mooney J, Broder G, Bolton M, Votto T, Davis-Vogel A; HVTN 505 study team; NIAID HIV Vaccine Trials Network |Exploring barriers and facilitators to participation of male-to-female transgender persons in preventive HIV vaccine clinical trials. |Prev Sci |2014 Jun |23446435 |vtn505 |vtn505 |FALSE | |218 |Andrasik MP |Andrasik MP, Karuna ST, Nebergall M, Koblin BA, Kublin JG; NIAID HIV Vaccine Trials Network |Behavioral risk assessment in HIV Vaccine Trials Network (HVTN) clinical trials: A qualitative study exploring HVTN staff perspectives |Vaccine |2013 Sep 13 |23859840 |vtn069, vtn404, vtn502, vtn503, vtn504, vtn505, vtn802, vtn903, vtn906, vtn907 |vtn505 |FALSE | |225 |Andrasik MP |Andrasik MP, Chandler C, Powell B, Humes D, Wakefield S, Kripke K, Eckstein D |Bridging the divide: HIV prevention research and Black men who have sex with men |Am J Public Health |2014 Apr |24524520 |vtn505 |vtn505 |FALSE | |226 |Arnold MP |Arnold MP, Andrasik M, Landers S, Karuna S, Mimiaga MJ, Wakefield S, Mayer K, Buchbinder S, Koblin BA |Sources of racial/ethnic differences in HIV vaccine trial awareness: Exposure, attention, or both? |Am J Public Health |2014 Aug |24922153 |vtn505 |vtn505 |FALSE | |7 |Asbach B |Asbach B, Kliche A, Köstler J, Perdiguero B, Esteban M, Jacobs BL, Montefiori DC, LaBranche CC, Yates NL, Tomaras GD, Ferrari G, Foulds KE, Roederer M, Landucci G, Forthal DN, Seaman MS, Hawkins N, Self SG, Sato A, Gottardo R, Phogat S, Tartaglia J, Barnett SW, Burke B, Cristillo AD, Weiss DE, Francis J, Galmin L, Ding S, Heeney JL, Pantaleo G, Wagner R |Potential to streamline heterologous DNA prime and NYVAC/protein boost HIV vaccine regimens in rhesus macaques by employing improved antigens |J Virol |2016 Mar 28 |26865719 |cvd277, cvd281 |cvd277, cvd281 |FALSE | We can also use this information to connect to related studies and pull integrated data. Say we are interested in this Rouphael (2019) publication: ```r rouphael2019 <- con$availablePublications[first_author == "Rouphael NG"] knitr::kable(rouphael2019[, -"link"]) ``` |publication_id |first_author |all_authors |title |journal |publication_date |pubmed_id |related_studies |studies_with_data |publication_data_available | |:--------------|:------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------|:-------------|:----------------|:---------|:---------------|:-----------------|:--------------------------| |1390 |Rouphael NG |Rouphael NG, Morgan C, Li SS, Jensen R, Sanchez B, Karuna S, Swann E, Sobieszczyk ME, Frank I, Wilson GJ, Tieu HV, Maenza J, Norwood A, Kobie J, Sinangil F, Pantaleo G, Ding S, McElrath MJ, De Rosa SC, Montefiori DC, Ferrari G, Tomaras GD, Keefer MC, HVTN 105 Protocol Team and the NIAID HIV Vaccine Trials Network |DNA priming and gp120 boosting induces HIV-specific antibodies in a randomized clinical trial |J Clin Invest |2019 Sep 30 |31566579 |vtn105 |vtn105 |TRUE | We can find this publication in the available publications table, determine related studies, and pull data for those studies where available. ```r related_studies <- rouphael2019$related_studies related_studies #> [1] "vtn105" rouphael2019_study <- con$getStudy(related_studies) rouphael2019_study #> #> Study: vtn105 #> URL: https://dataspace.cavd.org/CAVD/vtn105 #> Available datasets: #> - Binding Ab multiplex assay #> - Demographics #> - Intracellular Cytokine Staining #> - Neutralizing antibody #> Available non-integrated datasets: dim(rouphael2019_study$availableDatasets) #> [1] 4 4 ``` We can see that there are datasets available for this study. We can pull any of them using `rouphael2019_study$getDataset()`. ## Downloading Publication Data DataSpace also includes publication datasets for some publications. The format of this data will vary from publication to publication, and is stored in a zip file. The `publication_data_available` field specifies publications where publication data is available. ```r pubs_with_data <- con$availablePublications[publication_data_available == TRUE] knitr::kable(head(pubs_with_data[, -"link"])) ``` |publication_id |first_author |all_authors |title |journal |publication_date |pubmed_id |related_studies |studies_with_data |publication_data_available | |:--------------|:-----------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|:----------------|:---------|:---------------|:-----------------|:--------------------------| |1540 |Andersen-Nissen E |Andersen-Nissen E, Fiore-Gartland A, Ballweber Fleming L, Carpp LN, Naidoo AF, Harper MS, Voillet V, Grunenberg N, Laher F, Innes C, Bekker L-G, Kublin JG, Huang Y, Ferrari G, Tomaras GD, Gray G, Gilbert PB, McElrath MJ |Innate immune signatures to a partially-efficacious HIV vaccine predict correlates of HIV-1 infection risk |PLOS Pathog |2021 |NA |vtn097 |vtn097 |TRUE | |136 |Bekker LG |Bekker LG, Moodie Z, Grunenberg N, Laher F, Tomaras GD, Cohen KW, Allen M, Malahleha M, Mngadi K, Daniels B, Innes C, Bentley C, Frahm N, Morris DE, Morris L, Mkhize NN, Montefiori DC, Sarzotti-Kelsoe M, Grant S, Yu C, Mehra VL, Pensiero MN, Phogat S, DiazGranados CA, Barnett SW, Kanesa-thasan N, Koutsoukos M, Michael NL, Robb ML, Kublin JG, Gilbert PB, Corey L, Gray GE, McElrath MJ on behalf of the HVTN 100 Protocol Team |A phase 1/2 HIV-1 vaccine trial of a Subtype C ALVAC-HIV and Bivalent Subtype C gp120/MF59 vaccine regimen in low risk HIV uninfected South African adults |Lancet HIV |2018 Jul |29898870 |mrv144, vtn100 |NA |TRUE | |1466 |Cram JA |Cram JA, Fiore-Gartland AJ, Srinivasan S, Karuna S, Pantaleo G, Tomaras GD, Fredricks DN, Kublin JG |Human gut microbiota is associated with HIV-reactive immunoglobulin at baseline and following HIV vaccination |PLoS One |2019 |31869338 |vtn096 |NA |TRUE | |80 |Fong Y |Fong Y, Shen X, Ashley VC, Deal A, Seaton KE, Yu C, Grant SP, Ferrari G, deCamp AC, Bailer RT, Koup RA, Montefiori D, Haynes BF, Sarzotti-Kelsoe M, Graham BS, Carpp LN, Hammer SM, Sobieszczyk M, Karuna S, Swann E, DeJesus E, Mulligan M, Frank I, Buchbinder S, Novak RM, McElrath MJ, Kalams S, Keefer M, Frahm NA, Janes HE, Gilbert PB, Tomaras GD |Modification of the association between T-Cell immune responses and human immunodeficiency virus type 1 infection risk by vaccine-induced antibody responses in the HVTN 505 trial |J Infect Dis |2018 Mar 28 |29325070 |vtn505 |vtn505 |TRUE | |79 |Hammer SM |Hammer SC, Sobieszczyk ME, Janes H, Karuna ST, Mulligan MJ, Grove D, Koblin BA, Buchbinder SP, Keefer MC, Tomaras GD, Frahm N, Hural J, Anude C, Graham BS, Enama ME, Adams E, DeJesus E, Novak RM, Frank I, Bentley C, Ramirez S, Fu R, Koup RA, Mascola JR, Nabel GJ, Montefiori DC, Kublin J, McElrath MJ, Corey L, Gilbert PB |Efficacy trial of a DNA/rAd5 HIV-1 preventive vaccine |N Engl J Med |2013 Nov 28 |24099601 |vtn505 |vtn505 |TRUE | |1467 |Hosseinipour MC |Hosseinipour MC, Innes C, Naidoo S, Mann P, Hutter J, Ramjee G, Sebe M, Maganga L, Herce ME, deCamp AC, Marshall K, Dintwe O, Andersen-Nissen E, Tomaras GD, Mkhize N, Morris L, Jensen R, Miner MD, Pantaleo G, Ding S, Van Der Meeren O, Barnett SW, McElrath MJ, Corey L, Kublin JG |Phase 1 HIV vaccine trial to evaluate the safety and immunogenicity of HIV subtype C DNA and MF59-adjuvanted subtype C Env protein |Clin Infect Dis |2020 Jan 4 |31900486 |vtn111 |NA |TRUE | Data for a publication can be accessed through `DataSpaceR` with `con$downloadPublicationData()`. The publication ID must be specified, as found under `publication_id` in `con$availablePublications`. The file is presented as a zip file. The `unzip` argument gives us the option whether to unzip this file. By default, the file will be unzipped. You may also specify the directory to download the file. By default, it will be saved to your `Downloads` directory. This function invisibly returns the paths to the downloaded files. Here, we download data for publication with ID 1461 (Westling, 2020), and view the resulting downloads. ```r publicationDataFiles <- con$downloadPublicationData("1461", outputDir = tempdir(), unzip = TRUE, verbose = TRUE) basename(publicationDataFiles) #> [1] "causal.isoreg.fns.R" "CD.SuperLearner.R" #> [3] "cd4_analysis.R" "cd4_data.csv" #> [5] "cd8_analysis.R" "cd8_data.csv" #> [7] "README.txt" "Westling_1461_file_format.pdf" ``` All zip files will include a file format document as a PDF, as well as a README. These documents will give an overview of the remaining contents of the files. In this case, data is separated by CD8+ T-cell responses and CD4+ T-cell responses, as described in the `README.txt`. ```r cd4 <- fread(publicationDataFiles[grepl("cd4_data", publicationDataFiles)]) cd4 #> pub_id age sex bmi prot num_vacc dose response sexFemale #> 1: 069-071 23 Male 33.31000 vtn069 3 4.0 0 0 #> 2: 069-068 20 Female 32.74000 vtn069 3 4.0 1 1 #> 3: 069-020 19 Male 20.24000 vtn069 3 4.0 0 0 #> 4: 069-008 19 Male 29.98000 vtn069 3 4.0 0 0 #> 5: 069-058 27 Male 31.48000 vtn069 3 4.0 0 0 #> --- #> 368: 100-182 36 Female 24.32323 vtn100 4 1.5 0 1 #> 369: 100-034 20 Male 25.15590 vtn100 4 1.5 1 0 #> 370: 100-115 20 Male 22.94812 vtn100 4 1.5 1 0 #> 371: 100-144 19 Male 20.95661 vtn100 4 1.5 0 0 #> 372: 100-189 24 Male 19.03114 vtn100 4 1.5 1 0 #> studyHVTN052 studyHVTN068 studyHVTN069 studyHVTN204 studyHVTN100 vacc_type #> 1: 0 0 1 0 0 VRC4or6 #> 2: 0 0 1 0 0 VRC4or6 #> 3: 0 0 1 0 0 VRC4or6 #> 4: 0 0 1 0 0 VRC4or6 #> 5: 0 0 1 0 0 VRC4or6 #> --- #> 368: 0 0 0 0 1 VRC4or6 #> 369: 0 0 0 0 1 VRC4or6 #> 370: 0 0 0 0 1 VRC4or6 #> 371: 0 0 0 0 1 VRC4or6 #> 372: 0 0 0 0 1 VRC4or6 #> vacc_typeSinglePlasmid vacc_typeVRC4or6 #> 1: 0 1 #> 2: 0 1 #> 3: 0 1 #> 4: 0 1 #> 5: 0 1 #> --- #> 368: 0 1 #> 369: 0 1 #> 370: 0 1 #> 371: 0 1 #> 372: 0 1 ``` This publication also includes analysis scripts used for the publication, which can allow users to reproduce the analysis and results. ## Session information ```r sessionInfo() #> R version 4.1.2 (2021-11-01) #> Platform: x86_64-pc-linux-gnu (64-bit) #> Running under: Ubuntu 18.04.5 LTS #> #> Matrix products: default #> BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1 #> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1 #> #> locale: #> [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C #> [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 #> [5] LC_MONETARY=en_US.utf8 LC_MESSAGES=en_US.utf8 #> [7] LC_PAPER=en_US.utf8 LC_NAME=C #> [9] LC_ADDRESS=C LC_TELEPHONE=C #> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C #> #> attached base packages: #> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages: #> [1] data.table_1.14.2 DataSpaceR_0.7.5 knitr_1.37 #> #> loaded via a namespace (and not attached): #> [1] Rcpp_1.0.8 digest_0.6.29 assertthat_0.2.1 R6_2.5.1 #> [5] jsonlite_1.8.0 magrittr_2.0.2 evaluate_0.15 highr_0.9 #> [9] httr_1.4.2 stringi_1.7.6 curl_4.3.2 tools_4.1.2 #> [13] stringr_1.4.0 Rlabkey_2.8.3 xfun_0.29 compiler_4.1.2 ```