Package: biomartr 1.0.8.9000

Hajk-Georg Drost

biomartr: Genomic Data Retrieval

Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized way to automate genome, proteome, 'RNA', coding sequence ('CDS'), 'GFF', and metagenome retrieval from 'NCBI RefSeq', 'NCBI Genbank', 'ENSEMBL', and 'UniProt' databases. Furthermore, an interface to the 'BioMart' database (Smedley et al. (2009) <doi:10.1186/1471-2164-10-22>) allows users to retrieve functional annotation for genomic loci. In addition, users can download entire databases such as 'NCBI RefSeq' (Pruitt et al. (2007) <doi:10.1093/nar/gkl842>), 'NCBI nr', 'NCBI nt', 'NCBI Genbank' (Benson et al. (2013) <doi:10.1093/nar/gks1195>), etc. with only one command.

Authors:Hajk-Georg Drost [aut, cre], Haakon Tjeldnes [aut, ctb]

biomartr_1.0.8.9000.tar.gz
biomartr_1.0.8.9000.zip(r-4.5)biomartr_1.0.8.9000.zip(r-4.4)biomartr_1.0.8.9000.zip(r-4.3)
biomartr_1.0.8.9000.tgz(r-4.4-any)biomartr_1.0.8.9000.tgz(r-4.3-any)
biomartr_1.0.8.9000.tar.gz(r-4.5-noble)biomartr_1.0.8.9000.tar.gz(r-4.4-noble)
biomartr_1.0.8.9000.tgz(r-4.4-emscripten)biomartr_1.0.8.9000.tgz(r-4.3-emscripten)
biomartr.pdf |biomartr.html
biomartr/json (API)
NEWS

# Install 'biomartr' in R:
install.packages('biomartr', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/ropensci/biomartr/issues

On CRAN:

biomartgenomic-data-retrievalannotation-retrievaldatabase-retrievalncbiensemblbiological-data-retrievalensembl-serversgenomegenome-annotationgenome-retrievalgenomicsmeta-analysismetagenomicsncbi-genbankpeer-reviewedproteomesequenced-genomes

11.30 score 216 stars 3 packages 112 scripts 1.5k downloads 11 mentions 64 exports 79 dependencies

Last updated 5 months agofrom:9caeae524e (on master). Checks:OK: 7. Indexed: yes.

TargetResultDate
Doc / VignettesOKOct 28 2024
R-4.5-winOKOct 28 2024
R-4.5-linuxOKOct 28 2024
R-4.4-winOKOct 28 2024
R-4.4-macOKOct 28 2024
R-4.3-winOKOct 28 2024
R-4.3-macOKOct 28 2024

Exports:biomartcachedircachedir_setcheck_annotation_biomartrdownload.databasedownload.database.allensembl_divisionsget.ensembl.infogetAssemblyStatsgetAttributesgetBioSetgetCDSgetCDSSetgetCollectiongetCollectionSetgetDatasetsgetENSEMBLGENOMESInfogetENSEMBLInfogetFiltersgetGenomegetGENOMEREPORTgetGenomeSetgetGFFgetGFFSetgetGOgetGroupsgetGTFgetKingdomAssemblySummarygetKingdomsgetMartsgetMetaGenomeAnnotationsgetMetaGenomesgetMetaGenomeSummarygetProteomegetProteomeSetgetReleasesgetRepeatMaskergetRNAgetRNASetgetSummaryFilegetUniProtInfogetUniProtSTATSis.genome.availablelistDatabaseslistGenomeslistGroupslistKingdomslistMetaGenomeslistNCBIDatabasesmeta.retrievalmeta.retrieval.allorganismAttributesorganismBMorganismFiltersread_assemblystatsread_cdsread_genomeread_gffread_proteomeread_rmread_rnarefseqOrganismssummary_cdssummary_genome

Dependencies:AnnotationDbiaskpassBiobaseBiocFileCacheBiocGenericsbiomaRtBiostringsbitbit64bitopsblobcachemclicliprcpp11crayoncurldata.tableDBIdbplyrdigestdownloaderdplyrfansifastmapfilelockfsgenericsGenomeInfoDbGenomeInfoDbDatagluehmshttrhttr2IRangesjsonliteKEGGRESTKernSmoothlifecyclemagrittrmemoisemimeopensslphilentropypillarpkgconfigplogrpngpoormanprettyunitsprogresspurrrR.methodsS3R.ooR.utilsR6rappdirsRcppRCurlreadrrlangRSQLiteS4VectorsstringistringrsystibbletidyrtidyselecttzdbUCSC.utilsutf8vctrsvroomwithrXMLxml2XVectorzlibbioc

Ensembl BioMart Examples

Rendered fromBioMart_Examples.Rmdusingknitr::rmarkdownon Oct 28 2024.

Last update: 2023-08-17
Started: 2017-03-13

Functional Annotation with biomartr

Rendered fromFunctional_Annotation.Rmdusingknitr::rmarkdownon Oct 28 2024.

Last update: 2023-08-17
Started: 2014-11-27

Meta-Genome Retrieval

Rendered fromMetaGenome_Retrieval.Rmdusingknitr::rmarkdownon Oct 28 2024.

Last update: 2023-08-17
Started: 2016-10-11

NCBI Database Retrieval

Rendered fromDatabase_Retrieval.Rmdusingknitr::rmarkdownon Oct 28 2024.

Last update: 2022-02-22
Started: 2015-11-08

Sequence Retrieval

Rendered fromSequence_Retrieval.Rmdusingknitr::rmarkdownon Oct 28 2024.

Last update: 2023-09-29
Started: 2014-11-27

Readme and manuals

Help Manual

Help pageTopics
Genomic Data Retrievalbiomartr-package biomartr
Main BioMart Query Functionbiomart
Get directory to store back end files like kingdom summaries etccachedir
Set directory to store back end files like kingdom summaries etccachedir_set
Check whether an annotation file contains outlier linescheck_annotation_biomartr
Download a NCBI Database to Your Local Hard Drivedownload.database
Download all elements of an NCBI databsedownload.database.all
List all available ENSEMBL divisionsensembl_divisions
Helper function to retrieve species information from the ENSEMBL APIget.ensembl.info
Genome Assembly Stats RetrievalgetAssemblyStats
Retrieve All Available Attributes for a Specific DatasetgetAttributes
A wrapper to all bio getters, selected with 'type' argumentgetBio
Generic Bio data set extractorgetBioSet
Coding Sequence RetrievalgetCDS
CDS retrieval of multiple speciesgetCDSSet
Retrieve a Collection: Genome, Proteome, CDS, RNA, GFF, Repeat Masker, AssemblyStatsgetCollection
Retrieve a Collection: Genome, Proteome, CDS, RNA, GFF, Repeat Masker, AssemblyStats of multiple speciesgetCollectionSet
Retrieve All Available Datasets for a BioMart DatabasegetDatasets
Download sequence or annotation from ENSEMBLgetENSEMBL
Helper function for retrieving gtf files from ENSEMBLgetENSEMBL.gtf
Helper function for retrieving biological sequence files from ENSEMBLgetENSEMBL.Seq
Retrieve ENSEMBLGENOMES info filegetENSEMBLGENOMESInfo
Retrieve ENSEMBL info filegetENSEMBLInfo
Retrieve All Available Filters for a Specific DatasetgetFilters
Genome RetrievalgetGenome
Retrieve NCBI GENOME_REPORTS filegetGENOMEREPORT
Genome Retrieval of multiple speciesgetGenomeSet
Genome Annotation Retrieval (GFF3)getGFF
GFF retrieval of multiple speciesgetGFFSet
Gene Ontology QuerygetGO
Retrieve available groups for a kingdom of life (only available for NCBI RefSeq and NCBI Genbank)getGroups
Genome Annotation Retrieval (GTF)getGTF
Retrieve and summarise the assembly_summary.txt files from NCBI for all kingdomsgetKingdomAssemblySummary
Retrieve available kingdoms of lifegetKingdoms
Retrieve information about available Ensembl Biomart databasesgetMarts
Retrieve annotation *.gff files for metagenomes from NCBI GenbankgetMetaGenomeAnnotations
Retrieve metagenomes from NCBI GenbankgetMetaGenomes
Retrieve the assembly_summary.txt file from NCBI genbank metagenomesgetMetaGenomeSummary
Proteome RetrievalgetProteome
Proteome retrieval of multiple speciesgetProteomeSet
Retrieve available database releases or versions of ENSEMBLgetReleases
Repeat Masker RetrievalgetRepeatMasker
RNA Sequence RetrievalgetRNA
RNA Retrieval of multiple speciesgetRNASet
Helper function to retrieve the assembly_summary.txt file from NCBIgetSummaryFile
Get uniprot info from organismgetUniProtInfo
Retrieve UniProt Database Information File (STATS)getUniProtSTATS
Check Genome Availabilityis.genome.available
Retrieve a List of Available NCBI Databases for DownloadlistDatabases listNCBIDatabases
List All Available Genomes either by kingdom, group, or subgrouplistGenomes
List number of available genomes in each taxonomic grouplistGroups
List number of available genomes in each kingdom of lifelistKingdoms
List available metagenomes on NCBI GenbanklistMetaGenomes
Perform Meta-Genome Retrievalmeta.retrieval
Perform Meta-Genome Retrieval of all organisms in all kingdoms of lifemeta.retrieval.all
Retrieve Ensembl Biomart attributes for a query organismorganismAttributes
Retrieve Ensembl Biomart marts and datasets for a query organismorganismBM
Retrieve Ensembl Biomart filters for a query organismorganismFilters
Import Genome Assembly Stats Fileread_assemblystats
Import CDS as Biostrings or data.table objectread_cds
Import Genome Assembly as Biostrings or data.table objectread_genome
Import GFF Fileread_gff
Import Proteome as Biostrings or data.table objectread_proteome
Import Repeat Masker output fileread_rm
Import RNA as Biostrings or data.table objectread_rna
Retrieve All Organism Names Stored on refseqrefseqOrganisms
Retrieve summary statistics for a coding sequence (CDS) filesummary_cds
Retrieve summary statistics for a genome assembly filesummary_genome