Package: biomartr 1.0.11

Hajk-Georg Drost

biomartr: Genomic Data Retrieval

Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized way to automate genome, proteome, 'RNA', coding sequence ('CDS'), 'GFF', and metagenome retrieval from 'NCBI RefSeq', 'NCBI Genbank', 'ENSEMBL', and 'UniProt' databases. Furthermore, an interface to the 'BioMart' database (Smedley et al. (2009) <doi:10.1186/1471-2164-10-22>) allows users to retrieve functional annotation for genomic loci. In addition, users can download entire databases such as 'NCBI RefSeq' (Pruitt et al. (2007) <doi:10.1093/nar/gkl842>), 'NCBI nr', 'NCBI nt', 'NCBI Genbank' (Benson et al. (2013) <doi:10.1093/nar/gks1195>), etc. with only one command.

Authors:Hajk-Georg Drost [aut, cre], Haakon Tjeldnes [aut, ctb]

biomartr_1.0.11.tar.gz
biomartr_1.0.11.zip(r-4.7)biomartr_1.0.11.zip(r-4.6)biomartr_1.0.11.zip(r-4.5)
biomartr_1.0.11.tgz(r-4.6-any)biomartr_1.0.11.tgz(r-4.5-any)
biomartr_1.0.11.tar.gz(r-4.7-any)biomartr_1.0.11.tar.gz(r-4.6-any)
biomartr_1.0.11.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
biomartr/json (API)

# Install 'biomartr' in R:
install.packages('biomartr', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org'))

Reviews:rOpenSci Software Review #93

Bug tracker:https://github.com/ropensci/biomartr/issues

Pkgdown/docs site:https://docs.ropensci.org

On CRAN:

Conda:

biomartgenomic-data-retrievalannotation-retrievaldatabase-retrievalncbiensemblbiological-data-retrievalensembl-serversgenomegenome-annotationgenome-retrievalgenomicsmeta-analysismetagenomicsncbi-genbankpeer-reviewedproteomesequenced-genomes

11.35 score 225 stars 3 packages 182 scripts 1.0k downloads 11 mentions 65 exports 70 dependencies

Last updated from:5affece173 (on master). Checks:7 WARNING, 3 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64WARNING233
pkgdown docsOK276
source / vignettesOK303
linux-release-x86_64WARNING223
macos-release-arm64WARNING142
macos-oldrel-arm64WARNING147
windows-develWARNING176
windows-releaseWARNING179
windows-oldrelWARNING165
wasm-releaseOK176

Exports:biomartcachedircachedir_setcheck_annotation_biomartrclean.retrievaldownload.databasedownload.database.allensembl_divisionsget.ensembl.infogetAssemblyStatsgetAttributesgetBioSetgetCDSgetCDSSetgetCollectiongetCollectionSetgetDatasetsgetENSEMBLGENOMESInfogetENSEMBLInfogetFiltersgetGenomegetGENOMEREPORTgetGenomeSetgetGFFgetGFFSetgetGOgetGroupsgetGTFgetKingdomAssemblySummarygetKingdomsgetMartsgetMetaGenomeAnnotationsgetMetaGenomesgetMetaGenomeSummarygetProteomegetProteomeSetgetReleasesgetRepeatMaskergetRNAgetRNASetgetSummaryFilegetUniProtInfogetUniProtSTATSis.genome.availablelistDatabaseslistGenomeslistGroupslistKingdomslistMetaGenomeslistNCBIDatabasesmeta.retrievalmeta.retrieval.allorganismAttributesorganismBMorganismFiltersread_assemblystatsread_cdsread_genomeread_gffread_proteomeread_rmread_rnarefseqOrganismssummary_cdssummary_genome

Dependencies:AnnotationDbiaskpassBiobaseBiocFileCacheBiocGenericsbiomaRtBiostringsbitbit64bitopsblobcachemclicliprcpp11crayoncurldata.tableDBIdbplyrdigestdownloaderdplyrfastmapfilelockfsgenericsgluehmshttrhttr2IRangesjsonliteKEGGRESTlifecyclemagrittrmemoisemimeopensslpillarpkgconfigpngprettyunitsprogresspurrrR.methodsS3R.ooR.utilsR6rappdirsRCurlreadrrlangRSQLiteS4VectorsSeqinfostringistringrsystibbletidyrtidyselecttzdbutf8vctrsvroomwithrXMLxml2XVector

Ensembl BioMart Examples
Use Case #1: Functional Annotation of Genes Sharing a Common Evolutionary History | Step 1 | Enrichment Analyses

Last update: 2025-07-19
Started: 2017-03-13

Meta-Genome Retrieval
Topics | Perform Meta-Genome Retrieval | Getting Started | Retrieve Genomic Sequences | Example NCBI RefSeq: | Example NCBI Genbank: | Example ENSEMBL | Retrieval from NCBI RefSeq | Restarting a corrupted download | Un-zipping downloaded files | Retrieval from NCBI Genbank | Retrieval from ENSEMBL | Retrieve groups or subgroups of species | Example retrieval of all Gammaproteobacteria genomes from NCBI RefSeq: | Example retrieval of all Adenoviridae genomes from NCBI RefSeq: | Meta retrieval of genome assembly quality information | Metagenome project retrieval from NCBI Genbank | Retrieve Protein Sequences | Retrieval from NCBI RefSeq: | Retrieval from NCBI Genbank: | Retrieval from ENSEMBL: | Retrieve CDS Sequences | Retrieve GFF files | Retrieve GTF files | Retrieve RNA sequences | Retrieve Repeat Masker Sequences | Retrieve Individual Genomes for all Species in the Tree of Life | Genome Retrieval | Proteome Retrieval

Last update: 2024-12-12
Started: 2016-10-11

Sequence Retrieval
Biological Sequence Retrieval | Topics | Getting Started with Sequence Retrieval | Example NCBI RefSeq (?is.genome.available): | Using the NCBI Taxonomy ID instead of the scientific name to screen for organism availability | Using the accession ID instead of the scientific name or taxid to screen for organism availability | A small negative example | Example NCBI Genbank (?is.genome.available): | Using is.genome.available() with ENSEMBL | Example ENSEMBL (?is.genome.available): | Example UniProt (?is.genome.available): | Listing the total number of available genomes | Retrieving kingdom, group and subgroup information | Analogous computations can be performed for groups and subgroups | Downloading Biological Sequences and Annotations | Genome Retrieval | Example NCBI RefSeq: | Use taxid id for genome retrieval | Use assembly_accession id for genome retrieval | Example NCBI Genbank: | Use taxonomy id for genome retrieval | Example ENSEMBL: | GenomeSet Retrieval | Proteome Retrieval | Example Retrieval Uniprot: | ProteomeSet Retrieval | CDS Retrieval | CDSSet Retrieval | RNA Retrieval | RNASet Retrieval | Retrieve the annotation file of a particular genome | Removing corrupt lines from downloaded GFF files | GFFSet Retrieval | Repeat Masker Retrieval | Genome Assembly Stats Retrieval | Collection Retrieval

Last update: 2024-12-12
Started: 2014-11-27

Functional Annotation with biomartr
Functional Annotation Retrieval from Ensembl Biomart | Getting Started | The old biomaRt query methodology | Extending biomaRt using the new query system of the biomartr package | Getting Started with biomartr | Retrieve marts, datasets, attributes, and filters with biomartr | Retrieve Available Marts | Retrieve Available Datasets from a Specific Mart | Retrieve Available Attributes from a Specific Dataset | Retrieve Available Filters from a Specific Dataset | Organism Specific Retrieval of Information | Construct BioMart queries with biomartr | Gene Ontology | GO Annotation Retrieval via BioMart

Last update: 2023-08-17
Started: 2014-11-27

NCBI Database Retrieval
Retrieve Sequence Databases from NCBI | Getting Started | List available databases | Download NCBI databases | Example NCBI nr | Example NCBI nt | Example NCBI RefSeq | Example PDB | Example NCBI Taxonomy | Example NCBI Swissprot | Example NCBI CDD Delta

Last update: 2022-02-22
Started: 2015-11-08

Readme and manuals

Help Manual

Help pageTopics
Genomic Data Retrievalbiomartr-package biomartr
Main BioMart Query Functionbiomart
Get directory to store back end files like kingdom summaries etccachedir
Set directory to store back end files like kingdom summaries etccachedir_set
Check whether an annotation file contains outlier linescheck_annotation_biomartr
Format 'meta.retrieval' outputclean.retrieval
Download a NCBI Database to Your Local Hard Drivedownload.database
Download all elements of an NCBI databsedownload.database.all
List all available ENSEMBL divisionsensembl_divisions
Helper function to retrieve species information from the ENSEMBL APIget.ensembl.info
Genome Assembly Stats RetrievalgetAssemblyStats
Retrieve All Available Attributes for a Specific DatasetgetAttributes
A wrapper to all bio getters, selected with 'type' argumentgetBio
Generic Bio data set extractorgetBioSet
Coding Sequence RetrievalgetCDS
CDS retrieval of multiple speciesgetCDSSet
Retrieve a Collection: Genome, Proteome, CDS, RNA, GFF, Repeat Masker, AssemblyStatsgetCollection
Retrieve a Collection: Genome, Proteome, CDS, RNA, GFF, Repeat Masker, AssemblyStats of multiple speciesgetCollectionSet
Retrieve All Available Datasets for a BioMart DatabasegetDatasets
Download sequence or annotation from ENSEMBLgetENSEMBL
Helper function for retrieving gtf files from ENSEMBLgetENSEMBL.gtf
Helper function for retrieving biological sequence files from ENSEMBLgetENSEMBL.Seq
Retrieve ENSEMBLGENOMES info filegetENSEMBLGENOMESInfo
Retrieve ENSEMBL info filegetENSEMBLInfo
Retrieve All Available Filters for a Specific DatasetgetFilters
Genome RetrievalgetGenome
Retrieve NCBI GENOME_REPORTS filegetGENOMEREPORT
Genome Retrieval of multiple speciesgetGenomeSet
Genome Annotation Retrieval (GFF3)getGFF
GFF retrieval of multiple speciesgetGFFSet
Gene Ontology QuerygetGO
Retrieve available groups for a kingdom of life (only available for NCBI RefSeq and NCBI Genbank)getGroups
Genome Annotation Retrieval (GTF)getGTF
Retrieve and summarise the assembly_summary.txt files from NCBI for all kingdomsgetKingdomAssemblySummary
Retrieve available kingdoms of lifegetKingdoms
Retrieve information about available Ensembl Biomart databasesgetMarts
Retrieve annotation *.gff files for metagenomes from NCBI GenbankgetMetaGenomeAnnotations
Retrieve metagenomes from NCBI GenbankgetMetaGenomes
Retrieve the assembly_summary.txt file from NCBI genbank metagenomesgetMetaGenomeSummary
Proteome RetrievalgetProteome
Proteome retrieval of multiple speciesgetProteomeSet
Retrieve available database releases or versions of ENSEMBLgetReleases
Repeat Masker RetrievalgetRepeatMasker
RNA Sequence RetrievalgetRNA
RNA Retrieval of multiple speciesgetRNASet
Helper function to retrieve the assembly_summary.txt file from NCBIgetSummaryFile
Get uniprot info from organismgetUniProtInfo
Retrieve UniProt Database Information File (STATS)getUniProtSTATS
Check Genome Availabilityis.genome.available
Retrieve a List of Available NCBI Databases for DownloadlistDatabases listNCBIDatabases
List All Available Genomes either by kingdom, group, or subgrouplistGenomes
List number of available genomes in each taxonomic grouplistGroups
List number of available genomes in each kingdom of lifelistKingdoms
List available metagenomes on NCBI GenbanklistMetaGenomes
Perform Meta-Genome Retrievalmeta.retrieval
Perform Meta-Genome Retrieval of all organisms in all kingdoms of lifemeta.retrieval.all
Retrieve Ensembl Biomart attributes for a query organismorganismAttributes
Retrieve Ensembl Biomart marts and datasets for a query organismorganismBM
Retrieve Ensembl Biomart filters for a query organismorganismFilters
Import Genome Assembly Stats Fileread_assemblystats
Import CDS as Biostrings or data.table objectread_cds
Import Genome Assembly as Biostrings or data.table objectread_genome
Import GFF Fileread_gff
Import Proteome as Biostrings or data.table objectread_proteome
Import Repeat Masker output fileread_rm
Import RNA as Biostrings or data.table objectread_rna
Retrieve All Organism Names Stored on refseqrefseqOrganisms
Retrieve summary statistics for a coding sequence (CDS) filesummary_cds
Retrieve summary statistics for a genome assembly filesummary_genome