Title: | Interface to 'Virtuoso' using 'ODBC' |
---|---|
Description: | Provides users with a simple and convenient mechanism to manage and query a 'Virtuoso' database using the 'DBI' (Data-Base Interface) compatible 'ODBC' (Open Database Connectivity) interface. 'Virtuoso' is a high-performance "universal server," which can act as both a relational database, supporting standard Structured Query Language ('SQL') queries, while also supporting data following the Resource Description Framework ('RDF') model for Linked Data. 'RDF' data can be queried using 'SPARQL' ('SPARQL' Protocol and 'RDF' Query Language) queries, a graph-based query that supports semantic reasoning. This allows users to leverage the performance of local or remote 'Virtuoso' servers using popular 'R' packages such as 'DBI' and 'dplyr', while also providing a high-performance solution for working with large 'RDF' 'triplestores' from 'R.' The package also provides helper routines to install, launch, and manage a 'Virtuoso' server locally on 'Mac', 'Windows' and 'Linux' platforms using the standard interactive installers from the 'R' command-line. By automatically handling these setup steps, the package can make using 'Virtuoso' considerably faster and easier for a most users to deploy in a local environment. Managing the bulk import of triples from common serializations with a single intuitive command is another key feature of this package. Bulk import performance can be tens to hundreds of times faster than the comparable imports using existing 'R' tools, including 'rdflib' and 'redland' packages. |
Authors: | Carl Boettiger [aut, cre, cph] , Bryce Mecum [ctb] |
Maintainer: | Carl Boettiger <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.8 |
Built: | 2024-12-27 03:12:53 UTC |
Source: | https://github.com/ropensci/virtuoso |
test if the system has a virtuoso installation on the path
has_virtuoso()
has_virtuoso()
logical indicating if virtuoso-t binary was found or now.
has_virtuoso()
has_virtuoso()
Virtuoso Server configuration is determined by a virtuoso.ini file when
server starts. This file includes both system-specific information from
your install (location of server files, addons, etc) and user-configurable
parameters. This helper function provides a way to create and modify an
appropriate virtuoso.ini
file.
vos_configure( dirs_allowed = getwd(), gigs_ram = 2, template = find_virtuoso_ini(), db_dir = vos_db() )
vos_configure( dirs_allowed = getwd(), gigs_ram = 2, template = find_virtuoso_ini(), db_dir = vos_db() )
dirs_allowed |
Paths (relative or absolute) to directories from which Virtuoso should have read and write access (e.g. for bulk uploading). Should be specified as a single comma-separated string. |
gigs_ram |
Indicate approximately the maximum GB of memory Virtuoso can have access to. (Used to set NumberOfBuffers & MaxDirtyBuffers in config.) |
template |
Location of an existing virtuoso.ini file which will be used
as a template. By default, |
db_dir |
location where |
Writes the requested virtuoso.ini
file to the db_dir specified
and returns the path to this file.
http://docs.openlinksw.com/virtuoso/dbadm/
# can take > 5s to test ## configure with typical defaults: vos_configure() ## Increase or decrease RAM available to virtuoso: vos_configure(gigs_ram = 1)
# can take > 5s to test ## configure with typical defaults: vos_configure() ## Increase or decrease RAM available to virtuoso: vos_configure(gigs_ram = 1)
Connect to a Virtuoso Server over ODBC
vos_connect( driver = NULL, uid = "dba", pwd = "dba", host = "localhost", port = "1111", system_odbcinst = find_odbcinst(), local_odbcinst = odbcinst_path() )
vos_connect( driver = NULL, uid = "dba", pwd = "dba", host = "localhost", port = "1111", system_odbcinst = find_odbcinst(), local_odbcinst = odbcinst_path() )
driver |
Name of the Driver line in the ODBC configuration |
uid |
User id. Defaults to "dba" |
pwd |
Password. Defaults to "dba" |
host |
IP address of the Virtuoso Server |
port |
Port used by Virtuoso. Defaults to the Virtuoso standard port, 1111 |
system_odbcinst |
Path to the system |
local_odbcinst |
Path to the local odbcinst we should use. |
Default parameters are appropriate for the automatic installer provided by the package and for the default settings typically used by local Virtuoso installers. Adjust these only if you are connecting to a remote virtuoso server that is not controlled from the R package.
a DBI connection to the Virtuoso database. This can
be passed to additional virtuoso functions such as vos_import()
or vos_query()
, and can also be used as a standard DBI or dplyr
database backend.
status <- vos_status() if(has_virtuoso()){ ## start up vos_start() con <- vos_connect() }
status <- vos_status() if(has_virtuoso()){ ## start up vos_start() con <- vos_connect() }
delete the entire Virtuoso database for a fresh start.
vos_delete_db(ask = is_interactive(), db_dir = vos_db())
vos_delete_db(ask = is_interactive(), db_dir = vos_db())
ask |
ask before deleting? |
db_dir |
location of the directory to delete |
vos_delete_db()
vos_delete_db()
Provides a clean reset of the system that purges all
data files, config files, cache and log files created
by virtuoso R package. This does not uninstall Virtuoso software
itself, see vos_uninstall()
to uninstall.
vos_destroy_all(force = FALSE)
vos_destroy_all(force = FALSE)
force |
should permissions be changed (if possible) to allow deletion? |
TRUE if entirely successful in removing all files, FALSE otherwise (invisibly).
vos_destroy_all()
vos_destroy_all()
While triples data can be added one by one over SPARQL queries, Virtuoso bulk import is by far the fastest way to import large triplestores in the database.
vos_import( con, files = NULL, wd = ".", glob = "*", graph = "rdflib", n_cores = 1L )
vos_import( con, files = NULL, wd = ".", glob = "*", graph = "rdflib", n_cores = 1L )
con |
a ODBC connection to Virtuoso, from |
files |
paths to files to be imported |
wd |
Alternatively, can specify directory and globbing pattern
to import. Note that in this case, wd must be in (or a subdir of)
the |
glob |
A wildcard aka globbing pattern (e.g. '"*.nq"“). |
graph |
Name (technically URI) for a graph in the database. Can leave as default. If a graph is already specified by the import file (e.g. in nquads), that will be used instead. |
n_cores |
specify the number of available cores for parallel loading. Particularly useful when importing large numbers of bulk files. |
the bulk importer imports all files matching a pattern in a given directory. If given a list of files, these are temporarily symlinked (or copied on Windows machines) to the Virtuoso app cache dir in a subdirectory, and the entire subdirectory is loaded (filtered by the globbing pattern). If files are not specified, load is called directly on the specified directory and pattern. This is particularly useful for loading large numbers of files.
Note that Virtuoso recommends breaking large files into multiple smaller ones, which can improve loading time (particularly if using multiple cores.)
Virtuoso Bulk Importer recognizes the following file formats:
.grdf
.nq
.owl
.nt
.rdf
.trig
.ttl
.xml
Any of these can optionally be gzipped (with a .gz
extension).
(Invisibly) returns the status table of the bulk loader, indicating file loading time or errors.
http://vos.openlinksw.com/owiki/wiki/VOS/VirtBulkRDFLoader
vos_status() if(has_virtuoso()){ vos_start() con <- vos_connect() example <- system.file("extdata", "person.nq", package = "virtuoso") vos_import(con, example) }
vos_status() if(has_virtuoso()){ vos_start() con <- vos_connect() example <- system.file("extdata", "person.nq", package = "virtuoso") vos_import(con, example) }
Installation helper for Mac and Windows machines. By default,
method will download and launch the official .dmg
or .exe
installer
for your platform, running the standard drag-n-drop installer or
interactive dialog. Setting ask = FALSE
will allow the installer
to run entirely unsupervised, which is suitable for use in scripts.
Mac users can alternatively opt to install Virtuoso through HomeBrew
by setting use_brew=TRUE
. Linux users should simply install the
virtuoso-opensource
package (e.g. in debian & ubuntu) using the
package manager or by contacting your system administrator.
vos_install(ask = is_interactive(), use_brew = FALSE)
vos_install(ask = is_interactive(), use_brew = FALSE)
ask |
Should we ask user for interactive installation? |
use_brew |
Should we use homebrew to install? (MacOS only) |
vos_install()
vos_install()
Kill ends the process started by vos_start()
vos_kill(p = NA)
vos_kill(p = NA)
p |
a process object, returned by
|
vos_kill simply shuts down the local Virtuoso server,
it does not remove any data stored in the database system.
vos_kill()
terminates the process, removing the
process id from the process table.
if(has_virtuoso()){ vos_start() vos_kill() }
if(has_virtuoso()){ vos_start() vos_kill() }
List graphs
vos_list_graphs(con)
vos_list_graphs(con)
con |
a ODBC connection to Virtuoso, from |
status <- vos_status() if(has_virtuoso() & is.null(status)){ vos_start() con <- vos_connect() vos_list_graphs(con) }
status <- vos_status() if(has_virtuoso() & is.null(status)){ vos_start() con <- vos_connect() vos_list_graphs(con) }
Query the server logs
vos_log(p = NA, collapse = NULL, just_errors = FALSE)
vos_log(p = NA, collapse = NULL, just_errors = FALSE)
p |
a process object, returned by
|
collapse |
an optional character string to separate the lines in a single character string. |
just_errors |
logical, default FALSE. Set to TRUE to return just the lines that contain the term "error", which can be useful in debugging or validating bulk imports. |
Virtuoso logs as a character vector.
if(has_virtuoso()) vos_log()
if(has_virtuoso()) vos_log()
ODBC uses an odbcinst.ini
file to point ODBC at the library required
to drive any given database. This function helps us automatically
locate the driver library on different operating systems and configure
the odbcinst appropriately for each OS.
vos_odbcinst( system_odbcinst = find_odbcinst(), local_odbcinst = odbcinst_path() )
vos_odbcinst( system_odbcinst = find_odbcinst(), local_odbcinst = odbcinst_path() )
system_odbcinst |
Path to the system |
local_odbcinst |
Path to the local odbcinst we should use. |
This function is called automatically by vos_install()
and thus
does not usually need to be called by the user. Users can also manually
configure ODBC as outlined in
https://github.com/r-dbi/odbc#dsn-configuration-files.
This is merely a convenience function automating that process on most
systems.
the path to the odbcinst file that is created or modified.
## Configures ODBC and returns silently on success. vos_odbcinst() ## see where the inst file is located: inst <- vos_odbcinst() inst
## Configures ODBC and returns silently on success. vos_odbcinst() ## see where the inst file is located: inst <- vos_odbcinst() inst
Generally a user will not need to access this function directly, though it may be useful for debugging purposes.
vos_process(p = NA)
vos_process(p = NA)
p |
a process object, returned by
|
returns the processx::process()
object cached by vos_start()
to control the external Virtuoso sever process from R.
if(has_virtuoso()) vos_process()
if(has_virtuoso()) vos_process()
Run a SPARQL query
vos_query(con, query)
vos_query(con, query)
con |
a ODBC connection to Virtuoso, from |
query |
a SPARQL query statement |
SPARQL is a graph query language similar in syntax SQL, but allows the use of variables to walk through graph nodes.
a data.frame
containing the results of the query
vos_status() if(has_virtuoso()){ vos_start() con <- vos_connect() # show first 4 triples in the database DBI::dbGetQuery(con, "SPARQL SELECT * WHERE { ?s ?p ?o } LIMIT 4") }
vos_status() if(has_virtuoso()){ vos_start() con <- vos_connect() # show first 4 triples in the database DBI::dbGetQuery(con, "SPARQL SELECT * WHERE { ?s ?p ?o } LIMIT 4") }
Set the location of Virtuoso database, configure files, cache, and logs to your preferred location. Set home to the location of your Virtuoso installation.
vos_set_paths( db_dir = vos_db(), config_dir = vos_config(), cache_dir = vos_cache(), log_dir = vos_logdir(), home = virtuoso_home() )
vos_set_paths( db_dir = vos_db(), config_dir = vos_config(), cache_dir = vos_cache(), log_dir = vos_logdir(), home = virtuoso_home() )
db_dir |
Location of data in the Virtuoso (tables, triplestore) |
config_dir |
Location of configuration files for Virtuoso |
cache_dir |
Location of cache for bulk importing |
log_dir |
Location of Virutoso Server logs |
home |
Location of the Virtuoso installation |
A logical vector, with elements being true if setting the corresponding variable succeeded (invisibly).
if(has_virtuoso()) vos_set_paths()
if(has_virtuoso()) vos_set_paths()
This function will attempt to start a virtuoso server
instance that can be managed completely from R. This allows
the user to easily start, stop, and access server logs and functions
from the R command line. This server will be automatically shut
down when R exits or restarts, or can be explicitly controlled using
vos_kill()
, vos_log()
, and vos_status()
.
vos_start(ini = NULL, wait = 30)
vos_start(ini = NULL, wait = 30)
ini |
path to a virtuoso.ini configuration file. If not provided, function will attempt to determine the location of the default configuration file. |
wait |
number of seconds to wait for server to come online |
It can take some time for the server to come up before it is ready to
accept queries. vos_start()
will return as soon as the server is active,
which typically takes about 10 seconds on tested systems. vos_start()
monitors
the Virtuoso logs every one second for a maximum time of wait
seconds
(default 30 seconds) to see if the server is ready. If wait
time is exceeded,
vos_start()
will simply return the current server status. This does not mean
that starting has failed, it may simply need longer before the server is active.
Use vos_status()
to continue to monitor the server status manually.
If no virtuoso.ini
configuration file is provided, vos_start()
will
automatically attempt to configure one. For more control over this,
use vos_configure()
, see examples.
invisibly returns the processx::process()
object which can be used
to control the external process from R. It is not necessary for a user
to store this return object, as vos_start()
caches the process object so
it can be automatically accessed by other functions without needing to store
and pass the return object.
if(has_virtuoso()){ vos_start() ## or with custom config: vos_start(vos_configure(gigs_ram = 3)) }
if(has_virtuoso()){ vos_start() ## or with custom config: vos_start(vos_configure(gigs_ram = 3)) }
Query the server status
vos_status(p = NA, wait = 10)
vos_status(p = NA, wait = 10)
p |
a process object, returned by
|
wait |
number of seconds to wait for server to come online |
Note: Use vos_log()
to see the full log
a character string indicating the state of the server:
"not detected" if no process can be found
"dead" process exists but reports that server is not alive. Server may fail
to come online due to errors in configuration file. see vos_configure()
"running" Server is up and accepting queries.
"sleeping" Server is up and accepting queries.
if(has_virtuoso()) vos_status()
if(has_virtuoso()) vos_status()
Automatic uninstaller for Mac OSX and Windows clients.
vos_uninstall()
vos_uninstall()
## Not run: vos_uninstall() ## End(Not run)
## Not run: vos_uninstall() ## End(Not run)