Title: | A Test Environment for Database Requests |
---|---|
Description: | Testing and documenting code that communicates with remote databases can be painful. Although the interaction with R is usually relatively simple (e.g. data(frames) passed to and from a database), because they rely on a separate service and the data there, testing them can be difficult to set up, unsustainable in a continuous integration environment, or impossible without replicating an entire production cluster. This package addresses that by allowing you to make recordings from your database interactions and then play them back while testing (or in other contexts) all without needing to spin up or have access to the database your code would typically connect to. |
Authors: | Jonathan Keane [aut, cre] , Mauricio Vargas [aut] , Helen Miller [rev] (reviewed the package for rOpenSci, see https://github.com/ropensci/software-review/issues/366), Etienne Racine [rev] (reviewed the package for rOpenSci, see https://github.com/ropensci/software-review/issues/366) |
Maintainer: | Jonathan Keane <[email protected]> |
License: | Apache License (>= 2.0) |
Version: | 0.1.8 |
Built: | 2024-10-27 05:35:32 UTC |
Source: | https://github.com/ropensci/dittodb |
When creating database fixtures, it can sometimes be helpful to record the responses from the database for use in crafting tests.
start_db_capturing(path, redact_columns = NULL) stop_db_capturing() capture_db_requests(expr, path, redact_columns = NULL)
start_db_capturing(path, redact_columns = NULL) stop_db_capturing() capture_db_requests(expr, path, redact_columns = NULL)
path |
the path to record mocks (default if missing: the first path in
|
redact_columns |
a character vector of columns to redact. Any column that matches an entry will be redacted with a standard value for the column type (e.g. characters will be replaced with "[redacted]") |
expr |
an expression to evaluate while capturing requests (for
|
You can start capturing with start_db_capturing()
and end it with
stop_db_capturing()
. All queries run against a database will be executed like
normal, but their responses will be saved to the mock path given, so that if
you use the same queries later inside of a with_mock_db
block, the
database functions will return as if they had been run against the database.
Alternatively, you can wrap the code that you are trying to capture in the
function capture_db_requests({...})
this does the same thing as
start_db_capturing()
and stop_db_capturing()
but without needing to
remember to stop the recording.
You can redact certain columns using the redact_columns
argument. This will
replace the values in the column with a generic redacted version. This works
by always passing the data being saved through redact_columns
.
note You should always call DBI::dbConnect
inside of the capturing
block. When you connect to the database, dittodb sets up the mocks for the
specific database you're connecting to when you call DBI::dbConnect
.
NULL
(invisibily)
if (check_for_pkg("RSQLite", message)) { # Temporary files for examples nycflights_path <- tempfile() con <- nycflights13_create_sqlite(location = nycflights_path) dbDisconnect(con) start_db_capturing() con <- dbConnect(RSQLite::SQLite(), nycflights_path) df_1 <- dbGetQuery(con, "SELECT * FROM airlines LIMIT 1") res <- dbSendQuery(con, "SELECT * FROM airlines LIMIT 2") df_2 <- dbFetch(res) dbClearResult(res) dbDisconnect(con) stop_db_capturing() start_db_capturing(redact_columns = "carrier") con <- dbConnect(RSQLite::SQLite(), nycflights_path) df_3 <- dbGetQuery(con, "SELECT * FROM airlines LIMIT 3") dbDisconnect(con) stop_db_capturing() with_mock_db({ con <- dbConnect(RSQLite::SQLite(), nycflights_path) # the result from df1 above print(dbGetQuery(con, "SELECT * FROM airlines LIMIT 1")) # the result from df3 above print(dbGetQuery(con, "SELECT * FROM airlines LIMIT 3")) }) }
if (check_for_pkg("RSQLite", message)) { # Temporary files for examples nycflights_path <- tempfile() con <- nycflights13_create_sqlite(location = nycflights_path) dbDisconnect(con) start_db_capturing() con <- dbConnect(RSQLite::SQLite(), nycflights_path) df_1 <- dbGetQuery(con, "SELECT * FROM airlines LIMIT 1") res <- dbSendQuery(con, "SELECT * FROM airlines LIMIT 2") df_2 <- dbFetch(res) dbClearResult(res) dbDisconnect(con) stop_db_capturing() start_db_capturing(redact_columns = "carrier") con <- dbConnect(RSQLite::SQLite(), nycflights_path) df_3 <- dbGetQuery(con, "SELECT * FROM airlines LIMIT 3") dbDisconnect(con) stop_db_capturing() with_mock_db({ con <- dbConnect(RSQLite::SQLite(), nycflights_path) # the result from df1 above print(dbGetQuery(con, "SELECT * FROM airlines LIMIT 1")) # the result from df3 above print(dbGetQuery(con, "SELECT * FROM airlines LIMIT 3")) }) }
expect_sql(object, regexp = NULL, ...)
expect_sql(object, regexp = NULL, ...)
object |
the expression to evaluate |
regexp |
the statement to match |
... |
arguments passed to |
Sometimes all you need to check is if a specific SQL statement has been sent and you don't care about retrieving the results.
This works by raising an error that contains the statement that is sent to the
database as well as the location of the result. Currently, expect_sql()
only
works with DBI::dbSendQuery()
(and most implementations of DBI::dbGetQuery()
which call DBI::dbSendQuery()
internally).
Note: this function is experimental and will likely evolve over time. Please be prepared that new releases might break backwards compatibility.
if (check_for_pkg("RSQLite", message)) { with_mock_db({ con <- dbConnect(RSQLite::SQLite(), dbname = "not_a_db") expect_sql( dbGetQuery(con, "SELECT carrier, name FROM airlines LIMIT 3"), "SELECT carrier, name FROM airlines LIMIT 3" ) }) }
if (check_for_pkg("RSQLite", message)) { with_mock_db({ con <- dbConnect(RSQLite::SQLite(), dbname = "not_a_db") expect_sql( dbGetQuery(con, "SELECT carrier, name FROM airlines LIMIT 3"), "SELECT carrier, name FROM airlines LIMIT 3" ) }) }
Various methods (dbSendQuery
, dbFetchQuery
) that are mocks of the
DBI methods of the same name.
Instead of actually interacting with a database, they read in mock responses
and the code proceeds after that. These aren't used directly, but are part of
how dittodb
works.
## S4 method for signature 'DBIMockConnection' dbDisconnect(conn, ...) dbMockConnect(drv, ...) ## S4 method for signature 'DBIMockConnection,character' dbExistsTable(conn, name, ...) ## S4 method for signature 'DBIMockConnection,Id' dbExistsTable(conn, name, ...) ## S4 method for signature 'DBIMockConnection' dbListTables(conn, ...) ## S4 method for signature 'DBIMockConnection,character' dbListFields(conn, name, ...) ## S4 method for signature 'DBIMockConnection,Id' dbListFields(conn, name, ...) ## S4 method for signature 'DBIMockConnection,ANY' dbListFields(conn, name, ...) ## S4 method for signature 'DBIMockConnection,character' dbSendQuery(conn, statement, ...) ## S4 method for signature 'DBIMockConnection,SQL' dbSendQuery(conn, statement, ...) ## S4 method for signature 'DBIMockConnection,character' dbSendStatement(conn, statement, ...) ## S4 method for signature 'DBIMockResult' dbFetch(res, n = -1, ...) ## S4 method for signature 'DBIMockResult,ANY' fetch(res, n = -1, ...) ## S4 method for signature 'DBIMockResult,missing' fetch(res, n = -1, ...) ## S4 method for signature 'DBIMockResult' dbClearResult(res, n, ...) ## S4 method for signature 'DBIMockResult' dbHasCompleted(res, ...) ## S4 method for signature 'DBIMockRPostgreSQLConnection,character' dbGetQuery(conn, statement, ...) ## S4 method for signature 'DBIMockResult' dbGetRowsAffected(res, ...) ## S4 method for signature 'DBIMockConnection' dbGetInfo(dbObj, ...) ## S4 method for signature 'DBIMockConnection,character,data.frame' dbWriteTable(conn, name, value, ...) ## S4 method for signature 'DBIMockConnection,character' dbRemoveTable(conn, name, ...) ## S4 method for signature 'DBIMockResult' dbColumnInfo(res, ...) ## S4 method for signature 'DBIMockResult' dbGetInfo(dbObj, ...) ## S4 method for signature 'DBIMockRPostgresConnection,character' dbQuoteIdentifier(conn, x, ...) ## S4 method for signature 'DBIMockRPostgresConnection,SQL' dbQuoteIdentifier(conn, x, ...) ## S4 method for signature 'DBIMockRPostgresConnection,character' dbQuoteString(conn, x, ...) ## S4 method for signature 'DBIMockRPostgresConnection,SQL' dbQuoteString(conn, x, ...) ## S4 method for signature 'DBIMockMariaDBConnection,character' dbQuoteString(conn, x, ...) ## S4 method for signature 'DBIMockMariaDBConnection,SQL' dbQuoteString(conn, x, ...) ## S4 method for signature 'DBIMockConnection' dbBegin(conn, ..., name = NULL) ## S4 method for signature 'DBIMockConnection' dbCommit(conn, ..., name = NULL) ## S4 method for signature 'DBIMockConnection' dbRollback(conn, ..., name = NULL)
## S4 method for signature 'DBIMockConnection' dbDisconnect(conn, ...) dbMockConnect(drv, ...) ## S4 method for signature 'DBIMockConnection,character' dbExistsTable(conn, name, ...) ## S4 method for signature 'DBIMockConnection,Id' dbExistsTable(conn, name, ...) ## S4 method for signature 'DBIMockConnection' dbListTables(conn, ...) ## S4 method for signature 'DBIMockConnection,character' dbListFields(conn, name, ...) ## S4 method for signature 'DBIMockConnection,Id' dbListFields(conn, name, ...) ## S4 method for signature 'DBIMockConnection,ANY' dbListFields(conn, name, ...) ## S4 method for signature 'DBIMockConnection,character' dbSendQuery(conn, statement, ...) ## S4 method for signature 'DBIMockConnection,SQL' dbSendQuery(conn, statement, ...) ## S4 method for signature 'DBIMockConnection,character' dbSendStatement(conn, statement, ...) ## S4 method for signature 'DBIMockResult' dbFetch(res, n = -1, ...) ## S4 method for signature 'DBIMockResult,ANY' fetch(res, n = -1, ...) ## S4 method for signature 'DBIMockResult,missing' fetch(res, n = -1, ...) ## S4 method for signature 'DBIMockResult' dbClearResult(res, n, ...) ## S4 method for signature 'DBIMockResult' dbHasCompleted(res, ...) ## S4 method for signature 'DBIMockRPostgreSQLConnection,character' dbGetQuery(conn, statement, ...) ## S4 method for signature 'DBIMockResult' dbGetRowsAffected(res, ...) ## S4 method for signature 'DBIMockConnection' dbGetInfo(dbObj, ...) ## S4 method for signature 'DBIMockConnection,character,data.frame' dbWriteTable(conn, name, value, ...) ## S4 method for signature 'DBIMockConnection,character' dbRemoveTable(conn, name, ...) ## S4 method for signature 'DBIMockResult' dbColumnInfo(res, ...) ## S4 method for signature 'DBIMockResult' dbGetInfo(dbObj, ...) ## S4 method for signature 'DBIMockRPostgresConnection,character' dbQuoteIdentifier(conn, x, ...) ## S4 method for signature 'DBIMockRPostgresConnection,SQL' dbQuoteIdentifier(conn, x, ...) ## S4 method for signature 'DBIMockRPostgresConnection,character' dbQuoteString(conn, x, ...) ## S4 method for signature 'DBIMockRPostgresConnection,SQL' dbQuoteString(conn, x, ...) ## S4 method for signature 'DBIMockMariaDBConnection,character' dbQuoteString(conn, x, ...) ## S4 method for signature 'DBIMockMariaDBConnection,SQL' dbQuoteString(conn, x, ...) ## S4 method for signature 'DBIMockConnection' dbBegin(conn, ..., name = NULL) ## S4 method for signature 'DBIMockConnection' dbCommit(conn, ..., name = NULL) ## S4 method for signature 'DBIMockConnection' dbRollback(conn, ..., name = NULL)
conn |
a database connection (for dispatch with these methods, it should
be of class |
... |
arguments passed on inside of the methods |
drv |
a DB driver for use in |
name |
name of the table (for |
statement |
an SQL statement to execute |
res |
a result object (for dispatch with these methods, it should be of
class |
n |
number of results to fetch (ignored) |
dbObj |
a database object (a connection, result, etc.) for use in
|
value |
a value (generally a |
x |
a name to quote (for |
Wrap a chunk of code in with_mock_db()
to use mocked databases that will
use fixtures instead of connecting to a real database. Alternatively, you can
start and stop using a mocked database with start_mock_db()
and
stop_mock_db()
respectively.to execute the whole thing without needing to
remember to stop the mocking. When testing with dittodb
, it will look for
fixtures in all entries of db_mock_paths
.
with_mock_db(expr) start_mock_db() stop_mock_db()
with_mock_db(expr) start_mock_db() stop_mock_db()
expr |
the expression to execute |
You only need to use one approach: either use start_mock_db()
to start
using mocks and then stop_mock_db()
to stop or use with_mock_db()
wrapped
around the code you want to execute against the mocked database. You don't
need to (and should not) use both at the same time. Generally
with_mock_db()
is preferred because it is slightly safer and you don't have
to remember to stop_mock_db()
when you're done. However, it is easier to
step through tests interactively using start_mock_db()
/stop_mock_db()
.
Connections should be made after start_mock_db()
if you're using that
function or they should be made inside of with_mock_db()
if you're using
that function because dittodb
uses the database name (given in dbname
or
Database
argument of dbConnect
depending on the driver) to separate
different fixtures. For ODBC connections with only a dsn provided, the dsn is
used for this directory.
nothing
# Add the mocks included with dittodb to the db_mock_paths to use them below db_mock_paths(system.file("nycflight_mocks", package = "dittodb"), last = TRUE) if (check_for_pkg("RSQLite", message) & check_for_pkg("testthat", message)) { # using `with_mock_db()` with_mock_db({ con <- dbConnect( RSQLite::SQLite(), dbname = "nycflights" ) testthat::test_that("We get one airline", { one_airline <- dbGetQuery( con, "SELECT carrier, name FROM airlines LIMIT 1" ) testthat::expect_s3_class(one_airline, "data.frame") testthat::expect_equal(nrow(one_airline), 1) testthat::expect_equal(one_airline$carrier, "9E") testthat::expect_equal(one_airline$name, "Endeavor Air Inc.") }) dbDisconnect(con) }) # using `start_mock_db()` and `stop_mock_db()` start_mock_db() con <- dbConnect( RSQLite::SQLite(), dbname = "nycflights" ) testthat::test_that("We get one airline", { one_airline <- dbGetQuery( con, "SELECT carrier, name FROM airlines LIMIT 1" ) testthat::expect_s3_class(one_airline, "data.frame") testthat::expect_equal(nrow(one_airline), 1) testthat::expect_equal(one_airline$carrier, "9E") testthat::expect_equal(one_airline$name, "Endeavor Air Inc.") }) dbDisconnect(con) stop_mock_db() }
# Add the mocks included with dittodb to the db_mock_paths to use them below db_mock_paths(system.file("nycflight_mocks", package = "dittodb"), last = TRUE) if (check_for_pkg("RSQLite", message) & check_for_pkg("testthat", message)) { # using `with_mock_db()` with_mock_db({ con <- dbConnect( RSQLite::SQLite(), dbname = "nycflights" ) testthat::test_that("We get one airline", { one_airline <- dbGetQuery( con, "SELECT carrier, name FROM airlines LIMIT 1" ) testthat::expect_s3_class(one_airline, "data.frame") testthat::expect_equal(nrow(one_airline), 1) testthat::expect_equal(one_airline$carrier, "9E") testthat::expect_equal(one_airline$name, "Endeavor Air Inc.") }) dbDisconnect(con) }) # using `start_mock_db()` and `stop_mock_db()` start_mock_db() con <- dbConnect( RSQLite::SQLite(), dbname = "nycflights" ) testthat::test_that("We get one airline", { one_airline <- dbGetQuery( con, "SELECT carrier, name FROM airlines LIMIT 1" ) testthat::expect_s3_class(one_airline, "data.frame") testthat::expect_equal(nrow(one_airline), 1) testthat::expect_equal(one_airline$carrier, "9E") testthat::expect_equal(one_airline$name, "Endeavor Air Inc.") }) dbDisconnect(con) stop_mock_db() }
Included with dittodb
is a small subset of
nycflights13
prepopulated into a sqlite
database.
nycflights_sqlite()
nycflights_sqlite()
This database is helpful for getting to know dittodb
and running example
code. It contains a small subset of the data in nycflights13: namely only the
flights and planes that had a destination of ORD or MDW (the codes for the
two major airports in Chicago) in February of 2013. The airports table has
also been limited to only the New York and Chicago area airports.
an RSQLiteConnection
if (check_for_pkg("RSQLite", message)) { con <- nycflights_sqlite() DBI::dbGetQuery(con, "SELECT flight, tailnum, origin, dest FROM flights LIMIT 10") DBI::dbGetQuery(con, "SELECT faa, name, lat, lon, alt, tz FROM airports") DBI::dbDisconnect(con) }
if (check_for_pkg("RSQLite", message)) { con <- nycflights_sqlite() DBI::dbGetQuery(con, "SELECT flight, tailnum, origin, dest FROM flights LIMIT 10") DBI::dbGetQuery(con, "SELECT faa, name, lat, lon, alt, tz FROM airports") DBI::dbDisconnect(con) }
Using the connection given in con
, create a database including a few tables
from the nycflights13
dataset.
nycflights13_create_sql(con, schema = "", ...)
nycflights13_create_sql(con, schema = "", ...)
con |
an SQL connection (i.e a PostgreSQL connection) |
schema |
schema to write the tables ("", or no schema by default) |
... |
additional parameters to connect to a database |
the connection given in con
invisibly, generally called for the
side effects of writing to the database
if (check_for_pkg("RSQLite", message)) { con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:") nycflights13_create_sql(con) DBI::dbGetQuery( con, "SELECT year, month, day, carrier, flight, tailnum FROM flights LIMIT 10" ) DBI::dbDisconnect(con) }
if (check_for_pkg("RSQLite", message)) { con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:") nycflights13_create_sql(con) DBI::dbGetQuery( con, "SELECT year, month, day, carrier, flight, tailnum FROM flights LIMIT 10" ) DBI::dbDisconnect(con) }
Create an in-memory SQLite database for testing
nycflights13_create_sqlite(location = ":memory:", ...)
nycflights13_create_sqlite(location = ":memory:", ...)
location |
where to store the database |
... |
additional parameters to connect to a database (most are passed on
to |
RSQLiteConnection
if (check_for_pkg("RSQLite", message)) { con <- nycflights13_create_sqlite() DBI::dbGetQuery( con, "SELECT year, month, day, carrier, flight, tailnum FROM flights LIMIT 10" ) DBI::dbDisconnect(con) }
if (check_for_pkg("RSQLite", message)) { con <- nycflights13_create_sqlite() DBI::dbGetQuery( con, "SELECT year, month, day, carrier, flight, tailnum FROM flights LIMIT 10" ) DBI::dbDisconnect(con) }
This function redacts the columns specified in columns
in the data given in
data
using dittodb
's standard redactors.
redact_columns(data, columns, ignore.case = TRUE, ...)
redact_columns(data, columns, ignore.case = TRUE, ...)
data |
a dataframe to redact |
columns |
character, the columns to redact |
ignore.case |
should case be ignored? (default: |
... |
additional options to pass on to |
The column names given in the columns
argument are treated as regular
expressions, however they always have ^
and $
added to the beginning and
end of the strings. So if you would like to match any column that starts with
the string sensitive
(e.g. sensitive_name
, sensitive_date
) you could
use "sensitive.*
and this would catch all of those columns (though it would
not catch a column called most_sensitive_name
).
The standard redactors replace all values in the column with the following values based on the columns type:
integer – 9L
numeric – 9
character – "[redacted]"
POSIXct
(date times) – as.POSIXct("1988-10-11T17:00:00", tz = tzone)
data, with the columns specified in columns
duly redacted
if (check_for_pkg("nycflights13", message)) { small_flights <- head(nycflights13::flights) # with no columns specified, redacting does nothing redact_columns(small_flights, columns = NULL) # integer redact_columns(small_flights, columns = c("arr_time")) # numeric redact_columns(small_flights, columns = c("arr_delay")) # characters redact_columns(small_flights, columns = c("origin", "dest")) # datetiems redact_columns(small_flights, columns = c("time_hour")) }
if (check_for_pkg("nycflights13", message)) { small_flights <- head(nycflights13::flights) # with no columns specified, redacting does nothing redact_columns(small_flights, columns = NULL) # integer redact_columns(small_flights, columns = c("arr_time")) # numeric redact_columns(small_flights, columns = c("arr_delay")) # characters redact_columns(small_flights, columns = c("origin", "dest")) # datetiems redact_columns(small_flights, columns = c("time_hour")) }
dittodb
's debug levelIt can be helpful to see what's going on by increasing dittodb
's verbosity
which will show what's going on under the hood (e.g. what queries are being
requested, from where). This sets the option dittodb.debug
to the value
given in the level
argument. The option can be set directly with
options(dittodb.debug = n)
as well.
set_dittodb_debug_level(level)
set_dittodb_debug_level(level)
level |
a numeric, the level to set to (e.g. 1) |
The level
argument is a numeric, where 0 is the default and (relatively)
silent. The higher the level, the more verbose dittodb
will be.
Currently, dittodb
only has one level of debugging (any value 1 or
greater), but more might be used in the future.
the level, invisibly
set_dittodb_debug_level(1) set_dittodb_debug_level(0)
set_dittodb_debug_level(1) set_dittodb_debug_level(0)
dittodb
in your testsIf you would like to use dittodb
in your package, and you are already using
testthat, use this function to
add dittodb
to Suggests in the package DESCRIPTION and loads it in
tests/testthat/helper.R
. Call it once when you're setting up a new package
test suite.
use_dittodb(path = ".")
use_dittodb(path = ".")
path |
character path to the package |
This function should be called with the path to your package source as the
path
argument. The function is idempotent: if dittodb
is already added to
these files, no additional changes will be made.
It will:
add dittodb
to the Suggests
field of the DESCRIPTION file in the
current working directory
add library(dittodb)
to the file tests/testthat/helper.R
(creating it
if it doesn't already exist)
Nothing: called for file system side effects.
## Not run: use_dittodb() use_dittodb("/path/to/package") ## End(Not run)
## Not run: use_dittodb() use_dittodb("/path/to/package") ## End(Not run)
When testing with dittodb, wrap your tests in with_mock_path({})
to use the
database fixtures located in other directories. dittodb
will look for
fixtures in the directory specified by the user, which can be a temporary
or permanent location.
with_mock_path(path, expr, replace = FALSE)
with_mock_path(path, expr, replace = FALSE)
path |
the alternate directory |
expr |
the expression to execute |
replace |
logical, should the path replace the current mock paths
( |
nothing, called to execute the expression(s) in expr
# Only run if RSQLite and testthat are available if (check_for_pkg("RSQLite", message) & check_for_pkg("testthat", message)) { with_mock_path( system.file("nycflight_mocks", package = "dittodb"), with_mock_db({ con <- DBI::dbConnect( RSQLite::SQLite(), dbname = "nycflights" ) one_airline <- dbGetQuery( con, "SELECT carrier, name FROM airlines LIMIT 1" ) testthat::test_that("We get one airline", { testthat::expect_s3_class(one_airline, "data.frame") testthat::expect_equal(nrow(one_airline), 1) testthat::expect_equal(one_airline$carrier, "9E") testthat::expect_equal(one_airline$name, "Endeavor Air Inc.") }) one_airline }) ) }
# Only run if RSQLite and testthat are available if (check_for_pkg("RSQLite", message) & check_for_pkg("testthat", message)) { with_mock_path( system.file("nycflight_mocks", package = "dittodb"), with_mock_db({ con <- DBI::dbConnect( RSQLite::SQLite(), dbname = "nycflights" ) one_airline <- dbGetQuery( con, "SELECT carrier, name FROM airlines LIMIT 1" ) testthat::test_that("We get one airline", { testthat::expect_s3_class(one_airline, "data.frame") testthat::expect_equal(nrow(one_airline), 1) testthat::expect_equal(one_airline$carrier, "9E") testthat::expect_equal(one_airline$name, "Endeavor Air Inc.") }) one_airline }) ) }