skim() used within a function now prints the data frame name.
we have improved the interaction between focus() and the print methods.
skimr_table_header_width
. The default is to use the
console width, i.e. the value of the width
option.we have improved performance when handling large data with many columns.
haven_labelled
columns is now supported. These
columns are summarized using skimmers for the underlying data, typically
either numeric or character.skim_list
(most commonly generated by the partition()
function) also
inherits from a list
knitr
.We've made to_long()
generic, supporting a more intuitive interface.
skim_df
, it reshapes the output into the V1 long style.skim_tee()
Thanks @sethlatimer for suggesting this feature.
tibble
.skimr::summarize()
.Address failed build in CRAN due to lack of UTF-8 support in some platforms.
V2 is a complete rewrite of skimr
, incorporating all of the great feedback the
developers have received over the last year. A big thank you goes to @GShotwell,
@akraemer007, @puterleat, @tonyfischetti, @Nowosad, @rgayler, @jrosen48,
@randomgambit, @elben10, @koliii, @AndreaPi, @rubenarslan, @GegznaV, @svraka,
@dpprdan and to our ROpenSci reviewers @jenniferthompson and @jimhester for all
of the great support and feedback over the last year. We couldn't have done this
without you.
For most users using skimr
will not change in terms of visual outputs. However
for users who use skimr
outputs as part of a larger workflow the differences
are substantial.
skim_df
We've changed the way data is represented within skimr
to closer match
expectations. It is now wide by default. This makes piping statistics much
simpler
skim(iris) %>%
dplyr::filter(numeric.sd > 1)
This means that the old reshaping functions skim_to_wide()
and
skim_to_list()
are deprecated. The latter is replaced with a reshaping
function called partition()
that breaks a skim_df
into a list by data type.
Similarly, yank()
gets a specific data type from the skim_df
. to_long()
gets you data that is closest to the format in the old API.
As the above example suggests, columns of summary statistics are prefixed by
skim_type
. That is, statistics from numeric columns all begin numeric.
,
those for factors all begin factor.
, and so on.
We've deprecated support for pander()
and our kable()
method. Instead, we
now support knitr
through the knit_print()
API. This is much more seamless
than before. Having a skim_df
as the final object in a code chunk should
produce nice results in the majority of RMarkdown formats.
We've deprecated the previous approach customization. We no longer use
skim_format()
and skim_with()
no longer depends on a global state. Instead
skim_with()
is now a function factory. Customization creates a new skimming
function.
my_skim <- skim_with(numeric = sfl(mad = mad))
The fundamental tool for customization is the sfl
object, a skimmer function
list. It is used within skim_with()
and also within our new API for adding
default functions for new data types, the generic get_skimmers()
.
Most of the options set in skim_format
are now either in function arguments or
print arguments. The former can be updated using skim_with
, the latter in a
call to print()
. In RMarkdown documents, you can change the number of
displayed digits by adding the skimr_digits
option to your code chunk.
summary()
, and it is now incorporated into
print()
methods.focus()
is like dplyr::select()
, but it keeps around the columns
skim_type
and skim_variable
.dplyr
verbs to make sure
that they place nice with skimr
objects.skimr
has never really focused on performance, it should do a better
job on big data sets with lots of different columns.skim_without_charts()
as a fallback for when unicode support
is not possible.skimr
removes the tibble metadata when generating output. On
some platforms, this can lead to all output getting removed. To disable that
behavior, set either strip_metadata = FALSE
when calling print or use
options(skimr_strip_metadata = FALSE)
.This is likely to be the last release of skimr version 1. Version 2 has major changes to the API. Users should review and prepare for those changes now.
This is likely to be the last release of skimr version 1. Version 2 has major changes to the API. Users should review and prepare for those changes now.
skim_with(.list = mylist)
or skim_with(!!!mylist)
rlang
:
skim_with(iqr = ~IQR(.x, na.rm = TRUE))
.skim_with()
to add and remove skimmers at the same time, i.e.
skim_with(iqr = IQR, hist = NULL)
works as expected.spark_line()
and spark_bar()
are no longer exportedmin(x)
and max(x)
to
quantile(x, probs = 0)
and quantile(x, probs = 1)
. These changes lead to
more predictable behaviors when a column is all NA values.NA
s threw an errordplyr::do()
skim_v()
is no longer exported. Vectors are now directly supported via
skim.default()
.kable()
and pander()
for skim_df
objects.skim_df
objects.skim.default()
.dplyr::group_by()
)