checkr Naming Schemes

Joe Thorley

2019-04-24

The checkr packages provides a series of functions that can be used to check the values and classes of common R objects. Before attempting to use the functions it is worth understanding the checkr function naming schemes.

Primary Objects

As discussed in the Data Structures chapter in Advanced R by Hadley Wickham, the five main object types are defined by their dimensionality (1d, 2d and nd) and whether they are homogenous or heterogenous. Currently the check_vector(), check_list() and check_data() functions allow the user to check the type and values of the three main type of (atomic) vectors, lists and data frames. The check_matrix() and check_array() equivalents have yet to be implemented. The functions are named according to the type of the object they check.

It is worth noting that a while a vector, matrix or array will definitely pass the check_homogenous() function, a list or data frame may pass it if all its elements have identical classes.

Scalars

In addition to (atomic) vectors the checkr package also identifies scalars which in R are (atomic) vectors with one element. They can be tested using the check_scalar() function.

Often function arguments are scalars of a particular type. For example the default mean() function has an argument na.rm which is a logical indicating whether NA values should be stripped. The checkr package provides the following functions to test whether objects are non-missing scalars of particular types:

It is also worth noting that almost all the scalar functions include an argument coerce which if set to TRUE tests whether the object satisfies the condition after coercing to the required type. If the test passes then the function returns the coerced object.

Other Objects

Functions to test whether an object is another type of object include check_null(), check_environment() and check_function().

In addition the check_classes() and check_inherits() functions can be used to check the classes or inheritance of an object.

Function Functions

Many other checking functions are named according to the main base function that they use. Thus the check_nrow() function uses the nrow() function to check the number of rows while the check_nlevels() function uses the nlevels() function. It is important to realize that these functions do not check the type of an object but simply whether the return value of the base function(s) is consistent with the user defined conditions.

Functions which fall in this category include check_class(), check_colnames(), check_length(), check_levels(), check_names(), check_nchar(), check_ncol(), check_nlevels(), check_nrow() and check_sorted().

Data Functions

The check_join() function tests whether the columns in one data frame form a many-to-one join with corresponding columns in a second data frame. The check_key() function tests whether specific columns are unique. They are useful for testing whether a set of data frames can be archived in an relational database.

Miscellaneous Functions

The remaining functions which are not captured by the above naming schemes include checkor(), check_named(), check_pattern(), check_regex(), check_sorted(), check_tzone(), check_unique(), check_unnamed() and check_unused().