Modules as R Objects

2021-02-06

Introduction

In this vignette you can find details on

Modules as first class citizen in R

Modules are first class citizens in the sense that they can be treated like any other data structure in R:

Modules are represented as list type in R. Such that

library("modules")
m <- module({
  foo <- function() "foo"
})
is.list(m)
#> [1] TRUE
class(m)
#> [1] "module" "list"

S3 methods may be defined for the class module. The package itself only implements a method for the generic function print.

Nested Modules

Nested modules are modules defined inside other modules. In this case dependencies of the top level module are accessible to its children:

Modules as objects

Sometimes it can be useful to pass arguments to a module. If you have a background in object oriented programming you may find this natural. From a functional perspective we define parameters shared by a list of closures. This is achieved by making the enclosing environment of the module available to the module itself.

m <- function(param) {
  amodule({
    fun <- function() param
  })
}
m(1)$fun()
#> [1] 1

amodule is a wrapper around module to abstract the following pattern:


m <- function(param) {
  module(topEncl = environment(), {
    fun <- function() param
  })
}
m(1)$fun()
#> [1] 1

Using one of these approaches you construct a local namespace definition with the option to pass down some arguments.

Dependency injection

This can be very useful to handle dependencies between two modules. Instead of:

which would hard code the dependency, we can write:

There are many good reasons to follow such a strategy. As an example: consider the case in which module a introduces side effects. By leaving it open as argument we can later decide what exactly we pass down to the constructor of b. This may be important to us when we want to mock a database, disable logging or otherwise handle access to external ressources.

Modules to model mutable state

You can not only put functions into your bag (module) but any R-object. This includes data: modules can be state-full. To illustrate this we define a module to encapsulate some value and have a get and set method for it:

In the next module we can use mutableModule and rebuild the interface to .num.

Depending on your expectations with respect to the above code it comes at a surprise that we can get and set that value from an attached module; Furthermore it is not changed in mutableModule. This is because use will trigger a re-initialization of any module you plug in. You can override this behaviour:

Module composition

In contrast to systems of object orientation, modules do not provide a formal mechanism of inheritance. Instead we can use various modes of composition. Inheritance often is used to reuse code; or to add functionality to an existing module.

In this context we may use parameterized modules, use, expose and extend. The first two have already been discussed, as has been dependency injection as a strategy to encode relationships between modules.

expose is most useful when we want to re-export functions from another module:

Here we can easily add functionality to a module, or only reuse parts of it. Another way to achieve this is to use extend. The difference is, that with expose we re-export existing functionality unchanged. With extend we add lines of code to an existing module definition. This means we can (a) override private members of that module and (b) generally gain access to all implementation details. Hence the following two definitions are equivalent:

Variant A

Variant B

extend should be used with great care. It is possible and easy to breake functionality of the module you extend. This is not possible or at least more challenging using expose.

Unit tests for modules

The real use case for extend is to add unit tests to a module. You can think of using one of two patterns:

Variant A

Variant B

The latter alternative will keep the interface clean and gives access to private member functions. Sometimes this can be very useful for testing.

Modules in Packages

Of course a good way to write R code is to write packages. Modules inside of packages make a lot of sense, because also in a package we only have one scope to work with. Modules provide more options.

If you write constructor functions for your modules (see example below) you automatically take advantage of R CMD check. R CMD check will provide some static code analysis tools which are generally helpful.

As you would avoid using library inside of packages, you should also avoid using modules::import. The R package namespace mechanism is more than capable of handling all dependencies.