DataSHIELD Administration

Yannick Marcon

2022-06-09

Opal is the reference implementation of the DataSHIELD infrastructure. All the DataSHIELD administration tasks can be performed programmatically using functions starting with dsadmin.*:

See also the Opal DataSHIELD Administration documentation to learn how to administrate DataSHIELD using the graphical user interface.

Setup

Setup the connection with Opal:

library(opalr)
o <- opal.login("administrator", "password", "https://opal-demo.obiba.org")

R Packages

Opal can handle clusters of R servers. R packages are handled at the cluster level (because all the R servers in a cluster are expected to be identical). See Opal and R server documentation.

List installed DataSHIELD R packages:

dsadmin.package_descriptions(o, profile = "default")

Install a DataSHIELD R package from the configured CRAN repositories (most likely the DataSHIELD repo):

dsadmin.install_package(o, pkg = "dsBase", profile = "default")

R packages which source code is one GitHub can be installed directly:

dsadmin.install_github_package(o, pkg = "dsSurvival", username = "neelsoumya", ref = "v1.0.0", profile = "default")

When developing a new DataSHIELD R package, it can be built and installed as follow (from the root of the R package source directory):

dsadmin.install_local_package(o, devtools::build(), profile = "default")

To remove a DataSHIELD R package:

dsadmin.remove_package(o, pkg = "dsSurvival", profile = "default")

Note that removing a package does not update the DataSHIELD settings of the associated profiles. See the following sections to administrate the profiles and their settings.

Profiles

A DataSHIELD profile is based on a R servers cluster. In the most simple setup, there is only one cluster of one R server and this cluster is called default, and the profile is named the same.

It is possible to define several profiles based on the same cluster. The benefit of doing this is to:

To list the DataSHIELD profiles:

dsadmin.profiles(o)

To create a new DataSHIELD profile, to initialize it the DataSHIELD settings as declared by the installed packages and to enable it, use the following.

# ensure the profile does not exist
if (dsadmin.profile_exists(o, "demo"))
  dsadmin.profile_delete(o, "demo")
# create a profile, disabled
dsadmin.profile_create(o, "demo", cluster = "default")
# make only dsBase and resourcer packages visible
dsadmin.profile_init(o, "demo", packages = c("dsBase", "resourcer"))
# ready to be used
dsadmin.profile_enable(o, "demo")

When a DataSHIELD R package is installed but should be used only by a restricted group of users, proceed as follow:

dsadmin.profile_perm_add(o, "demo", subject = "testers", type = "group")
# verify permissions
dsadmin.profile_perm(o, "demo")

Settings

The DataSHIELD settings are defined per profile (DataSHIELD Profiles section). The settings can be minimally initialized by reading the declared settings from the installed DataSHIELD R packages. They can also be amended afterwards.

Methods

The DataSHIELD methods define the allowed function calls and their mapping to a server side function call.

To list the aggregation functions:

Fully custom settings can be defined (useful for developers).

A simple test of our custom hello() function would be:

Options

The DataSHIELD R options affects the behaviour of some methods.

To modify an R option:

R Parser

An advanced setting, mostly for backward compatibility issues, is the possibility to chose which R parser should be used by Opal when validating the submitted R code (remember that only a subset of the R language is allowed). See datashield4j library documentation for possible values.

To set the legacy R parser:

Users and Permissions

The DataSHIELD requires permissions: permissions to access the data (whether these are in a table or a resource) and permission to use the DataSHIELD service.

Users

To facilitate permissions maintenance, create users in appropriate group(s). Groups can represent data access and DataSHIELD service access.

Permissions

To set some DataSHIELD-compatible permissions (view without accessing individual-level data) to each tables of a project, use the following:

Similarly, permissions to use all the resources of a project in a DataSHIELD context is even simpler:

Then grant permission to use the DataSHIELD service to a group of users:

Note that it is also possible to grant permission to access a specific DataSHIELD profile (see Profiles section).

Teardown

Good practice is to free server resources by sending a logout request:

opal.logout(o)