Writing some tests

 

Pre-requisite reading

Disambiguation of terminology and testthat keywords

The table below summarises some concepts of unit testing and their implementation in ``testthat” package.

Concepts

Description

testthat keywords

Concepts

Description

testthat keywords

contexts

Groups some unit tests together under a common purpose. In R, the context is show in the output of the tests. DataSHIELD uses some naming labels to improve the readability of the testing report.

syntax: context

argument: A context name in character

example:

1 context("suitable properties of vectors")

 

An expectation that is expressed with some tests. The latter establishes whether certain expected behaviour occurs.

syntax: test_that

arguments:

  1. the name of test expressed in character

  2. the body of the test

example:

1 2 3 4 5 test_that("suitable lenght vector", { x <- c(1.91817353, 2.92817, 1.59861) expect_equal(length(x), 3) })

equal expectation

This expectation answer this question: is the outcome equal to a value?

syntax: expect_equal

arguments:

  1. outcome of function

  2. expected value

  3. tolerance

example:

1 2 3 4 5 6 7 test_that("mean of a vector", { x <- c(1, 2, 1.5) average <- sum(x)/length(x) expect_equal(mean(x), average, tolerance = 10^-4) })

error

expectations

This expectation answer this question: does code produce an error?

DataSHIELD can throws errors from the server or other DataSHIELD function. It is more challenging to capture some specific errors. For that reason, it is more suitable to check any errors are thrown.

syntax: expect_error

argument: function or R expression

example:

1 2 3 4 5 test_that("vector element does not exist", { x <- c(1, 2, 1.5) expect_error(x(4)*5)) })

 

Testing data

The testing virtual machine contains some testing data to be used for testing. Some R functions have also been developed to assist the testers to connect to those dataset.

Steps to write some test for DataSHIELD

It is assumed DataSHIELD and its testing framework has been installed fully. Further information is available on this page: Installing the testing framework.

The design phase

  1. Identify a R function or computations to complete locally to obtain some expected values.

  2. Identify all the mathematical properties that would test thoroughly a DataSHIELD function. The latter should include tests that assess the outcome properties outside the ``comfort zone” of the DataSHIELD programmer.

  3. From this list, identify some possible units tests, with some specific definition.

The coding phase of the definition of tests

The definition of tests is explained in the technical documentation: Definition of tests

  1. Create a definition file. Save it as def-[name of function].R in the /tests/testthat/ folder.

  2. Add the suitable R scripts to access to the functions to connect to the TESTING datasets and some comparison tools.

  3. For each unit test, write an R function that is general enough to be used with the TESTING datasets used locally and remotely.

Example of definition script for /tests/testthat/ def-ds.mean.R

Part a: The source function makes available all the function of def-assign-stats.R

Part b: This function complete an arithmetical mean locally using the R function. The same centrality measure is computed using the DataSHIELD function. The expect_equal function compares both results. The test is passed, if both results are accurate to the level of tolerance set in the testing framework (i.e., 10^-6).

 

1 2 3 4 5 6 7 8 9 10 # Part a source("definition_tests/def-assign-stats.R") # Part b .test.mean.combined <- function(variable.name,some.values) { mean.local <- mean(some.values) mean.server <- ds.mean(x=variable.name,type='combine', check=TRUE,save.mean.Nvalid=FALSE, datasources=ds.test_env$connection.opal) expect_equal(mean.server[[1]][1], mean.local, tolerance = ds.test_env$tolerance) }

The coding phase of the tests

The technical documentation provides the required information and naming convention on types of tests and testing data: Classes of tests and naming convention and Dataset for testing.

  1. Create some test scripts. The latter are saved under the folder /test/testthat using the naming convention for any test files.

  2. Add the suitable R scripts to access to the functions to connect to the TESTING datasets and some comparison tools.

  3. Write your tests using some contests, unit, some testing dataset, and some tests definitions (see example below)

 

Example of /test/testthat/test-expt-ds.mean.R

Part a: The source function make available the test definition written for a function.

Part b: This part simulates the connections to three Opal servers; the TESTING dataset is used that is TESTING.DATASET1, TESTING.DATASET2, TESTING.DATASET3. Locally, the equivalent data are uploaded in three data frames and a combined one, using the three sets of data. Both sets of data are the same.

Part c: A context is defined using the naming convention. It will be showed in the test output provided by devtools.

Part d: A unit test is defined. The body of the test repetitively calls the .test.mean.combined function (see above). The first arguments with the pattern of characters D$ refers to columns of the harmonised testing datasets on the server; i.e., the ``variable.name” argument). The second argument (i.e., the some.values argument) uses the corresponding data made available on locally; the latter is made available from the testing environment.

Part e: This part connects to a single Opal server and uses the TESTING dataset on three servers. That is TESTING.DATASET1. Locally, the equivalent data are uploaded in one data frames, using the three sets of data. Both sets of data are the same.

Part f: This part of the script has the same purpose as the parts c and d with a single server.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 # part a source("connection_to_datasets/init_all_datasets.R") source("definition_tests/def-ds.mean.R") # part b connect.all.datasets() # part c context("ds.mean()::expt::combine::multiple") #part d test_that("combined data set", { .test.mean.combined('D$INTEGER',ds.test_env$local.values[,6]) .test.mean.combined('D$NON_NEGATIVE_INTEGER',ds.test_env$local.values[,7]) .test.mean.combined('D$POSITIVE_INTEGER',ds.test_env$local.values[,8]) .test.mean.combined('D$NEGATIVE_INTEGER',ds.test_env$local.values[,9]) .test.mean.combined('D$NUMERIC',ds.test_env$local.values[,10]) .test.mean.combined('D$NON_NEGATIVE_NUMERIC',ds.test_env$local.values[,11]) .test.mean.combined('D$POSITIVE_NUMERIC',ds.test_env$local.values[,12]) .test.mean.combined('D$NEGATIVE_NUMERIC',ds.test_env$local.values[,13]) }) # part e connect.dataset.1() # part f context("ds.mean()::expt::single") test_that("combined data set", { .test.mean.combined('D$INTEGER',ds.test_env$local.values.1[,6]) .test.mean.combined('D$NON_NEGATIVE_INTEGER',ds.test_env$local.values.1[,7]) .test.mean.combined('D$POSITIVE_INTEGER',ds.test_env$local.values.1[,8]) .test.mean.combined('D$NEGATIVE_INTEGER',ds.test_env$local.values.1[,9]) .test.mean.combined('D$NUMERIC',ds.test_env$local.values.1[,10]) .test.mean.combined('D$NON_NEGATIVE_NUMERIC',ds.test_env$local.values.1[,11]) .test.mean.combined('D$POSITIVE_NUMERIC',ds.test_env$local.values.1[,12]) .test.mean.combined('D$NEGATIVE_NUMERIC',ds.test_env$local.values.1[,13]) })

Running the tests

Some instructions are available on this page: Running tests locally