Writing some tests
Pre-requisite reading
Disambiguation of terminology and testthat keywords
The table below summarises some concepts of unit testing and their implementation in ``testthat” package.
Concepts | Description | testthat keywords |
---|---|---|
contexts | Groups some unit tests together under a common purpose. In R, the context is show in the output of the tests. DataSHIELD uses some naming labels to improve the readability of the testing report. | syntax: context argument: A context name in character example: context("suitable properties of vectors") |
| An expectation that is expressed with some tests. The latter establishes whether certain expected behaviour occurs. | syntax: test_that arguments:
example: test_that("suitable lenght vector",
{
x <- c(1.91817353, 2.92817, 1.59861)
expect_equal(length(x), 3)
}) |
equal expectation | This expectation answer this question: is the outcome equal to a value? | syntax: expect_equal arguments:
example: test_that("mean of a vector",
{
x <- c(1, 2, 1.5)
average <- sum(x)/length(x)
expect_equal(mean(x), average,
tolerance = 10^-4)
}) |
error expectations | This expectation answer this question: does code produce an error? DataSHIELD can throws errors from the server or other DataSHIELD function. It is more challenging to capture some specific errors. For that reason, it is more suitable to check any errors are thrown. | syntax: expect_error argument: function or R expression example: |
Additional reading: |
---|
Testing data
The testing virtual machine contains some testing data to be used for testing. Some R functions have also been developed to assist the testers to connect to those dataset.
Steps to write some test for DataSHIELD
It is assumed DataSHIELD and its testing framework has been installed fully. Further information is available on this page: Installing the testing framework.
The design phase
Identify a R function or computations to complete locally to obtain some expected values.
Identify all the mathematical properties that would test thoroughly a DataSHIELD function. The latter should include tests that assess the outcome properties outside the ``comfort zone” of the DataSHIELD programmer.
From this list, identify some possible units tests, with some specific definition.
The coding phase of the definition of tests
The definition of tests is explained in the technical documentation: Definition of tests
Create a definition file. Save it as def-[name of function].R in the /tests/testthat/ folder.
Add the suitable R scripts to access to the functions to connect to the TESTING datasets and some comparison tools.
For each unit test, write an R function that is general enough to be used with the TESTING datasets used locally and remotely.
Example of definition script for /tests/testthat/ def-ds.mean.R
Part a: The source function makes available all the function of def-assign-stats.R
Part b: This function complete an arithmetical mean locally using the R function. The same centrality measure is computed using the DataSHIELD function. The expect_equal function compares both results. The test is passed, if both results are accurate to the level of tolerance set in the testing framework (i.e., 10^-6).
The coding phase of the tests
The technical documentation provides the required information and naming convention on types of tests and testing data: Classes of tests and naming convention and Dataset for testing.
Create some test scripts. The latter are saved under the folder /test/testthat using the naming convention for any test files.
Add the suitable R scripts to access to the functions to connect to the TESTING datasets and some comparison tools.
Write your tests using some contests, unit, some testing dataset, and some tests definitions (see example below)
Example of /test/testthat/test-expt-ds.mean.R
Part a: The source function make available the test definition written for a function.
Part b: This part simulates the connections to three Opal servers; the TESTING dataset is used that is TESTING.DATASET1, TESTING.DATASET2, TESTING.DATASET3. Locally, the equivalent data are uploaded in three data frames and a combined one, using the three sets of data. Both sets of data are the same.
Part c: A context is defined using the naming convention. It will be showed in the test output provided by devtools.
Part d: A unit test is defined. The body of the test repetitively calls the .test.mean.combined function (see above). The first arguments with the pattern of characters D$ refers to columns of the harmonised testing datasets on the server; i.e., the ``variable.name” argument). The second argument (i.e., the some.values argument) uses the corresponding data made available on locally; the latter is made available from the testing environment.
Part e: This part connects to a single Opal server and uses the TESTING dataset on three servers. That is TESTING.DATASET1. Locally, the equivalent data are uploaded in one data frames, using the three sets of data. Both sets of data are the same.
Part f: This part of the script has the same purpose as the parts c and d with a single server.
Running the tests
Some instructions are available on this page: Running tests locally
DataSHIELD Wiki by DataSHIELD is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at http://www.datashield.ac.uk/wiki