Test-driven development and unit testing

Test-driven paradigm (TDD)

This software development process relies on some well-defined requirements to be met. The latter becomes specific test cases. Each test case must be met at any time of the software life; that is if any code is altered, the tests must continue to be met. Traditionally, we write a test first from the list of test cases and expect the test to fail; the latter is an expected outcome as no code is written yet. The third step is to write the code and run the test again. These two steps may be repeated until the test passes. Once the test passes, the code is refactored; this phase organises the code again, so that it becomes highly maintainable (see diagram below).

 

Refactoring

The last phase of the (TDD) development cycle requires to once again structure the code, without changing its behaviour. In R, this step often requires shortening of lengthy functions into a more manageable and maintainable size.

To achieve this purpose, code is broken down into reusable functions that present clear, well-defined, simple-to-use arguments. Each of these functions should be named meaningfully and complete only one goal. Those are referred as helper functions.

The benefits of refactoring brings some R functions that can be maintained more easily. The reduced size of the code should enhance comprehension, through self-documentation of the logical flow of the implementation and meaningful naming of variables and helper functions.

 

“Whenever I have to think to understand what the code is doing, I ask myself if I can refactor the code to make that understanding more immediately apparent.”
Martin Fowler, Refactoring: Improving the Design of Existing Code

Example of refactored code:

This refactored function getOpals uses three helper functions: init.object.list.testing.environment, init.object.list.global.environment and init.opal.list. Each of these helper function complete a unique aim; the logic of the function becomes easier to read as the body of the selection now holds only one line. Each of these helper functions have a certain level of complexity (i.e., use of iterations and selections), that would be challenging to understand in the getOpals function. Also, the list returned is build using some values.

 

The getOpals function

This function identifies the R environment used during the execution of the R code. Two possible R environments can be used:

  1. the R environment referred as .GlobalEnv is used by default in the console of R Studio or an R script.

  2. The R environment ds.test_env is used by the testing framework.

For each type of execution, an opal connection object is searched. This part of the implementation can be quite complex. For that reason, the helper functions no 1 and no 2 removes this complexity from the main function. The last step is to build a suitable opal list that can be used to connect to an Opal server.

 

getOpals <- function() { # initialise function variables flag <- 0 opal.list <- NULL return.list <- list("flag"=flag, "opals"=NULL, "opals.list"= NULL) curr.ds.test_env <- get("ds.test_env", envir = .GlobalEnv) # Check the computer environment used to execute the code. Then initialise # accordingly the variable opal.list if (! is.null(curr.ds.test_env)) { opal.list <- init.object.list.testing.environment(ls(curr.ds.test_env)) } else { opal.list <- init.object.list.global.environment(ls(.GlobalEnv)) } # Build and return the opal list. return.list <- init.opal.list(opal.list) return(return.list) }

helper function no 1

init.object.list.global.environment <- function(objs) { opalist <- vector('list') counter <- 0 for(i in 1:length(objs)) { class.element.name = class(eval(parse(text=objs[i]))) if(class.element.name[1] == 'list') { list2check <- eval(parse(text=objs[i])) if(length(list2check) > 0) { cl2 <- class(list2check[[1]]) for(s in 1:length(cl2)) { if(cl2[s] == 'opal') { counter <- counter + 1 opalist[[counter]] <- objs[i] } } } } } return(opalist) }

helper function no 2

init.object.list.testing.environment <- function(objs) { opalist <- vector('list') counter <- 0 for(i in 1:length(objs)) { if (objs[i] == "connection.opal") { counter <- counter + 1 opalist[[counter]] <- paste("ds.test_env$",objs[1],sep="") } } return(opalist) }

helper function no 3

Additional reading:

Additional reading:

Unit testing

Unit testing is often related to test-driven development. The tests are broken into units; each of them ensure that a section of an application meets its design and behaves as intended. For example, if the division operation would be implemented as an R function, then the following designed would be valid:

Name of the function:

divide

Description:

implements the divide arithmetical operation using two numbers.

Arguments:

numerator - number to be divided

denominator- number dividing the numerator

Returned value:

The quotient - a numerical value

Test cases or units:

  1. denominator x quotient = numerator

  2. numerator / quotient = denominator

  3. Division by 0 shows an error message

  4. Character arguments show an error message

Testthat Package

In R, testthat package provides a unit testing framework. This package has been integrated in the DataSHIELD test framework.

 

DataSHIELD Wiki by DataSHIELD is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at http://www.datashield.ac.uk/wiki