Preliminary steps in GitHub
About DataSHIELD R package
In DataSHIELD the development of a package leads to the generation of two packages. Nearly each function consists in fact of two complementary functions: one running on the remote server (behind the firewall of the data owner) and referred to as server side function and its counterpart running on the local computer or analysis centre (i.e. the computer the analyses are ran from) and referred to as client side function. Server side functions are in the server side package and client side functions are in the client side package as shown in the very simple graph below.
The rest of this page shows how the server side package repository is initialized in GitHub but the client side package repository is created exactly the same way apart from the description section where we indicate that it is a server side package.
Creating the server side package repository in GitHub
Horizontal DataSHIELD uses GitHub for distributed development (Read more about GitHub here or watch these videos). Hence the project is located in GitHub and packages are maintained there. The project name in GitHub is DataSHIELD
and it can be found online here. The package we are going to develop in this tutorial will also be located under the DataSHIELD
project.
Naming of packages
In this tutorial we are going to build a package from scratch. The name of our package is dsTutorial
. In H-DS, names of packages start with the prefix ds
. We also try, whenever possible, to choose names that reflect the main purpose of the package. And if the name is composed of two or more words the second and the following words start with a capital letter. Hence the name of our tutorial package is dsTutorial
. But as already explained earlier for each package we actually develop consists of two parts a server side package with a name that has the prefix ds
but no suffix and a client side package with a name that starts with the prefix ds
and ends with the suffix Client. So in this tutorial the name of the server side and client side packages are respectively: dsTutorial
and dsTutorialClient
. It is recommended to avoid using very long package names.
Each new HDS package requires a new repository - each a HDS R package corresponds in fact to one GitHub repository. To create a new repository the coordinator or anyone with same rights goes to the project page and click on the button +New repository. A new page will then open up (see image below).
Who creates the new repository?
This is the strict responsibility of the project coordinator or someone with same privileges. Any developer can be given such rights but it is better to have a designated person for this task. The same person might be also responsible of the main branch , the master
branch, which no other developer should use for his/her developments. So the project coordinator creates a branch for each developer involved in the development of the package - each developer might be responsible of one or more functions if it is a large package.
Follow the instructions to below fill in the information required on the page:
- Make sure Owner is
DataSHIELD
. - Choose a Repository name that respects HDS nomenclature (see the note
Naming of packages
). - Give a brief Description of what the repository/package is for.
- Set access to
Public
. - Enable Initialize this repository with a README by ticking the box.
- Under Add .gitignore select
R
because we want an R package. - Under Add a licence select
GNU GPL v3.0
. - To finish click on the button Create Repository.
Creating a branch for a developer
This falls also under the responsibility of the project/package coordinator or someone with similar privileges as mentioned in the warning Who creates the new repository
.
Because only the coordinator is allowed to 'push' into the main branch all other developers should be allocated their own branches where their developments will takes place before - the final versions are merged with the master
branch periodically. To create a branch just click on the repository name (if not already open) and then click on branch:master as shown in the below image. The branch name should be the first letter of the developer' first name followed by his/her surname. If the name already exists use some variants (e.g insert middle name etc). Then click on Create branch: ... to finish. In the below example a branch is being created for the developer agaye
Now repeat the same above steps to create the client side package repository: the name of the repository should be dsTutorialClient
and in the description section write A client side package to illustrate the development of a Horizontal DataSHIELD R package
.
DataSHIELD Wiki by DataSHIELD is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at http://www.datashield.ac.uk/wiki