Importing data into Opal with the API

Background

Opal has a python client that lets you query various things, and even import some data (see https://opaldoc.obiba.org/en/latest/index.html). It doesn't let you create projects and tables from scratch though. Given the web interface does this via the REST API we should be able to do it from the command line. This is how.

First point to note is that we are still using the opal python client, we are just using the rest option of it (see https://opaldoc.obiba.org/en/latest/index.html ).

The following is loosely based on the AMASED work getting the BL books into opal - https://github.com/OllyButters/flatten-bl-xml

Test the python client

A good place to start is getting some info from the opal with the client:

opal rest /datasources --opal https://opal-demo.obiba.org --user administrator --password password --json


where the URL is changed to your install location and the 'administrator/password' bits are changed accordingly. Assuming you get something sensible back you have installed the opal client correctly and know all your credentials!

Adding a project

Lets call this project 'my_project', and lets assume I am looking at books. You can do this with the API, but I haven't yet - I made the project via the web interface.

Adding a table

Once you have a project you want to add a table to it. The first step is to make a JSON file and save it as e.g. my_table.json. The entityType seems to be entirely arbitrary, so call it something relevant to you.


{'entityType': 'Book', 'name': 'Hamlet'}

Now we can push this JSON file into the opal.


opal rest -o http://opal_url --user administrator --password password -m POST -ct "application/json" /datasource/my_project/tables < my_table.json

Now there should be a table called Hamlet in the my_project project - have a look in the web interface.

Adding some variables

The table will not have any variables associated with it at this stage. We can add them in a similar way to above, first off build a JSON file and save it as e.g. my_variables.json


[
	{
        "name": "content",
        "entityType": "Book",
        "valueType": "text",
        "isRepeatable": False,
        "index": 1,
    },  
    {
        "name": "clean_content",
        "entityType": "Book",
        "valueType": "text",
        "isRepeatable": False,
        "index": 2,
    }
]


Here name is the variable name, entityType matches that defined in the table above, valueType is what the variable actually is (allowed types are: text/integer/decimal/.....), isRepeatable is ??? and index is the order the columns are displayed in.

Then the API call like the following will add them:

    opal rest -o http://opal_address --user administrator --password password -m POST -ct "application/json" /datasource/my_project/table/hamlet/variables < my_variables.json


Variables can have quite complex definitions - attributes, categories etc can be added to them, see e.g. below. 


[
    {
        "name": "VAR1",
        "entityType": "Participant",
        "valueType": "text",
        "unit": "",
        "isRepeatable": false,
        "referencedEntityType": "",
        "mimeType": "",
        "occurrenceGroup": "",
        "index": 1,
        "attributes": [
            {
                "name": "label",
                "value": "Which language do you speak?",
                "locale": "en"
            },
            {
                "name": "label",
                "value": "Quelles langues parlez-vous?",
                "locale": "fr"
            }
        ]
    },
    {
        "name": "VAR2",
        "entityType": "Participant",
        "valueType": "integer",
        "mimeType": "",
        "isRepeatable": false,
        "occurrenceGroup": "",
        "unit": "",
        "referencedEntityType": "",
        "index": 2,
        "attributes": [
            {
                "name": "label",
                "value": "Do you like that?"
            }
        ],
        "categories": [
            {
                "name": "1",
                "isMissing": false,
                "attributes": [
                    {
                        "name": "label",
                        "value": "yes"
                    }
                ]
            },
            {
                "name": "2",
                "isMissing": false,
                "attributes": [
                    {
                        "name": "label",
                        "value": "no"
                    }
                ]
            },
            {
                "name": "9",
                "isMissing": true,
                "attributes": [
                    {
                        "name": "label",
                        "value": "do not know"
                    }
                ]
            }
        ]
    }
]


Adding some data

Now to import some data. The general work flow here is to make a CSV file elsewhere, upload it to opal, then tell opal to use the uploaded file.

First of all upload the file:

opal file --opal http://opal_address --user administrator --password password -up all_the_words.csv /home/administrator

The all_the_words.csv is the file you want to upload. The /home/administrator bit is where the file gets saved to - here it is assuming you are logged in as the administrator user and you want to save it in your home space, you could save it in the project space (I think). The CSV file itself should have a heading row for each of the columns that matches the variable names defined above.

Now to use the uploaded data:

opal import-csv -o http://opal_address --user administrator --password password --destination my_project --table hamlet --path /home/administrator/all_the_words.csv --type Book

Here we are telling opal where the CSV file is and which project/table to import it into.

Opal may take a few minutes to process all the data, it also will try to index it too. You can follow the process from the web interface.


DataSHIELD Wiki by DataSHIELD is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at http://www.datashield.ac.uk/wiki