API

The API service provides data submitters with functionality to control their submissions. Users are authenticated with a JWT.

Service Description

Endpoints:

  • /files
  • Parses and validates the JWT token against the public keys, either locally provisioned or from OIDC JWK endpoints.
  • The sub field from the token is extracted and used as the user's identifier
  • All files belonging to this user are extracted from the database, together with their latest status and creation date

    Example:

    bash $ curl 'https://server/files' -H "Authorization: Bearer $token" [{"inboxPath":"requester_demo.org/data/file1.c4gh","fileStatus":"uploaded","createAt":"2023-11-13T10:12:43.144242Z"}]

    It is possible to limit the returned results by supplying a base path prefix in the query. In this case only files that have a path that starts with submission-1 will be returned.

    bash curl -H "Authorization: Bearer $token" -X GET https://HOSTNAME/files?path_prefix=submission-1

If the token is invalid, 401 is returned.

  • /datasets
  • accepts GET requests
  • Returns all datasets, along with their status and last modified timestamp, for which the user has submitted data.

  • Error codes

    • 200 Query execute ok.
    • 400 Error due to bad payload.
    • 401 Token user is not in the list of admins.
    • 500 Internal error due to DB failures.

    Example:

    bash $curl -H "Authorization: Bearer $token" -X GET https://HOSTNAME/datasets [{"DatasetID":"EGAD74900000101","Status":"deprecated","Timestamp":"2024-11-05T11:31:16.81475Z"}]

Admin endpoints

Admin endpoints are only available to a set of whitelisted users specified in the application config.

  • /file/ingest
  • accepts POST requests with either:
    • A JSON payload: {"filepath": "</PATH/TO/FILE/IN/INBOX>", "user": "<USERNAME>"}
    • OR a fileid query parameter: /file/ingest?fileid=<FILE_UUID>
  • triggers the ingestion of the file.

  • If both a JSON payload and a fileid query parameter are provided in the same request, a 400 Bad Request is returned.

  • Error codes

    • 200 Query executed successfully.
    • 400 Bad request (e.g. wrong user + filepath combination, both payload and fileid provided, invalid fileid, or invalid JSON).
    • 401 Token user is not in the list of admins.
    • 500 Internal error due to DB or MQ failures.

    Example (JSON payload):

    bash curl -H "Authorization: Bearer $token" -H "Content-Type: application/json" -X POST -d '{"filepath": "/uploads/file.c4gh", "user": "testuser"}' https://HOSTNAME/file/ingest

    Example (fileid query parameter):

    ```bash curl -H "Authorization: Bearer $token" -X POST "https://HOSTNAME/file/ingest?fileid="

  • /file/accession

  • accepts POST requests with either:
    • A JSON playload: {"accession_id": "<FILE_ACCESSION>", "filepath": "</PATH/TO/FILE/IN/INBOX>", "user": "<USERNAME>"}
    • OR query parameters: /file/accession?fileid=<FILE_UUID>&accessionid=<ACCESSION_ID>
  • assigns accession ID to the file.

  • If both a JSON payload and query parameters are provided in the same request, a 400 Bad Request is returned.

  • Error codes

    • 200 Query executed successfully.
    • 400 Bad request (e.g. wrong user + filepath combination, both payload and parameters provided, invalid fileid, or invalid JSON).
    • 401 Token user is not in the list of admins.
    • 404 Decrypted checksum not found.
    • 500 Internal error due to DB or MQ failures.

    Example (JSON payload):

    bash curl -H "Authorization: Bearer $token" -H "Content-Type: application/json" -X POST -d '{"accession_id": "my-id-01", "filepath": "/uploads/file.c4gh", "user": "testuser"}' https://HOSTNAME/file/accession

    Example (query parameters):

    bash curl -H "Authorization: Bearer $token" -X POST "https://HOSTNAME/file/accession?fileid=<FILE_UUID>&accessionid=<ACCESSION_ID>"

  • /file/verify/:accession

  • accepts PUT requests with an accession ID as the last element in the query
  • triggers re-verification of the file with the specific accession ID.

  • Error codes

    • 200 Query execute ok.
    • 404 Error due to non existing accession ID.
    • 401 Token user is not in the list of admins.
    • 500 Internal error due to DB or MQ failures.

    Example:

    bash curl -H "Authorization: Bearer $token" -H "Content-Type: application/json" -X PUT -d '{"accession_id": "my-id-01", "filepath": "/uploads/file.c4gh", "user": "testuser"}' https://HOSTNAME/file/accession

  • /file/:username/:fileid

  • accepts DELETE requests
  • marks the file as disabled in the database, and deletes it from the inbox.
  • The file is identified by its id, returned by users/:username/:files

  • Response codes

    • 200 Query execute ok.
    • 400 File id not provided
    • 401 Token user is not in the list of admins.
    • 404 File not found
    • 500 Internal error due to Inbox, DB or MQ failures.

    Example:

    bash curl -H "Authorization: Bearer $token" -X DELETE https://HOSTNAME/file/user@demo.org/123abc

  • /dataset/create

  • accepts POST requests with JSON data with the format: {"accession_ids": ["<FILE_ACCESSION_01>", "<FILE_ACCESSION_02>"], "dataset_id": "<DATASET_01>", "user": "<SUBMISSION_USER>"}
  • creates a dataset from the list of accession IDs and the dataset ID.

  • Error codes

  • 200 Query execute ok.
  • 400 Error due to bad payload.
  • 401 Token user is not in the list of admins.
  • 500 Internal error due to DB or MQ failures.

    Example:

    bash curl -H "Authorization: Bearer $token" -H "Content-Type: application/json" -X POST -d '{"accession_ids": ["my-id-01", "my-id-02"], "dataset_id": "my-dataset-01", "user": "user@example.org"}' https://HOSTNAME/dataset/create

  • /dataset/release/*dataset

  • accepts POST requests with the dataset name as last part of the path`
  • releases a dataset so that it can be downloaded.

  • Error codes

    • 200 Query execute ok.
    • 400 Error due to bad payload.
    • 401 Token user is not in the list of admins.
    • 500 Internal error due to DB or MQ failures.

    Example:

    bash curl -H "Authorization: Bearer $token" -X POST https://HOSTNAME/dataset/release/my-dataset-01

  • /dataset/verify/*dataset

  • accepts PUT requests with the dataset name as last part of the path`
  • triggers reverification of all files in the dataset.

  • Error codes

    • 200 Query execute ok.
    • 404 Error wrong dataset name.
    • 401 Token user is not in the list of admins.
    • 500 Internal error due to DB or MQ failures.

    Example:

    bash curl -H "Authorization: Bearer $token" -X PUT https://HOSTNAME/dataset/verify/my-dataset-01

  • /dataset/rotatekey/:dataset

  • accepts POST requests with the dataset name as parameter
  • Triggers key rotation for all files in the dataset by sending a message to the rotatekey queue for each file.

  • Error codes

    • 200 Key rotation triggered successfully.
    • 400 Dataset ID not provided.
    • 401 Token user is not in the list of admins.
    • 500 Internal error due to DB or MQ failures.

    Example:

    bash curl -H "Authorization: Bearer $token" -X POST https://HOSTNAME/dataset/rotatekey/my-dataset-01

  • /file/rotatekey/:fileid

  • accepts POST requests with the file ID as parameter
  • Triggers key rotation for the specified file by sending a message to the rotatekey queue.

  • Error codes

    • 200 Query execute ok.
    • 400 File ID not provided or message validation failed.
    • 401 Token user is not in the list of admins.
    • 500 Internal error due to MQ or DB failures.

    Example:

    bash curl -H "Authorization: Bearer $token" -X POST https://HOSTNAME/file/rotatekey/c2acecc6-f208-441c-877a-2670e4cbb040

  • /datasets/list

  • accepts GET requests
  • Returns all datasets together with their status and last modified timestamp.

  • Error codes

    • 200 Query execute ok.
    • 400 Error due to bad payload.
    • 401 Token user is not in the list of admins.
    • 500 Internal error due to DB failures.

    Example:

    bash $curl -H "Authorization: Bearer $token" -X GET https://HOSTNAME/datasets/list [{"DatasetID":"EGAD74900000101","Status":"deprecated","Timestamp":"2024-11-05T11:31:16.81475Z"},{"DatasetID":"SYNC-001-12345","Status":"registered","Timestamp":"2024-11-05T11:31:16.965226Z"}]

  • /datasets/list/:username

  • accepts GET requests with the username name as last part of the path`
  • Returns all datasets, along with their status and last modified timestamp,for which the user has submitted data.

  • Error codes

    • 200 Query execute ok.
    • 400 Error due to bad payload.
    • 401 Token user is not in the list of admins.
    • 500 Internal error due to DB failures.

    Example:

    bash curl -H "Authorization: Bearer $token" -X GET https://HOSTNAME/datasets/list/submission-user [{"DatasetID":"EGAD74900000101","Status":"deprecated","Timestamp":"2024-11-05T11:31:16.81475Z"}]

  • /users

  • accepts GET requests
  • Returns all users with active uploads as a JSON array

    Example:

    bash curl -H "Authorization: Bearer $token" -X GET https://HOSTNAME/users

  • Error codes

    • 200 Query execute ok.
    • 401 Token user is not in the list of admins.
    • 500 Internal error due to DB failure.
  • /users/:username/files

  • accepts GET requests
  • Returns all files (that are not part of a dataset) for a user with active uploads as a JSON array

    Example:

    bash curl -H "Authorization: Bearer $token" -X GET https://HOSTNAME/users/submitter@example.org/files

    It is possible to limit the returned results by supplying a base path prefix in the query. In this case only files that have a path that starts with submission-1 will be returned.

    bash curl -H "Authorization: Bearer $token" -X GET https://HOSTNAME/users/submitter@example.org/files?path_prefix=submission-1

  • Error codes

    • 200 Query execute ok.
    • 401 Token user is not in the list of admins.
    • 500 Internal error due to DB failure.
  • /users/:username/file/:fileid

  • accepts GET requests.
  • downloads a file from the inbox, re-encrypted with the client’s public key provided in the request header.
  • the public key provided in the header should be base64 encoded.

  • Error codes

    • 200 Query execute ok.
    • 400 Client public key not provided or not valid.
    • 401 Token user is not in the list of admins.
    • 404 File not found
    • 500 Internal error due to Inbox, Reencrypt, DB, MQ or streaming failures.

    Example:

    bash curl -H "Authorization: Bearer $token" -H "C4GH-Public-Key: $base64_encoded_public_key" -X GET https://HOSTNAME/users/submitter@example.org/file/c2acecc6-f208-441c-877a-2670e4cbb040

  • /c4gh-keys/add

  • accepts POST requests with the hex hash of the key and its description
  • registers the key hash in the database.

  • Error codes

    • 200 Query execute ok.
    • 400 Error due to bad payload.
    • 401 Token user is not in the list of admins.
    • 409 Key hash already exists in the database.
    • 500 Internal error due to DB failures.

    Example:

    bash curl -H "Authorization: Bearer $token" -H "Content-Type: application/json" -X POST -d '{"pubkey": "'"$( base64 -w0 /PATH/TO/c4gh.pub)"'", "description": "this is the key description"}' https://HOSTNAME/c4gh-keys/add

Configure RBAC

RBAC is configured according to the JSON schema below. The path to the JSON file containing the RBAC policies needs to be passed through the api.rbacFile config definition.

The policy section will configure access to the defined endpoints. Unless specific rules are set, an endpoint will not be accessible.

  • action: can be single string value i,e GET or a regex string with | as separator i.e. (GET)|(POST)|(PUT). In the later case all actions in the list are allowed.
  • path: the endpoint. Should be a string value with two different wildcard notations: *, matches any value and : that matches a specific named value
  • role: the role that will be able to access the path, "*" will match any role or user.

The roles section defines the available roles

  • role: rolename or username from the accesstoken
  • roleBinding: maps a user/role to another role, this makes roles work as groups which simplifies the policy definitions.
{
   "policy": [
      {
         "role": "admin",
         "path": "/c4gh-keys/*",
         "action": "(GET)|(POST)|(PUT)"
      },
      {
         "role": "submission",
         "path": "/file/ingest",
         "action": "POST"
      },
      {
         "role": "submission",
         "path": "/file/accession",
         "action": "POST"
      },
      {
         "role": "submission",
         "path": "/users",
         "action": "GET"
      },
      {
         "role": "submission",
         "path": "/users/:username/files",
         "action": "GET"
      },
      {
         "role": "*",
         "path": "/files",
         "action": "GET"
      }
   ],
   "roles": [
      {
         "role": "admin",
         "rolebinding": "submission"
      },
      {
         "role": "dummy@example.org",
         "rolebinding": "admin"
      },
      {
         "role": "test@example.org",
         "rolebinding": "submission"
      }
   ]
}

Storage settings

The API service requires access to the "inbox" storage. To configure that, the following configuration is required:

storage:
  inbox:
    ${STORAGE_IMPLEMENTATION}:

For more details on available configuration see storage/v2 README.md