Submission Workflow

The submission workflow follows four major steps:

Create

A submission consists of one or more Dataset protocol buffers containing user-defined Reaction messages. Datasets can be created programatically or interactively with the ORD web editor.

Please use the online ORD reaction editor to create your submission.

The ORD reaction editor performs automatic validation that will catch any errors entered into the form, so there is no separate validation step.

See the Python examples here.

If you create your submission programmatically, be sure to run the validate_dataset.py script to identify any validation errors:

$ python validate_dataset.py --input="example_dataset.pbtxt"

When defining reactions and datasets programatically, it is good practice to use the validation methods in ord-schema as part of your workflow.

Prepare

Submissions are received as GitHub pull requests from a fork of the ORD repository. In essence, you are creating a personal copy of the repository, updating it with your data, and then requesting that your changes be merged into the main repository.

If you haven’t done so already, you will need to create a fork of the ord-data repository on GitHub.

Important

Please make sure your fork is up to date with the latest datasets in the official repo.

Create a new branch for your submission.

Clone your forked repository to your workstation. You may want to use the --depth flag to create a shallow clone instead of fetching the entire commit history:

Important

Be sure to clone your forked repository and not the official repo.

$ git clone --depth=1 "https://github.com/${GITHUB_USERNAME}/${REPOSITORY}"
# Make sure your fork is up to date.
$ git checkout main
$ git pull --rebase upstream main
# Create a new branch for your submission.
$ git checkout -b my_submission

Submit

Upload your dataset(s) into your submission branch on GitHub and commit the result.

# Copy your dataset(s) into your submission branch.
$ cp path/to/example_dataset.pbtxt .
# Commit your changes.
$ git add example_dataset.pbtxt
$ git commit -m "Example dataset submission"
# Push the submission to your fork.
$ git push origin my_submission

Next, log in to GitHub, navigate to the database repository, and create a pull request from your fork to the official repository.

Review

Your submission will be automatically validated and manually reviewed by one of the ORD reviewers. The reviewers may suggest additional changes and continue to iterate with you until they are satisfied with the submission. After your pull request is approved, it will be merged into a new branch in the official repository; this new branch is staging point for automated preprocessing that is required before merging into the official database.

After your submission has been accepted, a reviewer will trigger various automated preprocessing steps, such as renaming the dataset and assigning reaction and dataset IDs. Once these changes are verified by the reviewer, the dataset will be merged into the “main” branch and become part of the official database.