These are tests that have been explicitly written and stored in the regression testing repo in a path of the form tests/test_<dataset_name>.py
Tests minimally consist of one or more API request dictionaries for the dataset in the filename. They can also return a summary data structure describing aspects of the expected output data.
The output from the test requests will be compared against the output summary and any differences will be highlighted and result in a failed test.
Requests which are expected to fail can also be tested, with the resulting exception compared against the expected exception.
This is most useful when testing a dataset for the first time, when no reference stack/dataset exists to compare against.
Tests can be assigned a category which allows subsets of tests to be run easily.
In order to avoid writing explicit tests, you can ask the system to auto-generate tests from the dataset sample.json files.
Run on a single stack, the system will just check that the request succeeds with little-to-no checking on the returned data. But when comparing two stacks, the system will check both stacks return the same data, using a comparison that is more intelligent that a simple binary diff.
This makes the sample.json tests quite powerful when checking that a test stack behaves the same as a production stack for a large number of published datasets.
These tests are assigned the pseudo-category, "samples".
You can ask the system to make a random request of no more than N fields for each matching dataset.
As with tests generated from sample.jsons, little-to-no-checking will be done on the output data if only one stack is specified, but when comparing two stacks it will make sure both return the same data.
These tests are assigned the pseudo-category, "random:N" where N is the maximum number of fields to use.
Example. Run a random request of no more than 1000 fields for each era5 dataset:
regression_test -k random:1000 -d era5 c3stest
Users can do unexpected and chaotic things, so to increase confidence that the stack will handle the full range of what users will throw at it you can ask the system to take recently completed (successful) requests from the broker database and create tests from them.
Like tests from the sample.json files, little-to-no-checking will be done on the output data if only one stack is specified, but when comparing two stacks it will make sure both return the same data.
These tests are assigned the pseudo-category, "broker:<stack>" where <stack> is the stack name whose broker should be used for the tests.
Example. Take MARS-method requests from the c3sprod broker and use them to compare c3stest and c3sprod:
regression_test -k broker:c3sprod -m mars c3stest-c3sprod