Label Analysis for Automated Test Selection

Details about Label Analysis

What is Label Analysis?


Labels = Test Name

You can assume that a "label" is a "test name".


What is BASE and HEAD?

HEAD is the current commit, for which the tests to run will be decided (the HEAD of your feature branch, for example)

BASE is the remote commit that we are comparing to. We have historical coverage info about it.

Label analysis is the process through which Codecov takes the set of tests in your test suite (tests in HEAD) and derives a subset of them that will properly cover the diff between two given commits (HEAD vs BASE).

To do that it breaks the testing process into 2 parts:

  1. Collect tests names (labels) from the test suite in the checked out code (i.e. the collection of all tests you have in your HEAD commit)
  2. Send the set of labels and the commits to Codecov and get back the subset of labels - from the collected set in 1 - to be executed

What it does with this set of labels to run it's up to you. You can get them reported (with --dry-run) or executed by the Codecov CLI runner (more info below).

Notice that the Codecov CLI needs to be able to collect your tests. To do that you need to setup your environment to a point that test collection can be performed. You might have to add config to the runner that does that too (more info below)

How is the label subset to run calculated?

Codecov uses (1) the set of labels collected in the checked out HEAD code, (2) Static Analysis information already uploaded to Codecov for the BASE and HEAD commits and (3) the git diff between HEAD and BASE to calculate the subset of labels that need to be executed.

From the information above Codecov extracts 4 different lists, that are returned to the CLI at the end of stage 2:

  • absent_labels - the set of new labels found in HEAD that are not present in BASE
    Codecov has no record of these test labels ever being run
  • present_diff_labels - the set of labels affected by the git diff
    The diff between HEAD and BASE includes lines that are executed by these labels
  • global_level_labels - set of labels that possibly touch global code
    It may not be possible to safely skip these tests ever
  • present_report_labels - the set of labels previously uploaded
    Codecov has a record of these labels. They might be skipped in this run

The subset that necessarily needs to be run on the current run is the union of the first three subsets (excluding the labels already recorded that are not in the diff).

set(absent_labels + present_diff_labels + global_level_labels)

Notice that by changing the BASE-HEAD pair the set of present_diff_labels will also change.

The label-analysis CLI command

Label Analysis is the process that collects a set of test names (labels) from the test suite, and given a BASE commit to compare against, gets the subset of labels that actually need to be run in order to fully test the diff.

Usage: codecovcli label-analysis [OPTIONS]

  --token TEXT                  The static analysis token (NOT the same token
                                as upload)  [required]
  --head-sha TEXT               Commit SHA (with 40 chars)  [required]
  --base-sha TEXT               Commit SHA (with 40 chars)  [required]
  --runner-name, --runner TEXT  Runner to use
  --max-wait-time INTEGER       Max time (in seconds) to wait for the label
                                analysis result before falling back to running
                                all tests. Default is to wait forever.
  --dry-run                     Userful during setup. This will run the label
                                analysis, but will print the result to stdout
                                and terminate instead of calling the
  -h, --help                    Show this message and exit.

Above are the list of options for the label-analysis command.

  • The CLI will look in CODECOV_STATIC_TOKEN for the value --token if one is not specified.
  • For this process to work both the head-sha and the base-sha must have static analysis information already uploaded


Runners are the plugins that collect and execute tests in your test suite. To understand how to use and configure them let's start by checking the available ones, and then go over how to create your own runner.

To select a runner when running label analysis use the --runner-name option in the CLI command.

Available runners

Codecov CLI ships with 2 runners available: PythonStandardRunner and DANRunner.

Pytest Standard Runner

This runner is for Python users that run tests with pytest. Under the hood it runs pytest to collect and execute tests. This runner should fit almost all users running python.

๐Ÿ‘€ Want to check the code for this runner? See PytestStandardRunner in GitHub


Configuration options for the python standard runner.

      coverage_root: "./"
        - "--ignore=path/to/ignore"
        - "path/to/tests"
        - "cov-report=xml"
        - "--verbose"
      python_path: "/path/to/interpreter/python"


Prefer โ€”option=value format

When adding configuration for the collection phase always prefer the --option=value in the same string on the list.

  • coverage_root - used in the --cov=<coverage_root> argument passed to pytest when running collected tests.
  • collect_tests_options - options passed to pytest when collecting tests. Here you should put your options to ignore paths or look for certain paths.
  • execute_tests_options - options passed to pytest when executing tests. Here you should put options that control reporting and debug level, etc. Don't use --cov=/path here, use the coverage_root config option.
  • python_path - the python interpreter to use. It allows you to specify a python interpreter different than the system one. Default is python.

Collection Phase

In the collection phase, the python runner runs the command equivalent to the command below. Notice that if you don't provide any collect_tests_options configuration it will try to collect the entire test suite.

 python -m pytest \
   -q \
   --collect-only \

Test execution phase

In the test execution phase the subset of labels is fed into the python runner and that set of labels is executed. The equivalent command is below. You can see the progress of test execution in your CI as it goes.

python -m pytest \
  --cov=[coverage-root] \
  --cov-context=test \
  [options-in-execute_tests_options] \


DAN stands for Do Anything Now. This runner is a โ€œnuclear optionโ€ for the user to take full control of the code that's executed in the collection and execution phases. It does nothing by itself, only runs the commands that it is provided with.

Internally, it uses to execute the command. The output is captured, then stdout for the subprocess is decoded and that is the return of your command to the CLI.


With great powers comes great responsibility

There are no safety checks for the provided commands. It's your responsibility to make sure they are safe and work properly with the label analysis process.


        - "./my_command"
        - "--option=value"
      process_labelanalysis_result_command: "./other_command --option value"

Directly provide the commands that will be executed in the collection and test execution phases. You need to provide both commands.

Commands can be provided as a list, as shown in the first example, or as a string directly, as shown in the second example. Prefer the list option.

Collection phase

The DANRunner will run the command provided in collect_tests_command. The output of this command should be 1 test label per line (e.g. separated by \n). As shown below.


Test execution phase

The DANRunner will run the command provided in process_labelanalysis_result_command. It will receive as the last argument a string representation of the JSON result of label-analysis. It should run the tests. We recommend running the tests in the subset set(absent_labels) | set(present_diff_labels) | set(global_level_labels).

# Last argument given to the command is a stringifyied version of the dictionary
    "present_report_labels": ["label_1", "label_2", "label_3", "label_4"],
    "absent_labels": ["label_new"],
    "present_diff_labels": ["label_1", "label_2"],
    "global_level_labels": ["label_3"],

Custom Runners

Custom runners allow you to take full control of how ATS interacts with your code. By creating a runner script yourself and using it with the Codecov CLI you can own the behavior of your runner and make sure it only does what you want it to do.


To create a custom runner you need to create a class that adheres to LabelAnalysisRunnerInterface. This essentially means it needs to implement 2 functions:

  • def collect_tests(self) -> List[str] - collects a list of test labels. Returns such list.
  • def process_labelanalysis_result(self, result: LabelAnalysisRequestResult) - handles the label analysis processing result. Usually will execute the tests related to the labels in result.
  • It also needs a params attribute, but it can be None. Ideally it's where you'll put the config for the class.

Check the code

LabelAnalysisRunnerInterface source code

LabelAnalysisRequestResult source code


To configure your custom runner add the config options to the CLI config file (for example codecov.yml).

The name of this config key (in the example, "MY_RUNNER") will be the name of your runner. Pass that to the label-analysis command in the runner option (e.g. --runner MY_RUNNER).

Then you need to add the path to the module where MY_RUNNER is defined. Best to put the absolute path to avoid issues. You also need to provide the class name to be imported.

Optionally you can define params that will be passed to MY_RUNNER when trying to initialize the class.

      module: project.helpers.runner
      class: MyRunner
        foo: "bar"

This configuration will try to import MyRunner class from and instantiate it with params {"foo": "bar"}, which is equivalent to writing

from project.helpers.runner import MyRunner

runner = MyRunner({"foo": "bar"})

Then to use MY_RUNNER you'd call the command such as

codecovcli --codecov-yml-path=codecov.yml label-analysis --runner=MY_RUNNER --base-sha=$BASE_SHA