Skip to content

Coverage calculation may be incomplete with crashes #13791

@EliaGeretto

Description

@EliaGeretto

I was reviewing the coverage script to verify how the coverage data was collected and I stumbled on the following code:

# '%1m' will produce separate dump files for every object. For example, if a
# fuzz target loads a shared library, we will have dumps for both of them.
local profraw_file="$DUMPS_DIR/$target.%1m.profraw"
local profraw_file_mask="$DUMPS_DIR/$target.*.profraw"
local profdata_file="$DUMPS_DIR/$target.profdata"
local corpus_real="$CORPUS_DIR/${target}"
# -merge=1 requires an output directory, create a new, empty dir for that.
local corpus_dummy="$OUT/dummy_corpus_dir_for_${target}"
rm -rf $corpus_dummy && mkdir -p $corpus_dummy
# Use -merge=1 instead of -runs=0 because merge is crash resistant and would
# let to get coverage using all corpus files even if there are crash inputs.
# Merge should not introduce any significant overhead compared to -runs=0,
# because (A) corpuses are already minimized; (B) we do not use sancov, and so
# libFuzzer always finishes merge with an empty output dir.
# Use 100s timeout instead of 25s as code coverage builds can be very slow.
local args="-merge=1 -timeout=100 $corpus_dummy $corpus_real"
export LLVM_PROFILE_FILE=$profraw_file
timeout $TIMEOUT $OUT/$target $args &> $LOGS_DIR/$target.log
cov_retcode=$?

The profraw files are collected by running LibFuzzer with -merge=1 and the LLVM SourceBasedCodeCoverage instrumentation with on-line profile merging, i.e. with %1m.

The comment there says that running with -merge=1 is crash resistant and allows to collect coverage from all test cases even in the presence of crashes. This did not look correct, so I run some tests to verify what was actually going on. LibFuzzer with -merge=1 runs with an outer process, which monitors the progress, and an inner process, which may crash and get restarted. With strace, I verified that, on crash, the inner process does not open the profile file to write down coverage information, so the coverage for all test cases up to that point gets lost. At the end of the execution, only the last instance of the inner process will write down coverage because it exited gracefully. This is because the callback that writes coverage is registered with atexit. So, the final output will contain only the coverage of the last instance of the inner process merged with the coverage of the outer process due to the on-line profile merging.

To clarify, assuming we have three test cases (A, B, C), and B produces a crash, the coverage for A and B will be lost. The profraw file produced, will contain only the coverage for C and the outer process.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions