Mbed TLS tests guidelines
Mbed TLS includes an elaborate amount of test suites in the tests/ folder that initially requires Perl to generate the tests executable files. These files are generated from a function file and a data file, located in the suites/ subfolder. The function file contains the test cases code, while the data file contains the test cases data, specified as parameters that will be passed to the test case code. In addition, some tests use data resource files (such as certificates and keys). They are in the data_files/ subfolder. You should introduce new tests when you add a new feature to the library, or if you want to improve the test coverage.
Following are explanations on the .function and .data files that should assist you with adding your own test suite code.
Note that the test files are independent, and their order should not be dependent on each other.
.data files
A test data file consists of a sequence of paragraphs separated by a single empty line. Each paragraph is referred as test case data. Line breaks may be in Unix (LF) or Windows (CRLF) format. Lines starting with the character ‘#’ are ignored (the parser behaves as if they were not present). The first line must not be an empty line.
Each paragraph describes one test case and must consist of:
One line, which is the test case name.
An optional line starting with the 11-character prefix
depends_on:. This line consists of a list of compile-time options separated by the character ‘:’, with no whitespace. The test case is executed only if all of these configuration options are enabled inmbedtls_config.h. Note that this filtering is done at run time.A line containing the test case function to execute and its parameters. This last line contains a test function name and a list of parameters separated by the character ‘:’. The parameter must be valid for the function type:
intor other integral type: an integer-valued C expression, evaluated in a separate function in the same C source file as the test code. So this expression has access to macros, types and even global variables defined in the header of the.functionfile, but not to local variables of the test function.[const] char *: a string between double quotes. A backslash escapes the next character (needed for\":).[const] data_t *: a byte string written in hexadecimal, between double quotes.
For example:
Parse RSA Key #8 (AES-256 Encrypted)
depends_on:MBEDTLS_MD5_C:MBEDTLS_AES_C:MBEDTLS_PEM_PARSE_C:MBEDTLS_CIPHER_MODE_CBC
pk_parse_keyfile_rsa:"data_files/keyfile.aes256":"testkey":0
.function files
Code file that contains the actual test functions. The file contains a series of code sequences that the following delimit:
BEGIN_HEADER/END_HEADER- Code that will be added to the header of the generated.cfile. It could contain include directives, global variables, type definitions and static functions.BEGIN_DEPENDENCIES/END_DEPENDENCIES- A list of configuration options that this test suite depends on. The test suite will only be generated if all of these options are enabled inmbedtls_config.h.BEGIN_SUITE_HELPERS/END_SUITE_HELPERS- Similar toXXXX_HEADERsequence, except that this code will be added after the header sequence, in the generated.cfile.BEGIN_CASE/END_CASE- The test case functions in the test suite. Between each of these pairs, you should write exactly one function that is used to create the dispatch code. The function must returnvoidand may only take supported parameter types. Comments are allowed before and inside the function’s prototype.
An optional addition depends_on: has same usage as in the .data files. The section with this annotation will only be generated if all of the specified options are enabled in mbedtls_config.h. It can be added to the following delimiters:
BEGIN_DEPENDENCIES- When added in this delimiter section, the whole test suite will be generated only if all the configuration options are defined inmbedtls_config.h.For example:
/* BEGIN_DEPENDENCIES * depends_on:MBEDTLS_CIPHER_C * END_DEPENDENCIES */
BEGIN_CASE- When added to this delimiter, this specific test case will be generated at compile time only if the configuration option is defined inmbedtls_config.h.For example:
/* BEGIN_CASE depends_on:MBEDTLS_AES_C */
Building and running tests
Building your test suites
The test suite .c files are auto generated with the generate_test_code.py script. You could either use this script directly, or run make in the tests/ folder, as the Makefile utilizes this script. Once the .c files are generated, you could build the test suite executables running make again. Running make from the Mbed TLS root folder will also generate the test suite source code, and build the test suite executables.
Running unit tests
You can run a single test suite individually. To run all the test suites:
Run
make testfrom the top-level directory of the build tree.When building with Make, this runs
tests/scripts/run-test-suites.pl, which you can call directly.When building with CMake, this uses
ctest.
To skip a few test suites:
With Make, set the environment variable
SKIP_TEST_SUITESto a comma-separated list of short names, e.g.SKIP_TEST_SUITES=constant_time_hmac,lmots,lms,gcm,psa_crypto.pbkdf2,ssl_decrypt make test
With CMake, set the CMake parameter
SKIP_TEST_SUITESto a comma or semicolon-separated list of short names, e.g.cmake -B build-debug -DCMAKE_BUILD_TYPE=Debug -DSKIP_TEST_SUITES=constant_time_hmac,lmots,lms,gcm,psa_crypto.pbkdf2,ssl_decrypt
You have to re-run
cmakeif you want to change the set of skipped suites.
Running only one test case
If you just want to see information about failing test cases:
tests/test_suite_foo |& grep -Ev '(PASS|SKIP|----)'
But sometimes you want to set a breakpoint in a debugger and not have it trigger on “boring” test cases. At the time of writing, there is no way to skip individual test cases. Various kludges are possible, such as:
Edit the
.datafile to remove or comment out the boring test cases, and rebuild the test suite. Remember not to commit this change!Copy the interesting test case to the top of the
.datafile. Remember to update the test case in its “true” location if you modify it, and not to commit the copy.Copy the
.dataxfile, remove boring test cases from the copy and pass it to the executable.awk -vRS= -vORS='\n\n' '/test case description regex/' <tests/test_suite_foo.datax >my.datax tests/test_suite_foo my.datax
Introducing new tests
When you want to introduce a new test, if the test function:
Already exists and it only missing the test data, then update the .data file with the additional test data. If required, you can add a resource file to the
data_files/subfolder.Doesn’t exist, you can implement a new test function in the relevant
.functionfile following the guidelines mentioned above and add test cases to the .data file to test your new feature.
You should write your test code in the same platform abstraction as the library, and should not assume the existence of platform-specific functions.
Note that historically, most of SSL was tested differently, with sample programs under the programs/ssl/ folder. These are executed when you run the scripts tests/ssl-opt.sh and tests/compat.sh. However, for new code, we prefer to have unit tests as well.
.function example
/* BEGIN_HEADER */
#include "mbedtls/some_module.h"
/* END_HEADER */
/* BEGIN_DEPENDENCIES
* depends_on:MBEDTLS_MODULE_C
* END_DEPENDENCIES
*/
/* BEGIN_CASE depends_on:MBEDTLS_MODULE_OPTIONAL_PART */
void test_function_example(data_t *input, data_t *expected_output, int expected_ret)
{
unsigned char *output = NULL;
size_t output_size = expected_output->len;
size_t output_length = SIZE_MAX;
TEST_CALLOC(output, output_size);
TEST_EQUAL(mbedtls_module_tested_function(input->x, input->len,
expected_output->x, output_size,
&output_length),
expected_ret);
if (ret == 0) {
TEST_MEMORY_COMPARE(expected_output->x, expected_output->len,
output, output_len);
}
exit:
mbedtls_free(output);
}
/* END_CASE */
Guidance on writing unit test code
Many helper macros and functions are available in the tests directory of the framework repository (location since Mbed TLS 3.6.0, also applying to TF-PSA-Crypto). They are declared in <test/xxx.h> header files.
Testing expected results
Calls to library functions in test code should always check the function’s return status. Fail the test if anything is unexpected.
The header file <test/macros.h> declares several useful macros, including:
TEST_EQUAL(x, y)when two integer values are expected to be equal, for exampleTEST_EQUAL(mbedtls_library_function(), 0)when expecting a success orTEST_EQUAL(mbedtls_library_function(), MBEDTLS_ERR_xxx)when expecting an error.TEST_LE_U(x, y)to test that the unsigned integersxandysatisfyx <= y, andTEST_LE_S(x, y)whenxandyare signed integers.TEST_MEMORY_COMPARE(buffer1, size1, buffer2, size2)to compare the actual output from a function with the expected output.PSA_ASSERT(psa_function_call())when calling a function that returns apsa_status_tand is expected to returnPSA_SUCCESS.TEST_FAIL("explanation of why this shouldn't happen")for code that should be unreachable.TEST_ASSERT(condition)for a condition that doesn’t fit any of the special cases.
Older test code only had TEST_ASSERT. But in new test code, please use higher-level macros where applicable, as they have additional conveniences.
These macros can be used in the .function file, but also in auxiliary functions. If the assertion fails, in addition to marking the test case as failed, the macros cause goto exit to happen, thus the function must have an exit label. Often you’ll need to write some code after the exit: label, but as a convenience, if you have no cleanup code, the test framework will add exit:; to test entry points that don’t have an exit: label.
Output on failure
If a test fails, the location of the error is displayed, as well as the failed assertion.
If the test code runs into more than one failed assertion, only information about the first one is displayed. This is usually the right thing because as soon as one assertion has failed, the data is probably in a bad state anyway.
When a test assertion is in a loop, or in an auxiliary function that is called multiple times, the location is not enough to know exactly where the failure happened. You can call mbedtls_test_set_step() to declare a “step number” which is displayed together with the location on failure. For example:
for (int i = 0; i < max; i++) {
mbedtls_test_set_step(i);
one_iteration(i);
TEST_ASSERT(intermediate_check());
}
mbedtls_test_set_step(max);
final_checks(left_output);
mbedtls_test_set_step(max + 1);
final_checks(right_output);
Buffer allocation
When a function expects an input or an output to have a certain size, you should pass it an allocated buffer with exactly the expected size. The continuous integration system runs tests in many configurations with Asan or Valgrind, and these will cause test failures if there is a buffer overflow or underflow.
For output buffers, it’s usually desirable to also check that the function works if it’s given a buffer that’s larger than necessary, and that it returns the expected error if given a buffer that’s too small.
Here is an example of a test function that checks that a library function has the desired output for a given input.
/* BEGIN_CASE */
void test_function(data_t *input, data_t *expected_output)
{
// must be set to NULL both for TEST_CALLOC and so that mbedtls_free(actual_output) is safe
unsigned char *actual_output = NULL;
size_t output_size;
size_t output_length;
/* Good case: exact-size output buffer */
output_size = expected_output->len;
TEST_CALLOC(actual_output, output_size);
// set output_length to a bad value to ensure mbedtls_library_function updates it
output_length = 0xdeadbeef;
TEST_EQUAL(mbedtls_library_function(input->x, input->len,
actual_output, output_size,
&output_length), 0);
// Check both the output length and the buffer contents
TEST_MEMORY_COMPARE(expected_output->x, expected_output->len,
actual_output, output_length);
// Free the output buffer to prepare it for the next subtest
mbedtls_free(actual_output);
actual_output = NULL;
/* Good case: larger output buffer */
output_size = expected_output->len + 1;
TEST_CALLOC(actual_output, output_size);
output_length = 0xdeadbeef;
TEST_EQUAL(mbedtls_library_function(input->x, input->len,
actual_output, output_size,
&output_length), 0);
TEST_MEMORY_COMPARE(expected_output->x, expected_output->len,
actual_output, output_length);
mbedtls_free(actual_output);
actual_output = NULL;
/* Bad case: output buffer too small */
output_size = expected_output->len - 1;
TEST_CALLOC(actual_output, output_size);
TEST_EQUAL(mbedtls_library_function(input->x, input->len,
actual_output, output_size,
&output_length),
MBEDTLS_ERR_XXX_BUFFER_TOO_SMALL);
mbedtls_free(actual_output);
actual_output = NULL;
exit:
mbedtls_free(actual_output);
}
/* END_CASE */
PSA initialization and deinitialization
In a test case that always uses PSA crypto, call PSA_INIT() at the beginning and PSA_DONE() at the end (in the cleanup section). Destroy all keys used by the test before calling PSA_DONE(): if any key is still live at that point, it is considered a resource leak in the test.
In a test case that uses PSA crypto only when building with MBEDTLS_USE_PSA_CRYPTO, call USE_PSA_INIT() at the beginning and USE_PSA_DONE() at the end.
See <test/psa_crypto_helpers.h> for more complex cases.
Constant-flow testing
We run some tests with MemorySanitizer (MSan) and Valgrind configured to detect secret-dependent control flow: branches or memory addresses computed from secret data. These tests detect library code that could leak secret data through timing side channels to local attackers via shared hardware components such as a memory cache or a branch predictor. We refer to such tests as “constant-time” or more accurately “constant-flow” testing.
Constant-flow testing was added relatively recently in the history of the project, and many functions that should be constant-flow are not tested. However, constant-flow testing is preferred when writing new code that claims to be constant-flow, and especially when fixing a timing side channel.
In unit tests, use the following macros, from <test/constant_flow.h>:
TEST_CF_SECRET(buffer, size): marks the given buffer as secret. Call this on keys, plaintext and other confidential data before passing it to library functions.TEST_CF_PUBLIC(buffer, size): marks the given buffer as public. Call this on outputs before testing their content.
Note that you need to call TEST_CF_PUBLIC before TEST_MEMORY_COMPARE. However, it is not needed with scalar comparison assertions (TEST_EQUAL, etc.), which make a public copy of its argument before comparing them.
Guidance on writing unit test data
Document the test data
Document how the test data was generated. This helps future maintainers if they need to generate more similar test data, for example to construct a non-regression test for a bug in a particular case.
The documentation can be:
In Python (or other) code committed to the repository. This is preferred when the code can handle a large class of tests. For example, we have frameworks to generate key data for PSA, and to generate bignum tests. Some
.datafiles are fully generated at build time bytests/scripts/generate_*_tests.py.In a comment in the
.datafile. This is more convenient when the instructions are specific to a single test case.In the commit that introduces the test data. This is more convenient when the instructions cover a series of similar test cases.
When adding a new file in tests/data_files, if possible, do it by editing tests/data_files/Makefile to add executable instructions for creating those files, and then run make and commit the result. Ideally removing the files and re-running make should produce identical files, however in many cases this is not practical because the generation is randomized or depends on the current time (for certificates and related data), and we accept that.
Interoperability testing
Test data for cryptographic algorithms should be, in order of preference:
Official test vectors taken from a standards document.
Generated by a reference implementation from the authors of the algorithm, or by a widespread implementation such as OpenSSL or Cryptodome.
Generated by another independent implementation.
Obtained through manual means, possibly by patching together bits of other tests. For example, test output multipart operation function can be obtained by splitting the output of a one-shot operation function. Edge cases for parsing can be constructed by manually tweaking nominal cases.
As a last resort, obtained by running the library once. This is a last resort since it cannot validate that the implementation is correct, it can only protect against future behavior changes. This should only be done when there is no other way, for example to construct a non-regression test in an edge case if we’re very confident from code review that our bug fix is correct.
Whatever the source of the data is, remember to document it.