Parallel mingw builds under Windows fail randomly

We’ve observed an intermittent failure of the Windows-mingw CI job. This job runs python windows_testing.py src logs -b mingw under Windows, which in turn runs mingw32-make.

So far we’ve observed the failure only on development, but it may be a statistical fluke.

Logs:

21:22:57 C:buildsworkspacembed-tls-nightly-tests>python windows_testing.py src logs -b mingw 
…
21:23:27 "  CC    ../3rdparty/everest/library/Hacl_Curve25519_joined.c"
21:23:27 "  CC    x509.c"
21:23:27 "  CC    x509_create.c"
21:23:27 "  CC    x509_crl.c"
21:23:27 "  CC    x509_crt.c"
21:23:27 "  CC    x509_csr.c"
21:23:27 "  CC    x509write_crt.c"
21:23:27 "  CC    x509write_csr.c"
21:23:27 "  CC    debug.c"
21:23:27 "  CC    net_sockets.c"
21:23:27 "  CC    ssl_cache.c"
21:23:27 "  CC    ssl_ciphersuites.c"
21:23:27 "  CC    ssl_cli.c"
21:23:27 "  CC    ssl_cookie.c"
21:23:27 "  CC    ssl_msg.c"
21:23:27 "  CC    ssl_srv.c"
21:23:27 "  CC    ssl_ticket.c"
21:23:27 "  CC    ssl_tls.c"
21:23:27 "  CC    ssl_tls13_keys.c"
21:23:27 "  CC    ssl_tls13_client.c"
21:23:27 "  CC    ssl_tls13_server.c"
21:23:27 "  CC    ssl_tls13_generic.c"
21:23:27 "  CC    error.c"
21:23:27 "  CC    version_features.c"
21:23:27 "  AR    libmbedx509.a"
21:23:27 "  AR    libmbedtls.a"
21:23:27 ar: libmbedx509.a: Permission denied
21:23:27 Makefile:218: recipe for target 'libmbedx509.a' failed
21:23:27 mingw32-make[1]: *** [libmbedx509.a] Error 1
21:23:27 mingw32-make[1]: *** Waiting for unfinished jobs....
21:23:27 Makefile:18: recipe for target 'lib' failed
21:23:27 mingw32-make: *** [lib] Error 2
21:23:27 
21:23:27 2021-11-08 20:23:23,015 - MinGW - INFO - 
21:23:27 MingW build failed
21:23:27 1 configurations tested, 0 successful

With the failure logs seen so far, the x509 and tls source files are always built in this order (there’s some variation with the crypto files). The order seems to be the same in successful jobs as well. The failure is always with libmbedx509.a.

Conjecture: this is a race condition, and it started happening when we added MAKEFLAGS=-j2 to the overall build environment (vars/environ.groovy), change made in commit “Parallel make: use -j2 everywhere” merged on 2021-10-05. @tom-cosgrove-arm notes that this looks like Mingw bug “parallel builds fail on Windows due to bug in MinGW-w64 used to build binutils”. This bug was fixed in “GNU Arm Embedded Toolchain 9-2020-q2”, which is more recent than what we’re running on the CI.

Read more here: Source link