One of the more painful issues is hanging (lockup) at full CPU usage. At my workplace, initially we introduced a timeout to workaround the hang while trying to determine the cause of the hang. It happened within multithread R code. Various build flags for OpenBLAS have been tried to no avail. Setting OPENBLAS_NUM_THREADS=1 surely makes the problem go away, at the expense of performance.
That R code has since been ported to Python, but we faced the same issue again when using ThreadPoolExecutor, so we had to change it into ProcessPoolExecutor instead.