Tools to Help Debug and Optimize the Generated Code
From a given DSP program, the Faust compiler tries to generate the most efficient implementation. Optimizations can be done at DSP writing time, or later on when the target langage is generated (like C++ or LLVM IR). The generated code can have different "shapes" depending of compilation options, and can run faster of slower. Several programs and tools are available to help Faust programmers to test (for possible numerical or precision issues), optimize their programs by discovering the best set of options for a given DSP code, and finally compile them into native code for the target CPUs.
Debugging the DSP Code
The Faust compiler gives error messages when the written code is not syntactically or semantically correct. When a correct program is finally generated, it may still have numerical or precision issues only appearing at runtime. This typically happens when using mathematical functions outside of their definition domain, like calling log(0)
or sqrt(-1)
at some point in the signal path. Those errors have to be then fixed by carefully checking signal range, like verifying the min/max values in vslider/hslider/nentry
user-interface items. One way to detect and understand them is by running the code in a controlled and instrumented environment. A special version of the interpreter
backend can be used for that purpose and is embedded in a dedicated testing tool.
interp-tracer
The interp-tracer
tool runs and instruments the compiled program using the Interpreter backend. Various statistics on the code are collected and displayed while running and/or when closing the application, typically FP_SUBNORMAL
, FP_INFINITE
and FP_NAN
values, or INTEGER_OVERFLOW
and DIV_BY_ZERO
operations. Mode 4 and 5 allow to display the stack trace of the running code when FP_INFINITE
, FP_NAN
or INTEGER_OVERFLOW
values are produced. The -control mode allows to check control parameters, by explicitly setting their min and max values, then running the DSP and setting all controllers (inside their range) in a random way. Mode 4 up to 7 also check LOAD/STORE errors, and are typically used by the Faust compiler developers to check the generated code. A more complete documentation is available on the this page.
Debugging at runtime
On macOS, the faust2caqt script has a -me
option to catch math computation exceptions (floating point exceptions and integer div-by-zero or overflow) at runtime. Developers can possibly use the dsp_me_checker class to decorate a given DSP objet with the math computation exception handling code.
Optimizing the DSP Code
Writing efficient DSP code
TODO
Specializing the DSP code
The Faust compiler can possibly do a lot of optimizations at compile time. The DSP code can for instance be compiled for a fixed sample rate, thus doing at compile time all computation that depends of it. Since the Faust compiler will look for librairies starting from the local folder, a simple way is to locally copy the libraries/platform.lib
file (which contains the SR
definition), and change its definition for a fixed value like 48000 Hz. Then the DSP code has to be recompiled. Note that libraries/platform.lib
also contains the definition of the tablesize
constant which is used in various places to allocate tables for oscillators. Thus decreasing this value can save memory, for instance when compiling for embedded devices. This is the technique used in some Faust services scripts which add the -I /usr/local/share/faust/embedded/
parameter to the Faust command line to use a special version of the platform.lib file.
Optimizing the C++ or LLVM Code
By default the Faust compiler produces a big scalar loop in the generated mydsp::compute
method. Compiler options allow to generate other code "shape", like for instance separated simpler loops connected with buffers in the so-called vectorized mode (obtained using the -vec
option). The assumption is that auto-vectorizer passes in modern compilers will be able to better generate efficient SIMD code for them. In this vec option, the size of the internal buffer can be changed using the -vs value
option. Moreover the computation graph can be organized in deep-first order using -dfs
. A lot of other compilation choices are fully controllable with options. Note that the C/C++ and LLVM backends are the one with the maximum of possible compilation options.
Manually testing each of them and their combination is out of reach. So several tools have been developed to automatize that process and help search the configuration space to discover the best set of compilation options:
faustbench
The faustbench
tool uses the C++ backend to generate a set of C++ files produced with different Faust compiler options. All files are then compiled in a unique binary that will measure DSP CPU of all versions of the compiled DSP. The tool is supposed to be launched in a terminal, but it can be used to generate an iOS project, ready to be launched and tested in Xcode. A more complete documentation is available on the this page.
faustbench-llvm
The faustbench-llvm
tool uses the libfaust
library and its LLVM backend to dynamically compile DSP objects produced with different Faust compiler options, and then measure their DSP CPU usage. Additional Faust compiler options can be given beside the ones that will be automatically explored by the tool. A more complete documentation is available on the this page.
Some faust2xx tools like faust2max6
or faust2caqt
can internally call the faustbench-llvm
tool to discover and later on use the best possible compilation options. Remember that all faust2xx
tools compile in scalar mode by default, but can take any combination of optimal options (like -vec -fun -vs 32 -dfs -mcd 32
for instance) the previously described tools will automatically find.
Compiling for Multiple CPU
On modern CPUs, compiling native code dedicated to the target processor is critical to obtain the best possible performances. When using the C++ backend, the same C++ file can be compiled with gcc
of clang
for each possible target CPU using the appropriate -march=cpu
option. When using the LLVM backend, the same LLVM IR code can be compiled into CPU specific machine code using the dynamic-faust
tool. This step will typically be done using the best compilation options automatically found with the faustbench
tool or faustbench-llvm
tools. A specialized tool has been developed to combine all the possible options.
faust2object
The faust2object
tool either uses the standard C++ compiler or the LLVM dynamic compilation chain (the dynamic-faust
tool) to compile a Faust DSP to object code files (.o) and wrapper C++ header files for different CPUs. The DSP name is used in the generated C++ and object code files, thus allowing to generate distinct versions of the code that can finally be linked together in a single binary. A more complete documentation is available on the this page.