Assignment 2

Software Engineering - Fall 2024

A. Instructions

Assignment 2 due Nov 24th.

B. General hints

This page will be updated with more indications as I get more questions.

  1. If you get stuck at any stage, immediately contact me.
  2. Try to keep notes of everything you do and how you overcome issues. This will be useful later if you have to perform the same steps again.
  3. It may be easier to prepare your answers in a text file and copy/paste them into Blackboard when you are done.
  4. Your first steps could be as follows:
    • Download and build AFL++.
    • Read the documentation of AFL++ about how to use it effectively.
    • Download and build SCIP 9.1.1:
      • Build SCIP with a regular gcc/clang compiler first (no AFL++).
      • Understand how to use scip from the command line.
      • Run scip on an example file to check that it works.
      • Then build SCIP with AFL++.
      • Start fuzzing SCIP.
  5. Makefiles and make do not handle spaces in file names well. Make sure that the full paths to source code does not contain any space.
  6. Whenever you change some compilation parameters, it is a good idea to run make clean to delete all previously-generated compilation results before running make again. In the case of cmake-based builds, one can even delete the whole build/ directory in which cmake was run, and start again from scratch.
  7. Building and fuzzing commands quickly get complex (many parameters). It is often helpful to put such long commands into a script file (e.g. myscript.sh), make the script executable (chmod +x myscript.sh), then run the script file (./myscript.sh). This has the added advantage of allowing you to re-run the exact same building or fuzzing commands later.

C. Compiling SCIP (not platform-specific)

  1. Prefer building scip with cmake:

    mkdir build
    cd build
    cmake .. -DCMAKE_C_COMPILER=/path/to/afl-clang-fast \
             -DCMAKE_C_FLAGS_RELEASE="-O3 -ggdb" \
             -DCMAKE_CXX_COMPILER=/path/to/afl-clang-fast++ \
             -DCMAKE_CXX_FLAGS_RELEASE="-O3 -ggdb" \
             -DPAPILO=off -DIPOPT=off
    make VERBOSE=1 -j 8

    You must modify the /path/to/afl-clang- variants to wherever they are on your device. Prefer putting full absolute paths here (starting with /). Examples include: /opt/homebrew/bin/afl-clang-fast, /usr/local/bin/afl-clang-fast, /usr/bin/afl-clang-fast.

  2. You can use afl-clang-lto if it is available on your platform. Fuzzing will be slightly faster with afl-clang-lto, but compilation will be significantly slower. One option would be to test everything and start fuzzing with afl-clang-fast variants, and only use LTO for long (e.g. overnight) runs in an effort to find more crashes.

  3. Once compilation is done, check that scip is working (executable at build/bin/scip). When scip is started in interactive mode, type q to quit.

  4. If you get an error about no member named 'thesolver' in 'SoPlexBase<R> compiling the file soplex/src/soplex/testsoplex.hpp, you can modify this file by adding the following line:

    #define thesolver _solver

    just before the line:

    namespace soplex
  5. Compiling SCIP can take a long time (roughly between 2 and 20 minutes), and you will need to compile it multiple times (e.g. with assert() enabled or disabled, with optimizations enabled or disabled, etc.). Once you successfully produce a scip executable, I would advise copying it in a different directory (i.e. not under build/) for safekeeping. As an alternative, you could store build attempts in different directories (e.g. build-with-assert/, build-without-assert/, etc.)

  6. Many of SCIP’s dependencies are optional. For example, if cmake complains that it cannot find a tool like readline, you can either install it, or disable it by passing the option -DREADLINE=off.

  7. You tell make to run multiple jobs in parallel by using the -j option. For example, make -j 8 will run up to 8 parallel jobs. One downside of parallel builds is that understanding compilation errors can be harder. In such case, immediately re-run make without -j and try to fix the compilation error. Once it is fixed, you can interrupt non-parallel make (control+C), then resume building in parallel.

  8. When Makefile was generated by cmake, you can pass VERBOSE=1 to make to see the exact compilation commands used, for example:

    make VERBOSE=1 -j 8
  9. If you find input files that trigger assertion failures, then make sure they would cause a crash (e.g. “Segmentation Fault) if assertions were disabled. For that, create a build of scip without assertions by defining the NDEBUG macro:

    cmake .. -DCMAKE_C_FLAGS_RELEASE="-O3 -ggdb -DNDEBUG" \
             -DCMAKE_CXX_FLAGS_RELEASE="-O3 -ggdb -DNDEBUG" \
             -DPAPILO=off -DIPOPT=off
    make VERBOSE=1 -j 8

    Then, test that new build with the input files and check the results.

D. Fuzzing SCIP

  1. For the starting set of input files, afl-fuzz works best with a limited number (5 to 20) of small valid files (smaller than, say, 10 kB, and can be as small as 5 to 10 bytes) that are different from each other. Invalid files can be included, but they are typically not useful.

  2. The first time you run afl-fuzz, it may ask you to adjust system settings to allow it to run faster. Whenever those adjustments are easy to perform, it is recommended to follow its advice. For example, this command

    echo core | sudo tee /proc/sys/kernel/core_pattern

    is easy to run and is essentially required for AFL++ to work properly on Linux and WSL2.

  3. When afl-fuzz is running, a red message indicating “no new paths” (or something similar) means that fuzzing is probably not working. The most common cause of that is that the fuzzed program (scip in our case) immediately exits with an error message.

    The first thing to try in such case is to run the same scip command outside of afl-fuzz, and make sure that everything is working fine.

    If it is, then there must be a difference between us running scip manually and afl-fuzz running it. The next two points could be causes for such difference.

  4. If one runs scip -f input_file, SCIP tries to infer the specific parser to run (e.g. MPS or LP) from the extension of input_file (resp. .mps or .lp). However, by default afl-fuzz generates files with no extension. One can work around this with the -e option of afl-fuzz (see afl-fuzz -h for details). However, this may not be enough, see next point.

  5. By default afl-fuzz places the “input” files it creates in a directory whose path can confuse scip. If this happens, scip just exits and no fuzzing happens. This problem can be worked around with the -f option of afl-fuzz (see afl-fuzz -h for details), which forces it to use a fixed name and path for input files.

  6. By running scip -f input_file, we are telling SCIP to read a file then solve the corresponding problem instance. This is good if we want to fuzz the whole SCIP codebase, but bad if we are looking for bugs in specific parsers. scip has an interactive mode that gives us more control over what happens. We can simulate this interactive mode with the -c command-line option. For example, the following command

    scip -c "read /path/to/a/file mps" -c "quit"

    tells scip to read /path/to/a/file, specifically with the MPS parser, then quit without solving the corresponding problem. (Note: specifying the mps or lp format like this is only allowed if the file does not already have an extension!) Beware that -c "quit" is necessary, otherwise scip will expect further interactive input from the keyboard.

  7. If everything is running properly, afl-fuzz should find bugs in the MPS and LP file parsers within the first 2 to 5 minutes.

  8. Once afl-fuzz is running, it keeps looking for new crashes until you stop it. At first, it should be plenty enough to stop it after 5-10 crashes.

  9. You need a varied set of input files to allow AFL++ to find crashes quickly. This is true in particular if you have difficulty finding LP files that cause crashes in scip.

    The source code of SCIP contains example LP files in the scip/check/instances/ directory. I would suggest using all the LP files in this directory, at least all those under 10 kB, and especially those under scip/check/instances/SOS/.

E. Understanding crashes

  1. Feel free to use a debugger if you already know how to use one. However some debuggers (gdb) have a steep learning curve, and for what we are doing, there are other, easier-to-use tools that can give you enough information.

  2. For example, valgrind is very useful for pinpointing the exact location (in the source code) a crash happens. It will also give you information about the pointers involved.

    Note: on MacOS, if valgrind is not available, you can get the stack trace from lldb.

  3. If not already done, you can also recompile SCIP with assertions enabled, and run it on your crash-causing input. If it still causes a crash even with assertions enabled (unlikely), then this step was not helpful, but on the bright side, you found an answer to the bonus question. If instead you now get an assertion failure, then you get a lot of information about what assumptions SCIP’s coders were making, and how they are not always satisfied.

  4. Next you can add manual instrumentation to SCIP (printf) and/or modify the input files to narrow down the possible explanations for what happens.

F. Hints specific to x86_64 Windows

  1. Preferably, use WSL2.

  2. AFL++ Installation

    1. Option 1 (easy)
      • apt-get install afl++
    2. Option 2 (compile AFL++)
      • Instructions are here: https://github.com/AFLplusplus/AFLplusplus/blob/stable/docs/INSTALL.md. As far as I have been told, the docker-based instructions don’t seem to work, so it is probably better to skip them.

      • You should probably ignore the instructions aiming to install version 14 specifically of LLVM, so replace

        sudo apt-get install -y lld-14 llvm-14 llvm-14-dev clang-14 || sudo apt-get install -y lld llvm llvm-dev clang

        by

        sudo apt-get install -y lld llvm llvm-dev clang
  3. SCIP compilation

    • Download SCIP optimization suite: curl -O "https://scipopt.org/download/release/scipoptsuite-9.1.1.tgz"
    • Build with cmake (see non-platform-specific hints above).
    • Install additional required packages as needed, for example: apt-get install libgmp-dev

G. Hints specific to M1/M2 MacOS

  1. Make sure that your operating system is fully updated.

  2. Ensure Homebrew is installed

    • At least the following packages will be needed: cmake, gmp, bison:

      brew install cmake gmp bison
  3. AFL++ Installation

    1. Option 1 (easy)
      • brew install afl++
      • Note that there is no afl-clang-lto. It is ok, use afl-clang-fast.
      • By default, the afl binaries are in:
        • /opt/homebrew/afl-clang-fast
        • /opt/homebrew/afl-clang-fast++
        • /opt/homebrew/afl-fuzz
    2. Option 2 (compile AFL++ from source, not much harder)
      • The installation instructions for MacOS are here: https://github.com/AFLplusplus/AFLplusplus/blob/stable/docs/INSTALL.md#macos-on-x86_64-and-arm64 but they are a bit out of date, so before following them, first try this:

      • Just download AFL++ source and run make:

        git clone https://github.com/AFLplusplus/AFLplusplus
        cd AFLplusplus
        make
      • If you get the message “LLVM mode could not be built … please install llvm-13”, then make sure that llvm is installed with brew:

        brew install llvm

        then find the llvm-config utility, it should be somewhere like /opt/homebrew/Cellar/llvm/19.1.3/bin/ (the exact path may vary depending on the version of LLMV installed).

        Then, try to run AFL++’s Makefile again, but specifying where it can find llvm-config, like:

        LLVM_CONFIG=/opt/homebrew/Cellar/llvm/19.1.3/bin/llvm-config   make
      • You may get a message “LLVM LTO mode could not be build”. You can safely ignore this.

      • Test failure messages containing “assembler command failed” can be ignored as well, as long as you get afl-clang-fast and afl-clang-fast++.

      • Note the full paths of the afl-clang-fast and afl-clang-fast++ executable binaries. You will need them later.

  4. SCIP compilation

    • Download SCIP optimization suite: curl -O "https://scipopt.org/download/release/scipoptsuite-9.1.1.tgz"
    • Build with cmake (see non-platform-specific hints above).
  5. If you are on an M1/M2 Mac (not Intel-based) and see a message

    ld: unsupported tapi file type '!tapi-tbd' in YAML file

    when trying to compile, then something went wrong with you system configuration. Find out where the ld utility is located by running:

    which ld

    You should get /opt/homebrew/bin/ld or /usr/local/bin/ld (or possibly /usr/bin/ld) but not anything to do with conda. If you see conda in ld’s path, then you must disable Anaconda.

    conda deactivate

    Note that even if it is deactivated now, Anaconda may have interferred with how LLVM, clang and AFL++ were installed. You may have, for example to re-run, with Anaconda deactivated,

    brew install afl++
  6. To get a stack trace, you can use lldb with the following command:

    lldb path/to/bin/scip -o 'run -f path/to/crash/file.mps' -o 'bt' -o 'quit'

    where you need to adjust path/to/bin/scip and -f path/to/crash/file.mps to your specific needs. Explanations:

    • We pass to lldb the path to the binary executable (and, optionally, additional parameters).
    • By default, lldb starts in an interactive mode in which we can type commands. The -o command command-line parameter is equivalent to typing command in interactive mode.
    • The run commands tells lldb to run the executable. We type after run the parameters we want to pass to the executable. In our case, we could want scip to read and solve a file, hence -f path/to/crash/file.mps.
    • The bt commands tells lldb to print the stack trace.
    • The quit commands prevents lldb from entering interactive mode.