Structure-Aware Fuzzing With AFL

Dec. 11, 2023 // echel0n

Fuzzing Telegram's rlottie Library

Introduction

Hello guys! It has been a while. This seems to be my final blog post for the year. Do you feel the same way? Has time started to pass quickly? I feel like I have not done much anything. Anyway, I hope 2024 will bring some prosperity and happiness!

After working on ELF mutator, I wanted to work on something but this time is more beneficial. Yeah I know, An ELF mutator does not help a lot of developers. The surface is mostly analyzers and stuff. This time I brought you something bigger, an example protobuf mutator that is usable by the AFL++. Which means, after reading this blog you will be (I think you should) able to write your harness and protobuf mutator for your target!

I will be giving the examples by Telegram's rlottie(forked) library (6.1.1_1946 build version). rlottie is a library that loads animations in JSON format. Let's start by the definitions!

Table of Contents

  1. Prelogue
  2. Lottie Struct to Protobuf Message
  3. Building & Using AFLplusplus-libprotobuf-mutator
  4. The Harness
  5. Instrumentation
  6. Euphoria
  7. Conclusion/End


Prelogue

Coverage-guided fuzzing and mutation-based fuzzers are really the easiest way to start digging through a project. However if you are dealing with a target which accepts and processes a complex data type, you need more than random bit flipping, you need also a mutation that complies with the grammar, the valid structure. Otherwise, your fuzzer will not go through, and the target most probably will reject your mutated inputs in the very early stages of it. I have emphasized Structure-Aware Fuzzing importance enough right?

Dealing with the (de)serialization of the complex types is very hard. We know it is hard by the bugs discovered every year. We will discover how rlottie is dealing with it. We will use protobuf(protocol buffers) as intermediate format to serialize our mutated input. Thus, it will allow our harness to consume structured data with easiness.

Finding our target function

By looking at the Samsung's rlottie github README.md, it gives right away the candidate functions!

We can either target

  1. auto animation = rlottie::Animation::loadFromData(std::string(rawData), std::string(cacheKey));

or

  1. auto animation = rlottie::Animation::loadFromFile("absolute_path/test.json"); // giving the input traditionally as argv[1]

Technically, you can choose both functions, we can assume they will load the data in the same way but I can recommend persistent mode for better efficiency, which becomes `loadFromData` as our ultimate target.

Real Deal

So, what is the valid structure of those functions anyway?

1) We can check manually in the real time in Telegram's memory.
2) We can check created lottie stuff
3) We can read the code base and especially structs (nahhh I am too lazy for this one)

Let's look at this structure.

  1. https://github.com/landn172/lottie-miniapp/blob/master/lottie-json.md
  2. var data = {
  3. v: '4.6.3', // version
  4. fr: 30, // 30 fps
  5. ip: 0,
  6. op: 73,
  7. w: 250, // width
  8. h: 275, // height
  9. nm: 'B',
  10. ddd: 0,
  11. assets: [],
  12. layers: []
  13. };
  14. var assets = ..
  15. var layers = ..

Lottie to Protobuf Message

Translating part is very easy, something looks like this;

  1. syntax = "proto3";
  2. message Asset {
  3. string id = 1;
  4. repeated Layer layer = 2;
  5. message Layer {
  6. int32 ddd = 1;
  7. int32 ind = 2;
  8. int32 ty = 3;
  9. string nm = 4;
  10. Keyframes ks = 5;
  11. int32 ao = 6;
  12. repeated Shape shapes = 7;
  13. int32 ip = 8;
  14. int32 op = 9;
  15. int32 st = 10;
  16. int32 bm = 11;
  17. int32 sr = 12;
  18. }
  19. }
  20. message Layers {
  21. int32 ddd = 1;
  22. int32 ind = 2;
  23. int32 ty = 3;
  24. string nm = 4;
  25. Keyframes ks = 5;
  26. int32 ao = 6;
  27. repeated Shape shapes = 7;
  28. int32 ip = 8;
  29. int32 op = 9;
  30. int32 st = 10;
  31. int32 bm = 11;
  32. int32 sr = 12;
  33. }
  34. message Keyframes {
  35. repeated Keyframe k = 1;
  36. message Keyframe {
  37. Bezier i = 1;
  38. Bezier o = 2;
  39. Bezier v = 3;
  40. message Bezier {
  41. float x = 1;
  42. float y = 2;
  43. }
  44. }
  45. }
  46. message Shape {
  47. string ty = 1;
  48. Group it = 2;
  49. int32 ix = 3;
  50. Keyframes ks = 4;
  51. string nm = 5;
  52. string mn = 6;
  53. }
  54. message Group {
  55. repeated Shape shapes = 1;
  56. string nm = 2;
  57. int32 np = 3;
  58. int32 cix = 4;
  59. int32 ix = 5;
  60. string mn = 6;
  61. }
  62. message LottieMessage {
  63. string v = 1;
  64. int32 fr = 2;
  65. int32 ip = 3;
  66. int32 op = 4;
  67. int32 w = 5;
  68. int32 h = 6;
  69. string nm = 7;
  70. int32 ddd = 8;
  71. repeated Asset assets = 9;
  72. repeated Layers layers = 10;
  73. }</code>

In Telegram's fork, some parts can be useless (maybe they did not want that usability). However, the context is up to you. This depends mostly which parts you're interested in. If you're willing to write the code that handles defined parts of the structure, you are free to go! More values will bring more coverage! (does not promise more bug tho, lol)

Then you can translate this structure directly into cpp classes and compile with:

  1. $ protoc --cpp_out=$PWD --proto_path $PWD lottiemessage.proto
  2. $ clang++ lottie.pb.cc -g -c -fPIC

To cover up this part, what we have done here:


1) We have found the targeted function
2) We have written our protobuf message

What we have still missing:

3) Protobuf Mutator
4) Harness

Building AFLplusplus-libprotobuf-mutator Project

I have found this AFL++ mutator which is written by P1umer. However the Quick start and Usage part is not much helpful, especially on Arch Linux. Default CMake configuration is also downloading other thingies. In my opinion, you can use this mutator like this.

1) Build libprotobuf-mutator seperately with set(CMAKE_POSITION_INDEPENDENT_CODE ON)

  1. $ cmake .. -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_STANDARD=17 -DCMAKE_CXX_STANDARD_REQUIRED=ON -DCMAKE_CXX_EXTENSIONS=OFF
  2. $ sudo ninja install

2) Build afl_mutator.cc with clang++

  1. $ clang++ -shared -O3 afl_mutator.cc -lprotobuf-mutator -lprotobuf -I /usr/local/include/libprotobuf-mutator -o afl_mutator.so

3) Build Fuzzer/

  1. $ run ./build.sh in AFLplusplus-protobuf-mutator/src/

We have not done yet but these are the crucial parts before compiling a custom AFL++ protobuf mutator.

Usage of AFLplusplus-libprotobuf-mutator

The usage is very simple you must write your own `DEFINE_AFL_PROTO_FUZZER(const MessageType& input, unsigned char **out_buf)` and include `#include "src/afl-mutator.h"` file. For our example, it is something like this:

  1. #include "/usr/local/include/libprotobuf-mutator/src/libfuzzer/libfuzzer_macro.h"
  2. #include "afl_mutator.h" // in aflplusplus-libprotobuf-mutator
  3. #include "lottiemessage.pb.h" // which is created by protoc
  4. #include <array>
  5. #include <exception>
  6. #include <fstream>
  7. #include <google/protobuf/util/json_util.h>
  8. #include <iostream>
  9. #include <json/json.h>
  10. #include <map>
  11. #include <rlottie.h>
  12. #include <sstream>
  13. #include <stdio.h>
  14. #include <string.h>
  15. #include <string>
  16. #include <vector>
  17. // ...
  18. using namespace std;
  19. // MessagesToJsonString becomes useless because the format is failing when something bad injected
  20. // into protobuf message, and be careful do not use blindly util MessageToJsonString class because
  21. // it may encounter difficulties when it's mutated into something does not comply with the structure
  22. // but you can still use it for the printing to check mutator is working as intended tho
  23. #define DUMP_PROTO(input) \
  24. { \
  25. std::string json; \
  26. google::protobuf::util::JsonPrintOptions options; \
  27. options.add_whitespace = true; \
  28. options.preserve_proto_field_names = true; \
  29. options.always_print_primitive_fields = true; \
  30. options.always_print_enums_as_ints = true; \
  31. google::protobuf::util::MessageToJsonString(input, &json, options); \
  32. std::cout << "dump proto: " << json << std::endl; \
  33. }
  34. // .. continues
  35. Json::Value lottieMessageToJSON(const LottieMessage &composition) {
  36. // carefully changing message into json string
  37. return json_string;
  38. }
  39. DEFINE_AFL_PROTO_FUZZER(const LottieMessage &composition,
  40. unsigned char **out_buf) {
  41. Json::Value json_composition = lottieMessageToJSON(composition); // trimmed..
  42. Json::StreamWriterBuilder writer;
  43. std::string json_string = Json::writeString(writer, json_composition);
  44. *out_buf = (unsigned char *)json_string.c_str();
  45. return json_string.length();
  46. }

You may need rlottie's(or your target) classes too, if so compile them with clang as well. Be careful about the afl-toolchain tho. Up to the present, we have not use afl's toolchain. We did not need to instrument something so far. We are just compiling our mutator part, not harness or fuzzing related.

Compilation of rlottie:

  1. - clang++ -g -fPIC -c -I $(find ../ -name "inc") -I $(find ../ -name "jni") -I $(find ../ -name "stb") -I $(find ../ -name "vector") -I $(find ../ -name "freetype") -I $(find ../ -name "pixman") -I $(find ../ -name "rapidjson") -L $(find ../ -name "libs") $(find ../jni/rlottie -name "*.cpp") -lm -lz

Compilation of rlottie mutator:

  1. # (Fuzzer/ is at AFLplusplus-protobuf-mutator)
  2. - clang++ -shared -fPIC -g -O0 mutator.cpp -o libprotobuf_mutator.so -I $(find ../jni/rlottie -name "inc") -I /usr/local/include/libprotobuf-mutator -I Fuzzer/ Fuzzer/libFuzzer.a -lprotobuf -lprotobuf-mutator /usr/local/lib/libprotobuf-mutator-libfuzzer.a /usr/local/lib/libprotobuf-mutator.a -ljsoncpp objects_vanilla/*.o afl_mutator.so
  3. # your mutator is now libprotobuf-mutator.so
  4. - export AFL_CUSTOM_MUTATOR_LIBRARY="~/tmp/Telegram/Telegram-release-6.1.1_1946/TMessagesProj/build/libprotobuf_mutator.so"

Now, our mutator is ready! Let's go for the harness and instrumentation!

The Harness

In the harness part, with the power of protobuf mutator, we can assume that we will always get a JSON string as input. I just ripped the example usage from original README.md of rlottie's page here:

  1. while (__AFL_LOOP(10000)) {
  2. const unsigned char *raw_buf = __AFL_FUZZ_TESTCASE_BUF; // it is a JSON string and rlottie expects a json string!
  3. int len = strlen(reinterpret_cast<const char *>(raw_buf));
  4. const std::string data(raw_buf, raw_buf + len);
  5. const std::string CACHE_KEY = "I_DONT_KNOW_WHAT_IS_CACHE_KEY_BUT_WHATEVER";
  6. std::map<int32_t, int32_t> *colorReplacementMap = new std::map<int32_t, int32_t>();
  7. auto player = rlottie::Animation::loadFromData(data, CACHE_KEY,
  8. colorReplacementMap, ".");
  9. if (player) {
  10. size_t frame_count = player->totalFrame();
  11. uint32_t w = 420;
  12. uint32_t h = 666;
  13. auto buffer = std::unique_ptr<uint32_t[]>(new uint32_t[w * h]);
  14. for (size_t frame = 0; frame < frame_count; frame++) {
  15. rlottie::Surface surface(buffer.get(), w, h, w * 4);
  16. player->renderSync(frame, surface);
  17. }
  18. }
  19. }

The Instrumentation

We are not here for the only crashes, we also want to catch stack/heap overflow issues! Address Sanitizer will help us to get them. Suitable `ASAN_OPTIONS` is:

  1. $ export ASAN_OPTIONS=verbosity=3:detect_leaks=0:abort_on_error=1:symbolize=0:check_initialization_order=true:detect_stack_use_after_return=true:strict_string_checks=true:detect_invalid_pointer_pairs=2:malloc_context_size=0:allocator_may_return_null=1

It will abort on something(this configuration is very crucial to catch memory bugs with AFL++).

Set these variables before compiling:

  1. $ export AFL_USE_ASAN=1
  1. $ afl-clang-lto++ -g -fPIC -shared -c -I $(find ../ -name "inc") -I $(find ../ -name "jni") -I $(find ../ -name "stb") -I $(find ../ -name "vector") -I $(find ../ -name "freetype") -I $(find ../ -name "pixman") -I $(find ../ -name "rapidjson") -L $(find ../ -name "libs") $(find ../jni/rlottie -name "*.cpp") -lm -lz
  2. $ afl-clang-lto++ lottiemessage.pb.cc -g -fPIC -c -o lottiemessage.pb.o
  3. $ afl-clang-lto++ -ggdb harn.cpp -o afl_fuzz $(find . -name "*.o" ! -path "./objects_vanilla/*" ! -path "./Fuzzer/*") -I $(find ../jni -name "inc") -I $(find ../jni -name "vector") -L $(find ../ -name "libs") -g -ljsoncpp -lm -lz -lprotobuf -I /usr/local/include/libprotobuf-mutator -labsl_log_internal_check_op -labsl_log_internal_message -labsl_raw_logging_internal -labsl_spinlock_wait
  4. # (I compiled the vanilla objects within the same directory, to not include them I excluded that directory)

Before spinning up the fuzzers set these environment variables to use only custom mutator:

  1. $ export AFL_CUSTOM_MUTATOR_ONLY=1
  2. $ export AFL_CUSTOM_MUTATOR_LIBRARY=/where/the/mutator/is/libprotobuf_mutator.so

Then spin up the fuzzer!


(For the best options: https://aflplus.plus/docs/fuzzing_in_depth/)
  1. $ sudo afl-system-config # this bash script will adjust your system with the best performance
  2. $ LD_LIBRARY_PATH=$PWD afl-fuzz -m none -x json.dict -t 80 -o sync_dir -i corpus/ -M fuzzer01 -- ./afl_fuzz

The Euphoria

So what? We will happily sit down and wait for a bug? No! The fuzzer is always available to be feeded with new seeds. Just do whenever you come up with a new valid input:

  1. $ AFL_BENCH_JUST_ONE=1 AFL_FAST_CAL=1 afl-fuzz -i newseeds -o sync_dir -S newseeds -- ./target

The more corpus does not promise more coverage tho. Check these results and compare them:

  1. $ ls -al corpus/ | wc
  2. 188 1498 15454
  3. $ ls -al small_corpus/ | wc
  4. 6 42 360





See? Fuzzing with lesser corpus (but has a heretic) could mutate the input immediately to a known heap overflow case and it has nearly the same coverage! So, various of the seeds is important! Make sure it's diverse! (no, it is not a political joke! lol.)

Conclusion

To summarize We have;

  1. Found a target function
  2. Created a protobuf message
  3. Compiled the AFLplusplus-libprotobuf-mutator
  4. Build libprotobuf-mutator seperately with set(CMAKE_POSITION_INDEPENDENT_CODE ON)
  5. - cmake .. -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_STANDARD=17 -DCMAKE_CXX_STANDARD_REQUIRED=ON -DCMAKE_CXX_EXTENSIONS=OFF
  6. - sudo ninja install
  7. Build afl_mutator.cc with clang++
  8. clang++ -shared -O3 afl_mutator.cc -lprotobuf-mutator -lprotobuf -I /usr/local/include/libprotobuf-mutator -o afl_mutator.so
  9. Build Fuzzer/
  10. - run ./build.sh in AFLplusplus-protobuf-mutator/src/
  11. Wrote and compiled our mutator with vanilla clang++
  12. - clang++ -g -fPIC -c -I $(find ../ -name "inc") -I $(find ../ -name "jni") -I $(find ../ -name "stb") -I $(find ../ -name "vector") -I $(find ../ -name "freetype") -I $(find ../ -name "pixman") -I $(find ../ -name "rapidjson") -L $(find ../ -name "libs") $(find ../jni/rlottie -name "*.cpp") -lm -lz
  13. - mv *.o objects_vanilla/
  14. - clang++ -shared -fPIC -g -O0 mutator.cpp -o libprotobuf_mutator.so -I $(find ../jni/rlottie -name "inc") -I /usr/local/include/libprotobuf-mutator -I Fuzzer/ Fuzzer/libFuzzer.a -lprotobuf -lprotobuf-mutator /usr/local/lib/libprotobuf-mutator-libfuzzer.a /usr/local/lib/libprotobuf-mutator.a -ljsoncpp objects_vanilla/*.o afl_mutator.so
  15. Wrote and compile our harness
  16. - export ASAN_OPTIONS=verbosity=3:detect_leaks=0:abort_on_error=1:symbolize=0:check_initialization_order=true:detect_stack_use_after_return=true:strict_string_checks=true:detect_invalid_pointer_pairs=2:malloc_context_size=0:allocator_may_return_null=1
  17. - export AFL_USE_ASAN=1
  18. - afl-clang-lto++ -g -fPIC -shared -c -I $(find ../ -name "inc") -I $(find ../ -name "jni") -I $(find ../ -name "stb") -I $(find ../ -name "vector") -I $(find ../ -name "freetype") -I $(find ../ -name "pixman") -I $(find ../ -name "rapidjson") -L $(find ../ -name "libs") $(find ../jni/rlottie -name "*.cpp") -lm -lz
  19. - afl-clang-lto++ lottiemessage.pb.cc -g -fPIC -c -o lottiemessage.pb.o
  20. - afl-clang-lto++ -ggdb harn.cpp -o afl_fuzz $(find . -name "*.o" ! -path "./objects_vanilla/*" ! -path "./Fuzzer/*") -I $(find ../jni -name "inc") -I $(find ../jni -name "vector") -L $(find ../ -name "libs") -g -ljsoncpp -lm -lz -lprotobuf -I /usr/local/include/libprotobuf-mutator -labsl_log_internal_check_op -labsl_log_internal_message -labsl_raw_logging_internal -labsl_spinlock_wait
  21. Spinning up the fuzzer!
  22. - sudo afl-system-config
  23. - LD_LIBRARY_PATH=$PWD afl-fuzz -m none -x json.dict -t 80 -o sync_dir -i corpus/ -M fuzzer01 -- ./afl_fuzz
  24. Feeding the new seeds time to time!
  25. - AFL_BENCH_JUST_ONE=1 AFL_FAST_CAL=1 afl-fuzz -i newseeds -o sync_dir -S newseeds -- ./target

The End

This is the end folks. I may miss some parts, such as unmentioned or missing libraries like abseil-cpp, jsoncpp, protobuf. Just reach me out from discord/twitter, we can discuss the problems. I appreciated that you found my blog enough interesting to read! Have a nice year you absolute legends!

References & Links

  1. https://aflplus.plus/docs/fuzzing_in_depth/
  2. https://github.com/P1umer/AFLplusplus-protobuf-mutator
  3. https://github.com/landn172/lottie-miniapp/tree/master
  4. https://github.com/Samsung/rlottie/tree/master
  5. https://github.com/google/fuzzing/blob/master/docs/structure-aware-fuzzing.md
  6. https://github.com/DrKLO/Telegram/tree/master/TMessagesProj