[MENU] | |||||||||
[THOUGHTS] | [TECH RESOURCES] | [TRASH TALK] | |||||||
[DANK MEMES] | [FEATURED ARTISTS] | [W] |
Hello guys! It has been a while. This seems to be my final blog post for the year. Do you feel the same way? Has time started to pass quickly? I feel like I have not done much anything. Anyway, I hope 2024 will bring some prosperity and happiness!
After working on ELF mutator, I wanted to work on something but this time is more beneficial. Yeah I know, An ELF mutator does not help a lot of developers. The surface is mostly analyzers and stuff. This time I brought you something bigger, an example protobuf mutator that is usable by the AFL++. Which means, after reading this blog you will be (I think you should) able to write your harness and protobuf mutator for your target!
I will be giving the examples by Telegram's rlottie(forked) library (6.1.1_1946 build version). rlottie is a library that loads animations in JSON format. Let's start by the definitions!
- Prelogue
- Lottie Struct to Protobuf Message
- Building & Using AFLplusplus-libprotobuf-mutator
- The Harness
- Instrumentation
- Euphoria
- Conclusion/End
Coverage-guided fuzzing and mutation-based fuzzers are really the easiest way to start digging through a project. However if you are dealing with a target which accepts and processes a complex data type, you need more than random bit flipping, you need also a mutation that complies with the grammar, the valid structure. Otherwise, your fuzzer will not go through, and the target most probably will reject your mutated inputs in the very early stages of it. I have emphasized Structure-Aware Fuzzing importance enough right?
Dealing with the (de)serialization of the complex types is very hard. We know it is hard by the bugs discovered every year. We will discover how rlottie is dealing with it. We will use protobuf(protocol buffers) as intermediate format to serialize our mutated input. Thus, it will allow our harness to consume structured data with easiness.
By looking at the Samsung's rlottie github README.md, it gives right away the candidate functions!
We can either target
- auto animation = rlottie::Animation::loadFromData(std::string(rawData), std::string(cacheKey));
or
- auto animation = rlottie::Animation::loadFromFile("absolute_path/test.json"); // giving the input traditionally as argv[1]
Technically, you can choose both functions, we can assume they will load the data in the same way but I can recommend persistent mode for better efficiency, which becomes `loadFromData` as our ultimate target.
So, what is the valid structure of those functions anyway?
1) We can check manually in the real time in Telegram's memory.Let's look at this structure.
- https://github.com/landn172/lottie-miniapp/blob/master/lottie-json.md
- var data = {
- v: '4.6.3', // version
- fr: 30, // 30 fps
- ip: 0,
- op: 73,
- w: 250, // width
- h: 275, // height
- nm: 'B',
- ddd: 0,
- assets: [],
- layers: []
- };
- var assets = ..
- var layers = ..
Translating part is very easy, something looks like this;
- syntax = "proto3";
-
- message Asset {
- string id = 1;
- repeated Layer layer = 2;
-
- message Layer {
- int32 ddd = 1;
- int32 ind = 2;
- int32 ty = 3;
- string nm = 4;
- Keyframes ks = 5;
- int32 ao = 6;
- repeated Shape shapes = 7;
- int32 ip = 8;
- int32 op = 9;
- int32 st = 10;
- int32 bm = 11;
- int32 sr = 12;
- }
- }
-
- message Layers {
- int32 ddd = 1;
- int32 ind = 2;
- int32 ty = 3;
- string nm = 4;
- Keyframes ks = 5;
- int32 ao = 6;
- repeated Shape shapes = 7;
- int32 ip = 8;
- int32 op = 9;
- int32 st = 10;
- int32 bm = 11;
- int32 sr = 12;
- }
-
- message Keyframes {
- repeated Keyframe k = 1;
-
- message Keyframe {
- Bezier i = 1;
- Bezier o = 2;
- Bezier v = 3;
-
- message Bezier {
- float x = 1;
- float y = 2;
- }
- }
- }
-
- message Shape {
- string ty = 1;
- Group it = 2;
- int32 ix = 3;
- Keyframes ks = 4;
- string nm = 5;
- string mn = 6;
- }
-
- message Group {
- repeated Shape shapes = 1;
- string nm = 2;
- int32 np = 3;
- int32 cix = 4;
- int32 ix = 5;
- string mn = 6;
- }
-
- message LottieMessage {
- string v = 1;
- int32 fr = 2;
- int32 ip = 3;
- int32 op = 4;
- int32 w = 5;
- int32 h = 6;
- string nm = 7;
- int32 ddd = 8;
- repeated Asset assets = 9;
- repeated Layers layers = 10;
- }</code>
In Telegram's fork, some parts can be useless (maybe they did not want that usability). However, the context is up to you. This depends mostly which parts you're interested in. If you're willing to write the code that handles defined parts of the structure, you are free to go! More values will bring more coverage! (does not promise more bug tho, lol)
Then you can translate this structure directly into cpp classes and compile with:
- $ protoc --cpp_out=$PWD --proto_path $PWD lottiemessage.proto
- $ clang++ lottie.pb.cc -g -c -fPIC
To cover up this part, what we have done here:
What we have still missing:
3) Protobuf MutatorI have found this AFL++ mutator which is written by P1umer. However the Quick start and Usage part is not much helpful, especially on Arch Linux. Default CMake configuration is also downloading other thingies. In my opinion, you can use this mutator like this.
1) Build libprotobuf-mutator seperately with set(CMAKE_POSITION_INDEPENDENT_CODE ON)
- $ cmake .. -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_STANDARD=17 -DCMAKE_CXX_STANDARD_REQUIRED=ON -DCMAKE_CXX_EXTENSIONS=OFF
- $ sudo ninja install
2) Build afl_mutator.cc with clang++
- $ clang++ -shared -O3 afl_mutator.cc -lprotobuf-mutator -lprotobuf -I /usr/local/include/libprotobuf-mutator -o afl_mutator.so
3) Build Fuzzer/
- $ run ./build.sh in AFLplusplus-protobuf-mutator/src/
We have not done yet but these are the crucial parts before compiling a custom AFL++ protobuf mutator.
The usage is very simple you must write your own `DEFINE_AFL_PROTO_FUZZER(const MessageType& input, unsigned char **out_buf)` and include `#include "src/afl-mutator.h"` file. For our example, it is something like this:
- #include "/usr/local/include/libprotobuf-mutator/src/libfuzzer/libfuzzer_macro.h"
- #include "afl_mutator.h" // in aflplusplus-libprotobuf-mutator
- #include "lottiemessage.pb.h" // which is created by protoc
- #include <array>
- #include <exception>
- #include <fstream>
- #include <google/protobuf/util/json_util.h>
- #include <iostream>
- #include <json/json.h>
- #include <map>
- #include <rlottie.h>
- #include <sstream>
- #include <stdio.h>
- #include <string.h>
- #include <string>
- #include <vector>
- // ...
- using namespace std;
- // MessagesToJsonString becomes useless because the format is failing when something bad injected
- // into protobuf message, and be careful do not use blindly util MessageToJsonString class because
- // it may encounter difficulties when it's mutated into something does not comply with the structure
- // but you can still use it for the printing to check mutator is working as intended tho
- #define DUMP_PROTO(input) \
- { \
- std::string json; \
- google::protobuf::util::JsonPrintOptions options; \
- options.add_whitespace = true; \
- options.preserve_proto_field_names = true; \
- options.always_print_primitive_fields = true; \
- options.always_print_enums_as_ints = true; \
- google::protobuf::util::MessageToJsonString(input, &json, options); \
- std::cout << "dump proto: " << json << std::endl; \
- }
- // .. continues
- Json::Value lottieMessageToJSON(const LottieMessage &composition) {
- // carefully changing message into json string
- return json_string;
- }
-
- DEFINE_AFL_PROTO_FUZZER(const LottieMessage &composition,
- unsigned char **out_buf) {
-
- Json::Value json_composition = lottieMessageToJSON(composition); // trimmed..
- Json::StreamWriterBuilder writer;
- std::string json_string = Json::writeString(writer, json_composition);
- *out_buf = (unsigned char *)json_string.c_str();
- return json_string.length();
- }
You may need rlottie's(or your target) classes too, if so compile them with clang as well. Be careful about the afl-toolchain tho. Up to the present, we have not use afl's toolchain. We did not need to instrument something so far. We are just compiling our mutator part, not harness or fuzzing related.
Compilation of rlottie:
- - clang++ -g -fPIC -c -I $(find ../ -name "inc") -I $(find ../ -name "jni") -I $(find ../ -name "stb") -I $(find ../ -name "vector") -I $(find ../ -name "freetype") -I $(find ../ -name "pixman") -I $(find ../ -name "rapidjson") -L $(find ../ -name "libs") $(find ../jni/rlottie -name "*.cpp") -lm -lz
Compilation of rlottie mutator:
- # (Fuzzer/ is at AFLplusplus-protobuf-mutator)
- - clang++ -shared -fPIC -g -O0 mutator.cpp -o libprotobuf_mutator.so -I $(find ../jni/rlottie -name "inc") -I /usr/local/include/libprotobuf-mutator -I Fuzzer/ Fuzzer/libFuzzer.a -lprotobuf -lprotobuf-mutator /usr/local/lib/libprotobuf-mutator-libfuzzer.a /usr/local/lib/libprotobuf-mutator.a -ljsoncpp objects_vanilla/*.o afl_mutator.so
- # your mutator is now libprotobuf-mutator.so
- - export AFL_CUSTOM_MUTATOR_LIBRARY="~/tmp/Telegram/Telegram-release-6.1.1_1946/TMessagesProj/build/libprotobuf_mutator.so"
Now, our mutator is ready! Let's go for the harness and instrumentation!
In the harness part, with the power of protobuf mutator, we can assume that we will always get a JSON string as input. I just ripped the example usage from original README.md of rlottie's page here:
- while (__AFL_LOOP(10000)) {
- const unsigned char *raw_buf = __AFL_FUZZ_TESTCASE_BUF; // it is a JSON string and rlottie expects a json string!
- int len = strlen(reinterpret_cast<const char *>(raw_buf));
- const std::string data(raw_buf, raw_buf + len);
-
- const std::string CACHE_KEY = "I_DONT_KNOW_WHAT_IS_CACHE_KEY_BUT_WHATEVER";
- std::map<int32_t, int32_t> *colorReplacementMap = new std::map<int32_t, int32_t>();
- auto player = rlottie::Animation::loadFromData(data, CACHE_KEY,
- colorReplacementMap, ".");
- if (player) {
- size_t frame_count = player->totalFrame();
-
- uint32_t w = 420;
- uint32_t h = 666;
-
- auto buffer = std::unique_ptr<uint32_t[]>(new uint32_t[w * h]);
-
- for (size_t frame = 0; frame < frame_count; frame++) {
- rlottie::Surface surface(buffer.get(), w, h, w * 4);
- player->renderSync(frame, surface);
- }
- }
- }
We are not here for the only crashes, we also want to catch stack/heap overflow issues! Address Sanitizer will help us to get them. Suitable `ASAN_OPTIONS` is:
- $ export ASAN_OPTIONS=verbosity=3:detect_leaks=0:abort_on_error=1:symbolize=0:check_initialization_order=true:detect_stack_use_after_return=true:strict_string_checks=true:detect_invalid_pointer_pairs=2:malloc_context_size=0:allocator_may_return_null=1
It will abort on something(this configuration is very crucial to catch memory bugs with AFL++).
Set these variables before compiling:
- $ export AFL_USE_ASAN=1
- $ afl-clang-lto++ -g -fPIC -shared -c -I $(find ../ -name "inc") -I $(find ../ -name "jni") -I $(find ../ -name "stb") -I $(find ../ -name "vector") -I $(find ../ -name "freetype") -I $(find ../ -name "pixman") -I $(find ../ -name "rapidjson") -L $(find ../ -name "libs") $(find ../jni/rlottie -name "*.cpp") -lm -lz
- $ afl-clang-lto++ lottiemessage.pb.cc -g -fPIC -c -o lottiemessage.pb.o
- $ afl-clang-lto++ -ggdb harn.cpp -o afl_fuzz $(find . -name "*.o" ! -path "./objects_vanilla/*" ! -path "./Fuzzer/*") -I $(find ../jni -name "inc") -I $(find ../jni -name "vector") -L $(find ../ -name "libs") -g -ljsoncpp -lm -lz -lprotobuf -I /usr/local/include/libprotobuf-mutator -labsl_log_internal_check_op -labsl_log_internal_message -labsl_raw_logging_internal -labsl_spinlock_wait
- # (I compiled the vanilla objects within the same directory, to not include them I excluded that directory)
Before spinning up the fuzzers set these environment variables to use only custom mutator:
- $ export AFL_CUSTOM_MUTATOR_ONLY=1
- $ export AFL_CUSTOM_MUTATOR_LIBRARY=/where/the/mutator/is/libprotobuf_mutator.so
Then spin up the fuzzer!
- $ sudo afl-system-config # this bash script will adjust your system with the best performance
- $ LD_LIBRARY_PATH=$PWD afl-fuzz -m none -x json.dict -t 80 -o sync_dir -i corpus/ -M fuzzer01 -- ./afl_fuzz
So what? We will happily sit down and wait for a bug? No! The fuzzer is always available to be feeded with new seeds. Just do whenever you come up with a new valid input:
- $ AFL_BENCH_JUST_ONE=1 AFL_FAST_CAL=1 afl-fuzz -i newseeds -o sync_dir -S newseeds -- ./target
The more corpus does not promise more coverage tho. Check these results and compare them:
- $ ls -al corpus/ | wc
- 188 1498 15454
- $ ls -al small_corpus/ | wc
- 6 42 360
See? Fuzzing with lesser corpus (but has a heretic) could mutate the input immediately to a known heap overflow case and it has nearly the same coverage! So, various of the seeds is important! Make sure it's diverse! (no, it is not a political joke! lol.)
To summarize We have;
- Found a target function
- Created a protobuf message
- Compiled the AFLplusplus-libprotobuf-mutator
- Build libprotobuf-mutator seperately with set(CMAKE_POSITION_INDEPENDENT_CODE ON)
- - cmake .. -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_STANDARD=17 -DCMAKE_CXX_STANDARD_REQUIRED=ON -DCMAKE_CXX_EXTENSIONS=OFF
- - sudo ninja install
- Build afl_mutator.cc with clang++
- clang++ -shared -O3 afl_mutator.cc -lprotobuf-mutator -lprotobuf -I /usr/local/include/libprotobuf-mutator -o afl_mutator.so
- Build Fuzzer/
- - run ./build.sh in AFLplusplus-protobuf-mutator/src/
- Wrote and compiled our mutator with vanilla clang++
- - clang++ -g -fPIC -c -I $(find ../ -name "inc") -I $(find ../ -name "jni") -I $(find ../ -name "stb") -I $(find ../ -name "vector") -I $(find ../ -name "freetype") -I $(find ../ -name "pixman") -I $(find ../ -name "rapidjson") -L $(find ../ -name "libs") $(find ../jni/rlottie -name "*.cpp") -lm -lz
- - mv *.o objects_vanilla/
- - clang++ -shared -fPIC -g -O0 mutator.cpp -o libprotobuf_mutator.so -I $(find ../jni/rlottie -name "inc") -I /usr/local/include/libprotobuf-mutator -I Fuzzer/ Fuzzer/libFuzzer.a -lprotobuf -lprotobuf-mutator /usr/local/lib/libprotobuf-mutator-libfuzzer.a /usr/local/lib/libprotobuf-mutator.a -ljsoncpp objects_vanilla/*.o afl_mutator.so
-
- Wrote and compile our harness
- - export ASAN_OPTIONS=verbosity=3:detect_leaks=0:abort_on_error=1:symbolize=0:check_initialization_order=true:detect_stack_use_after_return=true:strict_string_checks=true:detect_invalid_pointer_pairs=2:malloc_context_size=0:allocator_may_return_null=1
- - export AFL_USE_ASAN=1
- - afl-clang-lto++ -g -fPIC -shared -c -I $(find ../ -name "inc") -I $(find ../ -name "jni") -I $(find ../ -name "stb") -I $(find ../ -name "vector") -I $(find ../ -name "freetype") -I $(find ../ -name "pixman") -I $(find ../ -name "rapidjson") -L $(find ../ -name "libs") $(find ../jni/rlottie -name "*.cpp") -lm -lz
- - afl-clang-lto++ lottiemessage.pb.cc -g -fPIC -c -o lottiemessage.pb.o
- - afl-clang-lto++ -ggdb harn.cpp -o afl_fuzz $(find . -name "*.o" ! -path "./objects_vanilla/*" ! -path "./Fuzzer/*") -I $(find ../jni -name "inc") -I $(find ../jni -name "vector") -L $(find ../ -name "libs") -g -ljsoncpp -lm -lz -lprotobuf -I /usr/local/include/libprotobuf-mutator -labsl_log_internal_check_op -labsl_log_internal_message -labsl_raw_logging_internal -labsl_spinlock_wait
-
- Spinning up the fuzzer!
- - sudo afl-system-config
- - LD_LIBRARY_PATH=$PWD afl-fuzz -m none -x json.dict -t 80 -o sync_dir -i corpus/ -M fuzzer01 -- ./afl_fuzz
- Feeding the new seeds time to time!
- - AFL_BENCH_JUST_ONE=1 AFL_FAST_CAL=1 afl-fuzz -i newseeds -o sync_dir -S newseeds -- ./target
- https://aflplus.plus/docs/fuzzing_in_depth/
- https://github.com/P1umer/AFLplusplus-protobuf-mutator
- https://github.com/landn172/lottie-miniapp/tree/master
- https://github.com/Samsung/rlottie/tree/master
- https://github.com/google/fuzzing/blob/master/docs/structure-aware-fuzzing.md
- https://github.com/DrKLO/Telegram/tree/master/TMessagesProj