[转]Resolving kernel symbols
原:http://ho.ax/posts/2012/02/resolving-kernel-symbols/
KXLD doesn’t like us much. He has KPIs to meet and doesn’t have time to help out shifty rootkit developers. KPIs are Kernel Programming Interfaces - lists of symbols in the kernel that KXLD (the kernel extension linker) will allow kexts to be linked against. The KPIs on which your kext depends are specified in the Info.plist
file like this:
<key>OSBundleLibraries</key>
<dict>
<key>com.apple.kpi.bsd</key>
<string>11.0</string>
<key>com.apple.kpi.libkern</key>
<string>11.0</string>
<key>com.apple.kpi.mach</key>
<string>11.0</string>
<key>com.apple.kpi.unsupported</key>
<string>11.0</string>
<key>com.apple.kpi.iokit</key>
<string>11.0</string>
<key>com.apple.kpi.dsep</key>
<string>11.0</string>
</dict>
Those bundle identifiers correspond to the CFBundleIdentifier
key specified in the Info.plist
files for “plug-ins” to the System.kext
kernel extension. Each KPI has its own plug-in kext - for example, the com.apple.kpi.bsd
symbol table lives in BSDKernel.kext
. These aren’t exactly complete kexts, they’re just Mach-O binaries with symbol tables full of undefined symbols (they really reside within the kernel image), which you can see if we dump the load commands:
$ otool -l /System/Library/Extensions/System.kext/PlugIns/BSDKernel.kext/BSDKernel
/System/Library/Extensions/System.kext/PlugIns/BSDKernel.kext/BSDKernel:
Load command 0
cmd LC_SYMTAB
cmdsize 24
symoff 80
nsyms 830
stroff 13360
strsize 13324
Load command 1
cmd LC_UUID
cmdsize 24
uuid B171D4B0-AC45-47FC-8098-5B2F89B474E6
That’s it - just the LC_SYMTAB
(symbol table). So, how many symbols are there in the kernel image?
$ nm /mach_kernel|wc -l
16122
Surely all the symbols in all the KPI symbol tables add up to the same number, right?
$ find /System/Library/Extensions/System.kext/PlugIns -type f|grep -v plist|xargs nm|sort|uniq|wc -l
7677
Nope. Apple doesn’t want us to play with a whole bunch of their toys. 8445 of them. Some of them are pretty fun too :( Like allproc
:
$ nm /mach_kernel|grep allproc
ffffff80008d9e40 S _allproc
$ find /System/Library/Extensions/System.kext/PlugIns -type f|grep -v plist|xargs nm|sort|uniq|grep allproc
$
Damn. The allproc
symbol is the head of the kernel’s list (the queue(3)
kind of list) of running processes. It’s what gets queried when you run ps(1)
or top(1)
. Why do we want to find allproc
? If we want to hide processes in a kernel rootkit that’s the best place to start. So, what happens if we build a kernel extension that imports allproc
and try to load it?
bash-3.2# kextload AllProcRocks.kext
/Users/admin/AllProcRocks.kext failed to load - (libkern/kext) link error; check the system/kernel logs for errors or try kextutil(8).
Console says:
25/02/12 6:30:47.000 PM kernel: kxld[ax.ho.kext.AllProcRocks]: The following symbols are unresolved for this kext:
25/02/12 6:30:47.000 PM kernel: kxld[ax.ho.kext.AllProcRocks]: _allproc
OK, whatever.
What do we do?
There are a few steps that we need to take in order to resolve symbols in the kernel (or any other Mach-O binary):
- Find the
__LINKEDIT
segment - this contains an array ofstruct nlist_64
’s which represent all the symbols in the symbol table, and an array of symbol name strings. - Find the
LC_SYMTAB
load command - this contains the offsets within the file of the symbol and string tables. - Calculate the position of the string table within
__LINKEDIT
based on the offsets in theLC_SYMTAB
load command. - Iterate through the
struct nlist_64
’s in__LINKEDIT
, comparing the corresponding string in the string table to the name of the symbol we’re looking for until we find it (or reach the end of the symbol table). - Grab the address of the symbol from the
struct nlist_64
we’ve found.
Parse the load commands
One easy way to look at the symbol table would be to read the kernel file on disk at /mach_kernel
, but we can do better than that if we’re already in the kernel - the kernel image is loaded into memory at a known address. If we have a look at the load commands for the kernel binary:
$ otool -l /mach_kernel
/mach_kernel:
Load command 0
cmd LC_SEGMENT_64
cmdsize 472
segname __TEXT
vmaddr 0xffffff8000200000
vmsize 0x000000000052f000
fileoff 0
filesize 5435392
maxprot 0x00000007
initprot 0x00000005
nsects 5
flags 0x0
<snip>
We can see that the vmaddr
field of the first segment is 0xffffff8000200000
. If we fire up GDB and point it at a VM running Mac OS X (as per my previous posts here and here), we can see the start of the Mach-O header in memory at this address:
gdb$ x/xw 0xffffff8000200000
0xffffff8000200000: 0xfeedfacf
0xfeedfacf
is the magic number denoting a 64-bit Mach-O image (the 32-bit version is 0xfeedface
). We can actually display this as a struct if we’re using the DEBUG kernel with all the DWARF info:
gdb$ print *(struct mach_header_64 *)0xffffff8000200000
$1 = {
magic = 0xfeedfacf,
cputype = 0x1000007,
cpusubtype = 0x3,
filetype = 0x2,
ncmds = 0x12,
sizeofcmds = 0x1010,
flags = 0x1,
reserved = 0x0
}
The mach_header
and mach_header_64
structs (along with the other Mach-O-related structs mentioned in this post) are documented in the Mach-O File Format Reference, but we aren’t particularly interested in the header at the moment. I recommend having a look at the kernel image with MachOView to get the gist of where everything is and how it’s laid out.
Directly following the Mach-O header is the first load command:
gdb$ set $mh=(struct mach_header_64 *)0xffffff8000200000
gdb$ print *(struct load_command*)((void *)$mh + sizeof(struct mach_header_64))
$6 = {
cmd = 0x19,
cmdsize = 0x1d8
}
This is the load command for the first __TEXT
segment we saw with otool
. We can cast it as asegment_command_64
in GDB and have a look:
gdb$ set $lc=((void *)$mh + sizeof(struct mach_header_64))
gdb$ print *(struct segment_command_64 *)$lc
$7 = {
cmd = 0x19,
cmdsize = 0x1d8,
segname = "__TEXT\000\000\000\000\000\000\000\000\000",
vmaddr = 0xffffff8000200000,
vmsize = 0x8c8000,
fileoff = 0x0,
filesize = 0x8c8000,
maxprot = 0x7,
initprot = 0x5,
nsects = 0x5,
flags = 0x0
}
This isn’t the load command we are looking for, so we have to iterate through all of them until we come across a segment with cmd
of 0x19
(LC_SEGMENT_64
) and segname
of __LINKEDIT
. In the debug kernel, this happens to be located at 0xffffff8000200e68
:
gdb$ set $lc=0xffffff8000200e68
gdb$ print *(struct load_command*)$lc
$14 = {
cmd = 0x19,
cmdsize = 0x48
}
gdb$ print *(struct segment_command_64*)$lc
$16 = {
cmd = 0x19,
cmdsize = 0x48,
segname = "__LINKEDIT\000\000\000\000\000",
vmaddr = 0xffffff8000d08000,
vmsize = 0x109468,
fileoff = 0xaf4698,
filesize = 0x109468,
maxprot = 0x7,
initprot = 0x1,
nsects = 0x0,
flags = 0x0
}
Then we grab the vmaddr
field from the load command, which specifies the address at which the__LINKEDIT
segment’s data will be located:
gdb$ set $linkedit=((struct segment_command_64*)$lc)->vmaddr
gdb$ print $linkedit
$19 = 0xffffff8000d08000
gdb$ print *(struct nlist_64 *)$linkedit
$20 = {
n_un = {
n_strx = 0x68a29
},
n_type = 0xe,
n_sect = 0x1,
n_desc = 0x0,
n_value = 0xffffff800020a870
}
And there’s the first struct nlist_64
.
As for the LC_SYMTAB
load command, we just need to iterate through the load commands until we find one with the cmd
field value of 0x02
(LC_SYMTAB
). In this case, it’s located at 0xffffff8000200eb0
:
gdb$ set $symtab=*(struct symtab_command*)0xffffff8000200eb0
gdb$ print $symtab
$23 = {
cmd = 0x2,
cmdsize = 0x18,
symoff = 0xaf4698,
nsyms = 0x699d,
stroff = 0xb5e068,
strsize = 0x9fa98
}
The useful parts here are the symoff
field, which specifies the offset in the file to the symbol table (start of the __LINKEDIT
segment), and the stroff
field, which specifies the offset in the file to the string table (somewhere in the middle of the __LINKEDIT
segment). Why, you ask, did we need to find the __LINKEDIT
segment as well, since we have the offset here in the LC_SYMTAB
command? If we were looking at the file on disk we wouldn’t have needed to, but as the kernel image we’re inspecting has already been loaded into memory, the binary segments have been loaded at the virtual memory addresses specified in their load commands. This means that the symoff
and stroff
fields are not correct any more. However, they’re still useful, as the difference between the two helps us figure out the offset into the __LINKEDIT
segment at which the string table exists:
gdb$ print $linkedit
$24 = 0xffffff8000d08000
gdb$ print $linkedit + ($symtab->stroff - $symtab->symoff)
$25 = 0xffffff8000d719d0
gdb$ set $strtab=$linkedit + ($symtab->stroff - $symtab->symoff)
gdb$ x/16s $strtab
0xffffff8000d719d0: ""
0xffffff8000d719d1: ""
0xffffff8000d719d2: ""
0xffffff8000d719d3: ""
0xffffff8000d719d4: ".constructors_used"
0xffffff8000d719e7: ".destructors_used"
0xffffff8000d719f9: "_AddFileExtent"
0xffffff8000d71a08: "_AllocateNode"
0xffffff8000d71a16: "_Assert"
0xffffff8000d71a1e: "_BF_decrypt"
0xffffff8000d71a2a: "_BF_encrypt"
0xffffff8000d71a36: "_BF_set_key"
0xffffff8000d71a42: "_BTClosePath"
0xffffff8000d71a4f: "_BTDeleteRecord"
0xffffff8000d71a5f: "_BTFlushPath"
0xffffff8000d71a6c: "_BTGetInformation"
Actually finding some symbols
Now that we know where the symbol table and string table live, we can get on to the srs bznz. So, let’s find that damn _allproc
symbol we need. Have a look at that first struct nlist_64
again:
gdb$ print *(struct nlist_64 *)$linkedit
$28 = {
n_un = {
n_strx = 0x68a29
},
n_type = 0xe,
n_sect = 0x1,
n_desc = 0x0,
n_value = 0xffffff800020a870
}
The n_un.nstrx
field there specifies the offset into the string table at which the string corresponding to this symbol exists. If we add that offset to the address at which the string table starts, we’ll see the symbol name:
gdb$ x/s $strtab + ((struct nlist_64 *)$linkedit)->n_un.n_strx
0xffffff8000dda3f9: "_ps_vnode_trim_init"
Now all we need to do is iterate through all the struct nlist_64
’s until we find the one with the matching name. In this case it’s at 0xffffff8000d482a0
:
gdb$ set $nlist=0xffffff8000d482a0
gdb$ print *(struct nlist_64*)$nlist
$31 = {
n_un = {
n_strx = 0x35a07
},
n_type = 0xf,
n_sect = 0xb,
n_desc = 0x0,
n_value = 0xffffff8000cb5ca0
}
gdb$ x/s $strtab + ((struct nlist_64 *)$nlist)->n_un.n_strx
0xffffff8000da73d7: "_allproc"
The n_value
field there (0xffffff8000cb5ca0
) is the virtual memory address at which the symbol’s data/code exists. _allproc
is not a great example as it’s a piece of data, rather than a function, so let’s try it with a function:
gdb$ set $nlist=0xffffff8000d618f0
gdb$ print *(struct nlist_64*)$nlist
$32 = {
n_un = {
n_strx = 0x52ed3
},
n_type = 0xf,
n_sect = 0x1,
n_desc = 0x0,
n_value = 0xffffff80007cceb0
}
gdb$ x/s $strtab + ((struct nlist_64 *)$nlist)->n_un.n_strx
0xffffff8000dc48a3: "_proc_lock"
If we disassemble a few instructions at that address:
gdb$ x/12i 0xffffff80007cceb0
0xffffff80007cceb0 <proc_lock>: push rbp
0xffffff80007cceb1 <proc_lock+1>: mov rbp,rsp
0xffffff80007cceb4 <proc_lock+4>: sub rsp,0x10
0xffffff80007cceb8 <proc_lock+8>: mov QWORD PTR [rbp-0x8],rdi
0xffffff80007ccebc <proc_lock+12>: mov rax,QWORD PTR [rbp-0x8]
0xffffff80007ccec0 <proc_lock+16>: mov rcx,0x50
0xffffff80007cceca <proc_lock+26>: add rax,rcx
0xffffff80007ccecd <proc_lock+29>: mov rdi,rax
0xffffff80007cced0 <proc_lock+32>: call 0xffffff800035d270 <lck_mtx_lock>
0xffffff80007cced5 <proc_lock+37>: add rsp,0x10
0xffffff80007cced9 <proc_lock+41>: pop rbp
0xffffff80007cceda <proc_lock+42>: ret
We can see that GDB has resolved the symbol for us, and we’re right on the money.
Sample code
I’ve posted an example kernel extension on github to check out. When we load it with kextload KernelResolver.kext
, we should see something like this on the console:
25/02/12 8:06:49.000 PM kernel: [+] _allproc @ 0xffffff8000cb5ca0
25/02/12 8:06:49.000 PM kernel: [+] _proc_lock @ 0xffffff80007cceb0
25/02/12 8:06:49.000 PM kernel: [+] _kauth_cred_setuidgid @ 0xffffff80007abbb0
25/02/12 8:06:49.000 PM kernel: [+] __ZN6OSKext13loadFromMkextEjPcjPS0_Pj @ 0xffffff80008f8606
Update: It was brought to my attention that I was using a debug kernel in these examples. Just to be clear - the method described in this post, as well as the sample code, works on a non-debug, default install >=10.7.0 (xnu-1699.22.73) kernel as well, but the GDB inspection probably won’t (unless you load up the struct definitions etc, as they are all stored in the DEBUG kernel). The debug kernel contains every symbol from the source, whereas many symbols are stripped from the distribution kernel (e.g. sLoadedKexts
). Previously (before 10.7), the kernel would write out the symbol table to a file on disk and jettison it from memory altogether. I suppose when kernel extensions were loaded,kextd
or kextload
would resolve symbols from within that on-disk symbol table or from the on-disk kernel image. These days the symbol table memory is just marked as pageable, so it can potentially get paged out if the system is short of memory.
I hope somebody finds this useful. Shoot me an email or get at me on twitter if you have any questions. I’ll probably sort out comments for this blog at some point, but I cbf at the moment.
[转]Resolving kernel symbols的更多相关文章
- Linux Kernel sys_call_table、Kernel Symbols Export Table Generation Principle、Difference Between System Calls Entrance In 32bit、64bit Linux
目录 . sys_call_table:系统调用表 . 内核符号导出表:Kernel-Symbol-Table . Linux 32bit.64bit环境下系统调用入口的异同 . Linux 32bi ...
- Linux Kernel sys_call_table、Kernel Symbols Export Table Generation Principle、Difference Between System Calls Entrance In 32bit、64bit Linux【转】
转自:http://www.cnblogs.com/LittleHann/p/4127096.html 目录 1. sys_call_table:系统调用表 2. 内核符号导出表:Kernel-Sym ...
- karottc A Simple linux-virus Analysis、Linux Kernel <= 2.6.37 - Local Privilege Escalation、CVE-2010-4258、CVE-2010-3849、CVE-2010-3850
catalog . 程序功能概述 . 感染文件 . 前置知识 . 获取ROOT权限: Linux Kernel <= - Local Privilege Escalation 1. 程序功能概述 ...
- How to exploit the x32 recvmmsg() kernel vulnerability CVE 2014-0038
http://blog.includesecurity.com/2014/03/exploit-CVE-2014-0038-x32-recvmmsg-kernel-vulnerablity.html ...
- __user表示是一个user mode的pointer,所以kernel不可能直接使用。
__user表示是一个用户空间的指针,所以kernel不可能直接使用. #ifdef __CHECKER__# define __user __attribute__((noderef, addres ...
- The Kernel Newbie Corner: Kernel Debugging Using proc "Sequence" Files--Part 1
转载:https://www.linux.com/learn/linux-career-center/37985-the-kernel-newbie-corner-kernel-debugging-u ...
- kernel(一)编译体验
目录 打补丁 配置 总结 配置方式 配置体验 配置详解 Makefile解析 子目录的Makefile 架构下面的Makefile 顶层Makefile Make解析 编译 链接 链接脚本 烧写内核 ...
- linux kernel内存映射实例分析
作者:JHJ(jianghuijun211@gmail.com)日期:2012/08/24 欢迎转载,请注明出处 引子 现在android智能手机市场异常火热,硬件升级非常迅猛,arm cortex ...
- FreeBSD 用kgdb调试kernel dump文件
FreeBSD 用kgdb调试kernel dump文件 来自: http://blog.csdn.net/ztz0223/article/details/8600052 kgdb貌似和ddb一样属于 ...
随机推荐
- 第17章 中介者模式(Mediator Pattern)
原文 第17章 中介者模式(Mediator Pattern) 中介者模式 概述: 在软件开发中,我们有时会碰上许多对象互相联系互相交互的情况,对象之间存在复杂的引用关系,当需求更改时,对系统进 ...
- xml:Invalid byte 2 of 2-byte UTF-8 sequence
xml解析错误:Invalid byte 2 of 2-byte UTF-8 sequence 在做接口解析时候出现的错误:Invalid byte 2 of 2-byte UTF-8 sequenc ...
- oracle_powerdesinger逆向工程 , PDM 文件 注释到name的完美解决方案 comment2name
1. 从oracle 到 PDM文件 逆向工程中 ,需要注意 去掉“” ,这个百度下很多帖子,用于去掉引号 2. 从注释copy到name运行脚本会有个问题就是 ,有些注释太长,不美观 解决方案, ...
- 【转】Android 常用 adb 命令总结
原文地址:http://testerhome.com/topics/2565 针对移动端 Android 的测试, adb 命令是很重要的一个点,必须将常用的 adb 命令熟记于心, 将会为 Andr ...
- MD5处理图片加密算法
Android MD5加密算与J2SE平台一模一样,由于Android 平台支持 java.security.MessageDigest这个包.实际上与J2SE平台一模一样. 首先: 输入一个Stri ...
- C++中的class
C++中的class是C++不同于C的关键所在: 是面向对象中声明的类: 公有成员public member 在程序的不论什么地方都能够被訪问实行信息隐藏的类将 其publ ...
- MVC4 学习笔记01
1 . ASP.NET MVC 中 ActionResult 和 ViewResult 在使用上的区别是什么?要注意什么吗? ActionResult 是一个抽象(abstract)类,ViewRes ...
- 第4章3节《MonkeyRunner源码剖析》ADB协议及服务: ADB协议概览SYNC.TXT翻译参考(原创)
天地会珠海分舵注:本来这一系列是准备出一本书的,详情请见早前博文“寻求合作伙伴编写<深入理解 MonkeyRunner>书籍“.但因为诸多原因,没有如愿.所以这里把草稿分享出来,所以错误在 ...
- 阐述linux IPC(五岁以下儿童):system V共享内存
[版权声明:尊重原创.转载请保留源:blog.csdn.net/shallnet 要么 .../gentleliu,文章学习交流,不用于商业用途] system V共享内存和posix ...
- YII相关资料(干货)
Sites 网站 yiifeed:Yii 最新动态都在这里 yiigist:Yii 专用的 Packages my-yii:Yii 学习资料和新闻 Docs 文档 Yii Framework 2.0 ...