没有一份详细的流程,探索是一件很痛苦的过程,本文不探讨引擎的原理和底层源码,仅仅解释引擎的使用 这是引擎的安装
pip install capstone
这是官方的帮助文档
https://github.com/kabeor/Capstone-Engine-Documentation/blob/master/Capstone-Engine%20Documentation.md
很好,我的探索是基于x86-32汇编基础上的,随便截了一段汇编代码
0040106C 60 pushad
0040106D BB 0050B0BC mov ebx, 0xBCB05000
00401072 53 push ebx
00401073 68 AD0B0000 push 0xBAD
00401078 C3 retn
00401079 B9 9C000000 mov ecx, 0x9C
0040107E 0BC9 or ecx, ecx
00401080 74 4D je short 004010CF
00401082 833D 1B014B00 0>cmp dword ptr [0x4B011B], 0x0
00401089 61 popad
python的初始化和官方文档给的初始化还是有部分差异,只需要关注模式和位数即可,方便至极
enum cs_arch {
CS_ARCH_ARM = 0,
CS_ARCH_ARM64,
CS_ARCH_MIPS,
CS_ARCH_X86,
CS_ARCH_PPC,
CS_ARCH_SPARC,
CS_ARCH_SYSZ,
CS_ARCH_XCORE,
CS_ARCH_M68K,
CS_ARCH_TMS320C64X,
CS_ARCH_M680X,
CS_ARCH_EVM,
CS_ARCH_MAX,
CS_ARCH_ALL = 0xFFFF,
} cs_arch;
enum cs_mode {
CS_MODE_LITTLE_ENDIAN = 0,
CS_MODE_ARM = 0,
CS_MODE_16 = 1 << 1,
CS_MODE_32 = 1 << 2,
CS_MODE_64 = 1 << 3,
CS_MODE_THUMB = 1 << 4,
CS_MODE_MCLASS = 1 << 5,
CS_MODE_V8 = 1 << 6,
CS_MODE_MICRO = 1 << 4,
CS_MODE_MIPS3 = 1 << 5,
CS_MODE_MIPS32R6 = 1 << 6,
CS_MODE_MIPS2 = 1 << 7,
CS_MODE_V9 = 1 << 4,
CS_MODE_QPX = 1 << 4,
CS_MODE_SPE = 1 << 5,
CS_MODE_BOOKE = 1 << 6,
CS_MODE_M68K_000 = 1 << 1,
CS_MODE_M68K_010 = 1 << 2,
CS_MODE_M68K_020 = 1 << 3,
CS_MODE_M68K_030 = 1 << 4,
CS_MODE_M68K_040 = 1 << 5,
CS_MODE_M68K_060 = 1 << 6,
CS_MODE_BIG_ENDIAN = 1 << 31,
CS_MODE_MIPS32 = CS_MODE_32,
CS_MODE_MIPS64 = CS_MODE_64,
CS_MODE_M680X_6301 = 1 << 1,
CS_MODE_M680X_6309 = 1 << 2,
CS_MODE_M680X_6800 = 1 << 3,
CS_MODE_M680X_6801 = 1 << 4,
CS_MODE_M680X_6805 = 1 << 5,
CS_MODE_M680X_6808 = 1 << 6,
CS_MODE_M680X_6809 = 1 << 7,
CS_MODE_M680X_6811 = 1 << 8,
CS_MODE_M680X_CPU12 = 1 << 9,
CS_MODE_M680X_HCS08 = 1 << 10,
CS_MODE_BPF_CLASSIC = 0,
CS_MODE_BPF_EXTENDED = 1 << 0,
CS_MODE_RISCV32 = 1 << 0,
CS_MODE_RISCV64 = 1 << 1,
CS_MODE_RISCVC = 1 << 2,
CS_MODE_MOS65XX_6502 = 1 << 1,
CS_MODE_MOS65XX_65C02 = 1 << 2,
CS_MODE_MOS65XX_W65C02 = 1 << 3,
CS_MODE_MOS65XX_65816 = 1 << 4,
CS_MODE_MOS65XX_65816_LONG_M = (1 << 5),
CS_MODE_MOS65XX_65816_LONG_X = (1 << 6),
CS_MODE_MOS65XX_65816_LONG_MX = CS_MODE_MOS65XX_65816_LONG_M | CS_MODE_MOS65XX_65816_LONG_X,
} cs_mode;
这里是我的代码,Code保证是字节数组即可
CODE = bytearray.fromhex("60 BB 00 50 B0 BC 53 68 AD 0B 00 00 C3 B9 9C 00 00 00 0B C9 74 4D 83 3D 1B 01 4B 00 00 61")
md = Cs(CS_ARCH_X86, CS_MODE_32)
好的,代码成功运行,继续探索,从简单的抓起,代码遍历函数disasm在python中定义如下,比起C来说更容易上手,不是吗?
disasm(code,offset,count)
code不用我多解释,就是要解析的数据,offset就是起始地址,你想设置多少就设置多少,count就是你要解析的指令数量(不填写的话,默认为全部,具体效果自己测试,秒懂),而返回的对象是一个迭代器,迭代器元素的重要属性如下(不重要的我给删了,想看可以去官网看)
struct cs_insn {
unsigned int id;
uint64_t address;
uint16_t size;
uint8_t bytes[24];
char mnemonic[CS_MNEMONIC_SIZE];
char op_str[160];
} cs_insn;
完整代码如下,至此,成功打印出汇编代码O(∩_∩)O~~
CODE = bytearray.fromhex("60 BB 00 50 B0 BC 53 68 AD 0B 00 00 C3 B9 9C 00 00 00 0B C9 74 4D 83 3D 1B 01 4B 00 00 61")
md = Cs(CS_ARCH_X86, CS_MODE_32)
for i in md.disasm(CODE, 0x0040106C):
print("%x:\t%s\t%s" % (i.address, i.mnemonic, i.op_str))
当然,我对其他功能也很好奇,就尝试了下disasm_lite这个在官方中文文档中并没有给出明确的定义。他的参数传递和disasm基本没区别,但是他返回的是一个元组对象,根据代码注释可以得知只返回(address、size, mnemonic, op_str)的元组,所以用法上还是和disasm有一定的区别,因为他返回的少,所以他执行效率更快
disasm_lite(code,offset,count)
CODE = bytearray.fromhex("60 BB 00 50 B0 BC 53 68 AD 0B 00 00 C3 B9 9C 00 00 00 0B C9 74 4D 83 3D 1B 01 4B 00 00 61")
md = Cs(CS_ARCH_X86, CS_MODE_32)
for i in md.disasm_lite(CODE, 0x0040106C):
print(i)
至此,主要功能介绍完毕,其他功能大概都无关紧要
|