stub&lazy bind
When i learn mach-o format before,i just know a little about stub. Now i have enough free time to study it. so record it. I try to explain how stubs work detaily,hope it can helpful.
Tool
- Xcode
- MachOView
- Hopper Disassember
- dyld-551.3
Workflow
prepare
Use
Xcodestart aCommand Line Toolproject in macOS, here i just namedstubDebugfor demonstrate. Then replace the defaultNSLog(@"Hello world");toprintf("Hello, World!\n");,like below:1
2
3
4
5
6
7int main(int argc, const char * argv[]) {
@autoreleasepool {
// insert code here...
printf("Hello, World!\n");
}
return 0;
}
- Compile the project to generate the executable
stubDebug(mach-o format),the drag it to bothMachOViewandHopper Disassember
analysis
In Hopper Disassember,let’s start with _mainlabel,we can find an instruction below
1 | 0000000100000f48 call imp___stubs__printf |
this is the point where our source codeprintf("Hello, World!\n");execute.
Click the imp___stubs__printflabel ,jump to
1 | 0000000100000f6e jmp qword [_printf_ptr] |
Then click _printf_prtlabel, jump to
1 | 0000000100001020 dq _printf |
The _printfis just a tip,not a acture address.So we will find the address 0x100001020 in MachOView,it locates in __DATA,_la_symbol_ptr
1 | 100001020 0000000100000F98 Indirect Pointer [0x100001020 -> _print] |
Then go to address 0x100000F98,it locates in __TXEXT,__stub_helper
For easy, we use Hopper Disassemberto ayalysis go on.
1 | ; Section __stub_helper |
When we arrive in address 0x100000f98, then we push 0x3f, push 0x100001008,then jump dyld_stub_binder.
What do 0x3fand 0x100001008mean? We will explain it for later, for now, we just consider they are two numberes.
Then wo search dyld_sub_binderin dyld-551.3source code. It’t a assembler code.We look at the __x86_64__architecture,
What a bad luck, it’s too long! Don’t lose heart.We will only analysis some instruction below:
1 | movq MH_PARAM_RBP(%rbp),%rdi # call fastBindLazySymbol(loadercache, lazyinfo) |
Then we search fastBindLazySymbol,it’s a c++code.The 0x3fand 0x100001008are two parameters here actually (0x3f -> lazyBindingInfoOffset, 0x100001008 -> imageLoaderCache)
Now,imageLoaderCacheis 0x100001008, *imageLoaderCacheis the data located in address 0x100001008,we can find the the data in address 0x100001008 is 0x0000000 in MachOView.
So we will arrive at dyld::findMappedRange,it’t a fast address->image lookups. Simply explain, it will find the ImageLoader* where address 0x100001008locates in. It’s our stubDebugmain executable certainly.
Then execute doBindFastLazySymbol function. Let’s look at the ImageLoaderMachOCompressed::doBindFastLazySymbol
First will focus on code below:
1 | getLazyBindingInfo(lazyBindingInfoOffset, start, end, &segIndex, &segOffset, &libraryOrdinal, &symbolName, &doneAfterBind) |
The code will analysis Laze Binding Infofor offset 0x3f.Open MachOView again, find Dynamic Loader Info ->Lazy Binding Info.The Lazy Binding Info start as 0x100002020, then add offset 0x3f,we get address 0x10000205f,
1 | 10000205E 00 BIND_OPCODE_DONE |
The MachOViewis already help us to explain the meaning for every filed. So we go back ImageLoaderMachOCompressed::doBindFastLazySymbol , know that:
1 | segIndex = 2; // 0 is `__PAGEZERO`,1 is `__TEXT`,2 is `__DATA` |
The general idea for this Lazy Binding Infois go to dylib(3)find symbol _printfaddress , then fill the address in this segment(2) segment with offset(32).
Then focus on code below:
1 | uintptr_t address = segActualLoadAddress(segIndex) + segOffset; |
segIndex =2,is’s __DATAsegment.(0 is __PAGEZERO,1 is __TEXT,2 is __DATA).
Follow the segActualLoadAddressfunction,look at the LC_SEGMENT_64(__DATA)in MachOView,The VM Address is 0x100001000,then add offset 0x20(segOffset=32),we get address = 0x100001020; It is the _printfplaceholer. We know it before actually ,the 0x100001020will jump __stub_helperbefore ,but now we will fill in actual address of _printf.
1 | 100001020 0000000100000F98 Indirect Pointer [0x100001020 -> _print] |
Then look at bindAtfunction,
1 | // resolve symbol |
The resolve call stack is almost follows:
1 | -resolve() |
Let’s focus on libImage((unsigned int)libraryOrdinal-1)first. the libraryOrdinalis 3 actually depend on above.In this stubDebugproject,libImage(3-1) mean libSystem.B.dylib.Why? We can see stubDebug in MachOView for Load Commandspart. We can see
1 | LC_LOAD_DYLIB(Foundation) |
Then focus on findExportedSymbol, Because libSystem.B.dylibis a collection of libsystem_c.dylib,libsystem_kernal.dylib…… It will look up for ecah in recursive way.
We know _printfis in libsystem_c.dylib,,so we assume we are in libsystem_c.dylib,then execute findShallowExportedSymbol,this function look up Dynamic Loader Info ->Export Info of libsystem_c.dylib
Open libsystem_c.dylibinMachOView,The Export Infois a trie,we can find _printf in logic below:
1 | 92972 5F00 Node Lable '_' |
Now, we get the symbol offset 0x40EC4,this is the address of the symbol _printfactually.We can confirm it in Hopper Disassembler
1 | _printf: |
Now,We find the address of symbol _printf, and we also know we should bind it to stubDebug‘s address 0x100001020in __DATA,_la_symbol_ptr,to replace __TEXT,__stub_helperwith_printf,We finish the lazy bind.
Finally,we go back dylb_stub_binder
1 | Lbind: |
We jump _printf to finish the statement printf("Hello World!\n"). When call _printfnext time,we will call directly ,rather than by _stub_helper
Summary
- Generally called symbol is
const char *structure. - Store export symbol info is
trie,triecan reduce memory. - The lazy bind process is very likely cache mechanism.