重学dyld - 鲤鱼池

0.01 写在前面

本文以 arm64 架构分析，如内容有不正确的，可以联系修改。😝。文采不行，将就看。描述可能不够直接，将就看。可能省略了，将就看。

0x02 launchd

在OS X 和 iOS 中，用户环境始于 launchd，其对应于其他 UN*X 系统中的 init。作为系统中的第一个用户态进程，launchd 是由内核直接启动的，负责直接或间接地启动系统中的其他进程。其核心职责是根据预定的安排或实际的需要加载其他应用程序或作业，同时也将负责两种后台作业：守护程序和代理程序。

守护程序：后台服务，通常和用户没有交互。是由系统自动启动，不考虑是否有用户登录进系统。比如 Push 通知，外接设备和XPC等。
代理程序：特殊的守护程序，只有在用户登录的时候才启动。其可以和用户交互。比如 Mac 的 Finder 或 iOS 的 SpringBoard 就是其中之一，即广义上我们理解的桌面。

launchd 是如何被创建的，需要去看下 XNU 的启动过程，其高层次启动流程图如下：

start(iOS)：这个函数主要处理的是处理器的底层设置：通过设置 ARM 控制寄存器，安装相关的内核陷阱处理程序，进行其他一些设置，然后进入 arm_init 。
arm_init(iOS)：初始化平台，为启动内核做准备。
kernel_bootstrap：设置和初始化 Mach 内核的各个核心子系统。例如：IPC(进程间通信是 Mach 构建的根基)、时钟、任务、线程。
machine_strapup：解析命令参数，主要用户调试。
kernel_bootstrp_thread：主线程开始以此线程的身份运行，之后首先创建 idle 线程(空闲线程)，初始化``IOK it(即XNU的设备驱动程序框架)、启动中断、初始化共享区域模块、初始化 commpage(一个从内核直接映射到所有进程的页面，包含各种导出数据和一些函数)、如果启用了 Mandatory Access Control (强制访问控制)，则启动 MAC` 初始化，这对维护系统的安全至关重要。
bsd_init：初始化各个子系统。线程、进程、文件系统、管道、共享内存的散列表等。在此函数快要结束时，会调用 bsd_utaskbootstrap()，主要是间接启动 PID1。bsd_utaskbootstrap() 首先创建一个新的 Mach 任务，为了真正的创建任务，bsd_utaskbootstrap() 对创建的线程调用 act_set_astbsd()，act_set_astbsd() 最终会调用 bsdinit_task()， bsdinit_task() 函数中最后会调用 load_init_program()，load_init_program() 负责将 PID 为 1 的进程转变为众所周知的 launchd。

源码的初始化过程，launchd 是怎么被启动起来的:

void bsd_init(void) {
	...
    bsd_utaskbootstrap();
    ...
}
void bsd_utaskbootstrap()
{
	thread_t th_act;
	struct uthread *ut;
    /// 创建一个新的 Mach 任务, 新的线程。
	th_act = cloneproc(kernproc, 0);
	initproc = pfind(1);				
	/* Set the launch time for init */
	microtime(&initproc->p_stats->p_start);
    
	ut = (struct uthread *)get_bsdthread_info(th_act);
	ut->uu_sigmask = 0;
    /// 为了真正地创建出新的任务，调用当前方法生成一个系统陷阱(AST)，当处理 AST 时，Mach的AST处理程序会特别处理这个特殊情况，调用 bsd_ast()，bsd_ast()调用 bsdinit_task()
	act_set_astbsd(th_act);
	(void) thread_resume(th_act);
}
void
bsdinit_task(void)
{
	struct proc *p = current_proc();
	struct uthread *ut;
	kern_return_t	kr;
	thread_t th_act;
	shared_region_mapping_t system_region;
    /// 将初始进程的名字设置为 init
	 zprocess_name("init", p);
    /// 创建一个独立的内核线程 ux_handler，这个线程负责处理 UNIX 异常，就是在一个全局的 un_exception_port 端口上接收消息
	ux_handler_init();

	th_act = current_thread();
    /// 注册 init 线程的异常端口，将这个全局端口注册为自己的端口，这样可以保证 init，以及所有UNIX 进程的所有 UNIX 异常都会被处理
	(void) host_set_exception_ports(host_priv_self(),
					EXC_MASK_ALL & ~(EXC_MASK_SYSCALL |
							 EXC_MASK_MACH_SYSCALL |
							 EXC_MASK_RPC_ALERT),
					ux_exception_port,
					EXCEPTION_DEFAULT, 0);

	(void) task_set_exception_ports(get_threadtask(th_act),
					EXC_MASK_ALL & ~(EXC_MASK_SYSCALL |
							 EXC_MASK_MACH_SYSCALL |
							 EXC_MASK_RPC_ALERT),
					ux_exception_port,
					EXCEPTION_DEFAULT, 0);




	ut = (uthread_t)get_bsdthread_info(th_act);
	ut->uu_ar0 = (void *)get_user_regs(th_act);

	bsd_hardclockinit = 1;	/* Start bsd hardclock */
	bsd_init_task = get_threadtask(th_act);
	init_task_failure_data[0] = 0;
	system_region = lookup_default_shared_region(ENV_DEFAULT_ROOT, cpu_type());
        if (system_region == NULL) {
		shared_file_boot_time_init(ENV_DEFAULT_ROOT, cpu_type());
	} else {
		vm_set_shared_region(get_threadtask(th_act), system_region);
	}
    /// 加载 launchd
	load_init_program(p);
	/* turn on app-profiling i.e. pre-heating */
	app_profile = 1;
	lock_trace = 1;
}
static char		init_program_name[128] = "/sbin/launchd";
/*
 加载 init 程序，大部分情况下 init 程序为 launchd
 参数 p 调用 execv() 创建 init 程序的进程
 说明：传入的进程是系统中创建的第一个进程，而且通过 bsd_ast() 第一次触发进入到这里。这么做是为了宝成 bsd_init() 运行完成
 */
void
load_init_program(struct proc *p)
{
	vm_offset_t	init_addr;
	char		*argv[3];
	int			error;
	register_t 	retval[2];

	error = 0;
    
    /// 直接从引导程序以字符串形式复制init_args
	do {
		if (boothowto & RB_INITNAME) {
			printf("init program? ");
#if FIXME  /* [ */
			gets(init_program_name, init_program_name);
#endif  /* FIXME ] */
		}

        /// 将程序名复制到用户地址空间
        init_addr = VM_MIN_ADDRESS;
        (void) vm_allocate(current_map(), &init_addr,
                   PAGE_SIZE, VM_FLAGS_ANYWHERE);
        if (init_addr == 0)
            init_addr++;

        (void) copyout((caddr_t) init_program_name,
                CAST_USER_ADDR_T(init_addr),
                (unsigned) sizeof(init_program_name)+1);

        argv[0] = (char *) init_addr;
        init_addr += sizeof(init_program_name);
        init_addr = (vm_offset_t)ROUND_PTR(char, init_addr);

        /// 类似地，将第一个(也是唯一的)参数也复制出来
        /// 假设复制出来的内容都能放进之前分配的一个页面中
        (void) copyout((caddr_t) init_args,
                CAST_USER_ADDR_T(init_addr),
                (unsigned) sizeof(init_args));

        argv[1] = (char *) init_addr;
        init_addr += sizeof(init_args);
        init_addr = (vm_offset_t)ROUND_PTR(char, init_addr);
        /// Null 结尾的参数列表
        argv[2] = (char *) 0;
        /// 将参数列表复制出来
        (void) copyout((caddr_t) argv,
                CAST_USER_ADDR_T(init_addr),
                (unsigned) sizeof(argv));
        /// 设置参数快，调用 execve 时使用
        init_exec_args.fname = CAST_USER_ADDR_T(argv[0]);
        init_exec_args.argp = CAST_USER_ADDR_T((char **)init_addr);
        init_exec_args.envp = CAST_USER_ADDR_T(0);
        /// mach_init 任务设置 uid、gid 为 0 的令牌
        set_security_token(p);
		/// 启动
		error = execve(p,&init_exec_args,retval);
	} while (error);
}

0x03 Mach-O 格式

参考Mach-O

0x03 地址空间布局随机化(ASLR)

ASLR 通过随机放置进程关键数据区域的地址空间来防止攻击者能可靠地跳转到内存的特定位置来利用函数。现代操作系统一般都加设这一机制，以防范恶意程序对已知地址进行 Return-to-libc 攻击。 ASLR 利用随机方式配置数据地址空间，使某些敏感数据（例如操作系统内核）配置到一个恶意程序无法事先获知的地址，令攻击者难以进行攻击。进程每一次启动时，地址空间都将被随机化，即偏移。实现方法是通过内核将 Mach-O 的 Segment 平移某个随机系数。后面的代码阅读中，我们将会遇到这个技术。

0x05 dyld 被加载过程

1. execve()

紧接着 0x01 的 execve() 分析，下面是简化后的源码分析：

/*
 p 当前进程
 uap 是在加载load_init_program函数传过来的 3 个参数。
    uap->fname 文件名
    uap->argp 参数列表
    uap->envp 环境参数
 retval：给上层的返回值，函数自身返回 0 则成功
*/
int execve {
	struct __mac_execve_args muap;
	int err;

	memoryshot(VM_EXECVE, DBG_FUNC_NONE);

	muap.fname = uap->fname;
	muap.argp = uap->argp;
	muap.envp = uap->envp;
	muap.mac_p = USER_ADDR_NULL;
	err = __mac_execve(p, &muap, retval);
	return err;
}

1.1 mac_execve

mac_execve 源码的简化分析：

int __mac_execve(proc_t p, struct __mac_execve_args *uap, int32_t *retval)
{
    ...
	
    /// 为本地人分配一个很大的块，而不是使用堆栈，因为这些结构很大。
	MALLOC(bufp, char *, (sizeof(*imgp) + sizeof(*vap) + sizeof(*origvap)), M_TEMP, M_WAITOK | M_ZERO);
	imgp = (struct image_params *) bufp;
    ..

	/// 初始化
	imgp->ip_user_fname = uap->fname;
	imgp->ip_user_argv = uap->argp;
	imgp->ip_user_envv = uap->envp;
	imgp->ip_vattr = vap;
	imgp->ip_origvattr = origvap;
	imgp->ip_vfs_context = &context;
	imgp->ip_flags = (is_64 ? IMGPF_WAS_64BIT_ADDR : IMGPF_NONE) | ((p->p_flag & P_DISABLE_ASLR) ? IMGPF_DISABLE_ASLR : IMGPF_NONE);
	imgp->ip_seg = (is_64 ? UIO_USERSPACE64 : UIO_USERSPACE32);
	imgp->ip_mac_return = 0;
	imgp->ip_cs_error = OS_REASON_NULL;
	imgp->ip_simulator_binary = IMGPF_SB_DEFAULT;

    ...
    // 程序启动需要fork一条新的进程，会走这个else分支
	uthread = get_bsdthread_info(current_thread());
	if (uthread->uu_flag & UT_VFORK) {
		imgp->ip_flags |= IMGPF_VFORK_EXEC;
		in_vfexec = TRUE;
	} else {
		imgp->ip_flags |= IMGPF_EXEC;

		imgp->ip_new_thread = fork_create_child(old_task,
		    NULL,
		    p,
		    FALSE,
		    p->p_flag & P_LP64,
		    task_get_64bit_data(old_task),
		    TRUE);
		/* task and thread ref returned by fork_create_child */
		if (imgp->ip_new_thread == NULL) {
			error = ENOMEM;
			goto exit_with_error;
		}

		new_task = get_threadtask(imgp->ip_new_thread);
		context.vc_thread = imgp->ip_new_thread;
	}
    
     // 解析程序
	error = exec_activate_image(imgp);

    // 设置进程的主线程
	if (!error) {
        ...

		thread_t main_thread = imgp->ip_new_thread;

		task_set_main_thread_qos(new_task, main_thread);
        ...
	}
    ...
	return error;
}

1.2 exec_activate_image

exec_activate_image 源码的简化分析：

static int
exec_activate_image(struct image_params *imgp)
{
    ...
    /// 分配内存，权限检查，
	error = execargs_alloc(imgp);
	if (error) {
		goto bad_notrans;
	}
    /// 将第一个参数存储到堆栈中
	error = exec_save_path(imgp, imgp->ip_user_fname, imgp->ip_seg, &excpath);
	if (error) {
		goto bad_notrans;
	}
    ...
    
    /// 通过 namei() 方法找到该二进制文件
	error = namei(ndp);
	if (error) {
		goto bad_notrans;
	}
    ...
    /// 使用 vn 接口(跟文件系统无关的抽象接口)读取文件头，最多读一页。
	error = vn_rdwr(UIO_READ, imgp->ip_vp, imgp->ip_vdata, PAGE_SIZE, 0,
	    UIO_SYSSPACE, IO_NODELOCKED,
	    vfs_context_ucred(imgp->ip_vfs_context),
	    &resid, vfs_context_proc(imgp->ip_vfs_context));
    ...
    
    /// 读到文件头信息之后再循环走一遍，判断是否如下三种:
	for (i = 0; error == -1 && execsw[i].ex_imgact != NULL; i++) {
		error = (*execsw[i].ex_imgact)(imgp);
        /// 找到了就使用对应 ex_imgact 转成函数指针然后调用它，传入 imgp 参数。
        /// 这里对macho文件进行了解析
        error = (*execsw[i].ex_imgact)(imgp);
        /// todo:调用了一个指针函数，exec_mach_imgact
        //总共有三种函数
        /*
            struct execsw {
            int (*ex_imgact)(struct image_params *);
            const char *ex_name;
            } execsw[] = {
            { exec_mach_imgact,        "Mach-O Binary" },
            { exec_fat_imgact,        "Fat Binary" },
            { exec_shell_imgact,        "Interpreter Script" },
            { NULL, NULL}
        };
        */
        switch (error) {
            /*出错处理*/
            ...
        }
	}
    ...
    return(error);
}

1.3 小段总结

上述函数，主要就是寻找并拷贝可执行文件到内存中，并且根据可执行文件的类型调用不能的解析函数。iOS 共支持三种可执行文件，各自对应的解析函数如下：

Mach-O Binary (普通的单架构 Mach-O 二进制文件) ：``exec_mach_imgact`
Fat Binary(多架构 Mach-O 胖二进制文件)：``exec_fat_imgact`
Interpreter Script(脚本)：``exec_shell_imgact`

2. exec_mach_imgact

下面是核心函数 exec_mach_imgact 的简化源码分析：

/*
 * exec_mach_imgact
 *
 * 用于Mach-O 1.0二进制文件的图像激活器。
 *
 * 返回:
 *  -1: 不是胖二进制文件
 *  -2:	二进制文件
 *  >0: 错误
 *	EBADARCH: Mach-O二进制文件，但无法识别
 *  ENOMEM			No memory for child process after -
 *					can only happen after vfork()
 *
 * Important:	此图像激活器不是字节顺序中性的。
 *
 * Note: -1以外的返回值表示不应为后续的图像激活器提供尝试激活图像的机会。
 */
static int exec_mach_imgact(struct image_params *imgp) {
    /// 获取 mach_header
	struct mach_header *mach_header = (struct mach_header *)imgp->ip_vdata;
    ...
    
    /// 通过检测 mach_header 里的 magic，查看其是否符号 Mach-O 可执行文件的特征
    /// 逆序Mach-O二进制文件可识别但不兼容。
	if ((mach_header->magic == MH_CIGAM) ||
	    (mach_header->magic == MH_CIGAM_64)) {
		error = EBADARCH;
		goto bad;
	}

	if ((mach_header->magic != MH_MAGIC) &&
	    (mach_header->magic != MH_MAGIC_64)) {
		error = -1;
		goto bad;
	}
    /// 检测Mach-O的文件类型，文件类型必须是可执行文件
    // 还有一些其他的常见类型
    // #define    MH_OBJECT    0x1        编译过程产生的obj文件
    // #define    MH_CORE        0x4        崩溃时的dump文件
	if (mach_header->filetype != MH_EXECUTE) {
		error = -1;
		goto bad;
	}
    /// 获取Mach-O的执行环境，cpu的平台与版本
	if (imgp->ip_origcputype != 0) {
		/* Fat header previously had an idea about this thin file */
		if (imgp->ip_origcputype != mach_header->cputype ||
		    imgp->ip_origcpusubtype != mach_header->cpusubtype) {
			error = EBADARCH;
			goto bad;
		}
	} else {
		imgp->ip_origcputype = mach_header->cputype;
		imgp->ip_origcpusubtype = mach_header->cpusubtype;
    }
    ...
    
grade:
    /// 检测 Mach-O 的 cpu 平台
	if (!grade_binary(imgp->ip_origcputype, imgp->ip_origcpusubtype & ~CPU_SUBTYPE_MASK, TRUE)) {
		error = EBADARCH;
		goto bad;
	}
    ...

    /// 获取环境变量和参数，为vfork执行macho做准备
	error = exec_extract_strings(imgp);
	if (error) {
		goto bad;
	}
    ...
    
    /// 通过fork，为macho生成一个新的线程
	if (vfexec) {
		imgp->ip_new_thread = fork_create_child(task,
		    NULL,
		    p,
		    FALSE,
		    (imgp->ip_flags & IMGPF_IS_64BIT_ADDR),
		    (imgp->ip_flags & IMGPF_IS_64BIT_DATA),
		    FALSE);
		/* task and thread ref returned, will be released in __mac_execve */
		if (imgp->ip_new_thread == NULL) {
			error = ENOMEM;
			goto bad;
		}
	}
    ...
    
	/// 加载，映射macho文件到内存
	lret = load_machfile(imgp, mach_header, thread, &map, &load_result);
    ...
    
	//设置了一堆标记位
    //需要关心一下的是这里和code-signgin有点关系
	if (load_result.csflags & CS_VALID) {
		imgp->ip_csflags |= load_result.csflags &
		    (CS_VALID | CS_SIGNED | CS_DEV_CODE |
		    CS_HARD | CS_KILL | CS_RESTRICT | CS_ENFORCEMENT | CS_REQUIRE_LV |
		    CS_FORCED_LV | CS_ENTITLEMENTS_VALIDATED | CS_DYLD_PLATFORM | CS_RUNTIME |
		    CS_ENTITLEMENT_FLAGS |
		    CS_EXEC_SET_HARD | CS_EXEC_SET_KILL | CS_EXEC_SET_ENFORCEMENT);
	} else {
		imgp->ip_csflags &= ~CS_VALID;
	}

	if (p->p_csflags & CS_EXEC_SET_HARD) {
		imgp->ip_csflags |= CS_HARD;
	}
	if (p->p_csflags & CS_EXEC_SET_KILL) {
		imgp->ip_csflags |= CS_KILL;
	}
	if (p->p_csflags & CS_EXEC_SET_ENFORCEMENT) {
		imgp->ip_csflags |= CS_ENFORCEMENT;
	}
	if (p->p_csflags & CS_EXEC_INHERIT_SIP) {
		if (p->p_csflags & CS_INSTALLER) {
			imgp->ip_csflags |= CS_INSTALLER;
		}
		if (p->p_csflags & CS_DATAVAULT_CONTROLLER) {
			imgp->ip_csflags |= CS_DATAVAULT_CONTROLLER;
		}
		if (p->p_csflags & CS_NVRAM_UNRESTRICTED) {
			imgp->ip_csflags |= CS_NVRAM_UNRESTRICTED;
		}
	}

	/// 在新的地址空间中设置系统保留区。
    /// 所有cpu_subtypes使用相同的共享区域
	int cpu_subtype;
	cpu_subtype = 0; /* all cpu_subtypes use the same shared region */
#if defined(HAS_APPLE_PAC)
	if (cpu_type() == CPU_TYPE_ARM64 &&
	    (p->p_cpusubtype & ~CPU_SUBTYPE_MASK) == CPU_SUBTYPE_ARM64E) {
		assertf(p->p_cputype == CPU_TYPE_ARM64,
		    "p %p cpu_type() 0x%x p->p_cputype 0x%x p->p_cpusubtype 0x%x",
		    p, cpu_type(), p->p_cputype, p->p_cpusubtype);
		/*
		 * arm64e uses pointer authentication, so request a separate
		 * shared region for this CPU subtype.
		 */
		cpu_subtype = p->p_cpusubtype & ~CPU_SUBTYPE_MASK;
	}
#endif /* HAS_APPLE_PAC */
    /// 依据可执行文件的平台，设置合适的执行环境
	vm_map_exec(map, task, load_result.is_64bit_addr, (void *)p->p_fd->fd_rdir, cpu_type(), cpu_subtype);

	/// 关闭所有被标记为close-on-exec的文件
	fdexec(p, psa != NULL ? psa->psa_flags : 0, exec);

	/// 处理setuid相关的逻辑，和权限相关
	error = exec_handle_sugid(imgp);
    ...
    
    /// 处理上面 Mach-O 的加载结果
	lret = activate_exec_state(task, p, thread, &load_result);
	...
	
    /// 为进程设置应用层的栈地址
	if (load_result.unixproc &&
	    create_unix_stack(get_task_map(task),
        ...
    }
    ...
            
	if (load_result.dynlinker) {
        ....
        
        /// 设置一些dyld需要使用的参数
		task_set_dyld_info(task, load_result.all_image_info_addr,
		    load_result.all_image_info_size);
	}

	/// 避免虚拟机立即出现故障回到内核
	exec_prefault_data(p, imgp, &load_result);

	/// 重置信号状态
	execsigs(p, thread);

    /// 需要取消可以取消的异步IO请求，并等待已激活的请求。 可能会阻塞！
	_aio_exec( p )
    ...
}

该函数主要作用：

对 macho文件做最基本的检测
fork新的线程运行 macho
映射 macho 文件到内存中
对 setuid，code-sign 等权限相关的事情有处理
为 dyld 接手 macho 文件的处理做了大量的准备工作
dyld 处理完之后，对资源的释放

2.1 load_machfile

load_machfile 源码的简化分析

load_return_t load_machfile {
    ...
    /// 如果有new_map就用参数传进来的new_map
    /// 否则就通过pmap_create,vm_map_create函数创建新的内存空间
	task_t ledger_task;
	if (imgp->ip_new_thread) {
		ledger_task = get_threadtask(imgp->ip_new_thread);
	} else {
		ledger_task = task;
	}
	pmap = pmap_create_options(get_task_ledger(ledger_task),
	    (vm_map_size_t) 0,
	    pmap_flags);
	if (pmap == NULL) {
		return LOAD_RESOURCE;
	}
	map = vm_map_create(pmap,
	    0,
	    vm_compute_max_offset(result->is_64bit_addr),
	    TRUE);

#if defined(__arm64__)
	if (result->is_64bit_addr) {
		/// 强制虚拟映射条目的16KB对齐
		vm_map_set_page_shift(map, SIXTEENK_PAGE_SHIFT);
	} else {
		vm_map_set_page_shift(map, page_shift_user32);
	}
#elif (__ARM_ARCH_7K__ >= 2) && defined(PLATFORM_WatchOS)
	/// 使用新的ABI对监视目标强制执行16KB对齐
	vm_map_set_page_shift(map, SIXTEENK_PAGE_SHIFT);
#endif /* __arm64__ */

#ifndef CONFIG_ENFORCE_SIGNED_CODE
    /// 可以关闭可执行页面的错误，从而可以绕过代码签名实施。
    /// 每个进程标志（CS_ENFORCEMENT）尚未设置，但是我们可以使用全局标志。
	if (!cs_process_global_enforcement() && (header->flags & MH_ALLOW_STACK_EXECUTION)) {
		vm_map_disable_NX(map);
		// TODO: Message Trace or log that this is happening
	}
#endif

	/// 将内存设置为不可执行，用来防止溢出漏洞的利用
	if ((header->flags & MH_NO_HEAP_EXECUTION) && !(imgp->ip_flags & IMGPF_ALLOW_DATA_EXEC)) {
		vm_map_disallow_data_exec(map);
	}

    /// //地址随机，计算ASLR的偏移量
	if (!(imgp->ip_flags & IMGPF_DISABLE_ASLR)) {
		vm_map_get_max_aslr_slide_section(map, &aslr_section_offset, &aslr_section_size);
		aslr_section_offset = (random() % aslr_section_offset) * aslr_section_size;

		aslr_page_offset = random();
		aslr_page_offset %= vm_map_get_max_aslr_slide_pages(map);
		aslr_page_offset <<= vm_map_page_shift(map);

		dyld_aslr_page_offset = random();
		dyld_aslr_page_offset %= vm_map_get_max_loader_aslr_slide_pages(map);
		dyld_aslr_page_offset <<= vm_map_page_shift(map);

		aslr_page_offset += aslr_section_offset;
	}
    ...
    /// 解析macho的文件格式
    
	lret = parse_machfile(vp, map, thread, header, file_offset, macho_size,
	    0, aslr_page_offset, dyld_aslr_page_offset, result,
	    NULL, imgp);

    ...
    /// 用新申请的内存替换原来的内存
	if (in_exec) {
		proc_t p = vfs_context_proc(imgp->ip_vfs_context);
    
		kret = task_start_halt(task);
		if (kret != KERN_SUCCESS) {
			vm_map_deallocate(map); /* will lose pmap reference too */
			return LOAD_FAILURE;
		}
		proc_transcommit(p, 0);
		workq_mark_exiting(p);
		task_complete_halt(task);
		workq_exit(p);

		/*
		 * Roll up accounting info to new task. The roll up is done after
		 * task_complete_halt to make sure the thread accounting info is
		 * rolled up to current_task.
		 */
		task_rollup_accounting_info(get_threadtask(thread), task);
	}
	*mapp = map;
    ....
}

该函数的作用：

对新的 task 进行内存分配，内存对齐。
加强安全方面的设置主要是 DEP 和 ASRL。
调用 parse_machfile 函数解析 Mach-O 文件。
解析成功之后，用新申请的内存替换旧的内存。

2.1.1 parse_machfile

parse_machfile 源码的简化分析：

static load_return_t parse_machfile{
    ... 隐藏了初始化与检测
    
    /// 通过检测 mach_header 里的 magic，判断是否是 64 位设备
	if (header->magic == MH_MAGIC_64 ||
	    header->magic == MH_CIGAM_64) {
		mach_header_sz = sizeof(struct mach_header_64);
	}

	/// 防止无限递归
	if (depth > 1) {
		return LOAD_FAILURE;
	}
    /// 此函数会被遍历两次，第一次解析主程序的Mach-O，第二次解析dyld
	depth++;
    
	/// 通过mach_header，校验文件的CPU架构和当前运行环境的CPU架构是否一致
	if (((cpu_type_t)(header->cputype & ~CPU_ARCH_MASK) != (cpu_type() & ~CPU_ARCH_MASK)) ||
	    !grade_binary(header->cputype,
	    header->cpusubtype & ~CPU_SUBTYPE_MASK, TRUE)) {
		return LOAD_BADARCH;
	}

	abi64 = ((header->cputype & CPU_ARCH_ABI64) == CPU_ARCH_ABI64);
    /// 根据文件类型，区别处理
	switch (header->filetype) {
        /// 如果是应用程序，即app
        case MH_EXECUTE:
            if (depth != 1) {
                return LOAD_FAILURE;
            }
#if CONFIG_EMBEDDED
            /// 如果需要作为动态链接器的输入文件，肯定会进入这里，因为dyld还需要解析一次主程序
            if (header->flags & MH_DYLDLINK) {
			/// 检查动态可执行文件的属性
			if (!(header->flags & MH_PIE) && pie_required(header->cputype, header->cpusubtype & ~CPU_SUBTYPE_MASK)) {
				return LOAD_FAILURE;
			}
			result->needs_dynlinker = TRUE;
            } else {
                /// 检查静态可执行文件的属性（开发除外）
#if !(DEVELOPMENT || DEBUG)
                return LOAD_FAILURE;
#endif
            }
#endif /* CONFIG_EMBEDDED */

            break;
        /// 如果是动态链接器
        case MH_DYLINKER:
            if (depth != 2) {
                return LOAD_FAILURE;
            }
            is_dyld = TRUE;
            break;

        default:
            return LOAD_FAILURE;
    }
    ...
    
    /// 将加载命令映射到内核内存。
	addr = kalloc(alloc_size);
	if (addr == NULL) {
		return LOAD_NOSPACE;
	}
    ...

	/// 对于PIE和dyld，就将随机地址的偏移值赋给slide
	if ((header->flags & MH_PIE) || is_dyld) {
		slide = aslr_offset;
	}

	/*
	 *  扫描命令，并根据需求处理每个命令
	 *  通过 headers，解析 4 遍
	 *  0: 确定TEXT(代码段)和DATA(数据段是否页面对齐
	 *  1: 线程状态，uuid，代码签名
	 *  2: segments
	 *  3: dyld，加密，检查入口点
	 */

	boolean_t slide_realign = FALSE;
#if __arm64__
	if (!abi64) {
		slide_realign = TRUE;
	}
#endif

	for (pass = 0; pass <= 3; pass++) {
        /// 如果不需要做对齐校验，直接下一轮
		if (pass == 0 && !slide_realign && !is_dyld) {
			/* if we dont need to realign the slide or determine dyld's load
			 * address, pass 0 can be skipped */
			continue;
		} else if (pass == 1) {
#if __arm64__
			boolean_t       is_pie;
			int64_t         adjust;

			is_pie = ((header->flags & MH_PIE) != 0);
			if (pagezero_end != 0 &&
			    pagezero_end < effective_page_size) {
				/// 至少需要一个页面的PAGEZERO
				adjust = effective_page_size;
				MACHO_PRINTF(("pagezero boundary at "
				    "0x%llx; adjust slide from "
				    "0x%llx to 0x%llx%s\n",
				    (uint64_t) pagezero_end,
				    slide,
				    slide + adjust,
				    (is_pie
				    ? ""
				    : " BUT NO PIE ****** :-(")));
				if (is_pie) {
					slide += adjust;
					pagezero_end += adjust;
					executable_end += adjust;
					writable_start += adjust;
				}
			}
			if (pagezero_end != 0) {
				result->has_pagezero = TRUE;
			}
			if (executable_end == writable_start &&
			    (executable_end & effective_page_mask) != 0 &&
			    (executable_end & FOURK_PAGE_MASK) == 0) {
				/// TEXT / DATA 段是4K对齐的，但不是页面对齐的。调整 slide以使其与页面对齐，
                /// 并避免页面具有写和执行权限。
				adjust =
				    (effective_page_size -
				    (executable_end & effective_page_mask));
				MACHO_PRINTF(("page-unaligned X-W boundary at "
				    "0x%llx; adjust slide from "
				    "0x%llx to 0x%llx%s\n",
				    (uint64_t) executable_end,
				    slide,
				    slide + adjust,
				    (is_pie
				    ? ""
				    : " BUT NO PIE ****** :-(")));
				if (is_pie) {
					slide += adjust;
				}
			}
#endif /* __arm64__ */

			if (dyld_no_load_addr && binresult) {
                /// dyld Mach-O未指定加载地址。其地址 = 随机地址 + 文件最大的虚拟地址
				slide = vm_map_round_page(slide + binresult->max_vm_addr, effective_page_mask);
			}
		}
        ...
		/// 检查某些段是否映射了Mach-O文件的开头，动态加载程序需要此文件来读取mach标题，等等。
		if ((pass == 3) && (found_header_segment == FALSE)) {
			ret = LOAD_BADMACHO;
			break;
		}
		offset = mach_header_sz;
		ncmds = header->ncmds;
		while (ncmds--) {
			/* 确保足够的空间可以解析 load_command */
			if (offset + sizeof(struct load_command) > cmds_size) {
				ret = LOAD_BADMACHO;
				break;
			}
			/// 获取要解析的load_command地址
			lcp = (struct load_command *)(addr + offset);
            /// oldoffset是从Mach-O文件内存开始的地方偏移到当前command的偏移量
			oldoffset = offset;
            /// 重新计算offset，再加上当前command的长度，offset的值为文件内存起始地址到下一个command的偏移量
            if (os_add_overflow(offset, lcp->cmdsize, &offset) ||
                    lcp->cmdsize < sizeof(struct load_command) ||
                    offset > cmds_size) {
                ret = LOAD_BADMACHO;
                break;
            }
            /// 做了一个检测，与如何加载进入内存无关
			switch (lcp->cmd) {
            ///  指导内核如何设置新运行进行的内存空间。这些段直接从Mach-O加载到内存中
			case LC_SEGMENT: {
				struct segment_command *scp = (struct segment_command *) lcp;
                ...
                /// segment映射和解析
                /// segment下还有区的概念，比如__objc_classlist，__objc_protolist
				ret = load_segment(lcp,
				    header->filetype,
				    control,
				    file_offset,
				    macho_size,
				    vp,
				    map,
				    slide,
				    result);
				....
				break;
			}
            /// 映射文件中的特定的字节到虚拟内存
			case LC_SEGMENT_64: {
				struct segment_command_64 *scp64 = (struct segment_command_64 *) lcp;
                ...

				ret = load_segment(lcp,
				    header->filetype,
				    control,
				    file_offset,
				    macho_size,
				    vp,
				    map,
				    slide,
				    result);
                ...
				break;
			}
            /// UNIX线程，包含堆栈
			case LC_UNIXTHREAD:
				if (pass != 1) {
					break;
				}
				ret = load_unixthread(
					(struct thread_command *) lcp,
					thread,
					slide,
					result);
				break;
            /// 替换LC_UNIXTHREAD
			case LC_MAIN:
                ...
                    
				ret = load_main(
					(struct entry_point_command *) lcp,
					thread,
					slide,
					result);
				break;
            /// 加载动态链接器
			case LC_LOAD_DYLINKER:
				if (pass != 3) {
					break;
				}
				if ((depth == 1) && (dlp == 0)) {
                    //// 动态解析器地址
					dlp = (struct dylinker_command *)lcp;
					dlarchbits = (header->cputype & CPU_ARCH_MASK);
				} else {
					ret = LOAD_FAILURE;
				}
				break;
            /// UUID
			case LC_UUID:
				if (pass == 1 && depth == 1) {
					ret = load_uuid((struct uuid_command *) lcp,
					    (char *)addr + cmds_size,
					    result);
				}
				break;
            /// 代码签名
			case LC_CODE_SIGNATURE:
				...
				ret = load_code_signature(
					(struct linkedit_data_command *) lcp,
					vp,
					file_offset,
					macho_size,
					header->cputype,
					result,
					imgp);
				....
				break;
#if CONFIG_CODE_DECRYPTION
            /// 加密的段信息
			case LC_ENCRYPTION_INFO:
			case LC_ENCRYPTION_INFO_64:
				if (pass != 3) {
					break;
				}
				ret = set_code_unprotect(
					(struct encryption_info_command *) lcp,
					addr, map, slide, vp, file_offset,
					header->cputype, header->cpusubtype);
				...
				break;
#endif
    ...
	if (ret == LOAD_SUCCESS) {
        ...
		if ((ret == LOAD_SUCCESS) && (dlp != 0)) {
			/// 加载动态解析器.不管主二进制文件的PIE大小如何d，都会再调用一次 parse_machfile
			ret = load_dylinker(dlp, dlarchbits, map, thread, depth,
			    dyld_aslr_offset, result, imgp);
		}
		...
	}
    ...

	return ret;
}

上面的过程得到的结果会被赋值进 load_result_t 这个结果体

typedef struct _load_result {
	user_addr_t		mach_header;
	user_addr_t		entry_point;

	// The user stack pointer and addressable user stack size.
	user_addr_t		user_stack;
	mach_vm_size_t		user_stack_size;

	// The allocation containing the stack and guard area.
	user_addr_t		user_stack_alloc;
	mach_vm_size_t		user_stack_alloc_size;

	mach_vm_address_t	all_image_info_addr;
	mach_vm_size_t		all_image_info_size;

	int			thread_count;
	unsigned int
		/* boolean_t */	unixproc	:1,
				needs_dynlinker 	:1,
				dynlinker			:1,
				validentry			:1,
				has_pagezero		:1,
				using_lcmain		:1,
#if __arm64__
				legacy_footprint	:1,
#endif /* __arm64__ */
				is_64bit_addr		:1,
				is_64bit_data		:1;
	unsigned int		csflags;
	unsigned char		uuid[16];
	mach_vm_address_t	min_vm_addr;
	mach_vm_address_t	max_vm_addr;
	unsigned int		platform_binary;
	off_t			cs_end_offset;
	void			*threadstate;
	size_t			threadstate_sz;
} load_result_t;

那么在哪里设置 entry_point，其实 entry_point 的设置在 load_dylinker 里

static load_return_t load_dylinker {
	...
	ret = parse_machfile(vp, map, thread, header, file_offset,
	    macho_size, depth, slide, 0, myresult, result, imgp);

	if (ret == LOAD_SUCCESS) {
		if (result->threadstate) {
			/* don't use the app's threadstate if we have a dyld */
			kfree(result->threadstate, result->threadstate_sz);
		}
		result->threadstate = myresult->threadstate;
		result->threadstate_sz = myresult->threadstate_sz;

		result->dynlinker = TRUE;
        /// 将load_result_t的entry_point，设置为dyld动态链接库的entrypoint，所以启动的时候首先加载的会是dyld。
		result->entry_point = myresult->entry_point;
		result->validentry = myresult->validentry;
		result->all_image_info_addr = myresult->all_image_info_addr;
		result->all_image_info_size = myresult->all_image_info_size;
		if (myresult->platform_binary) {
			result->csflags |= CS_DYLD_PLATFORM;
		}
	}

	...
	return ret;
}

2.2 activate_exec_state

activate_exec_state 源码的简化分析

static int activate_exec_state {
    ...
    /// 设置入口点
	thread_setentrypoint(thread, result->entry_point);
	return KERN_SUCCESS;
}

2.3 小段总结

上述函数主要是加载和解析 Mach-O 信息，并设置程序的 entry_point。

3. app 启动的流程

execve() ：用户点击了 app，系统调用 execve() 到内核
- mac_execve：fork一条新的线程(进程)出来
  - exec_activate_image：找到二进制文件，并确定解析函数
    - exec_mach_imgact：Mach-O Binary和Fat Binary都有对应的加载函数，此函数是Mach-O Binary的加载函数。
      - load_machfile：加载主程序的Mach-O信息。
        
        parse_machfile：解析主程序的Mach-O信息
        
        load_dylinker：解析完 macho后，根据macho中的 LC_LOAD_DYLINKER 这个LoadCommand来启动这个二进制的加载器，即 /usr/bin/dyld
        
        parse_machfile：解析 dyld 这个Mach-O文件，这个过程中会解析出entry_point
      - activate_exec_state：
        
        thread_setentrypoint：设置entry_point。
- 进入entry_point对应的入口，启动dyld

对应的流程图：

这个 entry_point 就是 _dyld_start 的地址，为什么？参考文章：_dyld_start之前

0x06 dyld加载程序流程

1. __dyld_start

上面在最后一次加载完 dyld 后，就进入 dyld 的入口函数，即 __dyld_start，在 dyld 源码中，__dyld_start 用汇编实现，下面是其 __arm64__ 的实现：

#if __arm64__
	.text
	.align 2
	.globl __dyld_start
__dyld_start:
	mov 	x28, sp
	and     sp, x28, #~15		// 强制堆栈的 16 字节对齐
	mov	x0, #0
	mov	x1, #0
	stp	x1, x0, [sp, #-16]!	// 对齐 terminating frame
	mov	fp, sp			// 设置 fp 指向 terminating frame
	sub	sp, sp, #16             // 为局部变量腾出空间
#if __LP64__
	ldr     x0, [x28]               // 获取主程序的 mach_header
	ldr     x1, [x28, #8]           // 将上个函数传入的 argc 赋值给 x1(内核将32位int argc作为64位堆栈传递）以保持对齐状态
	add     x2, x28, #16            // 将上个函数传入的 argv 赋值给 x2
#endif
	adrp	x3,___dso_handle@page
	add 	x3,x3,___dso_handle@pageoff // 获取 dyld 的 mach_headerget
	mov	x4,sp                   // x5 has &startGlue
	
	/// 启动引导，入口为dyldbootstrap::start函数

	/// call dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue)
	bl	__ZN13dyldbootstrap5startEPKN5dyld311MachOLoadedEiPPKcS3_Pm
	/// 会返回主程序的入口地址，并保存到x16寄存器
	mov	x16,x0                  // save entry point address in x16
#if __LP64__
	ldr     x1, [sp]
#endif
	cmp	x1, #0
	b.ne	Lnew

	/// 当所有库都完成加载之后，dyld的工作也完成了，之后由LC_UNIXTHREAD命令负责启动二进制程序的主线程
	// LC_UNIXTHREAD way, clean up stack and jump to result
#if __LP64__
	add	sp, x28, #8             // restore unaligned stack pointer without app mh
#endif
    /// 跳转到程序的入口点
#if __arm64e__
	braaz   x16                     // jump to the program's entry point
#else
	br      x16                     // jump to the program's entry point
#endif
	/// 设置程序主线程的入口地址和栈大小，用来调用 main()
	// LC_MAIN case, set up stack for call to main()
Lnew:	mov	lr, x1		    // simulate return address into _start in libdyld.dylib
#if __LP64__
	ldr	x0, [x28, #8]       // main param1 = argc ：main 第一个参数
	add	x1, x28, #16        // main param2 = argv ：main 第二个参数
	add	x2, x1, x0, lsl #3
	add	x2, x2, #8          // main param3 = &env[0] ：main 第三个参数
	mov	x3, x2
Lapple:	ldr	x4, [x3]
	add	x3, x3, #8
#endif
	cmp	x4, #0
	b.ne	Lapple		    // main param4 = apple ：main 第四个参数
	/// 调用 main()
#if __arm64e__
	braaz   x16
#else
	br      x16
#endif

#endif // __arm64__

上述源码，主要是获取主程序和 dyld 的 mach_header，之后调用函数 dyldbootstrap::start 对主程序进一步处理，处理完之后返回主程序的入口地址，然后设置其所需参数，调用 main()，进入主程序。下面开始对 dyldbootstrap::start 进行分析。

2. dyldbootstrap::start

下面是函数 dyldbootstrap::start 的简化源码分析。

uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
				const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{

    /// 发出 kdebug 跟踪，预示着 引导程序已经启动
    // Emit kdebug tracepoint to indicate dyld bootstrap has started <rdar://46878536>
    dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0);

	/// 如果内核有 slide dyld, 我们必须 fixeup 一下dyly中的内容.
    rebaseDyld(dyldsMachHeader);
	/// 内核设置的env pointers, 也就是环境参数
	const char** envp = &argv[argc+1];
    ...
	// 为堆栈设置随机值
	__guard_setup(apple);

#if DYLD_INITIALIZER_SUPPORT
	// run all C++ initializers inside dyld
    /// 这句话是关键，dyld在初始化其他库之前会调用这个函数来调用库自身的所有全局C++对象的初始化函数。
	runDyldInitializers(argc, argv, envp, apple);
#endif
    /// 引导完成，调用 dyld 的 main 函数，去初始化可执行程序及加载所有的动态库
	// now that we are done bootstrapping dyld, call dyld's main
	uintptr_t appsSlide = appsMachHeader->getSlide();
	return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}

上述源码主要是启动引导程序，设置内核环境参数，调用 dyld 的 main 函数，去执行初始化和加载所有的动态库等。下面开始分析 dyld::_main。

3. dyld::_main

下面是 dyld::_main 函数的简化实现：

uintptr_t
_main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide, 
		int argc, const char* argv[], const char* envp[], const char* apple[], 
		uintptr_t* startGlue)
{
	...
	/// 保存主程序的 mach_header
	sMainExecutableMachHeader = mainExecutableMH;
	sMainExecutableSlide = mainExecutableSlide;
	...
	
	/// 1. 设置上下文运行环境
	setContext(mainExecutableMH, argc, argv, envp, apple);

	/// 获取可执行文件的路径
	sExecPath = _simple_getenv(apple, "executable_path");
	
	/// 2. 配置进程限制相关、检查环境变量
    configureProcessRestrictions(mainExecutableMH, envp);
    checkEnvironmentVariables(envp);
	...
	/// 获取当前设备的 CPU 架构信息
	getHostInfo(mainExecutableMH, mainExecutableSlide);

	/// 3. 检查是否开启共享缓存，并加载共享缓存库
	checkSharedRegionDisable((dyld3::MachOLoaded*)mainExecutableMH, mainExecutableSlide);
	if ( gLinkContext.sharedRegionMode != ImageLoader::kDontUseSharedRegion ) {
#if TARGET_OS_SIMULATOR
		if ( sSharedCacheOverrideDir)
			mapSharedCache();
#else
		mapSharedCache();
#endif
	}
	...

	// 注册 gdb 通知，用于调试。
	stateToHandlers(dyld_image_state_dependents_mapped, sBatchHandlers)->push_back(notifyGDB);
	stateToHandlers(dyld_image_state_mapped, sSingleHandlers)->push_back(updateAllImages);
	// make initial allocations large enough that it is unlikely to need to be re-alloced
	sImageRoots.reserve(16);
	sAddImageCallbacks.reserve(4);
	sRemoveImageCallbacks.reserve(4);
	sAddLoadImageCallbacks.reserve(4);
	sImageFilesNeedingTermination.reserve(16);
	sImageFilesNeedingDOFUnregistration.reserve(8);
	...
	/// 初始化主程序
	try {
		// 4 添加 dyld 到 UUID list
		addDyldImageToUUIDList();

		CRSetCrashLogMessage(sLoadingCrashMessage);
		
		/// 5. 加载 sExecPath 路径下的可执行文件，生成一个 ImageLoader 对象
		sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
		/// 设置上下文，将 ImageLoader 对象设置给链接上下文的主程序相关属性，并设置其签名相关信息
		gLinkContext.mainExecutable = sMainExecutable;
		gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);
		...
		
		/// 6. 现在已加载共享缓存，检查库的版本是否有更新，有则覆盖原有的
	#if SUPPORT_VERSIONED_PATHS
		checkVersionedPaths();
	#endif
		...
		/// 7. 加载任何插入的库
		if	( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
			for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib) 
				loadInsertedDylib(*lib);
		}
		
		/// 记录插入库的数量，以便进行搜索，顺序是，差入库，main，其他
		sInsertedDylibCount = sAllImages.size()-1;

		// 8. 链接主程序
		gLinkContext.linkingMainExecutable = true;
#if SUPPORT_ACCELERATE_TABLES
		if ( mainExcutableAlreadyRebased ) {
			/// 在链接之程序之前，已为其 ASLR 调整了内部指针，这里通过 反 rebasing 来解决
			sMainExecutable->rebase(gLinkContext, -mainExecutableSlide);
		}
#endif
		// 开始链接主程序, 此时主程序已经被加载到 gLinkContext.mainExecutable 中, 调用 link 链接主程序。内核调用的是ImageLoader::link 函数。
		link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
		sMainExecutable->setNeverUnloadRecursive();
		if ( sMainExecutable->forceFlat() ) {
			gLinkContext.bindFlat = true;
			gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
		}

		// 9. 链接所有插入的库
		// 在链接主程序之后执行改操作，以便可以插入更多的库
		// dylibs (e.g. libSystem) 不在程序使用的 dylibs 之前链接
		if ( sInsertedDylibCount > 0 ) {
			for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
				ImageLoader* image = sAllImages[i+1];
				link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
				image->setNeverUnloadRecursive();
			}
			/// 只有插入的库可以调用
			/// 注册符号插入,Interposition, 是通过编写与函数库同名的函数来取代函数库的行为.
			for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
				ImageLoader* image = sAllImages[i+1];
				image->registerInterposing(gLinkContext);
			}
		}

		// <rdar://problem/19315404> dyld should support interposition even without DYLD_INSERT_LIBRARIES
		/// 即使没有DYLD_INSERT_LIBRARIES，dyld也应支持注册符号插入
		for (long i=sInsertedDylibCount+1; i < sAllImages.size(); ++i) {
			ImageLoader* image = sAllImages[i];
			if ( image->inSharedCache() )
				continue;
			image->registerInterposing(gLinkContext);
		}
		
			// apply interposing to initial set of images
		for(int i=0; i < sImageRoots.size(); ++i) {
			sImageRoots[i]->applyInterposing(gLinkContext);
		}
		ImageLoader::applyInterposingToDyldCache(gLinkContext)
		...
		
		// 10. 执行弱符号绑定: 仅在链接所有插入的images后才执行弱符号绑定
		sMainExecutable->weakBind(gLinkContext);
		gLinkContext.linkingMainExecutable = false;

		sMainExecutable->recursiveMakeDataReadOnly(gLinkContext);
		
		///  11. 初始化主程序
		initializeMainExecutable(); 

		/// 12. 查找主程序可执行文件的入口点
		result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN();
		if ( result != 0 ) {
			// main executable uses LC_MAIN, we need to use helper in libdyld to call into main()
			if ( (gLibSystemHelpers != NULL) && (gLibSystemHelpers->version >= 9) )
				*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
			else
				halt("libdyld.dylib support not present for LC_MAIN");
		}
		else {
			// main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
			result = (uintptr_t)sMainExecutable->getEntryFromLC_UNIXTHREAD();
			*startGlue = 0;
			}
#if __has_feature(ptrauth_calls)
		/// start（）将结果指针作为函数指针调用，因此我们需要对其进行签名。
		result = (uintptr_t)__builtin_ptrauth_sign_unauthenticated((void*)result, 0, 0);
#endif
	}
	catch(const char* message) {
		...
	}
	catch(...) {
		....
	}
	...
	
	return result;
}

该函数的作用大致如下：

设置运行环境，处理环境变量
配置进程限制相关、检查环境变量
检查是否开启共享缓存，并加载共享缓存库
添加 dyld自身到 UUID list
初始化主程序，位主程序成功一个 ImageLoader 对象
检查库的版本是否有更新，有则覆盖原有的
加载任何插入的库
链接主程序
链接所有插入的库
执行弱符号绑定: 仅在链接所有插入的images后才执行弱符号绑定
初始化主程序
查找主程序可执行文件的入口点

3.1 设置运行环境，处理环境变量

刚开始先把主程序的 mach_header 赋值给 sMainExecutableMachHeader。然后执行 setContext(..)，其接收参数是 主程序 mach_hader、argc、argv、envp、apple。此函数设置了一个全局的链接上下文 gLinkContext、环境变量、回调函数、链接 imageCount、主程序 mach_hader、loadLibrary等。loadLibrary 对应本模块的 libraryLocator() 方法, 负责加载动态库。

3.2 配置进程限制相关、检查环境变量

如果进程收到限制，处理环境变量发生改变，会进行下面操作：

更新 gLinkContext 的相关变量：主要是由于环境变量发生变化了, 需要更新进程的envp与路径相关参数。
checkEnvironmentVariables()：检查环境变量

如果 gLinkContext.allowEnvVarsPath 与 gLinkContext。allowEnvVarsPrint 为空，直接跳过，否则调用 processDyldEnvironmentVariable() 处理并设置环境变量。然后调用 checkLoadCommandEnvironmentVariables() 去检查主程序的 mach_hadner 的 ncmds 去检查环境变量。
- processDyldEnvironmentVariable()：对不同的环境变量做相应的处理。
- checkLoadCommandEnvironmentVariables()：遍历 Mach-O 中所有的 LC_DYLD_ENVIRONMENT 加载命令，然后调用 processDyldEnvironmentVariable()。

3.3 检查是否开启共享缓存，并加载共享缓存库

3.3.1 检查是否开启共享缓存

checkSharedRegionDisable：检查是否开启共享缓存，iOS 平台用不禁用。

3.3.2 加载共享缓存库

共享缓存库：是 dyld 支持的另外一种机制。共享缓存库指的是一些库经过预先链接，然后保存到磁盘上的一个文件中。在 iOS 中，共享缓存库可以在 /System/Library/Caches/com/apple.dyld 中找到。如 libdispatch，UIKit 都是共享缓存库。

这一步主要是调用 mapSharedCache() 来映射共享缓存库。下面是其源码的简化分析：

bool loadDyldCache(const SharedCacheOptions& options, SharedCacheLoadInfo* results)
{
    ...
    if ( options.forcePrivate ) {
        /// 仅加载到当前进程
        return mapCachePrivate(options, results);
    }
    else {
        // fast path: when cache is already mapped into shared region
        bool hasError = false;
        /// 快捷路径：已经映射到了共享区域了，直接将它在共享内存中的内存地址映射到进程的内存地址空间
        /// 其实说白了，就是之前已经加载过
        if ( reuseExistingCache(options, results) ) {
            hasError = (results->errorMessage != nullptr);
        } else {
            /// 慢路径：如果是第一个程序刚刚启动，共享区其实没内容的，需要将库映射到共享区
            /// 第一次加载
            hasError = mapCacheSystemWide(options, results);
        }
        return hasError;
    }
}

该函数调用有以下几种情况：

当这个库被 forcePrivate 的情况下，仅加载到当前进程中。
共享缓存库第一次加载，调用方法 mapCacheSystemWide()。
共享缓存库之前被加载过，直接将它在 共享内存 中的内存地址映射到进程的内存地址中。

mapCacheSystemWide()

static bool mapCacheSystemWide(const SharedCacheOptions& options, SharedCacheLoadInfo* results)
{
    CacheInfo info;
    /// 预检测缓存文件
    if ( !preflightCacheFile(options, results, &info) )
        return false;
    ...
    /// 使用该方法完成映射
    int result = __shared_region_map_and_slide_np(info.fd, 3, info.mappings, results->slide, slideInfo, info.slideInfoSize);
    ...
    return true;
}

static bool preflightCacheFile(const SharedCacheOptions& options, SharedCacheLoadInfo* results, CacheInfo* info)
{
    
    /// 查找并打开共享缓存库文件
    int fd = openSharedCacheFile(options, results);
    
    struct stat cacheStatBuf;
    /// 共享缓存库统计失败
    if ( dyld::my_stat(results->path, &cacheStatBuf) != 0 ) {
        results->errorMessage = "shared cache file stat() failed";
        ::close(fd);
        return false;
    }

    // 合理性的检查 header 和 映射
    uint8_t firstPage[0x4000];
    if ( ::pread(fd, firstPage, sizeof(firstPage), 0) != sizeof(firstPage) ) {
        results->errorMessage = "shared cache file pread() failed";
        ::close(fd);
        return false;
    }
    /// 解析 dyld_cache_header 信息
    const dyld_cache_mapping_info* const fileMappings = (dyld_cache_mapping_info*)&firstPage[cache->header.mappingOffset];
    
    ...
    // 缓存文件的代码签名及验证签名
    fsignatures_t siginfo;
    siginfo.fs_file_start = 0;  // cache always starts at beginning of file
    siginfo.fs_blob_start = (void*)cache->header.codeSignatureOffset;
    siginfo.fs_blob_size  = (size_t)(cache->header.codeSignatureSize);
    ...
    
    // 解析好的缓存信息存入到 mappings 变量
    info->fd = fd;
    for (int i=0; i < 3; ++i) {
        info->mappings[i].sfm_address       = fileMappings[i].address;
        info->mappings[i].sfm_size          = fileMappings[i].size;
        info->mappings[i].sfm_file_offset   = fileMappings[i].fileOffset;
        info->mappings[i].sfm_max_prot      = fileMappings[i].maxProt;
        info->mappings[i].sfm_init_prot     = fileMappings[i].initProt;
    }
    info->mappings[1].sfm_max_prot  |= VM_PROT_SLIDE;
    info->mappings[1].sfm_init_prot |= VM_PROT_SLIDE;
    info->slideInfoAddressUnslid  = fileMappings[2].address + cache->header.slideInfoOffset - fileMappings[2].fileOffset;
    info->slideInfoSize           = (long)cache->header.slideInfoSize;
    if ( cache->header.mappingOffset >= 0xf8 ) {
        info->sharedRegionStart = cache->header.sharedRegionStart;
        info->sharedRegionSize  = cache->header.sharedRegionSize;
        info->maxSlide          = cache->header.maxSlide;
    }
    else {
        info->sharedRegionStart = SHARED_REGION_BASE;
        info->sharedRegionSize  = SHARED_REGION_SIZE;
        info->maxSlide          = SHARED_REGION_SIZE - (fileMappings[2].address + fileMappings[2].size - fileMappings[0].address);
    }
    return true;
}

首先调用 preflightCacheFile() 方法去检测缓存环境，在其函数中会调用 openSharedCacheFile() 去打开缓存文件，打开与系统当前 CPU 架构匹配的缓存文件，检查 header 和映射、解析 dyld_cache_header 信息，在对缓存文件进行签名和验证，把解析好的数据 fileMappings 和 dyld_cache_header 信息存入到 mappings 等变量中。最后 __shared_region_map_and_slide_np() 完成真正的映射工作。

3.4 添加 dyld 到 UUID list

static void addDyldImageToUUIDList()
{
	const struct macho_header* mh = (macho_header*)&__dso_handle;
	const uint32_t cmd_count = mh->ncmds;
	const struct load_command* const cmds = (struct load_command*)((char*)mh + sizeof(macho_header));
	const struct load_command* cmd = cmds;
	for (uint32_t i = 0; i < cmd_count; ++i) {
		switch (cmd->cmd) {
			case LC_UUID: {
				uuid_command* uc = (uuid_command*)cmd;
				dyld_uuid_info info;
				info.imageLoadAddress = (mach_header*)mh;
				memcpy(info.imageUUID, uc->uuid, 16);
				addNonSharedCacheImageUUID(info);
				return;
			}
		}
		cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
	}
}

void addNonSharedCacheImageUUID(const dyld_uuid_info& info)
{
	/// 将uuidArray设置为NULL表示它正在使用
	dyld::gProcessInfo->uuidArray = NULL;
	
	// 附加所有新image
	sImageUUIDs.push_back(info);
	dyld::gProcessInfo->uuidArrayCount = sImageUUIDs.size();
	
	// 将uuidArray设置回向量的基地址（其他进程现在可以读取）
	dyld::gProcessInfo->uuidArray = &sImageUUIDs[0];
}

先通过循环 load_command，找到自身 UUID，然后调用 memcpy，复制自身 UUID 到当前程序的 info.imageUUID 处，然后调用方法 addNonSharedCacheImageUUID() 把自身 dyld 的 UUID 添加到共享缓存库的 UUID 列表中。

3.5 初始化主程序，为主程序成功一个 ImageLoader 对象

通过调用 instantiateFromLoadedImage() 方法，为主程序生成一个 ImageLoader 对象，然后把 ImageLoader 对象设置到全局的链接上下文中。下面是相关源码的简化分析:

static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
{
	// 尝试加载主程序的 Mach-O
	if ( isCompatibleMachO((const uint8_t*)mh, path) ) {
		ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
		addImage(image);
		return (ImageLoaderMachO*)image;
	}
	
	throw "main executable not a known format";
}

isCompatibleMachO() 主要检查 Mach-O 的头部的 cputype 与 cpusubtype 来判断程序与当前的系统是否兼容。如果兼容，加载主程序的 Mach-O，调用方法 ImageLoaderMachO::instantiateMainExecutable()。在获取到主程序的 ImageLoader 对象之后，把其添加到全局主列表 sAllImages，最后调用 addMappedRange() 申请内存，更新主程序映射的内存区。下面是方法 ImageLoaderMachO::instantiateMainExecutable() 的简化代码：

// create image for main executable
ImageLoader* ImageLoaderMachO::instantiateMainExecutable(const macho_header* mh, uintptr_t slide, const char* path, const LinkContext& context)
{
	bool compressed;
	unsigned int segCount;
	unsigned int libCount;
	const linkedit_data_command* codeSigCmd;
	const encryption_info_command* encryptCmd;
	sniffLoadCommands(mh, path, false, &compressed, &segCount, &libCount, context, &codeSigCmd, &encryptCmd);
	// instantiate concrete class based on content of load commands
	if ( compressed ) 
		return ImageLoaderMachOCompressed::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
	else
#if SUPPORT_CLASSIC_MACHO
		return ImageLoaderMachOClassic::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
#else
		throw "missing LC_DYLD_INFO load command";
#endif
}

sniffLoadCommands() 函数主要是获取 mach_header 中 load command 中的如下信息：

compressed：判断 Mach-O 是 Compressed 还是 Classic 类型。根据 mach-o 是否包含 LC_DYLD_INFO、LC_DYLD_INFO_ONLY 和 LC_DYLD_CHAINED_FIXUPS 加载命令来判断。前两个命令记录了Mach-O 的动态库加载信息，最后一个命令记录了 Mach-O 的动态库的链接信息。前两个命令使用结构体dyld_info_command 表示：
```
struct dyld_info_command {
   uint32_t   cmd;		/* LC_DYLD_INFO or LC_DYLD_INFO_ONLY */
   uint32_t   cmdsize;		/* sizeof(struct dyld_info_command) */
   uint32_t   rebase_off;	/* file offset to rebase info  */
   uint32_t   rebase_size;	/* size of rebase info   */
   uint32_t   bind_off;	/* file offset to binding info   */
   uint32_t   bind_size;	/* size of binding info  */
   uint32_t   weak_bind_off;	/* file offset to weak binding info   */
   uint32_t   weak_bind_size;  /* size of weak binding info  */
   uint32_t   lazy_bind_off;	/* file offset to lazy binding info */
   uint32_t   lazy_bind_size;  /* size of lazy binding infs */
   uint32_t   export_off;	/* file offset to lazy binding info */
   uint32_t   export_size;	/* size of lazy binding infs */
};
```
- rebase_off 和 rebase_size 存储了与 rebase(重设基准)相关信息。当 dyld 将 Mach-O 加载到与其首选地址不同的内存地址时，dyld 会对 Mach-O 重新设置基准。
- bind_off 与 bind_size 存储了进程的符号绑定信息，当进程启动时必须绑定这些符号，典型的有dyld_stub_binder，该符号被 dyld 用来做迟绑定加载符号，一般动态库都包含该符号。
- weak_bind_off 与 weak_bind_size 存储了进程的弱绑定符号信息。弱符号主要用于面向对旬语言中的符号重载，典型的有 c++ 中使用 new 创建对象，默认情况下会绑定 ibstdc++.dylib，如果检测到某个映像使用弱符号引用重载了 new 符号，dyld 则会重新绑定该符号并调用重载的版本。
- lazy_bind_off 与 lazy_bind_size 存储了进程的延迟绑定符号信息。有些符号在进程启动时不需要马上解析，它们会在第一次调用时被解析，这类符号叫延迟绑定符号（Lazy Symbol）。
- export_off 与 export_size 存储了进程的导出符号绑定信息。导出符号可以被外部的 Mach-O 访问，通常动态库会导出一个或多个符号供外部使用，而可执行程序由导出 _main与 _mh_execute_header 符号供dyld 使用。
最后一个命令使用了结构体 linkedit_data_command 表示：
```
struct linkedit_data_command {
    uint32_t	cmd;		/* LC_CODE_SIGNATURE, LC_SEGMENT_SPLIT_INFO,
				   LC_FUNCTION_STARTS, LC_DATA_IN_CODE,
				   LC_DYLIB_CODE_SIGN_DRS,
				   LC_LINKER_OPTIMIZATION_HINT,
				   LC_DYLD_EXPORTS_TRIE, or
				   LC_DYLD_CHAINED_FIXUPS. */
    uint32_t	cmdsize;	/* sizeof(struct linkedit_data_command) */
    uint32_t	dataoff;	/* file offset of data in __LINKEDIT segment */
    uint32_t	datasize;	/* file size of data in __LINKEDIT segment  */
};
```
- dataoff 与 datasize 存储了 __LINKEDIT 段相关的信息，属于链接信息。
代码签名也会使用当前结构体
segCount：段的数量。通过遍历所有的加载命令 LC_SEGMENT_COMMAND 累加获取段的数量。
libCount：需加载的动态库的数量。通过遍历所有的 LC_LOAD_UPWARD_DYLIB、LC_LOAD_DYLIB、LC_LOAD_WEAK_DYLIB、LC_REEXPORT_DYLIB 累加获取加载动态库的数量。
codeSigCmd：代码签名的命令。通过解析 LC_CODE_SIGNATURE 来获取代码签名的加载命令。
encryptCmd：加密命令。通过解析 LC_ENCRYPTION_INFO 和 LC_ENCRYPTION_INFO_64 来获取加密的加载命令。

在获取到 compressed 之后，根据其来分别调用 ImageLoaderMachOCompressed::instantiateMainExecutable()、ImageLoaderMachOClassic::instantiateMainExecutable()。这里以 ImageLoaderMachOCompressed::instantiateMainExecutable() 为例，其源码如下：

ImageLoaderMachOCompressed* ImageLoaderMachOCompressed::instantiateMainExecutable(const macho_header* mh, uintptr_t slide, const char* path, 
																		unsigned int segCount, unsigned int libCount, const LinkContext& context)
{
	ImageLoaderMachOCompressed* image = ImageLoaderMachOCompressed::instantiateStart(mh, path, segCount, libCount);

	// set slide for PIE programs
	image->setSlide(slide);

	// for PIE record end of program, to know where to start loading dylibs
	if ( slide != 0 )
		fgNextPIEDylibAddress = (uintptr_t)image->getEnd();

	image->disableCoverageCheck();
	image->instantiateFinish(context);
	image->setMapped(context);

	if ( context.verboseMapping ) {
		dyld::log("dyld: Main executable mapped %s\n", path);
		for(unsigned int i=0, e=image->segmentCount(); i < e; ++i) {
			const char* name = image->segName(i);
			if ( (strcmp(name, "__PAGEZERO") == 0) || (strcmp(name, "__UNIXSTACK") == 0)  )
				dyld::log("%18s at 0x%08lX->0x%08lX\n", name, image->segPreferredLoadAddress(i), image->segPreferredLoadAddress(i)+image->segSize(i));
			else
				dyld::log("%18s at 0x%08lX->0x%08lX\n", name, image->segActualLoadAddress(i), image->segActualEndAddress(i));
		}
	}

	return image;
}

调用 ImageLoaderMachOCompressed::instantiateStart( 方法，其使用主程序 Mach-O 的信息构造一个 ImageLoaderMachOCompressed 对象 image 。设置 image 对象的 slide 属性。image->disableCoverageCheck() 表示禁用覆盖率检查。image->instantiateFinish(context) 会调用 parseLoadCmds() 方法来解析其他的命令，会设置一些保护成员信息、设置动态库链接的信息、设置符号表的信息、修复链接等。最后设置 image 对象的映射。

3.6 检查库的版本是否有更新，有则覆盖原有的

该函数主要是读取 DYLD_VERSIONED_LIBRARY_PATH 和 DYLD_VERSIONED_FRAMEWORK_PATH 环境变量，将指定版本的库与当前加载的库的版本做比较，如果当前的库版本更改的话，就使用新版本的库替换掉旧版本的。

3.7 加载任何插入的动态库

这一步主要是循环遍历环境变量 DYLD_INSERT_LIBRARIES 中指定的动态库列表，并调用 loadInsertedDylib() 将其加载。loadInsertedDylib() 调用 load() 函数完成加载。load() 会调用 loadPhase0() 尝试从文件中加载，loadPhase0() 会向下调用下一层的 phase 来查找动态库的路径，知道 loadPhase6()，查找的顺序 DYLD_ROOT_PATH –> LD_LIBRARY_PATH –> DYLD_FRAMEWORK_PATH –> 文件自身 –> DYLD_FALLBACK_LIBRARY_PATH，找到最后调用 ImageLoaderMachO::instantiateFromFile() 来实例化一个 ImageLoader，并调用 addImage 加载到 sAllImages，之后调用 checkandAddImage() 验证映像并将其加入到全局映像列表中。如果 loadPhase0() 返回为空，表示在路径下没有找到动态库，就尝试从共享缓存中找，找到之后调用 ImageLoaderMachO::instantiateFromCache() 从缓存中加载，否则就抛出没找到映像的异常。简化源码如下：

ImageLoader* load(const char* path, const LoadContext& context, unsigned& cacheIndex)
{
	...
	ImageLoader* image = loadPhase0(path, orgPath, context, cacheIndex, NULL);
	if ( image != NULL ) {
		CRSetCrashLogMessage2(NULL);
		return image;
	}
	...
	if ( image == NULL)
		image = loadPhase2cache(path, orgPath, context, cacheIndex, &exceptions);

    CRSetCrashLogMessage2(NULL);
	...
}

3.8 链接主程序

这一步主要是执行 link() 函数完成主程序的链接操作。通过调用 ImageLoader 自身的 link() 函数，主要是将已经实例化的主程序的动态数据进行修正，达到让进程可用的目的，其中就包括主程序的符号表修正操作。源码简化如下：

void ImageLoader::link(const LinkContext& context, bool forceLazysBound, bool preflightOnly, bool neverUnload, const RPathChain& loaderRPaths, const char* imagePath)
{
	...
	/// 采用递归的方式来加载程序依赖的动态库
	this->recursiveLoadLibraries(context, preflightOnly, loaderRPaths, imagePath);

	context.clearAllDepths();
	/// 递归对image 及依赖库按列表进行排序
	this->recursiveUpdateDepth(context.imageCount());
	...
	/// 递归rebase操作
	this->recursiveRebaseWithAccounting(context);

	if ( !context.linkingMainExecutable )
		/// 递归绑定符号表操作
		this->recursiveBindWithAccounting(context, forceLazysBound, neverUnload);

	/// 现在主要是链接程序，所以下列 !context.linkingMainExecutable 都不会执行
	if ( !context.linkingMainExecutable )
		/// 弱符号绑定
		this->weakBind(context);
	}

	// interpose any dynamically loaded images
	if ( !context.linkingMainExecutable && (fgInterposingTuples.size() != 0) ) {
		dyld3::ScopedTimer timer(DBG_DYLD_TIMING_APPLY_INTERPOSING, 0, 0, 0);
		this->recursiveApplyInterposing(context);
	}

	// now that all fixups are done, make __DATA_CONST segments read-only
	if ( !context.linkingMainExecutable )
		this->recursiveMakeDataReadOnly(context);

    if ( !context.linkingMainExecutable )
        context.notifyBatch(dyld_image_state_bound, false);
	uint64_t t6 = mach_absolute_time();

	if ( context.registerDOFs != NULL ) {
		std::vector<DOFInfo> dofs;
		/// 注册程序的DOF节区
		this->recursiveGetDOFSections(context, dofs);
		context.registerDOFs(dofs);
	}
	...
}

调用 recursiveLoadLibraries()，采用递归的方式来加载程序所依赖的动态库，后者调用了 doGetDependentLibraries() 函数，从 LC_LOAD_DYLIB、LC_LOAD_WEAK_DYLIB、LC_REEXPORT_DYLIB、LC_LOAD_UPWARD_DYLIB 中获取所有依赖的库。接着，循环所依赖的库，调用 context.loadLibrary() 加载。从之前的设置上下文(gLinkContext.loadLibrary= &libraryLocator)中可以看到，这里其实调用的是 libraryLocator，而且还是通过 load 来加载。加载所有的依赖库之后，会递归的对依赖库进行排序，被依赖的库排在列表前面。然后调用 recursiveRebaseWithAccounting() 函数进行 rebase 操作，其内部调用 recursiveRebase()，recursiveRebase() 内部调用 doRebase()，doRebase() 内部调用 ImageLoaderMachO 的 rebase() 函数从 fDyldInfo->rebase_off 开始的 rebase_size 进行 rebase 操作。可以设置 DYLD_PRINT_REBASINGS 参数来查看相关日志输出，因为模块被加载的内存基地址不同，所以需要 rebase。设置截图如下：

打印如下：

recursiveBindWithAccounting() 会调用 recursiveBind()，内部会以递归的方式绑定符号表，正常情况下只绑定非懒加载符号，除非满足一下条件：

设置 DYLD_BIND_AT_LAUNCH，懒加载符号会立即绑定
使用某些API，例如 RTLD_NOW，会导致懒加载符号立即绑定

下面是绑定的其余简化源码：

void ImageLoader::recursiveBind(const LinkContext& context, bool forceLazysBound, bool neverUnload)
{
	// Normally just non-lazy pointers are bound immediately.
	// The exceptions are:
	//   1) DYLD_BIND_AT_LAUNCH will cause lazy pointers to be bound immediately
	//   2) some API's (e.g. RTLD_NOW) can cause lazy pointers to be bound immediately
	if ( fState < dyld_image_state_bound ) {
		// break cycles
		fState = dyld_image_state_bound;
	
		try {
			// bind lower level libraries first
			for(unsigned int i=0; i < libraryCount(); ++i) {
				ImageLoader* dependentImage = libImage(i);
				if ( dependentImage != NULL )
					dependentImage->recursiveBind(context, forceLazysBound, neverUnload);
			}
			// bind this image
			this->doBind(context, forceLazysBound);	
			...
		}
		catch (const char* msg) {
			...
		}
	}
}
void ImageLoaderMachOCompressed::doBind(const LinkContext& context, bool forceLazysBound)
{
	...
	if ( this->usablePrebinding(context) ) {
		// don't need to bind
		// except weak which may now be inline with the regular binds
		if (this->participatesInCoalescing()) {
			// run through all binding opcodes
			eachBind(context, ^(const LinkContext& ctx, ImageLoaderMachOCompressed* image,
								uintptr_t addr, uint8_t type, const char* symbolName,
								uint8_t symbolFlags, intptr_t addend, long libraryOrdinal,
								ExtraBindData *extraBindData,
								const char* msg, LastLookup* last, bool runResolver) {
				if ( libraryOrdinal != BIND_SPECIAL_DYLIB_WEAK_LOOKUP )
					return (uintptr_t)0;
				return ImageLoaderMachOCompressed::bindAt(ctx, image, addr, type, symbolName, symbolFlags,
														  addend, libraryOrdinal, extraBindData,
														  msg, last, runResolver);
			});
		}
	}
	else {
		...
		if ( fChainedFixups != NULL ) {
			...
		}
		else {
			// run through all binding opcodes
			eachBind(context, ^(const LinkContext& ctx, ImageLoaderMachOCompressed* image,
								uintptr_t addr, uint8_t type, const char* symbolName,
								uint8_t symbolFlags, intptr_t addend, long libraryOrdinal,
								ExtraBindData *extraBindData,
								const char* msg, LastLookup* last, bool runResolver) {
				return ImageLoaderMachOCompressed::bindAt(ctx, image, addr, type, symbolName, symbolFlags,
														  addend, libraryOrdinal, extraBindData,
														  msg, last, runResolver);
			});
			...
	}
	..
}
uintptr_t ImageLoaderMachOCompressed::bindAt{
	...
	return image->bindLocation(context, image->imageBaseAddress(), addr, symbolAddress,
    	type, symbolName, addend, image->getPath(), targetImage ? targetImage->getPath() : 
    	NULL, msg, extraBindData, image->fSlide);
}

eachBind 通过调用 fDyldInfo->bind_off 开始的 bind_size 获取位置和符号名，最后在调用 ImageLoaderMachOCompressed::bindAt() 进行绑定。在 ImageLoaderMachOCompressed::bindAt() 调用 image->resolve() 获取符号地址，在调用 image->bindLocation() 来更新绑定。需要绑定的符号信息有下面几种：

BIND_TYPE_POINTER：需要绑定的是一个指针。直接将计算好的新值屿值即可。
BIND_TYPE_TEXT_ABSOLUTE32：一个32位的值。取计算的值的低32位赋值过去。
BIND_TYPE_TEXT_PCREL32：重定位符号。需要使用新值减掉需要修正的地址值来计算出重定

可以通过设置 DYLD_PRINT_BINDINGS 来打印绑定信息，截图如下：

运行之后的打印信息如下：

recursiveGetDOFSections() 内部通过递归的方式去处理DOF信息。最后通过 registerDOFs() 注册程序的 DOF 节区，供 dtrace 使用。

3.9 链接所有插入的库

此步骤与与链接主程序一样，都是调用 link() 进行链接，这里是对 sAllImages (除了主程序)中的库进行链接，sAllImages 列表来自前面插入的动态库。在链接之后，会调用 registerInterposing() 注册符号替换，该函数读取__DATA，__INTERPOSE 段的函数和替换的函数，并将读取的信息保存到 ImageLoader::fgInterposingTuples 中，为接下来的替换符号做准备。接下来会调用 ImageLoader::applyInterposing() 函数，其内部调用 recursiveApplyInterposing() 函数，在recursiveApplyInterposing() 函数递归调用，最终需要调用 doInterpose() 进行符号替换。doInterpose() 实际调用了 eachBind() 和 eachLazyBind() 函数，分别对常规的符号与延迟加载的符号进行应用插入操作，之后会调用 interposeAt() 进行最终替换。在 interposeAt() 中会调用 interposedAddress() 在 fgInterposingTuples 中查找要替换的符号地址，找到后然后进行最终的符号地址替换。可以通过设置 DYLD_PRINT_INTERPOSING 来打印替换的日志。具体方式见之前的设置打印。部分简化源码如下：

for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
	ImageLoader* image = sAllImages[i+1];
	link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
	image->setNeverUnloadRecursive();
}
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
    ImageLoader* image = sAllImages[i+1];
    image->registerInterposing(gLinkContext);
}
...
// apply interposing to initial set of images
for(int i=0; i < sImageRoots.size(); ++i) {
    sImageRoots[i]->applyInterposing(gLinkContext);
}

ImageLoader::applyInterposingToDyldCache(gLinkContext);
////////////
void ImageLoaderMachO::registerInterposing(const LinkContext& context)
{
	// mach-o files advertise interposing by having a __DATA __interpose section
	...
	for (uint32_t i = 0; i < cmd_count; ++i) {
		switch (cmd->cmd) {
			case LC_SEGMENT_COMMAND:
						...
						if ( ((sect->flags & SECTION_TYPE) == S_INTERPOSING) || ((strcmp(sect->sectname, "__interpose") == 0) && (strcmp(seg->segname, "__DATA") == 0)) ) {
							...
									ImageLoader::fgInterposingTuples.push_back(tuple);
								}
							}
						}
					}
				}
				break;
		}
		cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
	}
}

void ImageLoader::applyInterposing(const LinkContext& context)
{
	dyld3::ScopedTimer timer(DBG_DYLD_TIMING_APPLY_INTERPOSING, 0, 0, 0);
	if ( fgInterposingTuples.size() != 0 )
		this->recursiveApplyInterposing(context);
}

void ImageLoader::recursiveApplyInterposing(const LinkContext& context)
{
	// interpose lower level libraries first
	for(unsigned int i=0; i < libraryCount(); ++i) {
		ImageLoader* dependentImage = libImage(i);
		if ( dependentImage != NULL )
			dependentImage->recursiveApplyInterposing(context);
	}
		
	// interpose this image
	doInterpose(context);
}

void ImageLoaderMachOCompressed::doInterpose(const LinkContext& context)
{
	if ( !ma->hasChainedFixups() ) {
		eachLazyBind(xxx) {
			return ImageLoaderMachOCompressed::interposeAt(xxx);
		});

	  	// 2) non-lazy pointers in the dyld cache need to be interposed
		if ( ma->inDyldCache() ) {
			eachBind(xxx) {
				return ImageLoaderMachOCompressed::interposeAt(...);
			});
		}
	}
}
uintptr_t ImageLoaderMachOCompressed::interposeAt(...)
{
	if ( type == BIND_TYPE_POINTER ) {
		uintptr_t* fixupLocation = (uintptr_t*)addr;
		uintptr_t curValue = *fixupLocation;
		uintptr_t newValue = interposedAddress(context, curValue, image);
		if ( newValue != curValue) {
			*fixupLocation = newValue;
		}
	}
	return 0;
}

3.10 执行弱符号绑定: 仅在链接所有插入的images后才执行弱符号绑定

到这里所有插入的动态库都已经执行了插入操作弱绑定首先获取参与绑定的 ImageLoader，调用方法 getCoalescedImages()，该方法会将所有含有弱符号的映射组成一个列表，然后循环计算出有多少位完成弱绑定 ImageLoader，计算完成之后，调用方法 initializeCoalIterator() 对需要绑定的弱符号数据进行排序。排序后调用 incrementCoalIterator() 读取 image 动态绑定信息的偏移和大小，主要是通过读取 weak_bind_off 与 weak_bind_size 来确定弱符号的数据偏移与大小，通过计算获取到地址信息。之后会重新进行排序，排序后调用 getAddressCoalIterator() 查找符号的地址，找到地址后调用 updateUsesCoalIterator() 执行绑定操作。updateUsesCoalIterator 会调用 bindLocation()，该函数上文有描述过，这里不再描述。下面是简化的源码：

void ImageLoader::weakBind(const LinkContext& context)
{
	...
	// 获得参与合并的 ImageLoader
	ImageLoader* imagesNeedingCoalescing[fgImagesRequiringCoalescing];
	unsigned imageIndexes[fgImagesRequiringCoalescing];
	int count = context.getCoalescedImages(imagesNeedingCoalescing, imageIndexes);
	
	// 计算有多少ImageLoader尚未完成弱绑定
	int countNotYetWeakBound = 0;
	int countOfImagesWithWeakDefinitionsNotInSharedCache = 0;
	for(int i=0; i < count; ++i) {
		if ( ! imagesNeedingCoalescing[i]->weakSymbolsBound(imageIndexes[i]) )
			++countNotYetWeakBound;
		if ( ! imagesNeedingCoalescing[i]->inSharedCache() )
			++countOfImagesWithWeakDefinitionsNotInSharedCache;
	}

	if ( (countOfImagesWithWeakDefinitionsNotInSharedCache > 0) && (countNotYetWeakBound > 0) ) {
		if (!context.weakDefMapInitialized) {
			// 初始化弱定义映射，因为链接上下文未运行静态初始化程序
			new (&context.weakDefMap) dyld3::Map<const char*, std::pair<const ImageLoader*, uintptr_t>, ImageLoader::HashCString, ImageLoader::EqualCString>();
			context.weakDefMapInitialized = true;
		}
		ImageLoader::CoalIterator iterators[count];
		ImageLoader::CoalIterator* sortedIts[count];
		/// 对需要绑定的弱符号进行排序
		for(int i=0; i < count; ++i) {
			imagesNeedingCoalescing[i]->initializeCoalIterator(iterators[i], i, imageIndexes[i]);
			sortedIts[i] = &iterators[i];
		}
		int doneCount = 0;
		while ( doneCount != count ) {
		
			// 读取映像动态链接信息的weak_bind_off与weak_bind_size来确定弱符号的数据偏移与大小
			if ( sortedIts[0]->image->incrementCoalIterator(*sortedIts[0]) )
				++doneCount; 
			// 重新排序
			for(int i=1; i < count; ++i) {
				int result = strcmp(sortedIts[i-1]->symbolName, sortedIts[i]->symbolName);
				if ( result == 0 )
					sortedIts[i-1]->symbolMatches = true;
				if ( result > 0 ) {
					// new one is bigger then next, so swap
					ImageLoader::CoalIterator* temp = sortedIts[i-1];
					sortedIts[i-1] = sortedIts[i];
					sortedIts[i] = temp;
				}
				if ( result < 0 )
					break;
			}
			if ( sortedIts[0]->symbolMatches && !sortedIts[0]->done ) {
				....
				for(int i=0; i < count; ++i) {
					if ( strcmp(iterators[i].symbolName, nameToCoalesce) == 0 ) {
						if ( iterators[i].weakSymbol ) {
							if ( targetAddr == 0 ) {
								/// 按照映像的加载顺序在导出表中查找符号的地址
								targetAddr = iterators[i].image->getAddressCoalIterator(iterators[i], context);
							}
						}
						else {
							/// 按照映像的加载顺序在导出表中查找符号的地址
							targetAddr = iterators[i].image->getAddressCoalIterator(iterators[i], context);
						}
					}
				}
				if ( targetAddr != 0 ) {
					for(int i=0; i < count; ++i) {
						if ( strcmp(iterators[i].symbolName, nameToCoalesce) == 0 ) {
							if ( ! iterators[i].image->weakSymbolsBound(imageIndexes[i]) )
								
								iterators[i].image->updateUsesCoalIterator(...);
						}
					}
				}
			}
		}

		...
}

3.11 初始化主程序

当前步骤主要是调用 initializeMainExecutable() 函数完成。会先初始化所有插入的 dylib，然后在初始化主程序。initializeMainExecutable() 函数先调用 runInitializers()，内部在依次调用processInitializers()、recursiveInitialization()。下面是 recursiveInitialization() 的简化源码：

void ImageLoader::recursiveInitialization(...)
{
	context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
	
	// initialize this image
	bool hasInitializers = this->doInitialization(context);
	// let anyone know we finished initializing this image
	context.notifySingle(dyld_image_state_initialized, this, NULL);
}

首先调用 notifySingle() 去通知 objc，这里即将初始化 image，然后通过调用 doInitialization() 进行image 的初始化，初始化之后在调用 notifySingle() 通知 objc，已经初始化完成。 notifySingle 的简化源码如下：

static void notifySingle(...)
{
	...
	if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit != NULL) && image->notifyObjC() ) {
		uint64_t t0 = mach_absolute_time();
		dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
		(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
		..
	}
	...
}

这里重点关注下 sNotifyObjCInit, objc 初始化的通知。其定义如下：

static _dyld_objc_notify_init		sNotifyObjCInit;

下面来搜索下 sNotifyObjCInit 的赋值位置，其代码如下：

void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
	sNotifyObjCMapped	= mapped;
	sNotifyObjCInit		= init;
	sNotifyObjCUnmapped = unmapped;
	...
}

全局搜索 registerObjCNotifiers 调用，其代码如下：

void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped)
{
	dyld::registerObjCNotifiers(mapped, init, unmapped);
}

全局搜索 _dyld_objc_notify_register 调用，其代码如下：

// 
// Note: only for use by objc runtime
// Register handlers to be called when objc images are mapped, unmapped, and initialized.
// Dyld will call back the "mapped" function with an array of images that contain an objc-image-info section.
// Those images that are dylibs will have the ref-counts automatically bumped, so objc will no longer need to
// call dlopen() on them to keep them from being unloaded.  During the call to _dyld_objc_notify_register(),
// dyld will call the "mapped" function with already loaded objc images.  During any later dlopen() call,
// dyld will also call the "mapped" function.  Dyld will call the "init" function when dyld would be called
// initializers in that image.  This is when objc calls any +load methods in that image.
//
void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped);

大体意思: 此函数仅供 objc runtime 使用，当映射、取消映射和初始化 objc 的时候将调用该函数来注册处理程序。dyld 将使用包含 objc-image-info 相关的 images 去调用映射函数。在调用 _dyld_objc_notify_register() 的过程中，dyld 将使用已加载的 objc image 调用“映射”函数。objc 中任何调用 +load 方法的时候，dyld 都会调用 init 函数。下面来证明一下，新建 iOS 程序，如下下断点：

示例代码如下：

#import "ViewController.h"

@interface ViewController ()

@end

@implementation ViewController

+(void)load {
    NSLog(@"--------");
}

- (void)viewDidLoad {
    [super viewDidLoad];
    // Do any additional setup after loading the view.
}


@end

运行程序，成功进入断点，截图如下：

从左侧调用栈可以看出调用的依次顺序是：libSystem_initializer、libdispatch_init、_os_object_init、_objc_init、_dyld_objc_notify_register。从调用栈可以看出 libSystem_initializer 是由 doModInitFunctions 调用的，其是由doInitialization 调用的。简化源码如下：

bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
	// mach-o has -init and static initializers
	doImageInit(context);
	doModInitFunctions(context);
}

doImageInit 的简化源码如下：

void ImageLoaderMachO::doImageInit(const LinkContext& context)
{
	if ( fHasDashInit ) {
		const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
		const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
		const struct load_command* cmd = cmds;
		for (uint32_t i = 0; i < cmd_count; ++i) {
			switch (cmd->cmd) {
				case LC_ROUTINES_COMMAND:
					Initializer func = (Initializer)(((struct macho_routines_command*)cmd)->init_address + fSlide);
					...
			cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
		}
	}
}

该函数主要是通过调用 LC_ROUTINES_COMMAND 的函数，进行 image 的初始化，通过设置DYLD_PRINT_INITIALIZERS，即可打印其日志。 doModInitFunctions 的简化源码如下：

void ImageLoaderMachO::doModInitFunctions(const LinkContext& context)
{
	if ( fHasInitializers ) {
		...
		for (uint32_t i = 0; i < cmd_count; ++i) {
			if ( cmd->cmd == LC_SEGMENT_COMMAND ) {
				...
				for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) {
					const uint8_t type = sect->flags & SECTION_TYPE;
					if ( type == S_MOD_INIT_FUNC_POINTERS ) {
						Initializer* inits = (Initializer*)(sect->addr + fSlide);
						const size_t count = sect->size / sizeof(uintptr_t);
						...
						for (size_t j=0; j < count; ++j) {
							Initializer func = inits[j];
							...
							bool haveLibSystemHelpersBefore = (dyld::gLibSystemHelpers != NULL);
							{
								dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
								func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
							}
							bool haveLibSystemHelpersAfter = (dyld::gLibSystemHelpers != NULL);
							if ( !haveLibSystemHelpersBefore && haveLibSystemHelpersAfter ) {
								// now safe to use malloc() and other calls in libSystem.dylib
								dyld::gProcessInfo->libSystemInitialized = true;
							}
						}
					}
					else if ( type == S_INIT_FUNC_OFFSETS ) {
						const uint32_t* inits = (uint32_t*)(sect->addr + fSlide);
						...
						for (size_t j=0; j < count; ++j) {
							uint32_t funcOffset = inits[j];
							...
                            Initializer func = (Initializer)((uint8_t*)this->machHeader() + funcOffset);
							bool haveLibSystemHelpersBefore = (dyld::gLibSystemHelpers != NULL);
							{
								dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
                                func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
                            }
							bool haveLibSystemHelpersAfter = (dyld::gLibSystemHelpers != NULL);
							if ( !haveLibSystemHelpersBefore && haveLibSystemHelpersAfter ) {
								dyld::gProcessInfo->libSystemInitialized = true;
							}
						}
					}
				}
			}
			cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
		}
	}
}

这里主要是通过 LC_SEGMENT_COMMAND 函数找到 image 中 flags 字段为 S_MOD_INIT_FUNC_POINTERS 的 section，然后去获取函数指针去调用 libSystem_initializer 函数，去进行一些列的初始化。源码中 func(..) 就是去调用 libSystem_initializer函数，在验证这个之前，先列出 ScopedTimer 的简化源码：

class VIS_HIDDEN ScopedTimer {
public:
    ScopedTimer(uint32_t code, kt_arg data1, kt_arg data2, kt_arg data3)
        : code(code), data1(data1), data2(data2), data3(data3), data4(0), data5(0), data6(0) {
#if BUILDING_LIBDYLD || BUILDING_DYLD
        startTimer();
#endif
    }

    ~ScopedTimer() {
#if BUILDING_LIBDYLD || BUILDING_DYLD
        endTimer();
#endif
    }
    ..
    void startTimer();
    void endTimer();
    ...
};

现在开始验证 func(..) 就是去调用 libSystem_initializer 函数。保持之前的断点，在真机运行，成功命中_dyld_objc_notify_register 断点，如下图：

点击红色箭头查看 doModInitFunctions 函数中调用 libSystem_initializer 函数的位置，截图如下：

记住 0x1029fb2c0 地址，其实只要记住后 3 位地址，因为ASLR，基址经常改变，所以下一个地址不会是 0x1029fb2c0。

下面，给 doModInitFunctions 函数下断点，如下图：

运行程序，命中 doModInitFunctions 函数断点，截图如下(注意红色监听指向，下面分析会用到)：

blr 是带返回的跳转指令，这里的意思是跳转到x24保存的地址中去执行。

通过红色箭头、doModInitFunctions 和 ScopedTimer 相关源码，对比得知，blr x24，就是相当于上文的func(..) 调用。也可以通过 x24 地址及跳转之后的函数首地址来验证。找到 2c0 结尾的地址0x1045f32c0，然后针对 0x1045f32c0 下断点，示例代码如下：

(lldb) br set -a 0x1045f32c0
Breakpoint 9: where = dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 424, address = 0x00000001045f32c0
(lldb) 

因为插入的动态库和主程序都会调用初始化，这里先把 doModInitFunctions 函数断点取消，只关注``0x1045f32c0断点。继续运行程序，命中0x1045f32c0 断点，这里打印下 x24` 的地址，打印如下：

(lldb) x/s $x24
0x1e34247b8: "\xfffffff6W\xffffffbd\xffffffa9\xfffffff4O\x01\xffffffa9\xfffffffd{\x02\xffffffa9\xfffffffd\xffffff83"
(lldb) 

x24 对应地址是 0x1e34247b8，然后在 lldb 中输入 si 进入跳转之后的调用，截图如下(关注下红色箭头)：

根据红色箭头及以上分析可以看出，func(..) 调用的就是 libSystem_initializer 函数。 libSystem_initializer 的源码如下：

static __attribute__((constructor)) 
void libSystem_initializer(int argc, const char* argv[], const char* envp[], const char* apple[], const struct ProgramVars* vars)
{
	_libkernel_functions_t libkernel_funcs = {
		.get_reply_port = _mig_get_reply_port,
		.set_reply_port = _mig_set_reply_port,
		.get_errno = __error,
		.set_errno = cthread_set_errno_self,
		.dlsym = dlsym,
	};

	_libkernel_init(libkernel_funcs);

	bootstrap_init();
	mach_init();
	pthread_init();
	__libc_init(vars, libSystem_atfork_prepare, libSystem_atfork_parent, libSystem_atfork_child, apple);
	__keymgr_initializer();
	_dyld_initializer();
	libdispatch_init();
#if !TARGET_OS_EMBEDDED || __IPHONE_OS_VERSION_MAX_ALLOWED >= 50000 // __IPHONE_5_0
	_libxpc_initializer();
#endif
}

libdispatch_init、``_os_object_init` 源码如下：

void
libdispatch_init(void)
{
	...
	_dispatch_hw_config_init();
	_dispatch_vtable_init();
	_os_object_init();
}

void
_os_object_init(void)
{
	return _objc_init();
}

_objc_init 源码如下：

void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    
    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    static_init();
    runtime_init();
    exception_init();
    cache_init();
    _imp_implementationWithBlock_init();

    _dyld_objc_notify_register(&map_images, load_images, unmap_image);

#if __OBJC2__
    didCallDyldNotifyRegister = true;
#endif
}

该函数主要是进行一系列的初始化，之后调用 _dyld_objc_notify_register 进行注册。从上面的_dyld_objc_notify_register 定义可以看出，其函数所需参数 mapped 和 init，分别对应 runtime 中的map_images 和 load_images 函数。map_images 主要处理由 dyld 映射的 image，load_images 主要处理的是调用 +load 方法。现在过掉 _dyld_objc_notify_register断点，继续运行。成功进入 load 断点，截图如下：

load_images 简化源码如下：

void
load_images(const char *path __unused, const struct mach_header *mh)
{
    ...
    // Discover load methods
    prepare_load_methods((const headerType *)mh);
    
    // Call +load methods (without runtimeLock - re-entrant)
    call_load_methods();
}

先通过 prepare_load_methods 查找所有类的 load 方法，然后调用 load 方法。 prepare_load_methods 的简化源码如下：

void prepare_load_methods(const headerType *mhdr)
{
    size_t count, i;

    runtimeLock.assertLocked();

    classref_t const *classlist = 
        _getObjc2NonlazyClassList(mhdr, &count);
    for (i = 0; i < count; i++) {
        schedule_class_load(remapClass(classlist[i]));
    }

    category_t * const *categorylist = _getObjc2NonlazyCategoryList(mhdr, &count);
    for (i = 0; i < count; i++) {
        category_t *cat = categorylist[i];
        Class cls = remapClass(cat->cls);
        if (!cls) continue;  // category for ignored weak-linked class
        realizeClassWithoutSwift(cls, nil);
        ASSERT(cls->ISA()->isRealized());
        add_category_to_loadable_list(cat);
    }
}

static void schedule_class_load(Class cls)
{
    if (!cls) return;
    ASSERT(cls->isRealized());  // _read_images should realize

    if (cls->data()->flags & RW_LOADED) return;

    // Ensure superclass-first ordering
    schedule_class_load(cls->superclass);

    add_class_to_loadable_list(cls);
    cls->setInfo(RW_LOADED); 
}

先遍历找出类的 +load 方法，在遍历找出分类的 +load 方法，父类的 +load 方法先添加到 loadable_classes 数组中，在添加类的 +load 方法到 loadable_classes 数组中。分类的 +load 方法添加到 loadable_categories 数组中。 call_load_methods 的源码如下：

void call_load_methods(void)
{
    static bool loading = NO;
    bool more_categories;

    loadMethodLock.assertLocked();

    // Re-entrant calls do nothing; the outermost call will finish the job.
    if (loading) return;
    loading = YES;

    void *pool = objc_autoreleasePoolPush();

    do {
        // 1. Repeatedly call class +loads until there aren't any more
        while (loadable_classes_used > 0) {
            call_class_loads();
        }

        // 2. Call category +loads ONCE
        more_categories = call_category_loads();

        // 3. Run more +loads if there are classes OR more untried categories
    } while (loadable_classes_used > 0  ||  more_categories);

    objc_autoreleasePoolPop(pool);

    loading = NO;
}

先调用 loadable_classes 数组中的 +load 方法在调用 loadable_categories 数组中的 +load 方法。从上面源码中可以发现：

父类的 +load 方法一定比子类的 +load 方法先调用。
主类的 +load 方法一定比分类的 +load 方法先调用。

map_images 简化源码如下：

void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses)
{
    ...
    // Fix up @selector references
    /// 1. 提取方法，并注册到名为namedSelectors的全局 map table 中
    static size_t UnfixedSelectors;
    {
        mutex_locker_t lock(selLock);
        for (EACH_HEADER) {
            if (hi->hasPreoptimizedSelectors()) continue;
            bool isBundle = hi->isBundle();
            SEL *sels = _getObjc2SelectorRefs(hi, &count);
            UnfixedSelectors += count;
            for (i = 0; i < count; i++) {
                const char *name = sel_cname(sels[i]);
                SEL sel = sel_registerNameNoLock(name, isBundle);
                if (sels[i] != sel) {
                    sels[i] = sel;
                }
            }
        }
    }
    // Discover classes. Fix up unresolved future classes. Mark bundle classes.
    /// 2. 发现类，从镜像提取类信息，并存到名为allocatedClasses的全局 hash table 中
    bool hasDyldRoots = dyld_shared_cache_some_image_overridden();

    for (EACH_HEADER) {
        if (! mustReadClasses(hi, hasDyldRoots)) {
            // Image is sufficiently optimized that we need not call readClass()
            continue;
        }

        classref_t const *classlist = _getObjc2ClassList(hi, &count);

        bool headerIsBundle = hi->isBundle();
        bool headerIsPreoptimized = hi->hasPreoptimizedClasses();

        for (i = 0; i < count; i++) {
            Class cls = (Class)classlist[i];
            Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized);

            if (newCls != cls  &&  newCls) {
                // Class was moved but not deleted. Currently this occurs 
                // only when the new class resolved a future class.
                // Non-lazily realize the class below.
                resolvedFutureClasses = (Class *)
                    realloc(resolvedFutureClasses, 
                            (resolvedFutureClassCount+1) * sizeof(Class));
                resolvedFutureClasses[resolvedFutureClassCount++] = newCls;
            }
        }
    }

    ts.log("IMAGE TIMES: discover classes");

    // Fix up remapped classes
    /// 3. 重新调整类之间的引用
    if (!noClassesRemapped()) {
        for (EACH_HEADER) {
            Class *classrefs = _getObjc2ClassRefs(hi, &count);
            for (i = 0; i < count; i++) {
                remapClassRef(&classrefs[i]);
            }
            // fixme why doesn't test future1 catch the absence of this?
            classrefs = _getObjc2SuperRefs(hi, &count);
            for (i = 0; i < count; i++) {
                remapClassRef(&classrefs[i]);
            }
        }
    }

    ts.log("IMAGE TIMES: remap classes");

#if SUPPORT_FIXUP
    // Fix up old objc_msgSend_fixup call sites
    /// 4. 提取旧objc_msgSend_fixup
    for (EACH_HEADER) {
        ...
        for (i = 0; i < count; i++) {
            fixupMessageRef(refs+i);
        }
    }
#endif

    // Discover protocols. Fix up protocol refs.
    /// 5. 提取 protocols，存储到全局 map table
    for (EACH_HEADER) {
        extern objc_class OBJC_CLASS_$_Protocol;
        Class cls = (Class)&OBJC_CLASS_$_Protocol;
        ASSERT(cls);
        NXMapTable *protocol_map = protocols();
        bool isPreoptimized = hi->hasPreoptimizedProtocols();

        // Skip reading protocols if this is an image from the shared cache
        // and we support roots
        // Note, after launch we do need to walk the protocol as the protocol
        // in the shared cache is marked with isCanonical() and that may not
        // be true if some non-shared cache binary was chosen as the canonical
        // definition
        if (launchTime && isPreoptimized && cacheSupportsProtocolRoots) {
            if (PrintProtocols) {
                _objc_inform("PROTOCOLS: Skipping reading protocols in image: %s",
                             hi->fname());
            }
            continue;
        }

        bool isBundle = hi->isBundle();

        protocol_t * const *protolist = _getObjc2ProtocolList(hi, &count);
        for (i = 0; i < count; i++) {
            readProtocol(protolist[i], cls, protocol_map, 
                         isPreoptimized, isBundle);
        }
    }

    ts.log("IMAGE TIMES: discover protocols");

    // Fix up @protocol references
    ///6. 和类一样，protocol 也有继承关系，此过程 fixup 它们的依赖关系
    for (EACH_HEADER) {
        // At launch time, we know preoptimized image refs are pointing at the
        // shared cache definition of a protocol.  We can skip the check on
        // launch, but have to visit @protocol refs for shared cache images
        // loaded later.
        if (launchTime && cacheSupportsProtocolRoots && hi->isPreoptimized())
            continue;
        protocol_t **protolist = _getObjc2ProtocolRefs(hi, &count);
        for (i = 0; i < count; i++) {
            remapProtocolRef(&protolist[i]);
        }
    }


    // Discover categories. Only do this after the initial category
    /// 7. 提取 categories，存储到全局 map table
    if (didInitialAttachCategories) {
        for (EACH_HEADER) {
            load_categories_nolock(hi);
        }
    }

 
    // Category discovery MUST BE Late to avoid potential races
    // Realize non-lazy classes (for +load methods and static instances)
    /// 8. realize 含有+load方法或者静态实例的类
    for (EACH_HEADER) {
        classref_t const *classlist = 
            _getObjc2NonlazyClassList(hi, &count);
        for (i = 0; i < count; i++) {
            Class cls = remapClass(classlist[i]);
            if (!cls) continue;

            addClassTableEntry(cls);

            if (cls->isSwiftStable()) {
                if (cls->swiftMetadataInitializer()) {
                    _objc_fatal("Swift class %s with a metadata initializer "
                                "is not allowed to be non-lazy",
                                cls->nameForLogging());
                }
                // fixme also disallow relocatable classes
                // We can't disallow all Swift classes because of
                // classes like Swift.__EmptyArrayStorage
            }
            realizeClassWithoutSwift(cls, nil);
        }
    }

    ts.log("IMAGE TIMES: realize non-lazy classes");

    // Realize newly-resolved future classes, in case CF manipulates them
    /// 9.realize 含有RO_FUTURE标识的类，这些类一般是 Core Foundation 中的类
    if (resolvedFutureClasses) {
        for (i = 0; i < resolvedFutureClassCount; i++) {
            Class cls = resolvedFutureClasses[i];
            if (cls->isSwiftStable()) {
                _objc_fatal("Swift class is not allowed to be future");
            }
            realizeClassWithoutSwift(cls, nil);
            cls->setInstancesRequireRawIsaRecursively(false/*inherited*/);
        }
        free(resolvedFutureClasses);
    }
    ...
}

提取方法，并注册到名为 namedSelectors 的全局 map table 中。
发现类，从镜像提取类信息，并存到名为 allocatedClasses 的全局 hash table 中。
重新调整类之间的引用。
提取旧 objc_msgSend_fixup。
提取 protocols，存储到全局 map table。
和类一样，``protocol` 也有继承关系，此过程重新调整它们的依赖关系。
提取 categories，存储到全局 map table。
realize class 含有 +load 方法或者静态实例的类。
realize class 含有 RO_FUTURE 标识的类，这些类一般是 Core Foundation 中的类。

这里不做具体分析，具体分析看参考文章。

3.12 查找主程序可执行文件的入口点

首先调用 getEntryFromLC_MAIN() 函数，从 Load Command 读取 LC_MAIN 入口。如果没有 LC_MAIN 入口，就调用 getEntryFromLC_UNIXTHREAD() 函数，从 Load Command 读取 LC_UNIXTHREAD 入口。因为会将这个入口作为函数指针调用，这里要对其进行签名。然后跳到入口处执行，之后就进入了 main() 函数。以下是简化的源码：

/// 查找主程序可执行文件的入口点
result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN();
if ( result != 0 ) {
}
else {
	result = (uintptr_t)sMainExecutable->getEntryFromLC_UNIXTHREAD();
	*startGlue = 0;
}
#if __has_feature(ptrauth_calls)
		/// start（）将结果指针作为函数指针调用，因此我们需要对其进行签名。
		result = (uintptr_t)__builtin_ptrauth_sign_unauthenticated((void*)result, 0, 0);
#endif

void* ImageLoaderMachO::getEntryFromLC_MAIN() const
{
	const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
	const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
	const struct load_command* cmd = cmds;
	for (uint32_t i = 0; i < cmd_count; ++i) {
		if ( cmd->cmd == LC_MAIN ) {
			entry_point_command* mainCmd = (entry_point_command*)cmd;
			void* entry = (void*)(mainCmd->entryoff + (char*)fMachOData);
			// <rdar://problem/8543820&9228031> verify entry point is in image
			if ( this->containsAddress(entry) )
				return entry;
			else
				throw "LC_MAIN entryoff is out of range";
		}
		cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
	}
	return NULL;
}

void* ImageLoaderMachO::getEntryFromLC_UNIXTHREAD() const
{
	const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
	const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
	const struct load_command* cmd = cmds;
	for (uint32_t i = 0; i < cmd_count; ++i) {
		if ( cmd->cmd == LC_UNIXTHREAD ) {
	#elif __arm64__ && !__arm64e__
			// temp support until <rdar://39514191> is fixed
			const uint64_t* regs64 = (uint64_t*)(((char*)cmd) + 16);
			void* entry = (void*)(regs64[32] + fSlide); // arm_thread_state64_t.__pc
			// <rdar://problem/8543820&9228031> verify entry point is in image
			if ( this->containsAddress(entry) )
				return entry;
	#endif
		}
		cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
	}
	throw "no valid entry point";
}

4. 总结

0x07 最后的最后

如有错误，请联系修正。谢谢