System call dispatching on Windows ARM64
Background
Microsoft recently announced that there will be Windows ARM64 devices. Technically, it should be “AArch64” but ARM64 is easier to type. This article briefly documents the system call dispatching mechanism for Windows on ARM64. Readers are assumed to be familiar with ARM64 assembly and system call dispatching on Windows x86/x64.
Previous research and related readings
The only notable Microsoft source about Windows on ARM64 is a talk from Hari Pulapaka and Arun Kishan from BUILD 2017 [2]. In the short talk, they called these devices as “Windows PC on ARM” and went on the demonstrate VLC and OpenVPN running on ARM64 Windows without any code modification. They gave a general overview of Compiled Hybrid PE (CHPE) and how it is used to run x86 applications on ARM64; it is a software layer used to emulate x86 code on ARM (note: it does not support x64). For those interested in the talk, you can view the video on Channel 9.
Ironically, most of the technical information about Windows on ARM64 comes from the open source world. Commits to the Linux kernel indicate that there is Hyper-V support for ARM64. It is most likely for running ARM64 servers. André Hentschel made many commits to enable ARM64 emulation in the Wine project; it is interesting to note that this work started in 2013, four years before Microsoft’s official announcement!
After I wrote this blog, wup and suezi from Qihoo 360’s IceSword Lab released a blog documenting how they were able to get KD working using QEMU by modifying the serial debug transport [3]. As far as I am aware, this is the first time a non-Microsoft entity got KD working on the platform. Hopefully they will release their source code so that others can try it as well. It is worth noting that on retail ARM64 devices, you cannot do KD via USB or Ethernet either (none of the USB ports have debug enabled and there is no physical Ethernet port). There are two serial ports though.
If you are aware of other technical sources of information, please send me an email.
System call dispatching
The ARM64 syscall stubs in ntdll.dll
and friends simply contain two instructions. For example, this is ntdll!NtDeviceIoControlFile
:
NtDeviceIoControlFile SVC 7 RET
SVC
is a trap instruction similar to SYSCALL
or SYSENTER
on x64/x86 platforms. Where does SVC
take us? To answer that question, we need to review some ARM64 system level concepts.
ARM64 introduces the concept of exception levels (EL) to model privilege levels. There are 4 ELs: EL0, EL1, EL2, and EL3. They are somewhat similar to the ring levels in x86-based architectures except that they also cover hypervisors and security monitors. Unlike ring levels, the numbering scheme and its associated privilege are reversed; EL0 is for usermode, EL1 is for kernelmode, EL2 is for hypervisors, and EL3 is for security monitors. We illustrate the concept as follow,
The only way to transition from one EL to another is through exceptions. Exceptions are generally triggered by special trap instructions (like SVC
) or some unexpected behavior. The CPU maintains a table of handlers for various exceptions and conditions. On ARM64, this is known as the vector table and its base address stored in the VBAR_EL1
register for the kernel (it also exists for EL2 and EL3 but we don’t discuss those here). It is a blob of 0x800 bytes of code and the layout is described on page D1-1876 of [1]. When an exception happens, the CPU will immediately execute code at some offset within the vector table. Exception handlers are stored at well-defined offsets within the table; for example, when the SVC
is executed, it will transition to offset 0x400 in the table; the return address is stored in the ELR_EL1
register.
The CPU automatically stores exception metadata in the Exception Syndrome Register (ESR_EL1
). This is a 32-bit register and the format is described on page D1-1877 of [1]. The fields of relevance to us are the Exception Class (EC) and Instruction-Specific Syndrome (ISS). The EC describes the exception type (SVC
, BRK
, instruction pointer alignment, etc.). Some notable exception type values for us are:
- 0x15: the
SVC
instruction was executed. This is typically used implement system calls in the kernel. You are reading about it now. - 0x16: the
HVC
instruction was executed. This is typically used to implement calls in the hypervisor. As mentioned earlier, there is a HyperV version for ARM64 and it implements hypercalls via this mechanism. Several kernel functions call into the hypervisor (HvlpCallVtl1
,HvlpAa64GetVpRegister64
,Aa64HviGetVpRegister64
, etc.) - 0x17: the
SMC
instruction was executed. This is typically used to implement calls into secure monitor. HyperV uses this. - 0x3C: the
BRK
instruction was executed. This is common used to implement OS-defined exceptions (like breakpoints). Windows uses to implement breakpoints and other custom exceptions.
The meaning of ISS is different for each exception, but for SVC
it contains the immediate value encoded in the instruction. For example, if the instruction was SVC 7
, then ISS will be 7. When the stub executes, the CPU will transition to EL1 and transfer control to the appropriate handler in the vector table. Where is the vector table located? The Windows kernel initializes VBAR_EL1
early in the boot process in KiInitializeExceptionVectorTable
:
KiInitializeExceptionVectorTable ; CODE XREF: KiInitializeBootStructures+1DC↓p MOV X8, SP MSR #0, c4, c1, #0, X8 MOV X9, #0 MSR #0, c4, c2, #0, X9 ADRP X8, #KiArm64ExceptionVectors@PAGE ADD X8, X8, #KiArm64ExceptionVectors@PAGEOFF MSR #0, c12, c0, #0, X8 ; VBAR_EL1 = KiArm64ExceptionVectors ISB RET
We now know that KiArm64ExceptionVectors
is the base of the vector table. We show the first few handlers below. Note that the entries are 0x80 bytes apart.
.text:000000014002D800 MRS X18, #0, c4, c1, #0 .text:000000014002D804 AND SP, X18, #0xFFFFFFFFFFFFFFF0 .text:000000014002D808 SUB SP, SP, #0x370 .text:000000014002D80C MRS X18, #0, c13, c0, #4 ... .text:000000014002D880 KiKernelSp0InterruptHandler .text:000000014002D880 MRS X18, #0, c4, c1, #0 .text:000000014002D884 AND SP, X18, #0xFFFFFFFFFFFFFFF0 .text:000000014002D888 SUB SP, SP, #0x370 ... .text:000000014002D900 .text:000000014002D900 KiKernelSp0FiqHandler .text:000000014002D900 MRS X18, #0, c4, c1, #0 .text:000000014002D904 AND SP, X18, #0xFFFFFFFFFFFFFFF0 ... .text:000000014002D980 .text:000000014002D980 KiKernelSp0SystemErrorHandler .text:000000014002D980 MRS X18, #0, c4, c1, #0 .text:000000014002D984 AND SP, X18, #0xFFFFFFFFFFFFFFF0 ... .text:000000014002DA00 .text:000000014002DA00 KiKernelExceptionHandler .text:000000014002DA00 MRS X18, #0, c13, c0, #4 .text:000000014002DA04 AND X18, X18, #0xFFFFFFFFFFFFF000 .text:000000014002DA08 STP X23, X30, [X18,#0x70] ... .text:000000014002DC00 .text:000000014002DC00 KiUserExceptionHandler .text:000000014002DC00 SUB SP, SP, #0x370 .text:000000014002DC04 STP X18, X30, [SP,#0x370+var_240] .text:000000014002DC08 MRS X18, #0, c5, c2, #0 .text:000000014002DC0C LSR X30, X18, #0x1A .text:000000014002DC10 CMP X30, #0x15
The layout is something like this,
As explained earlier, the handler for SVC
from EL0 is at offset 0x400 from the VBAR_EL1
register and this corresponds to KiUserExceptionHandler
:
KiUserExceptionHandler SUB SP, SP, #0x370 STP X18, X30, [SP,#0x370+var_240] MRS X18, #0, c5, c2, #0 ; read the ESR_EL1 LSR X30, X18, #0x1A ; extract the EC bits CMP X30, #0x15 ; check if it is an SVC64 exception B.NE case_otherExceptions ; if not, branch out SXTH X18, W18 CMN X18, #1 ; do we need to transition to 32bit? B.EQ KiEnter32BitMode ; yes B KiSystemServiceException ; otherwise, handle the SVC exception case_otherExceptions ; CODE XREF: KiUserExceptionHandler+14↑j ; KiEnter32BitMode+C↓j MRS X18, #0, c13, c0, #4 AND X18, X18, #0xFFFFFFFFFFFFF000 STP X19, X20, [X18,#0x50]
KiSystemServiceException
is where system call dispatching starts. It saves the syscall number, return address and a few other values and then calls KiSystemService
,
KiSystemServiceException ; CODE XREF: KiUserExceptionHandler+24↑j MRS X18, #0, c13, c0, #4 AND X18, X18, #0xFFFFFFFFFFFFF000 STP X0, X1, [SP,#arg_A0] STP X2, X3, [SP,#arg_B0] STP X4, X5, [SP,#arg_C0] STP X6, X7, [SP,#arg_D0] MOV X0, #2 MRS X1, #0, c4, c0, #1 ; ELR_EL1 MRS X2, #0, c4, c0, #0 ; SPSR_EL1 MRS X3, #0, c5, c2, #0 ; ESR_EL1 MRS X4, #0, c4, c1, #0 ; SP_EL0 MOV X5, SP MSR #0, c4, c1, #0, X5 MSR #5, #0 STR W0, [SP,#arg_0] STR XZR, [SP,#arg_10] MRS X0, #0, c0, c2, #2 TBNZ W0, #0, loc_14002F108 loc_14002EFC8 ; CODE XREF: KiSystemServiceExit+110↓j BFI X2, X3, #0x20, #0x20 ; ' ' STP X2, X4, [SP,#arg_90] ; save the ESR which has the immediate (syscall #) STP X29, X1, [SP,#arg_140] ADD X29, SP, #arg_140 LDR X0, [X18,#_KPCR.Prcb.CurrentThread] LDRB W1, [X0,#_KTHREAD.Header.___u0.__s7.DebugActive] CBNZ W1, loc_14002F130 loc_14002EFE4 ; CODE XREF: KiSystemServiceExit+130↓j ; KiSystemServiceExit+13C↓j MOV X0, SP BL KiLogTrapFrame LDR X15, [X18,#_KPCR.Prcb.CurrentThread] MSR #7, #0xA LDR X0, [SP,#arg_A0] LDRH W16, [SP,#arg_90+4] STP X16, X0, [X15,#_ETHREAD.Tcb.SystemCallNumber] BL KiSystemService
KiSystemService
extracts the syscall number from the ESR and eventually invokes the syscall. The syscall table structure is the same as x64 so we will not discuss that here.
KiSystemService ; CODE XREF: KiServiceInternal+5C↑p SUB SP, SP, #0x60 STP X29, X30, [SP,#0x50+var_s0] ADD X29, SP, #0x50 ADD X14, SP, #0x50+arg_0 STR X14, [X15,#_ETHREAD.Tcb.TrapFrame] UBFX X9, X16, #0xC, #1 AND X8, X16, #0xFFF ; X8 = syscall# loc_14002F1DC ; CODE XREF: KiSystemServiceCopyEnd+50↓j LDR W10, [X15,#0x70] ADRP X11, #KeServiceDescriptorTable@PAGE ADD X11, X11, #KeServiceDescriptorTable@PAGEOFF ADRP X12, #KeServiceDescriptorTableShadow@PAGE ADD X12, X12, #KeServiceDescriptorTableShadow@PAGEOFF TST X10, #0x80 B.EQ not_GUIthread TST X10, #0x200000 B.EQ not_GUI_restricted ADRP X12, #KeServiceDescriptorTableFilter@PAGE ADD X12, X12, #KeServiceDescriptorTableFilter@PAGEOFF ... LDRSW X10, [X11,X8,LSL#2] ; extract the encoded syscall offset ADD X11, X11, X10,ASR#4 ; calculate the final syscall target ... BLR X11 ; issue the syscall
We can summarize the whole syscall flow as,
And that’s a brief look at syscall dispatching for Windows on ARM64.
If you are interested in this kind of stuff, please consider attending one of my training courses. Click HERE for the current course schedule.
References
[1] ARM 2017. ARM Architecture Reference Manual: ARMv8, for ARMv8-A architecture profile (ARM DDI 0487C.A (ID121917)).
[2] Kishan, A. and Pulapaka, H. 2017. Windows 10 on ARM.
[3] wup and suezi. https://www.iceswordlab.com/2018/07/25/kdhack/.