System call dispatching for Windows on ARM64

System call dispatching on Windows ARM64

Background

Microsoft recently announced that there will be Windows ARM64 devices. Technically, it should be “AArch64” but ARM64 is easier to type. This article briefly documents the system call dispatching mechanism for Windows on ARM64.  Readers are assumed to be familiar with ARM64 assembly and system call dispatching on Windows x86/x64.

Previous research and related readings

The only notable Microsoft source about Windows on ARM64 is a talk from Hari Pulapaka and Arun Kishan from BUILD 2017 [2]. In the short talk, they called these devices as “Windows PC on ARM” and went on the demonstrate VLC and OpenVPN running on ARM64 Windows without any code modification. They gave a general overview of Compiled Hybrid PE (CHPE) and how it is used to run x86 applications on ARM64; it is a software layer used to emulate x86 code on ARM (note: it does not support x64). For those interested in the talk, you can view the video on Channel 9.

Ironically, most of the technical information about Windows on ARM64 comes from the open source world. Commits to the Linux kernel indicate that there is Hyper-V support for ARM64. It is most likely for running ARM64 servers. André Hentschel made many commits to enable ARM64 emulation in the Wine project; it is interesting to note that this work started in 2013, four years before Microsoft’s official announcement!

After I wrote this blog, wup and suezi from Qihoo 360’s IceSword Lab released a blog documenting how they were able to get KD working using QEMU by modifying the serial debug transport [3].  As far as I am aware, this is the first time a non-Microsoft entity got KD working on the platform.  Hopefully they will release their source code so that others can try it as well.  It is worth noting that on retail ARM64 devices, you cannot do KD via USB or Ethernet either (none of the USB ports have debug enabled and there is no physical Ethernet port).  There are two serial ports though.

If you are aware of other technical sources of information, please send me an email.

System call dispatching

The ARM64 syscall stubs in ntdll.dll and friends simply contain two instructions. For example, this is ntdll!NtDeviceIoControlFile:

NtDeviceIoControlFile
    SVC             7
    RET

SVC is a trap instruction similar to SYSCALL or SYSENTER on x64/x86 platforms. Where does SVC take us? To answer that question, we need to review some ARM64 system level concepts.

ARM64 introduces the concept of exception levels (EL) to model privilege levels. There are 4 ELs: EL0, EL1, EL2, and EL3. They are somewhat similar to the ring levels in x86-based architectures except that they also cover hypervisors and security monitors. Unlike ring levels, the numbering scheme and its associated privilege are reversed; EL0 is for usermode, EL1 is for kernelmode, EL2 is for hypervisors, and EL3 is for security monitors. We illustrate the concept as follow,

arm_ELs

The only way to transition from one EL to another is through exceptions. Exceptions are generally triggered by special trap instructions (like SVC) or some unexpected behavior. The CPU maintains a table of handlers for various exceptions and conditions. On ARM64, this is known as the vector table and its base address stored in the VBAR_EL1 register for the kernel (it also exists for EL2 and EL3 but we don’t discuss those here). It is a blob of 0x800 bytes of code and the layout is described on page D1-1876 of [1]. When an exception happens, the CPU will immediately execute code at some offset within the vector table. Exception handlers are stored at well-defined offsets within the table; for example, when the SVC is executed, it will transition to offset 0x400 in the table; the return address is stored in the ELR_EL1 register.

The CPU automatically stores exception metadata in the Exception Syndrome Register (ESR_EL1). This is a 32-bit register and the format is described on page D1-1877 of [1]. The fields of relevance to us are the Exception Class (EC) and Instruction-Specific Syndrome (ISS). The EC describes the exception type (SVC, BRK, instruction pointer alignment, etc.). Some notable exception type values for us are:

  • 0x15: the SVC instruction was executed. This is typically used implement system calls in the kernel.  You are reading about it now.
  • 0x16: the HVC instruction was executed. This is typically used to implement calls in the hypervisor.  As mentioned earlier, there is a HyperV version for ARM64 and it implements hypercalls via this mechanism.  Several kernel functions call into the hypervisor (HvlpCallVtl1, HvlpAa64GetVpRegister64, Aa64HviGetVpRegister64, etc.)
  • 0x17: the SMC instruction was executed. This is typically used to implement calls into secure monitor.  HyperV uses this.
  • 0x3C: the BRK instruction was executed. This is common used to implement OS-defined exceptions (like breakpoints).  Windows uses to implement breakpoints and other custom exceptions.

The meaning of ISS is different for each exception, but for SVC it contains the immediate value encoded in the instruction. For example, if the instruction was SVC 7, then ISS will be 7. When the stub executes, the CPU will transition to EL1 and transfer control to the appropriate handler in the vector table. Where is the vector table located? The Windows kernel initializes VBAR_EL1 early in the boot process in KiInitializeExceptionVectorTable:

KiInitializeExceptionVectorTable        ; CODE XREF: KiInitializeBootStructures+1DC↓p
    MOV             X8, SP
    MSR             #0, c4, c1, #0, X8
    MOV             X9, #0
    MSR             #0, c4, c2, #0, X9
    ADRP            X8, #KiArm64ExceptionVectors@PAGE
    ADD             X8, X8, #KiArm64ExceptionVectors@PAGEOFF
    MSR             #0, c12, c0, #0, X8   ; VBAR_EL1 = KiArm64ExceptionVectors
    ISB
    RET

We now know that KiArm64ExceptionVectors is the base of the vector table. We show the first few handlers below. Note that the entries are 0x80 bytes apart.

.text:000000014002D800     MRS             X18, #0, c4, c1, #0
.text:000000014002D804     AND             SP, X18, #0xFFFFFFFFFFFFFFF0
.text:000000014002D808     SUB             SP, SP, #0x370
.text:000000014002D80C     MRS             X18, #0, c13, c0, #4
...
.text:000000014002D880 KiKernelSp0InterruptHandler
.text:000000014002D880     MRS             X18, #0, c4, c1, #0
.text:000000014002D884     AND             SP, X18, #0xFFFFFFFFFFFFFFF0
.text:000000014002D888     SUB             SP, SP, #0x370
...
.text:000000014002D900
.text:000000014002D900 KiKernelSp0FiqHandler
.text:000000014002D900     MRS             X18, #0, c4, c1, #0
.text:000000014002D904     AND             SP, X18, #0xFFFFFFFFFFFFFFF0
...
.text:000000014002D980
.text:000000014002D980 KiKernelSp0SystemErrorHandler
.text:000000014002D980     MRS             X18, #0, c4, c1, #0
.text:000000014002D984     AND             SP, X18, #0xFFFFFFFFFFFFFFF0
...
.text:000000014002DA00
.text:000000014002DA00 KiKernelExceptionHandler
.text:000000014002DA00     MRS             X18, #0, c13, c0, #4
.text:000000014002DA04     AND             X18, X18, #0xFFFFFFFFFFFFF000
.text:000000014002DA08     STP             X23, X30, [X18,#0x70]
...
.text:000000014002DC00
.text:000000014002DC00 KiUserExceptionHandler
.text:000000014002DC00     SUB             SP, SP, #0x370
.text:000000014002DC04     STP             X18, X30, [SP,#0x370+var_240]
.text:000000014002DC08     MRS             X18, #0, c5, c2, #0
.text:000000014002DC0C     LSR             X30, X18, #0x1A
.text:000000014002DC10     CMP             X30, #0x15

The layout is something like this,

vector_table

As explained earlier, the handler for SVC from EL0 is at offset 0x400 from the VBAR_EL1 register and this corresponds to KiUserExceptionHandler:

KiUserExceptionHandler
    SUB             SP, SP, #0x370
    STP             X18, X30, [SP,#0x370+var_240]
    MRS             X18, #0, c5, c2, #0   ; read the ESR_EL1
    LSR             X30, X18, #0x1A       ; extract the EC bits
    CMP             X30, #0x15            ; check if it is an SVC64 exception
    B.NE            case_otherExceptions  ; if not, branch out
    SXTH            X18, W18
    CMN             X18, #1               ; do we need to transition to 32bit?
    B.EQ            KiEnter32BitMode      ; yes
    B               KiSystemServiceException  ; otherwise, handle the SVC exception
case_otherExceptions                    ; CODE XREF: KiUserExceptionHandler+14↑j
                                        ; KiEnter32BitMode+C↓j
    MRS             X18, #0, c13, c0, #4
    AND             X18, X18, #0xFFFFFFFFFFFFF000
    STP             X19, X20, [X18,#0x50]

KiSystemServiceException is where system call dispatching starts. It saves the syscall number, return address and a few other values and then calls KiSystemService,

KiSystemServiceException                ; CODE XREF: KiUserExceptionHandler+24↑j
    MRS             X18, #0, c13, c0, #4
    AND             X18, X18, #0xFFFFFFFFFFFFF000
    STP             X0, X1, [SP,#arg_A0]
    STP             X2, X3, [SP,#arg_B0]
    STP             X4, X5, [SP,#arg_C0]
    STP             X6, X7, [SP,#arg_D0]
    MOV             X0, #2
    MRS             X1, #0, c4, c0, #1  ; ELR_EL1
    MRS             X2, #0, c4, c0, #0  ; SPSR_EL1
    MRS             X3, #0, c5, c2, #0  ; ESR_EL1
    MRS             X4, #0, c4, c1, #0  ; SP_EL0
    MOV             X5, SP
    MSR             #0, c4, c1, #0, X5
    MSR             #5, #0
    STR             W0, [SP,#arg_0]
    STR             XZR, [SP,#arg_10]
    MRS             X0, #0, c0, c2, #2
    TBNZ            W0, #0, loc_14002F108
loc_14002EFC8                           ; CODE XREF: KiSystemServiceExit+110↓j
    BFI             X2, X3, #0x20, #0x20 ; ' '
    STP             X2, X4, [SP,#arg_90] ; save the ESR which has the immediate (syscall #)
    STP             X29, X1, [SP,#arg_140]
    ADD             X29, SP, #arg_140
    LDR             X0, [X18,#_KPCR.Prcb.CurrentThread]
    LDRB            W1, [X0,#_KTHREAD.Header.___u0.__s7.DebugActive]
    CBNZ            W1, loc_14002F130
loc_14002EFE4                           ; CODE XREF: KiSystemServiceExit+130↓j
                                        ; KiSystemServiceExit+13C↓j
    MOV             X0, SP
    BL              KiLogTrapFrame
    LDR             X15, [X18,#_KPCR.Prcb.CurrentThread]
    MSR             #7, #0xA
    LDR             X0, [SP,#arg_A0]
    LDRH            W16, [SP,#arg_90+4]
    STP             X16, X0, [X15,#_ETHREAD.Tcb.SystemCallNumber]
    BL              KiSystemService

KiSystemService extracts the syscall number from the ESR and eventually invokes the syscall. The syscall table structure is the same as x64 so we will not discuss that here.

KiSystemService                         ; CODE XREF: KiServiceInternal+5C↑p
    SUB             SP, SP, #0x60
    STP             X29, X30, [SP,#0x50+var_s0]
    ADD             X29, SP, #0x50
    ADD             X14, SP, #0x50+arg_0
    STR             X14, [X15,#_ETHREAD.Tcb.TrapFrame]
    UBFX            X9, X16, #0xC, #1
    AND             X8, X16, #0xFFF     ; X8 = syscall#
loc_14002F1DC                           ; CODE XREF: KiSystemServiceCopyEnd+50↓j
    LDR             W10, [X15,#0x70]
    ADRP            X11, #KeServiceDescriptorTable@PAGE
    ADD             X11, X11, #KeServiceDescriptorTable@PAGEOFF
    ADRP            X12, #KeServiceDescriptorTableShadow@PAGE
    ADD             X12, X12, #KeServiceDescriptorTableShadow@PAGEOFF
    TST             X10, #0x80
    B.EQ            not_GUIthread
    TST             X10, #0x200000
    B.EQ            not_GUI_restricted
    ADRP            X12, #KeServiceDescriptorTableFilter@PAGE
    ADD             X12, X12, #KeServiceDescriptorTableFilter@PAGEOFF
...
    LDRSW           X10, [X11,X8,LSL#2]  ; extract the encoded syscall offset
    ADD             X11, X11, X10,ASR#4  ; calculate the final syscall target
...
    BLR             X11 ; issue the syscall

We can summarize the whole syscall flow as,

syscall_flow

And that’s a brief look at syscall dispatching for Windows on ARM64.

If you are interested in this kind of stuff, please consider attending one of my training courses. Click HERE for the current course schedule.

References

[1] ARM 2017. ARM Architecture Reference Manual: ARMv8, for ARMv8-A architecture profile (ARM DDI 0487C.A (ID121917)).
[2] Kishan, A. and Pulapaka, H. 2017. Windows 10 on ARM.
[3] wup and suezi. https://www.iceswordlab.com/2018/07/25/kdhack/.