ARMv8 has four exception has four levels.
EL0 -- user applications
EL1 -- OS kernel
EL2 - - Hypervisor for virtualization platform
EL3 -- Secure Monitor firmware
The EL3 to EL0 elevation from one exception level to next exception level are achieved by setting exceptions. These exceptions will be set by one level and the next level will handle it.
The synchronous exception from user space EL0 to kernel EL1 using the svc supervisor call. Thus an application runs in Linux should issue svc
with registers set with appropriate values. To know what are those appropriate values, Lets see how kernel handles svc
.
Kernel :
VBAR_EL1
[Vector Base Address Register for EL1].arch/arm64/kernel/entry.S + 493
. Eachkerenl_ventry
is 32 instructions long. As an instruction in ARMv8 is 4 bytes long, next kerenl_ventry will start at +0x80 of current kerenl_ventry.Offset from VBAR_EL1 | Exception type | Exception set level |
---|---|---|
+0x000 | Synchronous | Current EL with SP0 |
+0x080 | IRQ/vIRQ | “ |
+0x100 | FIQ/vFIQ | “ |
+0x180 | SError/vSError | “ |
+0x200 | Synchronous | Current EL with SPx |
+0x280 | IRQ/vIRQ | “ |
+0x300 | FIQ/vFIQ | “ |
+0x380 | SError/vSError | “ |
+0x400 | Synchronous | Lower EL using ARM64 |
+0x480 | IRQ/vIRQ | “ |
+0x500 | FIQ/vFIQ | “ |
+0x580 | SError/vSError | “ |
+0x600 | Synchronous | Lower EL with ARM32 |
+0x680 | IRQ/vIRQ | “ |
+0x700 | FIQ/vFIQ | “ |
+0x780 | SError/vSError | “ |
arch/arm64/kernel/entry.S + 493
. Eachkerenl_ventry
is 32 instructions long. As an instruction in ARMv8 is 4 bytes long, next kerenl_ventry will start at +0x80 of current kerenl_ventry.Loads the vector table into
VBAR_EL1
at arch/arm64/kernel/head.S +429
| adr_l x8, vectors // load VBAR_EL1 with virtual msr vbar_el1, x8 // vector table address isb // instruction set barrier |
VBAR_EL1
is an system register. So it cannot be accessed directly. Special system instructions msr
and mrs
should be used manipulate system registers.Instruction | Description |
---|---|
adr_l x8, vector | loads the address of vector table into general purpose register X8 |
msr vbar_el1, x8 | moves value in X8 to system register VBAR_EL1 |
isb | instruction sync barrier |
System call flow in Kernel
svc
. From thtable, we can see for AArch64 synchronous exception from lower level, the offset is +0x400
. In the Linux vector definition VBAR_EL1+0x400
is el0t_64_sync
. it call el0t_64_sync_handler definition at arch/arm64/kernel/entry-common.c + 615
1 26 27 28 29 30 31 34 35 35 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs) {
unsigned long esr = read_sysreg(esr_el1); //read the syndrome register |
The synchronous exception can have multiple reasons which will be stored in the syndrome register esr_el1
. Compare the value in syndrome register with predefined macros and branch to the corresponding subroutine.
In a system call case, control will be branched to el0_svc and it call do_e10_svc
. It is defined at arm64/kernel/entry-common.c +599 and arch/arm64/kernel/syscall.c +178
as follows
/*
* SVC handler.
*/
static void noinstr el0_svc(struct pt_regs *regs)
{
enter_from_user_mode(regs);
cortex_a76_erratum_1463225_svc_handler();
do_el0_svc(regs);
exit_to_user_mode(regs);
}
void do_el0_svc(struct pt_regs *regs)
{
sve_user_discard();
el0_svc_common(regs, regs->regs[8], __NR_syscalls, sys_call_table);
}
static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr, const syscall_fn_t syscall_table[])
{
invoke_syscall(regs, scno, sc_nr, syscall_table); //system call invoke here
}
sys_call_table
sys_call_table
is defined at arch/arm64/kernel/sys.c +58
.#undef __SYSCALL
#define __SYSCALL(nr, sym) [nr] = sym,
/*
* The sys_call_table array must be 4K aligned to be accessible from
* kernel/entry.S.
*/
void * const sys_call_table[__NR_syscalls] __aligned(4096) = {
[0 ... __NR_syscalls - 1] = sys_ni_syscall,
#include <asm/unistd.h>
};
__NR_syscalls
defines the number of system call. This varies from architecture to architecture.- Initially all the system call numbers were set
sys_ni_syscall
- not implemented system call. If a system call is removed, its system call number will not be reused. Instead it will be assigned withsys_ni_syscall
function. And the include goes like this
arch/arm64/include/asm/unistd.h
->arch/arm64/include/uapi/asm/unistd.h
->include/asm-generic/unistd.h
->include/uapi/asm-generic/unistd.h
. The last file has the definition of all system calls. For example thewrite
system call is defined here as
1
2
#define __NR_write 64
__SYSCALL(__NR_write, sys_write)
The
sys_call_table
is an array of function pointers. As in ARM64 a function pointer is 8 bytes long, to calculate the address of actual system call, system call numberscno
is left shifted by 3 and added with system call table address.
Each system call is defined with a macro SYSCALL_DEFINEn
macro. n
is corresponding to the number of arguments the system call accepts. For example the write
is implemented at fs/read_write.c +652
1 | SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf, |
This macro will expand into sys_write
function definition and other aliases functions as mentioned in this LWN article. The expanded function will have the compiler directive asmlinkage
set. It instructs the compiler to look for arguments in CPU stack instead of registers. This is to implement system calls architecture independent. That’s why kernel_entry
macro in el0_sync
pushed all general purpose registers into stack. In ARM64 case registers X0
to X7
will have the arguments.
Application Flow
- http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/ch09s01s01.html
- http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/CHDEEDDC.html
- https://lwn.net/Articles/604287/
- https://courses.cs.washington.edu/courses/cse469/18wi/Materials/arm64.pd
- http://www.osteras.info/personal/2013/10/11/hello-world-analysis.html
- And the Linux kernel source. Thanks to https://elixir.bootlin.com
- https://eastrivervillage.com/Anatomy-of-Linux-system-call-in-ARM64/
No comments:
Post a Comment