Tuesday 23 August 2016

Linux Boot process on ARM CPU

 Linux Boot sequence on ARM CPU

Bootloader preparations

Before jumping to kernel entry point boot loader should do at least the following:
1. Setup and initialise the RAM.
2. Initialise one serial port. 
3. Detect the machine type.    
4. Setup the kernel tagged list.
5. Call the kernel image. 
 CPU register settings
  r0 = 0,
  r1 = machine type number discovered in (3) above.
  r2 = physical address of tagged list in system RAM, or
       physical address of device tree block (dtb) in system RAM

Low level kernel init

Kernel entry point is arch/arm/kernel/head.S:stext
At this point we save values from boot loader, check that CPU is in correct state, enable low level debug if enabled. Prepare MMU and enable it. Call trace is below.
1) ./arch/arm/kernel/head.S:stext()
//For early printk example look the trace below. For stages w/o MMU you should just set omap_uart_phys as one defined in uboot config file.
   ./arch/arm/kernel/head-common.S:__error_p() ->  printascii() -> addruart_current() ->
        arch/arm/mach-omap2/include/mach/debug-macro.S:addruart()
2) ./arch/arm/kernel/head.S:__enable_mmu()
3) ./arch/arm/kernel/head.S:__turn_mmu_on()
4) ./arch/arm/kernel/head-common.S:__mmap_switched()
5) init/main.c:start_kernel()

Low level debug

To enable console output as soon as possible, CONFIG_EARLY_PRINTK should be enabled.
Then correct physical and virtual address should be set up (for example see [2]). At this point kernel just writes output symbols directly to specific addresses instead of using kernel log daemon.
If UART address values are set up properly you can use assembler routines like ./arch/arm/kernel/head-common.S:printascii() from assembler code. From kernel code you can use early_printk() routine, which actually use same assembler code to throw output symbols though console.

Single thread kernel initialization

Code is in init/main.c:kernel_start() does the following things:
1) Obtain CPU id
2) Initialize runtime locking correctness validator
3) Initialize object tracker. (initialize the hash buckets and link the static object pool objects into the poll list)
4) Set up the the initial canary   (GCC stack protector support. Stack protector works by putting predefined pattern at the start of  the stack frame and verifying that it hasn't been overwritten when returning from the function.  The pattern is called stack canary and gcc expects it to be defined by a global variable called "__stack_chk_guard" on ARM.  This unfortunately means that on SMP we cannot have a different canary value per task. 
5) Initialize cgroups at system boot, and initialize any subsystems that request early init. (cgroups (control groups) is a Linux kernel feature to limit, account and isolate resource usage (CPU, memory, disk I/O, etc.) of process groups.)
==== Disable IRQ ====
6) initialize the tick control (Register the notifier with the clockevents framework)
7) Activate the first processor.
8) Initialize page address pool
9) Setup architecture
   a) Setup CPU configuration and CPU initialization
   b) Setup machine device tree (tags)
   c) Parse early parameters
   d) Initialize mem blocks
   e) sets up the page tables, initialises the zone memory maps, and sets up the zero page, bad page and bad page tables.
   f)  Unflatten device tree
   g) Store callbacks from machine description
   h) Init other CPUs if necessary
   i) reserves memory area given in "crashkernel=" kernel command line parameter. The memory reserved is used by a dump capture kernel when primary kernel is crashing.
   j) Initialize TCM memory (Tightly-coupled Memory, memory which resides directly on the processor of a computer)
   k) Early trap initialization
   l) Call machine early_init routine (if exists)
10) Setup init mm owner and cpumask
11) Store command line (We need to store the untouched command line for future reference. We also need to store the touched command line since the parameter  parsing is performed in place, and we should allow a component to store reference of name/value for future reference.)
12) Save nubmber of CPU IDs
13) SMP percpu area setup
14) Run arch-specific boot CPU hooks (??? SMP staff)
15) Build zone lists
16) Initialize page allocation
17) Parse early parameters (earlycon, console)
18) Parse other parameters
19) Initialize jump_labels
20) Setup log buffer
21) Initialize pid hash table
22) early initialization of vfs caches
23) Sort the kernel's built-in exception table
24) trap initialization (not implemeted for ARM)
25) Set up kernel memory allocators
26) Set up the scheduler prior starting any interrupts (such as the timer interrupt). Full topology setup happens at smp_init() time - but meanwhile we still have a functioning scheduler.
==== Disable preemption ====
27) initialize idr cache (Small id to pointer translation service.)
28) Initialize performance events core
29) Initialize RCU (Read-Copy Update mechanism for mutual exclusion)
30) Initialize radix tree
31) Early IRQ init (init some links before init_ISA_irqs())
32) IRQ init
33) Initialize priority search tree (A clever mix of heap and radix trees forms a radix priority search tree which is useful for storing intervals.)
34) Init timers
35) Init HR timers
36) Init  soft IRQ
37) Initializes the clocksource and common timekeeping values
38) Set machine timer as a system one and initialize it
39) Initialize simple kernel profiler
40) Register CPUs going up/down notifiers
====Enable IRQ====
41) Late initialize of kmem cache
42) Initialize console
43) Fall with panic here if needed
44) Run lock dependency validator
45) Run locking API test suite
46) Check initrd was not overwritten (if needed)
47) Initialize page cgroup
48) Enable debug page allocation
49) Initialize debug memory objects (Called after the kmem_caches are functional to setup a dedicated cache pool, which has the SLAB_DEBUG_OBJECTS flag set. This flag prevents that the debug code is called on kmem_cache_free() for the debug tracker objects to avoid recursive calls.)
50) Initialize kmemleaks (Kmemleak provides a way of detecting possible kernel memory leaks in a way similar to a tracing garbage collector with the difference that the orphan objects are not freed but only reported via /sys/kernel/debug/kmemleak. A similar method is used by the Valgrind tool (memcheck --leak-check) to detect the memory leaks in user-space applications.)
51) Allocate per cpu pagesets and initialize them.
52) Numa policy initialization (Non Uniform Memory Access policy)
53) Run late time init if provided (Machine specific ?)
54) Initialize schedule clock
55) Calibrating delay loop
56) Initialize pid hash table
57) anon_vma_init (?)
58) initialise the credentials stuff
59) Initialize fork (Allocate space for task structures )
60) Prepare proc caches (allocate memory for fork)
61) Allocate kernel buffer
62) Initialize the key management state.
63) Initialize security framework
64) Late gdb initialization
65) Initialize VFS caches
66) Initialize signals
67) Initialize page write-back
68) Initialize proc FS (if enabled)
69) Initialize cgroups (Register cgroup filesystem and /proc file, and initialize any subsystems that didn't request early init.
70) Initialize top_cpuset and the cpuset internal file system
71) early initialization of taskstat (Export per-task statistics to userland)
72) Initialize per-task delay accounting
73) Check write buffer bugs
74) Early initialization of ACPI
75) Late initialization of SFI (Simple Firmware Interface)
76) Ftrace initialization
77) Do the rest non-__init'ed, we're now alive (Create bunch of threads and call schedule to get things moving) Code in init/main.c:rest_init() routine.
   a) Create kernel init thread.
   b) Create kthreadd thread
   c) Prepare scheduler
==== Enable preemption ====
   d) call schedule()

Late kernel initialization 

kthread() thread

1) Setup a clean context for our children to inherit
2) If kthread_create_list empty just reschedule
3) Create kernel thread for every task in kthread_create_list 
4) Go back to step 2  

kernel_init() thread

1) Wait for kernel thread daemon initialization completion
2) Setup init permissions:
    a) init can allocate pages on any node
    b) init can run on any cpu
3) Prepare CPUs for smp
4) Do pre SMP init calls
5) Init lockup detector
6) Enable SMP
7) Initialize SMP support in scheduler
8) Initialize devices in init/main.c:do_basic_setup() (Ok, the machine is now initialized. None of the devices have been touched yet, but the CPU subsystem is up and running, and memory and process management works. Now we can finally start doing some real work..)
    a) Finish top cpuset after cpu, node maps are initialized
    b) Initialize user mode helper
    c) Initialize shmem
    d) Initialize drivers
    e) Initialize /proc/irq handling code
    f) Call all constructor functions linked into the kernel
    g) usermodehelper_enable - allow new helpers to be started again
    h) Do init calls
9) Open the /dev/console on the rootfs
10) check if there is an early userspace init.  If yes, let it do all the work
11) Run late init (Ok, we have completed the initial bootup, and we're essentially up and running. Get rid of the initmem segments and start the user-mode stuff..)
    a) finish all async __init code before freeing the memory
    b) Free init memory
    c) Mark readonly data as RO
    d) Set system state to  Running
    e) Try to execute userspace init command

No comments:

Post a Comment