System Call Table in Linux

System call table is an array of function pointers. It is defined in kernel space as variable sys_call_table and it contains pointers to functions which implement system calls. Index of each function pointer in the array is the system call number for that syscall. These are denoted by NR_* macros in header files, such as /usr/include/asm/unistd_64.h for x86_64.

On x86 systems, when a user mode program makes a system call it puts the system call number in RAX register and calls sysenter assembly instruction. This instruction switches CPU from user mode into kernel mode. It sets instruction pointer RIP to the value stored in SYSENTER_EIP_MSR register and stack pointer RSP to the value stored in SYSENTER_ESP_MSR register. MSR is short for Model Specific Register.  These are registers which are present on specific models of Intel processors, such as 64-bit processors only. The above mentioned MSRs are set up by Linux kernel to contain addresses of system_call() kernel function for RIP and kernel mode stack belonging to the process which started the system call for RSP (yes each process Рor more specifically thread Рhas a kernel mode stack in addition to user mode stack).

system_call() function is like a multiplexer for syscalls. It saves hardware context on stack, performs some checks, e.g. whether the process is being syscall-traced in which case it needs to notify the tracer, and if all checks pass, ultimately jump into function pointed to by the pointer at syscall number index inside system call table. Return from syscall happens with sysexit assembly instruction. Upon return, the hardware context is restored and execution continues in user-space code which usually is a libc wrapper routine.

Linux Kernel Symbols

Kernel symbols are names of functions and variables. Global symbols are those
which are available outside the file they are declared in. Global symbols
in the Linux kernel currently running on a system are available through
`/proc/kallsyms` file. This includes symbols defined inside kernel modules
currently loaded.

Global symbols are of two types:

  1. those explicitly exported through EXPORT_SYMBOL_GPL and EXPORT_SYMBOL
    macros, and
  2. those which are not declared with `static` C keyword and hence visible to
    code which is statically linked with the kernel itself and may be available
    outside the kernel image.

The first type, explicitly exported ones, are denoted with capital letter in
output of `cat /proc/kallsyms` – e.g. T if the symbol is in text section, i.e.
a function name. The second type are denoted with small letter – e.g. t for a
function which isn’t exported via EXPORT_SYMBOL_GPL or EXPORT_SYMBOL.

Inside kernel code, we can access symbols which are exported explicity by
simply using them like other variables, e.g. by calling printk() function.

For global symbols which aren’t explicitly exported, but are still available,
we can attempt to access them by calling kallsyms_lookup_name() function,
defined in kernel/kallsyms.c:

unsigned long kallsyms_lookup_name(const char *name);

This takes symbol name as argument and returns its address in memory, i.e. a
pointer to it. The calling code can dereference the pointer to make use of that
symbol. If the symbol isn’t found, the function returns NULL.