Demystifying the Linux Kernel: It's Just a Program After All

Most books and courses introduce Linux through shell commands, leaving the kernel as a mysterious black box doing magic behind the scenes. This perception obscures a fundamental truth: the Linux kernel is just a binary that you can build and run directly. In this article, we'll conduct a series of experiments to demystify the kernel and build a mental model of how Linux systems work at their core.

What Exactly is a Kernel?

Computers are complex assemblies of components—CPUs, memory, video cards, network cards, keyboards, displays, and countless other peripherals. These devices come from different manufacturers, have varying capabilities, and require different programming approaches. The operating system kernel provides a unified interface to use these devices conveniently and securely.

Without a kernel, programs would likely be incompatible across different computers, we couldn't run multiple programs simultaneously, and multiple users couldn't share the same machine efficiently. The kernel essentially:

  • Provides APIs to interact with hardware through a unified interface
  • Manages how programs can use the computer's CPU and memory
  • Offers additional features like user management, permissions, process isolation, and security mechanisms

From a software development perspective, the kernel functions as a runtime for the entire computer system.

Locating the Kernel

On most Linux distributions, you'll find the kernel in the /boot directory. Let's explore what's there:

$ cd /boot
$ ls -1
System.map-6.12.43+deb13-amd64
System.map-6.12.48+deb13-amd64
config-6.12.43+deb13-amd64
config-6.12.48+deb13-amd64
efi
grub
initrd.img-6.12.43+deb13-amd64
initrd.img-6.12.48+deb13-amd64
vmlinuz-6.12.43+deb13-amd64
vmlinuz-6.12.48+deb13-amd64

The file we're interested in is vmlinuz-6.12.48+deb13-amd64. This single file is the kernel itself. The filename components have specific meanings:

  • vmlinuz: vm for virtual memory, linux, and z indicating compression
  • 6.12.48+deb13: the kernel version and distribution information (Debian 13)
  • amd64: the system architecture

Experiment 1: Running the Kernel Directly

Our first experiment will involve copying this kernel to another directory and attempting to run it directly:

$ cd
$ mkdir linux-inside-out
$ cd linux-inside-out/
$ cp /boot/vmlinuz-6.12.48+deb13-amd64 .
$ ls -lh
total 12M
-rw-r--r-- 1 user user 12M Dec  1 09:44 vmlinuz-6.12.48+deb13-amd64

Since running the kernel directly on our host system could be risky, we'll use QEMU, a virtual machine emulator. First, let's install the necessary tools:

$ sudo apt update
$ sudo apt install -y qemu-system-x86 qemu-utils

Now, let's attempt to start a virtual machine with our kernel:

$ qemu-system-x86_64 \
  -m 256M \
  -kernel vmlinuz-6.12.48+deb13-amd64 \
  -append "console=ttyS0" \
  -nographic

The output will include kernel initialization messages followed by an error:

[    2.179652] /dev/root: Can't open blockdev
[    2.180871] VFS: Cannot open root device "" or unknown-block(0,0): error -6
[    2.181038] Please append a correct "root=" boot option; here are the available partitions:
[    2.181368] List of all bdev filesystems:
[    2.181477]  fuseblk
[    2.181516]
[    2.181875] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

This panic is actually expected. After initializing itself, the kernel attempts to mount a root filesystem and hand control to an init program. Since we didn't provide one, the kernel panics.

Experiment 2: Creating a Custom Init Program

Let's create a simple init program in Go that the kernel can execute. First, install Go:

$ sudo apt -y install golang
$ mkdir init
$ cd init
$ go mod init init

Now, create a simple program that will serve as our init process:

package main

import (
    "fmt"
    "os"
    "time"
)

func main() {
    fmt.Println("Hello from Go init!")
    fmt.Println("PID:", os.Getpid()) // printing the PID (process ID)

    for i := 0; ; i++ { // every two seconds printing the text "tick {tick number}"
        fmt.Println("tick", i)
        time.Sleep(2 * time.Second)
    }
}

Build the program:

$ CGO_ENABLED=0 go build -o init .
$ ./init
Hello from Go init!
PID: 3086
tick 0
tick 1

This is a regular program that received a process ID (PID) and prints output. Nothing special about it yet.

Experiment 3: Creating a Minimal Initramfs

When the kernel starts, it doesn't have all components needed to access disk storage. It requires an initial RAM filesystem (initramfs) loaded into memory. Let's create a minimal one:

$ cd ../
$ mkdir -p rootfs/{proc,sys,dev}
$ cp ./init/init rootfs/init
$ sudo mknod rootfs/dev/console c 5 1
$ sudo mknod rootfs/dev/null c 1 3

The mknod commands create special device files that programs use to communicate with hardware. Our directory structure now looks like:

rootfs/
|-- dev
|   |-- console
|   `-- null
|-- init
|-- proc
`-- sys

Now, let's package these files into an initramfs image:

( cd rootfs && find . | cpio -H newc -o ) > initramfs.img

Experiment 4: Booting the Kernel with Our Initramfs

Let's try booting the kernel again, this time with our custom initramfs:

$ qemu-system-x86_64 \
  -m 256M \
  -kernel vmlinuz-6.12.48+deb13-amd64 \
  -initrd initramfs.img \
  -append "console=ttyS0 rdinit=/init" \
  -nographic

This time, the kernel should boot successfully and execute our init program:

[    2.555150] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[    2.557822] tsc: Refined TSC clocksource calibration: 2903.977 MHz
[    2.558399] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x29dbf0142be, max_idle_ns: 440795300983 ns
[    2.565700] clocksource: Switched to clocksource tsc
[    2.672446] Run /init as init process
Hello from Go init!
PID: 1
tick 0
tick 1
tick 2

Key Insights from Our Experiments

Several important concepts emerge from successfully booting our minimal Linux system:

  1. Our Go program received PID 1: The first process started by the kernel always gets PID 1 and is called the init process. Its job is to start other necessary programs.

  2. Kernel space vs. user space: Up until "Run /init as init process," we were in kernel space. With the init process starting, we entered user space.

  3. Linux distributions simplified: We've essentially built a minimal Linux distribution with just two files—the kernel and our init program. A Linux distribution is fundamentally just a kernel plus a collection of programs and configuration files.

  4. The kernel as a program: We've demonstrated that the kernel is just a binary that can be executed directly, not some mystical entity.

What We've Learned

Through these experiments, we've gained several crucial insights into Linux systems:

  • The Linux kernel is a single, few-megabyte file stored on your disk
  • A Linux distribution consists of a kernel plus additional programs and configuration files
  • A process is simply a program under execution
  • PID (Process ID) identifies each running process
  • The distinction between kernel space and user space
  • The role and significance of the init process

This hands-on approach reveals that the Linux kernel, while complex in its implementation, is fundamentally accessible. By directly interacting with it, we've demystified what often remains an abstract concept in technical education.

Understanding these core concepts provides a solid foundation for further exploration into Linux internals, system programming, and operating system design.


Source: https://serversfor.dev/linux-inside-out/the-linux-kernel-is-just-a-program/