Have you ever wondered what’s really happening under the hood from the CLI up till the Kernel level when you type the ‘ls’ command? Let’s figure it out!
The very first thing that allows you to start typing in the shell is a prompt. In most of the Linux distros, you’ll see it set as
I by default in the
The moment you start typing, a keyboard interrupt takes place that further calls a key handler which will display the characters on the shell.
After displaying on the screen whatever you wrote on the screen, let’s suppose you wrote
ls (for the sake of this post), the shell reads the command using the
STDIN data stream. It will store the input into a buffer as a string.
Buffer reads from
STDIN to the given block size and writes each block to the standard output.
Now, the string is broken into tokens by removing whitespace(suppose if you wrote
ls *.c). This is stored in an array of strings.
Now, it checks if any token has an alias defined. If there’s an alias defined for the token, then it will replace it with that particular value. The next step is to check if any token is a built-in function. Since built-in functions are treated differently by shell voluntarily. For example cd, echo, help are all built-in commands.
If it’s not a built-in function, we’ll go to find the
PATH variable in the directory. Since it holds the absolute paths for all the executable binary files. Each location specified in the
PATH variable is separated using the delimiter
: and searches recursively by appending the command at the end of the path.
usr/bin will be searched by appending
usr/bin/ls. Also, since it searches recursively, it will first search in the pwd and then its parent and so on and so forth with all other commands.
Once it finds the binary for
ls , the program is loaded in memory and a system call
fork() is made. This creates a child process as
ls and the shell will be the parent process. The
0 to the child process so it knows it has to act as a child and returns
PID of the child to the parent process(i.e. the shell).
ls process executes the system call
execve() that will give it a brand new address space with the program that it has to run. Now, the
ls can start running its program. The
ls utility uses a function to read the directories and files from the disk by consulting the underlying filesystem’s
You can use the
strace with the ls to dig deeper to know which library functions and system calls are being executed.
adeel@pycen:~/foo$ strace ls execve("/bin/ls", ["ls"], [/* 30 vars */]) = 0
adeel@pycen:/usr/src/bash-4.0/bash-4.0$ find . | xargs grep -n "execve (" ./builtins/exec.def:201: shell_execve (command, args, env); ./execute_cmd.c:4323: 5) execve () ./execute_cmd.c:4466: exit (shell_execve (command, args, export_env)); ./execute_cmd.c:4577: return (shell_execve (execname, args, env)); ./execute_cmd.c:4653:/* Call execve (), handling interpreting shell scripts, and handling ./execute_cmd.c:4656:shell_execve (command, args, env) ./execute_cmd.c:4665: execve (command, args, env);
ls process is done executing, it will call the
_exit() system call with an integer
0 that denotes a normal execution and the kernel will free up its resources.
The shell will free up memory, exits, and re-prompts the user for input.