Top 10 Ways to Crash Your Linux System

I recently made my collection of personal tools public on GitHub under the name of EUtils. These are real tools that I do use in real life to do somewhat useful things. But, about midway through making one of them, I realised just how many of them can be used to crash a linux box…

I don’t know if there are actuall ten different ways in there, but it does make a good title.

Allocate too much memory

I was writing a program to buffer a load of random numbers from /dev/urandom when I realised that, if the user cranked up the amount to buffer really high, they could allocate more memory than the system can handle, making the whole thing freeze up until the kernel finally figures out what is going on and kills the program responsible. Basically, I accidentally created a memory bomb to rival tail /dev/zero. So, I actually made a program that is for this and just for this purpose. It is called mdest, or Memory Destroyer. This program will, basically, repeatedly call malloc until (again) the stupid kernel actually does its job and kills the program responsible for ruining the system’s performance. If you really want to suffer, you can run the program as root and the kernel will be even slower to figure out what is going on because “the root is always correct!”.

In any case, running this program will always result in one of three scenarios:

  • Your system goes down after the idiot kernel fails to figure out what is going on before complete OOM and paging deadlock, causing a likely kernel panic or complete softlock
  • All your programs crash or are swapped out to disk after failed memory allocations. I have known some programs to start dereferencing NULL pointers due to failed allocations and others just completely deadlock waiting for a blocking IO operation that requires memory
  • mdest recieves sig-9 (SIGKILL) instructing it to exit immediately. Sadly, the C runtime library insists that it frees all your memory before the kernel does it anyway (thanks GNU), so even that can lead to a system deadlock or really slow exit time

Basically, this program really sucks and you shouldn’t run it unless you are testing OOM scenarios or you really hate whoever owns the computer you are running it on. But, let’s be real: it’s probably the second one.

Fork too many times

If you weren’t satisfied with memory bombs, how about automated fork bombs! Well, kind of. You see, in UNIX, the way that you create a child process is by doing the fork-exec sequence: your program forks into two copies of itself and the second copy simply replaces itself with a new program using the exec system call. However, creating a new process requires CPU time, IO time and memory usage. The footprint of a process descriptor for the kernel is not cheap. The Linux task struct is one of the largest data structures in the entire kernel (plus the overhead for stuff like page tables and it really does not come cheap). So, what if we were to fork ourselves over a million times and just start sched_yielding for the rest of time? Well, eventually you just arrive at either an unresponsive system or an OOM scenario like previously. This is boring - next horrible idea…

/dev/random

For some ungodly reason, some programs out there still use /dev/random in actual production code. /dev/random is a horribly designed interface which completely misunderstands how entropy and pseudorandom cryptographic randomness actually works. It is a lot of fun to abuse though.

You see, /dev/random will block if there is not enough entropy in the kernel entropy pool to return a truly random answer from this randomness interface. So, if a program is wanting a load of random data, getting another program to hog all the randomness will cause them all to start behaving strangely.

In eutils, there is a program called rand which will just return as many random numbers from the best available source (or whatever source you specify) at any given time. By default, it uses the standard C runtime’s rand call, which is horrible and just based on your system clock. However, for most shell programs or quick pipeline scripts, it works fine. So, its a useful utility. What it is also useful for, however, is intentionally crashing your Linux system. The program takes in one argument, which is the count: the amount of random numbers to return. So, if you just shove some enormous number in (I like to use 2^64 for fun) and point it at /dev/random, you can make some dark magic happen. One of the best things to try out is creating a GPG key pair at the same time as running rand in this manor. You will notice that both programs start to have a massive scrap over who gets the random data. Meenwhile, the poor kernel is having a heart attack trying to get both programs that they are asking for: loads of random data it doesn’t have because rand stole it all.

In the end, what tends to end up happening is that most of the applications that require random data just freeze up forever. Luckily, there is a quick fix to this one: just wiggle your mouse about loads, launch a load of programs and type random keys on your keyboard. The kernel’s entropy pool is obtained from a combination of user inputs and random memory locations. So, all the programs will, eventually, end up with the data they asked for and everybody is happy again.

So, does this count as a crash? Well, eventually the IO operations that the programs that want data might timeout. Either that or some threaded operation in them (this one tends to apply to web browsers) also requests this data, causing a mutual exclusion deadlock and an eventual crash. So, yeah - I’m counting this as a crash anyway.

Oversleeping

The sleep system call is incredibly useful for many applications. However, one of the applications where it tends to be a bit less suitable is being actually accurate with time, given that it requires a load of context switches for one call and it is up to the scheduler as to where you actually end up. So, if you are relying upon the sleep system call for accurate time keeping, you may as well give up.

So, naturally, that is what I chose to program the eutils stopwatch implementation using! Well, to be fair, this one is less egregious, because the program is multi-threaded using Go’s cool threading implementation, so the Go runtime is more in charge of sleeping threads. However, a similar situation to the normal sleep syscall applies: it is not designed for accuracy.

The only issue is that sleeping lots of times (in my case, once every millisecond) results in some funky behaviour. In particular, the increased CPU load feeds into a negative feedback loop: as the CPU stress increases, the scheduler compensates by making less context switches. As the CPU makes less context switches, the sleeps become longer, requiring more in a particular time frame to meet the demand for the application’s time keeping. And, the cycle continues. In the end, you end up with a dreadfully slow stopwatch which is out of sync by at least a whole second with realtime.

Luckily, my implementation does not suffer from this due to a specific little piece of threading genius that I used (thanks Rob Pike) and the stopwatch will be accurate to within a few milliseconds (I can’t guarantee that the runtime will not be a tiny bit longer than it promised, because it is out of my hands).

So, basically, sleeping in in Linux is not just being lazy - it actually takes more effort overall.

Ethan Marshall

A programmer who, to preserve his sanity, took refuge in electrical engineering. What an idiot.


First Published 2021-09-29

Categories: [ Old Blog ]