A quick one for today, sparked by recent events at work. It can pretty much be summed up in this even quicker question: do you know what your program does when it's out of resources? Out of RAM, out of disk-space, out of address-space, out of time? Computers are indeed powerful beasts these days, and there's a bunch of people who would like you to believe that they are effectively infinitely powerful, but observing your code working with limited resources, even if those limitations are artificially imposed, can tell you a lot of things you mightn't have known previously. I promised you this one would be short, so without further ado, here's my grab bag of stuff you should check for:
What happens when your program writes (or reads) a file greater than 2GB? 2GB is a magical number, and you might have good reason to believe that your program just won't ever have to deal with that much data. But can it? The time you want to find out is in the comfort of your office, running your code in the debugger. I made the unfortunate discovery this Sunday (!) that a popular open-source IPS tool segfaults in this sitution. Not cool. It's ok for your program to be unable to deal with large files, but segfaulting is not the answer. Open another file, throw an error, whatever. Just don't segfault!
What happens when you run out of disk space? The guarantee you want to give your users is that you might lose work since the last check-point, but nothing more than that. Can your program start when there is no disk space at all? An answer of "no" is ok on this one, but make sure you've got a good error message if that's the case.
What happens when malloc() actually does return NULL? This is one of those things that falls under the heading of building testing into your code. Abuse LD_PRELOAD to build a malloc replacement that fails randomly 1 in 100 times and run your program at full load. Does it operate at 99% of its previous performance level, or does it crash? Does it run?
What if you're just plain outta RAM? Write a program which, every 20 seconds, allocates a megabyte of memory, and run it in parallel with your code. You'll need to actually paint that memory, because Linux (and likely other OSes) have a feature called over-commit, which lets you ask for more memory than the system actually has. It is only allocated when you actually use it. Granted, sooner or later you'll trigger the OOM killer, but memory pressure has a bunch of manifestations that'll turn up long before then. Running with less RAM is also a good way to find memory leaks faster -- they manifest as crashers much sooner when there's less RAM.
Out of lock-slices? -- ok, so I don't know if "lock slices" is part of the common engineering lexicon, but they're the random discrete unit of measurement I've adopted here to talk about lock contention. Basically, if you force your program into a situation where your locks are getting absolutely hammered, and you throw your hardest workload at the program, what happens?
I've espoused the virtues of static memory pools here before, but static pools hand back to you a concurrency problem that otherwise belongs to the memory allocator (or so everyone pretends) -- that of lock contention on memory acquisition and release. There's a bunch of pros and cons here that I won't go into, I'll only say this: if you do go with a static pool allocation system, build into your code the ability to configure the size of those pools, even if you never expose that configurability to your users. Then, you can (or rather, must), set these pools to an absurdly small number and hammer your code.
This strategy is 100% awesome in my opinion. We tried it today and it uncovered a deadlock in 10 seconds that a whole weekend of pushing packets had missed. Money just can't buy that kind of happiness.
I won't harp on this one too much longer, it's just one of those things you should try before you push out a release. It's also good to keep track of results between releases to guard against resource-creep. Understanding all the failure modes of your software is an absolute must if you expect to support it. Get this stuff right and that's a whole realm of problems you'll never have to explain to your customers. As they say in those oft-parodied Visa ads: priceless.