What if?


Then one morning shit hits the fan, and it's all over the place. Something goes terribly wrong and production makes a distinct halt. It takes ages to file a new order. The Intranet page just won't load. Local shares won't mount. Remote Desktop Connections time out. SSH does not work… OK, this is not good, by any measures. And yes, this is an exaggeration, you probably won't see the above happening all at once unless someone blew up your data centers or you've had a very, very malicious attack where someone erased the lot.

In one way or another you have to access your devices, be it through ILO, KVM or good old RS-232, but some serious digging into your logs (that are hopefully collected) has to be done. Following your Incident response plan (you have one, right?) you call in experts in several fields to help you chase traces of faults and translate various logs from all your systems. These experts might be in-house specialists, or you could buy this expertise from consultants. The point being, you just won't cope with logging in to all of your boxes in reasonable time and look for suspicious lines of logged events, even if the amount of devices are counted in tens. So get some help.

By now you are at the investigation level. The likelihood of missing logs, or logs not containing enough data, is probably closer to 1 than 0. You decide that logging would have been a nice thing to have, and start configuring devices and applications to log, at least locally, and you are slowly starting to embrace the thought of collecting the logs at some kind of central repository. Let's be straight and honest - log management is a must have. It doesn't matter how you do it, but you need to configure logging on your systems. If you can save them centrally investigations will be so much easier.

With that in mind, and a serious incident that cost the company a lot of money which gives you a very good incentive, you go ahead and ask for money to invest in some kind of log collecting gadget, which also has rudimentary capabilities to parse the log data automatically. Of course there is a possibility that the incident wasn't serious enough for your investment board to consider investments, or your boss is reluctant to buy something that will jeopardize his budget (which of course will affect his bonus). If that is the case there are options that are free to download. Some are Open Source, and some vendors offer free versions of their commercial programs with some limitations in volume, functionality is usually the same as the commercial product. Welcome to level four, reporting!

Comments

Popular posts from this blog

Overhead lines

Links for August 19th 2009

R.I.P. Google Reader