New post

5 bad habits in system administration


This article is more for new birds on the dusty IT field rather than for experts.

This text supposed to increase overall professionalism in the industry. Due to my work specifics, I’ve inherited all kinds of cloud-based hell, which I’m cleaning up, optimizing, making beautifully transparent.

These habits are an illustration of a worst-case scenario in system administration and should be prevented on all levels.

Of course, we can discuss forever these habits’ reasons: deadlines, laws, business developing speed and mmmm… poor brain capabilities in the end. But I pursue a different goal. I want to create a constructive discussion. And its results are considered as a goal.

Meet the habits:

1.    Manual system management/configuration by administrators.

What does it mean?

I think is the most common and at the same time the most dangerous habit of all. Especially when it goes with others. The key problem here is that “people tend to mess up things”. And they do mess up. Of course, you think that you will never do some stupid stuff in trivial situations, but isn’t it better just prevent these situations from the beginning?

What you can do about it?

The easiest thing – not to go on a server by ssh manually. Never! Get familiar with configuration management systems like Opscode Chef, Puppet and CFEngine, for example. Basic info in many languages is more than enough for overall comprehension and successful usage.

2.    Third party components that confuse system updates.

I’m almost 100% sure that each system administrator who worked with ruby, but hadn’t yet discovered rvm / rbenv, once in a lifetime for sure composed it from the source code on his server and then used it in production.

And now a question – you need to urgently upd ate ruby on 16 front-end servers because there was a patch released that fixes a crucial vulnerability, that allows getting root rights remotely (it’s hypothetical, but you know the life is unpredictable). Will you compile every server manually? Or maybe you set up an update pack on the test machine and then update all the servers centrally, using the software from the first paragraph? I hope you know the answer.

3. Standardization is missing

What is it and what you can do about it?

This is either a reason or a consequence of the first to paragraphs. Imagine a zoo with 16 front-end servers with different versions of Debian, centos, and Gentoo and with non-standard repositories of dubious origin connected. Imagined? Terrified? That’s good.

But it’s very easy to give up this habit. Write a guideline and follow it through.

4.    Lack of monitoring and notifications

What is it and what you can do about it?

It’s strange but I faced it in 50% of companies. If you don’t even have Nagios and Monit for numerous metrics collection and joyful mailings to your Operation Team in case of emergency, it’s guaranteed that The Day will come when you’ll spend 24 or even 48 hours straight working, feeling how your hair is turning grey.

Fighting this habit is easy and only your fantasy limits you. You can use Nagios or Zabbix + Cacti perform well too. If you a fan of SaaSthen try Circonus and/or NewRelic

And there is a nice tool — PagerDuty. With it you can convert your e-mail alerts to SMS, and it can build a vigil schedule and overall cool and flexible.

5. Lack of files modification tracking.

Yesterday I’ve modified the configuration file. And one more as well. And then my changer came and modified some a bit more. And today both of us were called at 3 am to the office, sounds familiar?  

But Linux created Git in 2005. And before Git there were other vcs. To make Git commit after you’ve changed something – it’s a 1-second job, but these exact habit will wash your problems away when you perform emergency configuration rollback. And in addition to configuration management systems (see the point. 1) it becomes basic and the most important skill in your everyday work.

I don’t know about drinking more water and flossing, but this is the habit that you should establish for sure.


The best result of this article I see that everyone look closely on their work se t up and would fix the things that are waiting to be fixed for months, would fight the procrastination and make this time everything as it should be.

And as I mentioned, in the beginning, my bad habit list is not ending here and it will be good to continue it in the comments.

The original article is here

Comments 0
Only registered users can post comments. Log in, please.