Unix server “dissection” with “htop” command

A client of mine had problems with his hosting. I dig, maybe a month, but see nothing. It was a "virtual dedicated server" - the server, which we can see, was a separate "jail" in a bigger shared server. "Cloud computing", you know... And finally, with the help of a student of mine, we saw this screen bellow, which uncover the problem.


Those who work with Unix systems will see that this is the "htop" command. It shows the processes in the server in real time. But there is a hidden "piggy"... Let me explain:

Rows starting with numbers 1-4 are 4 processor's cores. They say that server is very busy, all the cores are 100% loaded.

Row "Mem" is memory. It's full a bit more than half - not busy at all. Next row is "Swp", means "swap" - memory extension on the disk. It's practically free.

The rows at right: There is shown that processor runs 83 tasks - very small number, usual for an idle server. There is 9 tasks active in this moment, shown as "running". Next row shows the total load of the server, there is 3 numbers. First is "right now", second is "this minute", and last is "this hour", values 2.86, 1.15 and 0.80 - this means that hardware is free, as with 100% busy processor fist the value should be at least 4 (4 cores x 100%).

So, only the processor does some big job, but it is not seen on other components: memory is free, swap space is free and load is low, and long time ago.

There is only one case when this is possible - when we have massive reading or writing on the disk. Find the column "S" at the next (bright) row, it show "process state". "R" means "running" (one of 9 tasks I mentioned already), "S" means "sleeping" (non-active), "D" means "disk operation". But no such process, which do a disk operation here. There is hidden processes, can't all be shown on the screen, but such process must be seen at top of the list, as it loads processors. And we sort the list by processor's load ("CPU%" highlighted), it should be the first one in the list. But none. In this case is only one possibility for disk jobs - server do "swapping" - parts from memory are saved to the disk, and then loaded back and other parts are saved there. Thus, system uses the hard disk as "spare memory". This is a very intensive and processor's time consuming job, but is the only way for the server to run, even slowly, when memory is busy. But it is not... Both memory  and swap is free.

Huston, we got a problem... Misleading info. But "htop" can not lie - don't knows how.

Result: faulty processor. We moves the site to another hosting.

Share this

This entry was posted in Blog, Hacks. Bookmark the permalink.

Leave a Reply