Wednesday, 30 December 2015

How to change iLo hostname on HP Proliant servers - SOLVED and EXPLAINED

In case you have or administrating HP Proliant servers then you probably know what iLo is. iLo is short from integrated Light-out. On every HP server there is a small chip where iLo is located. So what is iLo? iLo is some kind of hardware monitoring web GUI. iLo is always working but to approach  to iLo you need IP address. There is default username and password that is located on small paper card that comes with server. Default username is Administrator and password is 8 character random combination of letters and number. Remember that password is case sensitive. It is wise NOT to change password for user Administrator. It is wise to create new user.  

What is iLo hostname? 

how to change ilo hostname hp proliant

This in red square is iLo hostname. Basically, it is name of iLo chip. You can name it Chip if you like but it is wise that iLo hostname is same as server hostname? Why? Well, think of this situation. You need to reboot server called server1 that has iLo address 30.30.30.31. You will see above picture. Here you can see iLo firmware version and iLo hostname. Does iLo hostname ILOFAKENAME mean anything to you? Usually, it does not. So you log in and reboot the server. 20 seconds later you CTO call and ask why server server2 is down. Yea, you rebooted wrong server. That is wise to change iLo hostname to server name.

Ok. Lets change iLo hostame on HP Proliant server.
Step1. Login to iLo. You will see something like this
how to change ilo hostname hp proliant


Step2. Go to Network on left menu.
how to change ilo hostname hp proliant

Step3. Change iLo Subsystem Name(Host Name) to name you will be familiar with. For this test we will change it to myserver. 
how to change ilo hostname hp proliant

For these changes to take affect you have to reset iLo chip. This has absolutely no effect on server. THIS WILL NOT REBOOT SERVER BUT ONLY iLO chip.

Step4. Wait for iLo to reset and see this changes.
how to change ilo hostname hp proliant
That is it. This takes only few minutes to change but can really save you a lot of time.

Or you can see this on Youtube!


 

Tuesday, 22 September 2015

High CPU load because of ksoftirqd processes cause by LEAP second issue - SOLVED

 High CPU, ksoftirqd, LEAP second - SOLVED EXPLAINED

In case that your server start to act funny, funny like slow response time, high CPU load, etc. first thing that you do it to execute top command. In case your top looks something like this

leap seconf LINUX cpu load

 you know that something strange is going on. CPU load is very, very high! My first guess was that something is wrong with java application that this server is running. But when I notices that other servers have same issue(very high CPU load), I new that something strange is happening because those server very on different locations, different platforms(hardware, virtual), different OS(but all Linux) and different application were running on them(DB,java,etc.). Also I noticed these ksoftirqd  processes. They did not use so much CPU but it's strange that they are so high in top CPU list.

High CPU load because of ksoftirqd processes

So, all servers with issues had same problem - ksoftirqd process is causing this! What is ksoftird?

ksoftirqd is a per-cpu kernel thread that runs when the machine is under heavy soft-interrupt load. Soft interrupts are normally serviced on return from a hard interrupt, but it's possible for soft interrupts to be triggered more quickly than they can be serviced.

So these soft interrupts if from some reason causing other process (like java process) to use too much CPU.

What caused this? And why? To be worst, I rebooted one server with high CPU load because of ksoftirqd processes but this did not helped!

High CPU load because of ksoftirqd processes cause by LEAP second issue - SOLVED  

After few hours of searching on Internet what can be cause of this, I remembered that I saw a news and that then mention that "tonight" (night before problem started) there is going to be one extra second and that it is something that is normal and every few year extra second is added. I continue to search for solution and then on one forum I read that someone is mentioning "leap" second. What is leap second?

From Wikipedia:
A leap second is a one-second adjustment that is occasionally applied to Coordinated Universal Time (UTC) in order to keep its time of day close to the mean solar time, or UT1. Without such a correction, time reckoned by Earth's rotation drifts away from atomic time because of irregularities in the Earth's rate of rotation. 
 
The NTP packet includes a leap second flag, which informs the user that a leap second is imminent. This, among other things, allows the user to distinguish between a bad measurement that should be ignored and a genuine leap second that should be followed. It has been reported that never, since the monitoring began in 2008 and whether or not a leap second should be inserted, have all NTP servers correctly set their flags on a December 31 or June 30.This is one reason many NTP servers broadcast the wrong time for up to a day after a leap second insertion

So I start to search for this leap second issue solution as possible cause of my problem because it was 1.7.2015.
How to solve this leap second that is causing high CPU load because of ksoftirqd processes?

# date -s now

As soon as I execute this, CPU load start do drop! In a 2 minutes CPU load was back to normal.