Friday, 7 October 2016


If you are AIX admin, then probably you have experience with Human Managment Console or short HMC. You can do all HMC tasks via web application or via command line.

Web application is java application. In case you recently update your web browser or java on your  PC you will find that no can no longer acces HMC web application because of security or java security issues. What then?

Reboot LPAR by using HMC CLI

Then you connect to HMC via putty. Regular ssh connection! Your user must have admin privileges!

Today I have situation not be able to connect to HMC via web application and LPAR pagging memory was very, very low so LPAR stop responding. I needed to reboot it! How to do that in HMC cli?

There are two things that you need to know. You need to know machine name where LPAR is located and LPAR name.

#:lssyscfg -r sys -F name:state

 Ok, so we have two machines.

#:lssyscfg -m MACHINE_NAME1 -r lpar -F name:state

 Ok, now we have LPAR names on machine

In case you need to restart imediateally LPAR1 on MACHINE1 here is commad
chsysstate -r lpar -m MACHINE1 -o shutdown --immed --restart -n LPAR1


Thursday, 18 August 2016

WARNING: hpasmd: System Overheating (Zone 4, Location CPU, Temperature 85C) - SOLVED and EXPLAINED

In case you find this message in you server log then you are probably looking for reason why your HP server sometimes reboot it self with no particular reason.

WARNING: hpasmd: System Overheating (Zone 4, Location CPU, Temperature 85C)

So reason is overheating! As you know every machine have something called working temperature. Manufacturer of hardware always point's  that in that boring pdf that you never read. In case that working temperature is not in that temperature range, machine can go malfunctioning. To avoid that some manufactures develop certain mechanisms to prevent that. In HP servers case, that mechanism is server reboot. 

HP servers System Overheating reason


Reasons for system overheating can be various, from very high CPU load on certain CPU to too much dust covering air flow ventilators. System overheating can be one time thing(every few months) or every day. It all depends. 

You have to rule out reasons for system overheating. Monitor temperatures and try to connect them to CPU load. HP have tool called hpasm and with that you can monitor various things. For temperatures you can enter hpasm cli and user show temp:

hpasmcli> show temp
Sensor   Location              Temp       Threshold
------          --------              ----       ---------
#1        CPU#1                46C/114F   85C/185F
#2        CPU#2                43C/109F   85C/185F
#3        CPU#3                48C/118F   85C/185F
#4        CPU#4                50C/122F   85C/185F
#5        I/O_ZONE           37C/98F    60C/140F
#6        AMBIENT            22C/71F    40C/104F
#7        SYSTEM_BD       39C/102F   60C/140F

In case you have ruled out high load for system overheating condition, then it's probably air flow ventilator issue. How to fix this? Shutdown the server, open it and start vacuum cleaner. :) No, I am not joking.  You may find something like this
WARNING: hpasmd: System Overheating

 On right side is CPU heat sink before vacuum cleaning and on the right is CPU heat sink after vacuum cleaning. As you can see, there was wall of dust stopping normal air flow. This was reason for system overheating and reboot of server.

In case you still think that I am joking take a look at this time graph with CPU temperatures.

  As you can see, there are significant drop in CPU temperatures after dust removal! 

Point is keep you servers clean :)