Sysadmin tips and tricks for Linux and Android

Thursday 8 March 2018

Checking for softnofiles=4096; softnofiles=1024. Failed installation check issue with limits.com SOLVED and EXPLAINED

In case you have try to install some Oracle software you have to setup hard and soft limits. This is done in /etc/security/limits.conf. After changing these settings, you need to logoff for these settings to take effect.

Checking for softnofiles=4096; softnofiles=1024. Failed

So even if you set your user to have soft nofile limit to your preferred value, changes did not take effect. Why?

This is for Red Hat Enterprise Linux Server 7.4, I suppose that same is in other distros.

These parameters CAN NOT be greater then value that is applied to super-user root! So if your root soft limit is 1024, then your no-seper-user cannot be grater then 1024.

How to fix this?

In limits.conf add this

root - nofile 100000

user soft file 4096

and DO NOT CLOSE YOUR SESSION BECAUSE IF THERE IS ANY ISSUE YOU WILL NOT BE ABLE TO LOG BACK IN WITH PUTTY! So just start new session and check these parameters.

Also, remember this - hard limits should be greater or equal to soft limits!

Friday 2 March 2018

SSH X11 forwarding Error: Can't open display: issue on rhel 7.x SOLVED and EXPLAINED

There is no need for installing XVNC server on your server or installing desktop environment in case someone need GUI for installing some applications like Oracle DB or some web application.

You can use SSH X11 forwarding to GUI stuff.

What is SSH X11 forwarding?

First you need to know what X11 is. Simple said - X11 is GUI. What is SSH? SSH is short from Secure SHell. You can connect to server with SSH sessions by using putty or some other program.

SSH session is very "light" and it is used for CLI access to server.

SSH X11 forwarding will enable GUI through SSH session. Meaning? You use GUI environment of your client machine to do work on server that needs to be done from GUI!

Why is this probably better then installing complete GUI environment on your server? Because, GUI environment can be pretty big and depending on your server resources can consumed much needed CPU or memory.

Check list for enabling SSH X11 forwarding

To enable X11 forwarding you do need to install few packages and make some configuration adjustments!

Step 1 - enabling SSH X11 forwarding

Because I am doing this on RHEL 7.4, names of packages may slightly differ.

Packages are:

xorg-x11-xauth
xorg-x11-fonts-*
xorg-x11-utils

It is wise to install and xorg-x11-utils because it provides xclock, simple GUI clock that can be test if you SSH X11 forwarding is working.

So:

yum install xorg-x11-xauth xorg-x11-fonts-* xorg-x11-util

Step 2 - enabling SSH X11 forwarding

Check if X11 forwarding is enabled. This is checked in SSH server configuration file

#:cat /etc/ssh/sshd_config |grep X11FoX11Forwarding yes

I have to be yes if you want this to work.

Step 3 - enabling SSH X11 forwarding

This step is needed for RHEL 7.X, Centos 7.X.

In sshd_config file there is parameter named AdressFamily. This parameter is by default set to any.

cat /etc/ssh/sshd_config |grep any
#AddressFamily any

Even do this parameter is commented, it is active because, default value is any. What does this mean anyway? Address Family is referred acceptance of connections from ipv4 or ipv6 networks. Any means it can be both but first is default value of OS. In older Linux distributions, default value any was referred to ipv4 address family. In newer distros such as RHEL 7.X or Centos 7.x, value any is referred to ipv6 adress family.

Funny thing is that if this parameter is set to any, you can connect to server with regular SSH session(no X11 forwarding) but if you connect to SSH with X11 forwarding enabled on your client, you will not be able to start X11 session. Your error will be

program you want to run Error: Can't open display:

In my case this is

# xclock
Error: Can't open display:

If you, did not set this right, your X11 SSH session will not have DISPLAY variable set. exit

After all this is set, you can start your SSH X11 forwarding session!

Tuesday 6 February 2018

How to remove LUNZ from LINUX - EXPLAINED and SOLVED

What LUNZ are in a first place?

When you attach storage to a server but you didn't yet assing any LUN's from storage to that server, server see something called LUNZ. You can think of LUNZ as dummy(ghost device) LUN. How many LUNZ whould you have? That's depending of how many path you have to your storage device. Each LUNZ is representing on path. So if you have 4 paths to storage, you will have 4 LUNZ, if you have 8 paths, you will have 8 LUNZ, etc.

This here is applied for EMC storage systems( Clarrion, Symmetrix, VNX, UNITY...)

Are LUNZ good or bad thing?

Seeing LUNZ are generally a good thing. This means that your server have connection to storage system. Where can you check if you can see LUNZ?

#cat /proc/scsi/scsi
.
.
Host: scsi1 Channel: 00 Id: 05 Lun: 2
Vendor: DGC      Model: LUNZ            Rev: 8301
Type:   Direct-Access                    ANSI SCSI revision: 06
Host: scsi1 Channel: 00 Id: 04 Lun: 2
Vendor: DGC      Model: LUNZ            Rev: 8301
Type:   Direct-Access                    ANSI SCSI revision: 06
Host: scsi2 Channel: 00 Id: 05 Lun: 2
Vendor: DGC      Model: LUNZ            Rev: 8301
Type:   Direct-Access                    ANSI SCSI revision: 06
Host: scsi2 Channel: 00 Id: 04 Lun: 2
Vendor: DGC      Model: LUNZ            Rev: 8301
Type:   Direct-Access                    ANSI SCSI revision: 06

All disks are shown to system as scsi devices, same thing applied to LUNZ. Each path to storage device is represented as one LUNZ.
What can we see from above output? We can see that we have 4 paths to storage and thah we are connected to storage system with Rev number 8301.

Why this Rev number is important? In case you are troubleshooting, Rev number is uniq for storage system so in case you are not shure to which storage system your server is attached, you should loggin to serverA that already have LUN's attached to serverA and execute cat /proc/scsi/scsi and check the Rev number. If they are same as in server and serverA we can conclude that they are attached to same storage system.

Should you keep LUNZ on your system?

Only answer to this question in NO. Why? They will not do no harm but once you assing LUN to your server and rescan your scsi host, propper scsi devices will be shown and things can get a little to crowded. Even do all LUN's have uniq scsi ID multiplied by number of paths to stogare(in case you have 4 paths to storage system, one LUN is represented as 4 scsi devices, in case you have 8 paths,..) it's always wise to keep your system clean from junk. LUNZ are junk.

How to remove LUNZ?

First list your scsi devices.

#cat /proc/scsi/scsi

.
.
Host: scsi1 Channel: 00 Id: 05 Lun: 2
Vendor: DGC      Model: LUNZ            Rev: 8301
Type:   Direct-Access                    ANSI SCSI revision: 06
Host: scsi1 Channel: 00 Id: 04 Lun: 2
Vendor: DGC      Model: LUNZ            Rev: 8301
Type:   Direct-Access                    ANSI SCSI revision: 06
Host: scsi2 Channel: 00 Id: 05 Lun: 2
Vendor: DGC      Model: LUNZ            Rev: 8301
Type:   Direct-Access                    ANSI SCSI revision: 06
Host: scsi2 Channel: 00 Id: 04 Lun: 2
Vendor: DGC      Model: LUNZ            Rev: 8301
Type:   Direct-Access                    ANSI SCSI revision: 06

From here we can see where LUNZ are located as scsi devices.
All info you need is here:
Host: scsi1 Channel: 00 Id: 05 Lun: 2
Vendor: DGC      Model: LUNZ            Rev: 8301
Type:   Direct-Access                    ANSI SCSI revision: 06

cd /sys/class/scsi_device

Here is location of all scsi devices including LUNZ.

#cd 1\:0\:5\:2/

is location of are LUNZ. How to we know this?
Host: scsi1 Channel: 00 Id: 05 Lun: 2

Bolded number are location of scsi device.

#cd device
#:/sys/class/scsi_device/1:0:5:2/device # ls
block           evt_media_change modalias              rescan        state
bsg             generic           model                 rev           subsystem
delete          iocounterbits     power                 scsi_device   timeout
device_blocked iodone_cnt        queue_depth           scsi_disk     type
dh_state        ioerr_cnt         queue_ramp_up_period scsi_generic uevent
driver          iorequest_cnt     queue_type            scsi_level    vendor

Let's delete scsi device
#echo 1> delete

And LUNZ is deleted. Now check scsi device list with cat /proc/scsi/scsi. Now you should see one LUNZ less then before.

Do the same thing for the rest of the LUNZ.

P.S. You should always be carefull with removing scsi devices. One mistake can be fatal for you server.

Until you assing LUN to your server, LUNZ will be appering every time you rescan you scsi host .

WARNING: Re-reading the partition table failed with error 22: Invalid argument. - SOLVED and EXPLAINED

In case you have this WARNING while creating partition on multipath device

WARNING: Re-reading the partition table failed with error 22: Invalid argument.

don't panic.

# fdisk /dev/mapper/mpatha
Partition number (1-4): 1
   First cylinder (1-19074, default 1):
   Using default value 1
   Last cylinder or +size or +sizeM or +sizeK (1-19074, default 19074):
   Using default value 19074

   Command (m for help): p

   Disk /dev/mapper/mpatha: 156.8 GB, 156892397568 bytes
   255 heads, 63 sectors/track, 19074 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes

                            Device Boot      Start         End      Blocks   Id  System
   /dev/mapper/mpathap1               1       19074   153211873+  83  Linux

   Command (m for help): w
   The partition table has been altered!

   Calling ioctl() to re-read partition table.

   WARNING: Re-reading the partition table failed with error 22: Invalid argument.  <------
   The kernel still uses the old table.
   The new table will be used at the next reboot.
   Syncing disks.

Depending on you Linux distro, there can be also suggestion that you reboot server or use parted or kpartx so that new partition table be used.

In case you have this error, first you should check if kpartx see newly created partition with

kpartx -l /dev/mapper/mpatha
mpatha1 : 0 33552384 /dev/mapper/mpatha 2048

This means that kpartx can normally see partition on multipath device mpatha. In other words, new partition table is already applied.

P.S. Don't mind the numbers on last command.

Thursday 1 February 2018

Migrating data from LUN to LUN by using LVM or How to transfer data to new storage online by using LVM - SOLVED and EXPLAINED

So it's time to migrate you servers to new storage system or you just want to replace local hard drives. You can do this by calling your users, telling them that they have to stop application, DB or something else. Usually, users are spoiled brats and they will telling you about down time, etc. If you are system admin, I bet you did have these kinds of conversations. But, admin gotta do what admin gotta do...

Also, even do we are talking about LUN's because remote storage disks on storage systems are called that, story is same if you add new local hard drive. In example below, I was testing this with Ubuntu 16 on my VirtualBox. Logic is complettly the same, only difference is name od disk devices. For LUN's naming is emporerdx if you are using PowerPath as multipathing software,or sdx if you are using local drives.

"Offline" as call users to stop applications method to replace LUN's

For purpose of this, let's assume that we have linux OS with no LVM. Also we wont be dealing here with connecting to new storage, adding new LUN's steps, etc. We will assume that all that is already done.

This goes something like this:

1.OS must see new LUN

2.stop application

3. copy data from old_LUN to new_LUN

4.replace mount points so that mount point start pointing to new_LUN

5.start application, check it

6.remove old_LUN

This can take time. Bigger the disks, users will have to wait longer, application downtime is bigger, you boss is unhappier, etc.

"Online" as users don't know that you moved applications to new storage system because you are using LVM

On my second day at work, 10 years ago, my mentor Tatjana told me "Always use LVM because it will be easier later to do things". So I did listen to her. This post is not about what LVM is, logic behind LVM, what pv is, what is and what lv is. I will assume that you know what this is already. I case you don' (I have to write this in capital bold letters)

LEARN WHAT LVM IS AND START TO USE IT

Why? Read what Tatjana told me. "it will be easier later to do things"

And it did make things a lot easier!

How to do this? When you understand the logic of LVM, it's quite logical to do data migration from disk to disk in this way. Of course, there are some restictions. Actually, there is only one restriction:

NEW DISK MUST(should) NOT BE SMALLER THEN OLD DISK

Why? Because we are not touching file system layer or logical volume layer. We are only touching physical volume and volume group layers.

Migrating data from LUN to LUN by using LVM or How to transfer data to new storage online by using LVM - SOLVED and EXPLAINED

As I said before, you my testing purposes I use Ubuntu 16 that is running on VirtualBox running on my PC. As I also said before, there is no difference it procedure if you are migrating data from LUN to LUN or from local disk to local disk. Only difference is name of disk devices.

How does this works? Basic logic is this:

1. Add new disks to PV

2. extend VG with that new disk

3. pvmove old disk to new disk

4.reduse VG.

So let's begin!

Ubuntu VM had one disk added. This was sda. Disk device sda has few partitions (sda1,sda2,sda3 and sda5). Sda5 was in LVM. I addend new disk device sdb.

1. You can skip this step if you like, but I like to make partition on disk device even it's not really nessacerry if entire disk is used by LVM.

This step is about creating new partition on new device sdb.

root@ubuntu16:~# fdisk /dev/sdb

Welcome to fdisk (util-linux 2.27.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table.
Created a new DOS disklabel with disk identifier 0xa2b893f9.

Command (m for help): n
Partition type
p primary (0 primary, 0 extended, 4 free)
e extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1):
First sector (2048-20971519, default 2048):
Last sector, +sectors or +size{K,M,G,T,P} (2048-20971519, default 20971519):

Created a new partition 1 of type 'Linux' and of size 10 GiB.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

2. List all disks devices to see all partiotions

root@ubuntu16:~# fdisk -l
Disk /dev/sda: 10 GiB, 10737418240 bytes, 20971520 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xb80d799b

Device     Boot   Start      End Sectors Size Id Type
/dev/sda1 *       2048   999423   997376 487M 83 Linux
/dev/sda2       1001470 20969471 19968002 9.5G 5 Extended
/dev/sda5       1001472 20969471 19968000 9.5G 8e Linux LVM

Disk /dev/sdb: 10 GiB, 10737418240 bytes, 20971520 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xa2b893f9

Device     Boot Start      End Sectors Size Id Type
/dev/sdb1        2048 20971519 20969472 10G 83 Linux

Disk /dev/mapper/ubuntu16--vg-root: 8.5 GiB, 9126805504 bytes, 17825792 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk /dev/mapper/ubuntu16--vg-swap_1: 1020 MiB, 1069547520 bytes, 2088960 sector                                                                                                                                                             s
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

3. List VG

root@ubuntu16:~# vgs
VG #PV #LV #SN Attr VSize VFree
ubuntu16-vg 1 2 0 wz--n- 9.52g 24.00m

Red colored number is showing as how many pv this vg is using.

4.Create PV with new partition from new disk
root@ubuntu16:~# pvcreate /dev/sdb1
Physical volume "/dev/sdb1" successfully created

5.Extend VG. By doing this you are inserting that new partition into VG

root@ubuntu16:~# vgextend ubuntu16-vg /dev/sdb1
Volume group "ubuntu16-vg" successfully extended

6. List VG again

root@ubuntu16:~# vgs
VG #PV #LV #SN Attr VSize VFree
ubuntu16-vg 2 2 0 wz--n- 19.52g 10.02g

Compare pink numbers with numbers in step 3.

7. Copy data from sda5 to sdb1. This is actually coping data from one disk to another with any interaction with file system. While this is done, OS and all applications are working totally normal.

Magic command is pvmove

root@ubuntu16:~# pvmove /dev/sda5 /dev/sdb1
/dev/sda5: Moved: 0.0%
/dev/sda5: Moved: 2.6%
/dev/sda5: Moved: 4.6%
/dev/sda5: Moved: 6.6%
/dev/sda5: Moved: 8.6%
/dev/sda5: Moved: 10.7%
/dev/sda5: Moved: 12.7%
/dev/sda5: Moved: 14.7%
/dev/sda5: Moved: 16.8%
/dev/sda5: Moved: 19.0%
/dev/sda5: Moved: 21.0%
/dev/sda5: Moved: 23.8%
/dev/sda5: Moved: 25.7%
/dev/sda5: Moved: 27.3%
/dev/sda5: Moved: 29.1%
/dev/sda5: Moved: 30.4%
/dev/sda5: Moved: 31.8%
/dev/sda5: Moved: 33.3%
/dev/sda5: Moved: 34.8%
/dev/sda5: Moved: 36.4%
/dev/sda5: Moved: 37.6%
/dev/sda5: Moved: 38.3%
/dev/sda5: Moved: 39.0%
/dev/sda5: Moved: 39.9%
/dev/sda5: Moved: 41.3%
/dev/sda5: Moved: 44.0%
/dev/sda5: Moved: 45.8%
/dev/sda5: Moved: 48.0%
/dev/sda5: Moved: 50.1%
/dev/sda5: Moved: 52.3%
/dev/sda5: Moved: 54.5%
/dev/sda5: Moved: 56.5%
/dev/sda5: Moved: 58.7%
/dev/sda5: Moved: 60.9%
/dev/sda5: Moved: 63.0%
/dev/sda5: Moved: 65.5%
/dev/sda5: Moved: 67.5%
/dev/sda5: Moved: 69.6%
/dev/sda5: Moved: 71.3%
/dev/sda5: Moved: 73.3%
/dev/sda5: Moved: 75.2%
/dev/sda5: Moved: 77.3%
/dev/sda5: Moved: 79.5%
/dev/sda5: Moved: 81.4%
/dev/sda5: Moved: 83.5%
/dev/sda5: Moved: 86.8%
/dev/sda5: Moved: 88.9%
/dev/sda5: Moved: 89.5%
/dev/sda5: Moved: 100.0%

This can take some time depending on how big you partitions are, how fast your disks are, IO on file system. For this 10GB partition to move from sda5 to sdb1 it took about 20 minutes to complet.

8.List VG

root@ubuntu16:~# vgs
VG #PV #LV #SN Attr VSize VFree
ubuntu16-vg 2 2 0 wz--n- 19.52g 10.02g

This should be the same as in step 6.

9. Remove old disk sda5 from VG

root@ubuntu16:~# vgreduce ubuntu16-vg /dev/sda5
Removed "/dev/sda5" from volume group "ubuntu16-vg"

10.List pv and vg

root@ubuntu16:~# pvs
PV VG Fmt Attr PSize PFree
/dev/sda5 lvm2 --- 9.52g 9.52g
/dev/sdb1 ubuntu16-vg lvm2 a-- 10.00g 512.00m

As you can see here, only sdb1 is associated with ubuntu16-vg and size is not 19,52GB anymore but 10GB. Sda5 is not part of any vg.

11. Remove sda5 from LVM

root@ubuntu16:~# pvremove /dev/sda5
Labels on physical volume "/dev/sda5" successfully wiped

Partition sda5 is no longer part of LVM.

12. List PV

root@ubuntu16:~# pvs
PV VG Fmt Attr PSize PFree
/dev/sdb1 ubuntu16-vg lvm2 a-- 10.00g 512.00m

After this you can approach removing disk or LUN from OS. All this time, OS and all applications were up and running!

Here is video how this is done!

Monday 29 January 2018

Fdisk and parted - creating partitions larger then 2TB - SOLVED

In case you don't know what parted is then I can only assume that you never worked with disks larger then 2 TB. Good for you.

If you do know what parted is that I also can assume that you came across problems with fdisk and disks larger then 2TB.

So what is parted?

Simple answer to this question is just another disks partitioning tool. What is so special about parted? When using parted you can create partitions that are larger then 2TB. Why? Because. Don't get to techical about it.

Can you use parted for partitions smaller then 2TB? Yes, you can! Do you use parted for partitions smaller then 2TB if you don't have partitions larger then 2TB? I don't think so. Why? Answer is pretty simple. Big disks are still expensive.

Fdisk limitations for large partitions

Fdisk limitation is that it can't create partitions larger then 2TB.

Example: you have a disk sdb that is 3TB. Do all things that you need for creating new partion on it.

fdisk /dev/sdb

(new primary partition, number 1, first block on disk, last block on disk)

When you are finish with creating partition, list info about it

fdisk /dev/sdb1

You will see that is it only 2TB big. Even if you did select last block on disk, you can get partition bigger then 2TB.

Can I still make partitions larger then 2TB with fdisk?

Well, you can but ... you will need to to use LVM and make some tricks.

How to do it? Pretty simple really. Let's show this in an example.

If you can 3TB disks on your server,let's say it is sdb, you first create one partition. Then create second partition on save disk. So you will have sdb1 and sdb2 partition that are in total 3TB large.

Now, implement LVM on them.

Create PV on them:

pvcreate /dev/sdb1

pvcreate /dev/sdb2

Check PV

# pvs
PV         VG     Fmt Attr PSize   PFree
/dev/sda2 rootvg lvm2 a-    40.00g     0
/dev/sdb2 data_vg lvm2 a-   2.00t     0
/dev/sdb1 data_vg lvm2 a-   1.00t 0

Create VG on them:

vgcreate data_vg /dev/sdb1

vgextend data_vg /dev/sdb2

You must use extend for second disks because you can't have to VG with same name.

So when you execute vgs command, you will see something like this

# vgs
VG     #PV #LV #SN Attr   VSize   VFree
data_vg   2   1   0 wz--n- 3.00tg   0
rootvg   1 3   0 wz--n- 90.00g 28.00g

Now you can create LV on VG data_vg that is 3TB big.

lvcreate -L3T -n data_lv data_vg

# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
data_lv data_vg -wi-ao 3.00T
rootlv rootvg -wi-ao 20.00g
varlv rootvg -wi-ao 10.00g

Rest is easy, create file system on /dev/mapper/data_lv and mount it somewhere.

Making all this with parted

Same disk - sdb

# parted /dev/sdb
GNU Parted 2.3
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel GPT
Warning: The existing disk label on /dev/sdc will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? Yes

(parted) mkpart primary 65535s 100%

(parted) q
Information: You may need to update /etc/fstab.

# parted /dev/sdb print
Model: Unknown (unknown)
Disk /dev/sdb: 3299GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system  Name     Flags
 1      33.6MB  3299GB  3298GB               primary

And now you can create file system on sdb1 or you can create LVM on sdb1.