Tuesday 6 February 2018

How to remove LUNZ from LINUX - EXPLAINED and SOLVED

What LUNZ are in a first place?

When you attach storage to a server but you didn't yet assing any LUN's from storage to that server, server see something called LUNZ. You can think of LUNZ as dummy(ghost device) LUN. How many LUNZ whould you have? That's depending of how many path you have to your storage device. Each LUNZ is representing on path. So if you have 4 paths to storage, you will have 4 LUNZ, if you have 8 paths, you will have 8 LUNZ, etc.

This here is applied for EMC storage systems( Clarrion, Symmetrix, VNX, UNITY...)

Are LUNZ good or bad thing?

Seeing LUNZ are generally a good thing. This means that your server have connection to storage system.  Where can you check if you can see LUNZ?

#cat /proc/scsi/scsi
.
.
Host: scsi1 Channel: 00 Id: 05 Lun: 2
  Vendor: DGC      Model: LUNZ            Rev: 8301
  Type:   Direct-Access                    ANSI  SCSI revision: 06
Host: scsi1 Channel: 00 Id: 04 Lun: 2
  Vendor: DGC      Model: LUNZ            Rev: 8301
  Type:   Direct-Access                    ANSI  SCSI revision: 06
Host: scsi2 Channel: 00 Id: 05 Lun: 2
  Vendor: DGC      Model: LUNZ            Rev: 8301
  Type:   Direct-Access                    ANSI  SCSI revision: 06
Host: scsi2 Channel: 00 Id: 04 Lun: 2
  Vendor: DGC      Model: LUNZ            Rev: 8301
  Type:   Direct-Access                    ANSI  SCSI revision: 06


All disks are shown to system as scsi devices, same thing applied to LUNZ. Each path to storage device is represented as one LUNZ.
What can we see from above output? We can see that we have 4 paths to storage and thah we are connected to storage system with Rev number 8301.

Why this Rev number is important? In case you are troubleshooting, Rev number is uniq for storage system so in case you are not shure to which storage system your server is attached, you should loggin to serverA that already have LUN's attached to serverA and execute cat /proc/scsi/scsi and check the Rev number. If they are same as in server and serverA we can conclude that they are attached to same storage system.


Should you keep LUNZ on your system?

Only answer to this question in NO. Why? They will not do no harm but once you assing LUN to your server and rescan your scsi host, propper scsi devices will be shown and things can get a little to crowded. Even do all LUN's have uniq scsi ID multiplied by number of paths to stogare(in case you have 4 paths to storage system, one LUN is represented as 4 scsi devices, in case you have 8 paths,..) it's always wise to keep your system clean from junk. LUNZ are junk.

How to remove LUNZ?

First list your scsi devices.
#cat /proc/scsi/scsi
.
.
Host: scsi1 Channel: 00 Id: 05 Lun: 2
  Vendor: DGC      Model: LUNZ            Rev: 8301
  Type:   Direct-Access                    ANSI  SCSI revision: 06
Host: scsi1 Channel: 00 Id: 04 Lun: 2
  Vendor: DGC      Model: LUNZ            Rev: 8301
  Type:   Direct-Access                    ANSI  SCSI revision: 06
Host: scsi2 Channel: 00 Id: 05 Lun: 2
  Vendor: DGC      Model: LUNZ            Rev: 8301
  Type:   Direct-Access                    ANSI  SCSI revision: 06
Host: scsi2 Channel: 00 Id: 04 Lun: 2
  Vendor: DGC      Model: LUNZ            Rev: 8301
  Type:   Direct-Access                    ANSI  SCSI revision: 06


 From here we can see where LUNZ are located as scsi devices.
All info you need is here:
Host: scsi1 Channel: 00 Id: 05 Lun: 2
 Vendor: DGC      Model: LUNZ            Rev: 8301
 Type:   Direct-Access                    ANSI  SCSI revision: 06


cd /sys/class/scsi_device

Here is location of all scsi devices including LUNZ.

 #cd 1\:0\:5\:2/

is location of are LUNZ. How to we know this?
Host: scsi1 Channel: 00 Id: 05 Lun: 2

Bolded number are location of scsi device.

#cd device
#:/sys/class/scsi_device/1:0:5:2/device # ls
block           evt_media_change  modalias              rescan        state
bsg             generic           model                 rev           subsystem
delete          iocounterbits     power                 scsi_device   timeout
device_blocked  iodone_cnt        queue_depth           scsi_disk     type
dh_state        ioerr_cnt         queue_ramp_up_period  scsi_generic  uevent
driver          iorequest_cnt     queue_type            scsi_level    vendor


Let's delete scsi device
#echo 1> delete  

And LUNZ is deleted. Now check scsi device list with cat /proc/scsi/scsi. Now you should see one LUNZ less then before.

Do the same thing for the rest of the LUNZ.

P.S. You should always be carefull with removing scsi devices. One mistake can be fatal for you server.

Until you assing LUN to your server, LUNZ will be appering every time you rescan you scsi host .

WARNING: Re-reading the partition table failed with error 22: Invalid argument. - SOLVED and EXPLAINED

In case you have this WARNING while creating partition on multipath device

 WARNING: Re-reading the partition table failed with error 22: Invalid argument. 

don't panic.

 
# fdisk /dev/mapper/mpatha
Partition number (1-4): 1
   First cylinder (1-19074, default 1):
   Using default value 1
   Last cylinder or +size or +sizeM or +sizeK (1-19074, default 19074):
   Using default value 19074

   Command (m for help): p

   Disk /dev/mapper/mpatha: 156.8 GB, 156892397568 bytes
   255 heads, 63 sectors/track, 19074 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes

                            Device Boot      Start         End      Blocks   Id  System
   /dev/mapper/mpathap1               1       19074   153211873+  83  Linux

   Command (m for help): w
   The partition table has been altered!

   Calling ioctl() to re-read partition table.

   WARNING: Re-reading the partition table failed with error 22: Invalid argument.  <------
   The kernel still uses the old table.
   The new table will be used at the next reboot.
   Syncing disks.

Depending on you Linux distro, there can be also suggestion that you reboot server or use parted or kpartx so that new partition table be used.

In case you have this error, first you should check if kpartx see newly created partition with

kpartx -l /dev/mapper/mpatha
mpatha1 : 0 33552384 /dev/mapper/mpatha 2048

This means that kpartx can normally see partition on multipath device mpatha. In other words, new partition table is already applied.

P.S. Don't mind the numbers on last command.

Thursday 1 February 2018

Migrating data from LUN to LUN by using LVM or How to transfer data to new storage online by using LVM - SOLVED and EXPLAINED

So it's time to migrate you servers to new storage system or you just want to replace local hard drives. You can do this by calling your users, telling them that they have to stop application, DB or something else. Usually, users are spoiled brats and they will telling you about down time, etc. If you are system admin, I bet you did have these kinds of conversations. But, admin gotta do what admin gotta do...

Also, even do we are talking about LUN's because remote storage disks on storage systems are called that, story is same if you add new local hard drive. In example below, I was testing this with Ubuntu 16 on my VirtualBox. Logic is complettly the same, only difference is name od disk devices. For LUN's naming is emporerdx if you are using PowerPath as multipathing software,or sdx if you are using local drives.

"Offline" as call users to stop applications method to replace LUN's


For purpose of this, let's assume that we have linux OS with no LVM. Also we wont be dealing here with connecting to new storage, adding new LUN's steps, etc. We will assume that all that is already done.

This goes something like this:
1.OS must see new LUN
2.stop application
3. copy data from old_LUN to new_LUN
4.replace mount points so that mount point start pointing to new_LUN
5.start application, check it
6.remove old_LUN

This can take time. Bigger the disks, users will have to wait longer, application downtime is bigger, you boss is unhappier, etc.

"Online" as users don't know that you moved applications to new storage system because you are using LVM

On my second day at work, 10 years ago, my mentor Tatjana told me "Always use LVM because it will be easier later to do things". So I did listen to her. This post is not about what LVM is, logic behind LVM, what pv is, what is and what lv is. I will assume that you know what this is already. I case you don' (I have to write this in capital bold letters)

LEARN WHAT LVM IS AND START TO USE IT

Why? Read what Tatjana told me.  "it will be easier later to do things"

And it did make things a lot easier!
How to do this? When you understand the logic of LVM, it's quite logical to do data migration from disk to disk in this way. Of course, there are some restictions. Actually, there is only one restriction: 

NEW DISK MUST(should) NOT BE SMALLER THEN OLD DISK


Why? Because we are not touching file system layer or logical volume layer. We are only touching physical volume and volume group layers. 

Migrating data from LUN to LUN by using LVM or How to transfer data to new storage online by using LVM - SOLVED and EXPLAINED 

As I said before, you my testing purposes I use Ubuntu 16 that is running on VirtualBox running on my PC. As I also said before, there is no difference it procedure if you are migrating data from LUN to LUN or from local disk to local disk. Only difference is name of disk devices.

How does this works? Basic logic is this:

1. Add new disks to PV
2. extend VG with that new disk
3. pvmove old disk to new disk
4.reduse VG.
So let's begin!

Ubuntu VM had one disk added. This was sda. Disk device sda has few partitions (sda1,sda2,sda3 and sda5). Sda5 was in LVM. I addend new disk device sdb. 

1. You can skip this step if you like, but I like to make partition on disk device even it's not really nessacerry if entire disk is used by LVM.
 This step is about creating new partition on new device sdb.
root@ubuntu16:~# fdisk /dev/sdb

Welcome to fdisk (util-linux 2.27.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table.
Created a new DOS disklabel with disk identifier 0xa2b893f9.

Command (m for help): n
Partition type
   p   primary (0 primary, 0 extended, 4 free)
   e   extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1):
First sector (2048-20971519, default 2048):
Last sector, +sectors or +size{K,M,G,T,P} (2048-20971519, default 20971519):

Created a new partition 1 of type 'Linux' and of size 10 GiB.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

2. List all disks devices to see all partiotions
root@ubuntu16:~# fdisk -l
Disk /dev/sda: 10 GiB, 10737418240 bytes, 20971520 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xb80d799b

Device     Boot   Start      End  Sectors  Size Id Type
/dev/sda1  *       2048   999423   997376  487M 83 Linux
/dev/sda2       1001470 20969471 19968002  9.5G  5 Extended
/dev/sda5       1001472 20969471 19968000  9.5G 8e Linux LVM

Disk /dev/sdb: 10 GiB, 10737418240 bytes, 20971520 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xa2b893f9

Device     Boot Start      End  Sectors Size Id Type
/dev/sdb1        2048 20971519 20969472  10G 83 Linux


Disk /dev/mapper/ubuntu16--vg-root: 8.5 GiB, 9126805504 bytes, 17825792 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/ubuntu16--vg-swap_1: 1020 MiB, 1069547520 bytes, 2088960 sector                                                                                                                                                             s
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

3. List VG
root@ubuntu16:~# vgs
  VG          #PV #LV #SN Attr   VSize VFree
  ubuntu16-vg   1   2   0 wz--n- 9.52g 24.00m

Red colored number is showing as how many pv this vg is using.

4.Create PV with new partition from new disk
root@ubuntu16:~# pvcreate /dev/sdb1
  Physical volume "/dev/sdb1" successfully created

5.Extend VG. By doing this you are inserting that new partition into VG
root@ubuntu16:~# vgextend ubuntu16-vg /dev/sdb1
  Volume group "ubuntu16-vg" successfully extended
 
6. List VG again
root@ubuntu16:~# vgs
  VG          #PV #LV #SN Attr   VSize  VFree
  ubuntu16-vg   2   2   0 wz--n- 19.52g 10.02g

Compare pink numbers with numbers in step 3.

7. Copy data from sda5 to sdb1. This is actually coping data from one disk to another with any interaction with file system. While this is done, OS and all applications are working totally normal.
Magic command is pvmove

root@ubuntu16:~# pvmove /dev/sda5 /dev/sdb1
  /dev/sda5: Moved: 0.0%
  /dev/sda5: Moved: 2.6%
  /dev/sda5: Moved: 4.6%
  /dev/sda5: Moved: 6.6%
  /dev/sda5: Moved: 8.6%
  /dev/sda5: Moved: 10.7%
  /dev/sda5: Moved: 12.7%
  /dev/sda5: Moved: 14.7%
  /dev/sda5: Moved: 16.8%
  /dev/sda5: Moved: 19.0%
  /dev/sda5: Moved: 21.0%
  /dev/sda5: Moved: 23.8%
  /dev/sda5: Moved: 25.7%
  /dev/sda5: Moved: 27.3%
  /dev/sda5: Moved: 29.1%
  /dev/sda5: Moved: 30.4%
  /dev/sda5: Moved: 31.8%
  /dev/sda5: Moved: 33.3%
  /dev/sda5: Moved: 34.8%
  /dev/sda5: Moved: 36.4%
  /dev/sda5: Moved: 37.6%
  /dev/sda5: Moved: 38.3%
  /dev/sda5: Moved: 39.0%
  /dev/sda5: Moved: 39.9%
  /dev/sda5: Moved: 41.3%
  /dev/sda5: Moved: 44.0%
  /dev/sda5: Moved: 45.8%
  /dev/sda5: Moved: 48.0%
  /dev/sda5: Moved: 50.1%
  /dev/sda5: Moved: 52.3%
  /dev/sda5: Moved: 54.5%
  /dev/sda5: Moved: 56.5%
  /dev/sda5: Moved: 58.7%
  /dev/sda5: Moved: 60.9%
  /dev/sda5: Moved: 63.0%
  /dev/sda5: Moved: 65.5%
  /dev/sda5: Moved: 67.5%
  /dev/sda5: Moved: 69.6%
  /dev/sda5: Moved: 71.3%
  /dev/sda5: Moved: 73.3%
  /dev/sda5: Moved: 75.2%
  /dev/sda5: Moved: 77.3%
  /dev/sda5: Moved: 79.5%
  /dev/sda5: Moved: 81.4%
  /dev/sda5: Moved: 83.5%
  /dev/sda5: Moved: 86.8%
  /dev/sda5: Moved: 88.9%
  /dev/sda5: Moved: 89.5%
  /dev/sda5: Moved: 100.0%
 

 This can take some time depending on how big you partitions are, how fast your disks are, IO on file system. For this 10GB partition to move from sda5 to sdb1 it took about 20 minutes to complet.
8.List VG
root@ubuntu16:~# vgs
  VG          #PV #LV #SN Attr   VSize  VFree
  ubuntu16-vg   2   2   0 wz--n- 19.52g 10.02g

This should be the same as in step 6.
9. Remove old disk sda5 from VG
root@ubuntu16:~# vgreduce ubuntu16-vg /dev/sda5
  Removed "/dev/sda5" from volume group "ubuntu16-vg"

10.List pv and vg
root@ubuntu16:~# pvs
  PV         VG          Fmt  Attr PSize  PFree
  /dev/sda5              lvm2 ---   9.52g   9.52g
  /dev/sdb1  ubuntu16-vg lvm2 a--  10.00g 512.00m

As you can see here, only sdb1 is associated with ubuntu16-vg and size is not 19,52GB anymore but 10GB. Sda5 is not part of any vg.

11. Remove sda5 from LVM
root@ubuntu16:~# pvremove /dev/sda5
  Labels on physical volume "/dev/sda5" successfully wiped
 
Partition sda5 is no longer part of LVM.

12. List PV
root@ubuntu16:~# pvs
  PV         VG          Fmt  Attr PSize  PFree
  /dev/sdb1  ubuntu16-vg lvm2 a--  10.00g 512.00m 

After this you can approach removing disk or LUN from OS. All this time, OS and all applications were up and running! 

Here is video how this is done!