Live migration steps
====================

See http://libvirt.org/migration.html for migration details using libvirt.

This page explains the live migration scenario between two hypervisors hv1 and
hv2.
We try to move a VM from hv1 to hv2 whose name is "TestVM".
VM has a disk which is a LVM backend named "/dev/vg/TestVM".

A shell command is specified for each step to reproduce the steps.

it-test-16.lab.fr.lan is the source hypervisor (where the VM is)
it-test-17.lab.fr.lan is the destination hypervisor (where we migrate the VM)

[...] in the command results means some lines were removed.

TODO schematics

1 - Pause VM
------------

We must pause VM while we manipulate the device mapper because I/O errors would
occur otherwise.::

    virsh suspend TestVM

2 - Set up DRBD
---------------

Reload kernel module with proper parameters::

    it-test-16.lab.fr.lan ~ 2 # modprobe drbd minor_count=100 usermode_helper=/bin/true
    it-test-16.lab.fr.lan ~ 0 #

    it-test-17.lab.fr.lan ~ 2 # modprobe drbd minor_count=100 usermode_helper=/bin/true
    it-test-17.lab.fr.lan ~ 0 # 

Create DRBD device metadata (external).
See http://www.drbd.org/users-guide-emb/ch-internals.html#s-external-meta-data
for more information.

We find the metadata size we need::

    it-test-16.lab.fr.lan ~ 130 # ipython
    Python 2.7.3rc2 (default, Apr 22 2012, 22:30:17) 
    Type "copyright", "credits" or "license" for more information.

    IPython 0.12 -- An enhanced Interactive Python.
    ?         -> Introduction and overview of IPython's features.
    %quickref -> Quick reference.
    help      -> Python's own help system.
    object?   -> Details about 'object', use 'object??' for extra details.

    In [1]: import libvirt

    In [2]: c = libvirt.open('qemu:///system')

    In [4]: c.listDomainsID()
    Out[4]: [6, 7]

    In [7]: d = c.lookupByID(7)

    In [8]: d.name()
    Out[8]: 'TestVMDisk-small-16'

    In [9]: d.XMLDesc(0)
    Out[9]: "<domain type='kvm' id='7'>\n
    [...]
    <disk type='block' device='disk'>\n      <driver name='qemu' type='raw'/>\n      <source dev='/dev/vg/TestVMDisk-small-16'/>\n      <target dev='hda' bus='ide'/>\n      <alias name='ide0-0-0'/>\n      <address type='drive' controller='0' bus='0' target='0' unit='0'/>\n    </disk>\n    <controller type='usb' index='0'>\n      <alias name='usb0'/>\n      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>\n    </controller>\n    <controller type='ide' index='0'>\n      <alias name='ide0'/>\n      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>\n    </controller>\n
    [...]
    "

    In [10]: s = c.storagePoolLookupByName('vg')

    In [13]: s.listVolumes()
    Out[13]: ['TestVMDisk-16', 'TestVMDisk-small-16', 'TestVMDisk-17']

    In [15]: v = c.storageVolLookupByPath('/dev/vg/TestVMDisk-small-16')

    In [16]: v.name()
    Out[16]: 'TestVMDisk-small-16'

    In [18]: _, capacity, _ = v.info()

    In [19]: capacity
    Out[19]: 209715200L

    In [20]: drbd_meta_size = max(capacity / 32768 + 4 * 2 ** 20, 128 * 2 ** 20)

    In [21]: drbd_meta_size
    Out[21]: 134217728

Create the LV for the VM on the destination hypervisor::

    it-test-17.lab.fr.lan ~ 0 # lvcreate -n TestVMDisk-small-16 -L 200m vg
      Logical volume "TestVMDisk-small-16" created
    it-test-17.lab.fr.lan ~ 0 # lvs
      LV                  VG   Attr   LSize   Origin Snap%  Move Log Copy%  Convert
      TestVMDisk-small-16 vg   -wi-a- 200.00m   


We create the metadata LV with the appropriate size on both hypervisors::

    it-test-16.lab.fr.lan ~ 5 # lvcreate -v -n TestVMDisk-small-16.drbdmeta -L 134217728b vg
        Setting logging type to disk
        Finding volume group "vg"
        Archiving volume group "vg" metadata (seqno 4).
        Creating logical volume TestVMDisk-small-16.drbdmeta
        Creating volume group backup "/etc/lvm/backup/vg" (seqno 5).
        Found volume group "vg"
        activation/volume_list configuration setting not defined, checking only host tags for vg/TestVMDisk-small-16.drbdmeta
        Creating vg-TestVMDisk--small--16.drbdmeta
        Loading vg-TestVMDisk--small--16.drbdmeta table (253:3)
        Resuming vg-TestVMDisk--small--16.drbdmeta (253:3)
        Clearing start of logical volume "TestVMDisk-small-16.drbdmeta"
        Creating volume group backup "/etc/lvm/backup/vg" (seqno 5).
      Logical volume "TestVMDisk-small-16.drbdmeta" created
    it-test-16.lab.fr.lan ~ 0 # lvs
      LV                           VG   Attr   LSize   Origin Snap%  Move Log Copy%  Convert
      [...]
      TestVMDisk-small-16          vg   -wi-ao 200.00m                                      
      TestVMDisk-small-16.drbdmeta vg   -wi-a- 128.00m 

    it-test-17.lab.fr.lan ~ 3 # lvcreate -v -n TestVMDisk-small-16.drbdmeta -L 134217728b vg
        Setting logging type to disk
        Finding volume group "vg"
        Archiving volume group "vg" metadata (seqno 2).
        Creating logical volume TestVMDisk-small-16.drbdmeta
        Creating volume group backup "/etc/lvm/backup/vg" (seqno 3).
        Found volume group "vg"
        activation/volume_list configuration setting not defined, checking only host tags for vg/TestVMDisk-small-16.drbdmeta
        Creating vg-TestVMDisk--small--16.drbdmeta
        Loading vg-TestVMDisk--small--16.drbdmeta table (253:1)
        Resuming vg-TestVMDisk--small--16.drbdmeta (253:1)
        Clearing start of logical volume "TestVMDisk-small-16.drbdmeta"
        Creating volume group backup "/etc/lvm/backup/vg" (seqno 3).
      Logical volume "TestVMDisk-small-16.drbdmeta" created

Create a device mapper with a different name (.copy) that points to the LV::

    it-test-16.lab.fr.lan ~ 0 # dmsetup table --showkeys /dev/vg/TestVMDisk-small-16
    0 409600 linear 9:126 8300544
    it-test-16.lab.fr.lan ~ 0 #  dmsetup table --showkeys /dev/vg/TestVMDisk-small-16 | dmsetup create vg-TestVMDisk--small--16.copy 

    it-test-17.lab.fr.lan ~ 0 # dmsetup table --showkeys /dev/vg/TestVMDisk-small-16
    0 409600 linear 9:126 2048
    it-test-17.lab.fr.lan ~ 0 # dmsetup table --showkeys /dev/vg/TestVMDisk-small-16  | dmsetup create vg-TestVMDisk--small--16.copy
    it-test-17.lab.fr.lan ~ 0 # dmsetup table --showkeys /dev/vg/TestVMDisk-small-16
    0 409600 linear 9:126 2048
    it-test-17.lab.fr.lan ~ 0 # dmsetup table --showkeys 
    vg-TestVMDisk--small--16: 0 409600 linear 9:126 2048
    vg-TestVMDisk--small--16.copy: 0 409600 linear 9:126 2048
    vg-TestVMDisk--small--16.drbdmeta: 0 262144 linear 9:126 411648

Wipe and initialize drbd metadata::

    it-test-16.lab.fr.lan ~ 20 # drbdmeta --force /dev/drbd0 v08 /dev/vg/TestVMDisk-small-16.drbdmeta 0 wipe-md
    There appears to be no drbd meta data to wipe out?
    it-test-16.lab.fr.lan ~ 0 # drbdmeta --force /dev/drbd0 v08 /dev/vg/TestVMDisk-small-16.drbdmeta 0 create-md
    md_offset 0
    al_offset 4096
    bm_offset 36864

    Found some data

     ==> This might destroy existing data! <==

    Do you want to proceed?
    *** confirmation forced via --force option ***
    Writing meta data...
    initializing activity log
    NOT initialized bitmap
    New drbd meta data block successfully created.

    it-test-17.lab.fr.lan ~ 0 # drbdmeta --force /dev/drbd0 v08 /dev/vg/TestVMDisk-small-16.drbdmeta 0 wipe-md
    There appears to be no drbd meta data to wipe out?
    it-test-17.lab.fr.lan ~ 0 # drbdmeta --force /dev/drbd0 v08 /dev/vg/TestVMDisk-small-16.drbdmeta 0 create-md
    md_offset 0
    al_offset 4096
    bm_offset 36864

    Found some data

     ==> This might destroy existing data! <==

    Do you want to proceed?
    *** confirmation forced via --force option ***
    Writing meta data...
    initializing activity log
    NOT initialized bitmap
    New drbd meta data block successfully created.

Create DRBD device with LV (.copy) and metadata::

    it-test-16.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 disk /dev/mapper/vg-TestVMDisk--small--16.copy /dev/vg/TestVMDisk-small-16.drbdmeta 0 --create-device
    it-test-16.lab.fr.lan ~ 0 # ls /dev/drb*
    /dev/drbd0

    it-test-17.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 disk /dev/mapper/vg-TestVMDisk--small--16.copy /dev/vg/TestVMDisk-small-16.drbdmeta 0 --create-device
    it-test-17.lab.fr.lan ~ 2 # ls /dev/drb*
    /dev/drbd0

Connect DRBD together::

    it-test-16.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 net 0.0.0.0:7788 192.168.32.209:7788 C -m -S 10000000 &
    [1] 9439
    [1]+  Done                    drbdsetup /dev/drbd0 net 0.0.0.0:7788 192.168.32.209:7788 C -m -S 10000000
    it-test-16.lab.fr.lan ~ 0 # netstat -t
    Active Internet connections (w/o servers)
    Proto Recv-Q Send-Q Local Address           Foreign Address         State      
    [...]
    tcp        0      0 it-test-16.lab.fr.:7788 it-test-17.lab.fr:53443 ESTABLISHED
    tcp        0      0 it-test-16.lab.fr:54046 it-test-17.lab.fr.:7788 ESTABLISHED

    it-test-17.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 net 0.0.0.0:7788 192.168.32.208:7788 C -m -S 10000000 &
    [1] 8113
    [...]
    [1]+  Done                    drbdsetup /dev/drbd0 net 0.0.0.0:7788 192.168.32.208:7788 C -m -S 10000000
    it-test-17.lab.fr.lan ~ 0 # netstat -t
    Active Internet connections (w/o servers)
    Proto Recv-Q Send-Q Local Address           Foreign Address         State      
    [...]
    tcp        0      0 it-test-17.lab.fr.:7788 it-test-16.lab.fr:54046 ESTABLISHED
    tcp        0      0 it-test-17.lab.fr:53443 it-test-16.lab.fr.:7788 ESTABLISHED


Set synchronization rate::

    it-test-16.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 syncer -r 50000

    it-test-17.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 syncer -r 50000

Make sure DRBD is connected to its peer::

    it-test-16.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 wait-connect -t 60 -d 60 -o 60
    it-test-16.lab.fr.lan ~ 0 # 

    it-test-17.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 wait-connect -t 60 -d 60 -o 60
    it-test-17.lab.fr.lan ~ 0 # 

Set roles for DRBD (source is primary and destination is secondary)::

    it-test-16.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 primary -o
    it-test-16.lab.fr.lan ~ 0 #

    it-test-17.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 secondary
    it-test-17.lab.fr.lan ~ 0 #

Wait for synchronisation to complete::

    it-test-16.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 wait-sync
    it-test-16.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 status
    <resource minor="0" cs="Connected" ro1="Primary" ro2="Secondary" ds1="UpToDate" ds2="UpToDate" />

    it-test-17.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 wait-sync
    it-test-17.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 status
    <resource minor="0" cs="Connected" ro1="Secondary" ro2="Primary" ds1="UpToDate" ds2="UpToDate" />

Set DRBD in primary primary mode (see
http://www.drbd.org/users-guide/ch-admin.html#s-roles and
http://www.drbd.org/users-guide/s-enable-dual-primary.html
for more information)::

    it-test-17.lab.fr.lan ~ 1 # drbdsetup /dev/drbd0 primary -o
    it-test-17.lab.fr.lan ~ 0 #

Set the original device mapper to the DRBD device (ie takeover)::

    it-test-16.lab.fr.lan ~ 0 # python -c 'print "0 %d linear /dev/drbd0 0" % (134217728 / 512)' | dmsetup load /dev/vg/TestVMDisk-small-16
    it-test-16.lab.fr.lan ~ 0 # dmsetup table
    vg-TestVMDisk--17: 0 8192000 linear 9:126 8710144
    vg-TestVMDisk--16: 0 8298496 linear 9:126 2048
    vg-TestVMDisk--small--16: 0 409600 linear 9:126 8300544
    vg-TestVMDisk--small--16.copy: 0 409600 linear 9:126 8300544
    vg-TestVMDisk--small--16.drbdmeta: 0 262144 linear 9:126 16902144
    it-test-16.lab.fr.lan ~ 0 # dmsetup suspend /dev/vg/TestVMDisk-small-16
    it-test-16.lab.fr.lan ~ 0 # dmsetup resume /dev/vg/TestVMDisk-small-16
    it-test-16.lab.fr.lan ~ 0 # dmsetup table
    vg-TestVMDisk--17: 0 8192000 linear 9:126 8710144
    vg-TestVMDisk--16: 0 8298496 linear 9:126 2048
    vg-TestVMDisk--small--16: 0 262144 linear 147:0 0
    vg-TestVMDisk--small--16.copy: 0 409600 linear 9:126 8300544
    vg-TestVMDisk--small--16.drbdmeta: 0 262144 linear 9:126 16902144

    it-test-17.lab.fr.lan ~ 0 # python -c 'print "0 %d linear /dev/drbd0 0" % (134217728 / 512)' | dmsetup load /dev/vg/TestVMDisk-small-16
    it-test-17.lab.fr.lan ~ 0 # dmsetup table
    vg-TestVMDisk--small--16: 0 409600 linear 9:126 2048
    vg-TestVMDisk--small--16.copy: 0 409600 linear 9:126 2048
    vg-TestVMDisk--small--16.drbdmeta: 0 262144 linear 9:126 411648
    it-test-17.lab.fr.lan ~ 0 # dmsetup suspend /dev/vg/TestVMDisk-small-16
    it-test-17.lab.fr.lan ~ 0 # dmsetup resume /dev/vg/TestVMDisk-small-16
    it-test-17.lab.fr.lan ~ 0 # dmsetup table
    vg-TestVMDisk--small--16: 0 262144 linear 147:0 0
    vg-TestVMDisk--small--16.copy: 0 409600 linear 9:126 2048
    vg-TestVMDisk--small--16.drbdmeta: 0 262144 linear 9:126 411648

3 - Resume VM
-------------

FIXME ?

::

    virsh resume TestVMDisk-small-16

4 - Wait for sync
-----------------

FIXME ?

Wait for DRBD to synchronise. Synchronisation status can be checked with the
following command::

    it-test-16.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 status
    <resource minor="0" cs="Unconfigured" />

When done, setup DRBD in primary-primary mode.

5 - Migrate
-----------

::

    it-test-16.lab.fr.lan ~ 0 # ipython
    Python 2.7.3rc2 (default, Apr 22 2012, 22:30:17) 
    Type "copyright", "credits" or "license" for more information.

    IPython 0.12 -- An enhanced Interactive Python.
    ?         -> Introduction and overview of IPython's features.
    %quickref -> Quick reference.
    help      -> Python's own help system.
    object?   -> Details about 'object', use 'object??' for extra details.

    In [1]: import libvirt

    In [2]: c = libvirt.open('qemu:///system')

    In [3]: dest_c = libvirt.open('qemu+tcp://it-test-17.lab.fr.lan:1234/system')

    In [5]: c.listDomainsID()
    Out[5]: [6, 7]

    In [8]: flags = libvirt.VIR_MIGRATE_LIVE | libvirt.VIR_MIGRATE_PEER2PEER | libvirt.VIR_MIGRATE_TUNNELLED | libvirt.VIR_MIGRATE_PERSIST_DEST | libvirt.VIR_MIGRATE_UNDEFINE_SOURCE

    In [9]: flags
    Out[9]: 31

    In [11]: d = c.lookupByID(7)

    In [12]: d.name()
    Out[12]: 'TestVMDisk-small-16'

    In [20]: d.migrate(dest_c, flags, None, 'qemu+tcp://it-test-17.lab.fr.lan:1234/system', 0)
    Out[20]: <libvirt.virDomain instance at 0x7fdbe40f1c68>

In case your virtio driver is not configured with cache set to 'none' you will
end up with the following message::

    In [13]: d.migrate(dest_c, flags, None, 'qemu+tcp://it-test-17.lab.fr.lan:1234/system', 0)
    libvir: QEMU error : Unsafe migration: Migration may lead to data corruption if disks use cache != none
    ---------------------------------------------------------------------------
    libvirtError                              Traceback (most recent call last)
    /root/<ipython-input-13-cedc63d7d342> in <module>()
    ----> 1 d.migrate(dest_c, flags, None, 'qemu+tcp://it-test-17.lab.fr.lan:1234/system', 0)

    /usr/lib/python2.7/dist-packages/libvirt.pyc in migrate(self, dconn, flags, dname, uri, bandwidth)
        820         else: dconn__o = dconn._o
        821         ret = libvirtmod.virDomainMigrate(self._o, dconn__o, flags, dname, uri, bandwidth)
    --> 822         if ret is None:raise libvirtError('virDomainMigrate() failed', dom=self)
        823         __tmp = virDomain(self,_obj=ret)
        824         return __tmp

    libvirtError: Unsafe migration: Migration may lead to data corruption if disks use cache != none

You may force the migration process by setting migration flags (libvirt >=
0.9.11)::

    In [14]: flags |= libvirt.VIR_MIGRATE_UNSAFE

When done, setup DRBD in secondary-primary mode.

6 - Pause VM
------------

We must pause VM while we manipulate the device mapper because I/O errors would
occur otherwise.

7 - Remove DRBD
---------------

Remove DRBD::

    it-test-17.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 disconnect
    it-test-17.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 detach
    /dev/drbd0: State change failed: (-2) Need access to UpToDate data
    it-test-17.lab.fr.lan ~ 17 # drbdsetup /dev/drbd0 down
    it-test-17.lab.fr.lan ~ 0 #

    it-test-16.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 disconnect
    it-test-16.lab.fr.lan ~ 0 # drbdsetup /dev/drbd0 detach
    /dev/drbd0: State change failed: (-2) Need access to UpToDate data
    it-test-16.lab.fr.lan ~ 17 # drbdsetup /dev/drbd0 down
    it-test-16.lab.fr.lan ~ 0 #

Remove DM::

    it-test-17.lab.fr.lan ~ 0 # echo '0 409600 linear 9:126 8300544' | dmsetup load /dev/vg/TestVMDisk-small-16
    it-test-17.lab.fr.lan ~ 0 # dmsetup table
    vg-TestVMDisk--small--16: 0 262144 linear 147:0 0
    vg-TestVMDisk--small--16.copy: 0 409600 linear 9:126 2048
    vg-TestVMDisk--small--16.drbdmeta: 0 262144 linear 9:126 411648
    it-test-17.lab.fr.lan ~ 0 # dmsetup suspend /dev/vg/TestVMDisk-small-16
    it-test-17.lab.fr.lan ~ 0 # dmsetup resume /dev/vg/TestVMDisk-small-16
    it-test-17.lab.fr.lan ~ 0 # dmsetup table
    vg-TestVMDisk--small--16: 0 409600 linear 9:126 8300544
    vg-TestVMDisk--small--16.copy: 0 409600 linear 9:126 2048
    vg-TestVMDisk--small--16.drbdmeta: 0 262144 linear 9:126 411648

    it-test-16.lab.fr.lan ~ 0 # echo '0 409600 linear 9:126 8300544' | dmsetup load /dev/vg/TestVMDisk-small-16
    it-test-16.lab.fr.lan ~ 0 # dmsetup table
    vg-TestVMDisk--17: 0 8192000 linear 9:126 8710144
    vg-TestVMDisk--16: 0 8298496 linear 9:126 2048
    vg-TestVMDisk--small--16: 0 262144 linear 147:0 0
    vg-TestVMDisk--small--16.copy: 0 409600 linear 9:126 8300544
    vg-TestVMDisk--small--16.drbdmeta: 0 262144 linear 9:126 16902144
    it-test-16.lab.fr.lan ~ 0 # dmsetup suspend /dev/vg/TestVMDisk-small-16
    it-test-16.lab.fr.lan ~ 0 # dmsetup resume /dev/vg/TestVMDisk-small-16
    it-test-16.lab.fr.lan ~ 0 # dmsetup table
    vg-TestVMDisk--17: 0 8192000 linear 9:126 8710144
    vg-TestVMDisk--16: 0 8298496 linear 9:126 2048
    vg-TestVMDisk--small--16: 0 409600 linear 9:126 8300544
    vg-TestVMDisk--small--16.copy: 0 409600 linear 9:126 8300544
    vg-TestVMDisk--small--16.drbdmeta: 0 262144 linear 9:126 16902144

Remove copy DM::

    it-test-16.lab.fr.lan ~ 0 # dmsetup remove /dev/mapper/vg-TestVMDisk--small--16.copy
    it-test-16.lab.fr.lan ~ 0 #

    it-test-17.lab.fr.lan ~ 0 # dmsetup remove /dev/mapper/vg-TestVMDisk--small--16.copy
    it-test-17.lab.fr.lan ~ 0 #

Remove metadata LV::

    it-test-17.lab.fr.lan ~ 5 # lvremove /dev/vg/TestVMDisk-small-16.drbdmeta
    Do you really want to remove active logical volume TestVMDisk-small-16.drbdmeta? [y/n]: y
      Logical volume "TestVMDisk-small-16.drbdmeta" successfully removed
    it-test-17.lab.fr.lan ~ 0 #

    it-test-16.lab.fr.lan ~ 5 # lvremove /dev/vg/TestVMDisk-small-16.drbdmeta
    Do you really want to remove active logical volume TestVMDisk-small-16.drbdmeta? [y/n]: y
      Logical volume "TestVMDisk-small-16.drbdmeta" successfully removed
    it-test-16.lab.fr.lan ~ 0 #

Remove VM LV on source hypervisor::

    it-test-16.lab.fr.lan ~ 0 # lvremove /dev/vg/TestVMDisk-small-16 
    Do you really want to remove active logical volume TestVMDisk-small-16? [y/n]: y
      Logical volume "TestVMDisk-small-16" successfully removed
    it-test-16.lab.fr.lan ~ 0 #

8 - Resume VM
-------------