Moving XG within Azure

By 18 September, 2018 Azure, Sophos XG

While there are many technicalities to take into consideration when moving an XG virtual appliance from one tenant to another in Microsoft Azure, it is important not to forget an essential issue to insure a post migration continuous working environment.

First, let’s take into consideration the following facts:

  1. The XG uses standard SKU Public IPs that cannot be move between accounts (https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-move-resources#pip-limitations)
  2. You cannot move a VM with a plan on it (Like the BYOL licensing for XG)

There are a few other reasons why moving an appliance might fail, but these are the most common ones along with using managed disks.

These situations require a rollback to the previous tenant because connectivity to the appliance fails after it is moved.

Here, we are going to examine the log messages from Azure related to a crash right after reboot:

FIRMWARE LOADER

Starting 17_0_8_209.

Warning: Detected Disk [/dev/hda] [size:8388608]

Warning: Detected Disk [/dev/hdb] [size:209715200]

Warning: Detected Disk [/dev/sda] [size:8388608]

Warning: Detected Disk [/dev/sdb] [size:209715200]

Warning: Detected Disk [/dev/sdc] [size:167772160]

Warning: using primary disk [/dev/sda] , auxiliary disk : [/dev/sdc]

Loading configuration

Performing automated file system integrity checks. It will take some time before your system is available.

Examining Config partition…..

Examining Signature partition…..

Examining Report partition…..

 

### System Detail ###

 

Number of cores:                4

Total RAM:                      6976 MB

Total Number of interfaces:     2

Total Primary Disk:             8 GB

Total Auxiliary Disk:           80 GB

 

#####################

 

mkdir: can’t create directory ‘/sdisk/conan_new’: File exists

Password: 2018/09/03 10:00:28.xxxxx INFO Azure Linux Agent Version:2.1.3

2018/09/03 10:00:29.216731 INFO OS: sfos 17

2018/09/03 10:00:29.219252 INFO Python: 3.5.1

2018/09/03 10:00:29.219510 INFO Run daemon

2018/09/03 10:00:29.220392 INFO Detect protocol endpoints

2018/09/03 10:00:29.221389 INFO WireServer endpoint is not found. Rerun dhcp handler

2018/09/03 10:00:29.221491 INFO Send dhcp request

2018/09/03 10:00:29.329677 INFO Configure routes

2018/09/03 10:00:29.343346 INFO Gateway:xxx.xxx.xxx.xxx

2018/09/03 10:00:29.359037 INFO Routes:None

2018/09/03 10:00:29.373068 INFO Request to install route: 0 0 xxx.xxx.xxx.xxx

2018/09/03 10:00:29.385071 INFO Wire server endpoint:xxx.xxx.xxx.xxx

2018/09/03 10:00:29.615998 INFO Fabric preferred wire protocol version:2015-04-05

2018/09/03 10:00:29.637200 INFO Wire protocol version:2012-11-30

2018/09/03 10:00:29.658134 WARNING Server prefered version:2015-04-05

2018/09/03 10:00:34.667711 INFO Start env monitor service.

2018/09/03 10:00:34.668910 INFO Configure routes

2018/09/03 10:00:34.669032 INFO Gateway:xxx.xxx.xxx.xxx

2018/09/03 10:00:34.669906 INFO Routes:None

2018/09/03 10:00:34.670010 INFO Request to install route: 0 0 xxx.xxx.xxx.xxx

2018/09/03 10:00:34.687595 ERROR run cmd ‘pidof udhcpc’ failed

2018/09/03 10:00:34.688913 ERROR Error Code:1

The information given is that the DHCP from Azure is not able to deliver an IP when it is expected to do so (point #3 above). Usually, this is due to a static IP assigned to the WAN ‘B’ Port. Even though the address assigned might be identical to the one delivered by Azure, the communications and security infrastructure above the XG is very strict in this sense, and it expects an ‘OK received’ from the appliance. It might sound silly, but this failure enters in a continuous loop that breaks communications.

To avoid this problem altogether, it is highly recommended that you make sure your WAN port is in DHCP mode. A simple detail that might cause complex problems.

In terms of recommendations, of course, it is recommended to simply deploy a new XG machine into the new subscription and import the configuration from the older XG node.