I’ve run into a bit of frustration in Azure when working in multi-domain environments. Our organization uses 4 different TLDs, only one of which is the Active Directory domain. This makes it kind of annoying to configure domain-joined Linux VMs that aren’t in the primary Active Directory domain. This is a little unclear, so let me give an example.
In Contoso Organization, there are several business units. Among them are business unit A (which uses domain businessA.net) and business unit B (which uses domain businessB.net). But there is a single Active Directory domain, businessA.net. Joining a Linux machine to businessA.net is no problem. But joining the Linux machine to businessB.net is a little more involved when the machine is built in Azure, due to the functioning of cloud-init. Luckily, it’s possible to tell cloud-init to do what we want, but it has to be done _at the time of provisioning_ or it won’t work.
Under normal circumstances, you’d update your netplan file to include the correct search domain. But in Azure, your netplan is overwritten every boot by cloud-init. So unless cloud-init knows what you want in your netplan file, you’ll just keep wiping out your changes every reboot. You’d also have to set your hostname to the FQDN (i.e. hostname.businessB.net) that you are registering in DNS.
So, how to accomplish this? First, you will need to set the config data in your Azure VM deployment (under Advanced tab) to a valid cloud-init config file (see https://cloud-init.readthedocs.io). Ensure it starts with #cloud-config
or it won’t be processed. The following config file works for businessB.net:
#cloud-config
preserve_hostname: false
fqdn: hostname.businessb.net
prefer_fqdn_over_hostname: true
manage_resolv_conf: true
resolv_conf:
nameservers:
- 127.0.0.53
domain: businessb.net
searchdomains:
- businessb.net
Some quick notes about this. First, note the use of preserve_hostname:false and prefer_fqdn_over_hostname:true. This will cause cloud-init to use the fqdn as the hostname, and won’t prevent the existing hostname (i.e. the Azure VM name) from being overwritten.
If only it were so simple! After booting the machine for the first time, you will see that the hostname is indeed hostname.businessb.net, but resolv.conf is still the default resolv.conf created by Azure. It’s like it didn’t read our resolv_conf configuration at all! In fact, this is exactly what has happened. resolv_conf module is not enabled by default in /etc/cloud/cloud.cfg, and must be added to the cloud_config_modules section.
But even that is not enough. On Ubuntu, the resolv_conf module is not verified, so you also have to tell cloud-init that you’re ok with that (it seems to do what I’d expect…). You can do so by adding a section unverified_modules
underneath cloud_final_modules
, and add a single entry, resolv_conf (Ensure it’s valid YAML!)
At this point, the system should be ready to accept the complete configuration from cloud-init, but the initial setup has already run. We will need to tell cloud-init to rerun the initial configuration, by executing the following commands in order:
# cloud-init clean --logs
# cloud-init init --local
# cloud-init init
# cloud-init modules
WARNING! This will regenerate your system SSH keys, so make sure you’re prepared for that. Do this immediately after you boot for the first time.
At this point, resolv.conf should have the correct contents:
nameserver 127.0.0.53
domain businessb.net
search businessb.net
It will also contain a comment saying that it won’t do this again. Note that we’re using 127.0.0.53, the systemd-resolved. You can feel free to change this and disable systemd-resolved.
That’s a bit about cloud-config, and a problem common enough for me to write about.