VM as a file in Ansible
3/Jul 2025
Everything is a file, we all know this old good principle from Unix. And what if we apply this to our IaC repo.
Ansible is a popular tool for managing infrastructure. In typical setups, infrastructure is divided into multiple environments, e.g. Dev, Test, and of course, Prod. The first step in creating an Ansible repository is to design a clear structure that reflects the environment separation.
The very first step when creating an Ansible repo is to design a clear structure that supports this environment separation. We assume Ansible is the only tool to manage VM resources and configuration.
Creating VMs with Ansible
Ansible offers built-in modules to create and manage cloud VMs, just like other IaC tools. E.g for Azure, we have a module called azure.azcollection.azure_rm_virtualmachine
This module supports many parameters. And of course, you as a developer want to make your code reusable and scalable.
Now, try to imagine you have hundreds of VMs, and every VM has dozens of variables. The standard Ansible approach with host_vars/group_vars
can make your code unreadable.
To handle this complexity, we will revisit a classic Unix concept: everything is a file.
VM as a file
The concept is that every VM is a file in the repo. The file has YAML format, in fact it describes all parameters of the VM, it is readable and clear. This will allow everyone to create a VM by adding a file to a specific folder.
An example may look like this:
name: prd-vm-tomcat-01
shape: Standard_D2s_v4
os:
type: linux
os_disk:
size: 128
type: Premium_LRS
auth: # auth information
admin_username: superadmin
admin_ssh_keys_pub:
- "ssh-rsa AAA..."
tags: # Azure tags
application: tomcat
environment: prd
ports: # Inbound ports in Networks Security group (optional)
- port: 22
- port: 80
- port: 443
This file provides all information needed to create a VM in Azure, the content may be extended, of course.
Environment as a directory
…but generally is also a file, since a directory is a type of file in Unix systems.
In a large-scale enterprise architecture, we normally have more than one environment. Apart from production, it may be Dev, Stage, Test, etc. According to the best practices, we have to keep them isolated. Which makes us isolate the files as well, so we keep them in different directories. In the following example, you can see that our VMs directory contains three environments: stage, dev, and prod (and a meta.yml file, we’ll talk about it later)
virtual_machines
├── stage
│ ├── stg-vm-tomcat-01.yml
│ └── stg-vm-tomcat-02.yml
├── dev
│ └── dev-vm-tomcat-01.yml
└── prod
├── prd-vm-tomcat-01.yml
├── prd-vm-tomcat-02.yml
└── prd-vm-tomcat-03.yml
Thus, every env directory contains files that describe virtual machines.
Using group vars
Every environment has a set of variables that are shared across it (like Virtual Networks, region locations, common tags, etc). Following the DRY principle, it would be good to keep them separately from the VMs description, and then merge them with every file when needed.
Ansible gives us a group_vars
file. In our case, it contains these kinds of common vars, in a way, it is metainformation about the environments.
These common vars may be rewritten in the VM file. Parameters in the VM file have higher priority when we merge them correctly in our Ansible code.
...
azure_network:
dev:
rg: dev-rg-vnet-01
vnet: dev-vnet-01
subnet: dev-vnet-01-sbnet-01
stg:
rg: stg-rg-vnet-01
vnet: stg-vnet-01
subnet: stg-vnet-01-sbnet-01
prd:
rg: prd-rg-vnet-01
vnet: prd-vnet-01
subnet: prd-vnet-01-sbnet-01
image:
subscription_id: 12345-6789-12342323
rg: prd-rg-galery-01
gallery: prd-galery-01
os:
linux:
default: ubuntu-24.04-lts
stable: ubuntu-24.04-lts
latest: ubuntu-25.04
windows:
default: windows-10
stable: windows-10
latest: windows-11
...
Lookup plugin
To load our variables with information about the VMs from the files, we will use the lookup plugin. This is going to be a simple Python script. We’ll put it in a directory called lookup_plugins
, the script will have a name vms.py
, and this name will be used in the Ansible code later.
class LookupModule(LookupBase):
vms_location_dir = 'virtual_machines'
def run(self, terms, **kwargs):
env = terms[0]
vms_dir = join(getcwd(), self.vms_location_dir, env)
vms_files = [join(vms_dir, f)
for f in listdir(vms_dir) if isfile(join(vms_dir, f))]
vms_list = []
for vm_file in vms_files:
if vm_file.lower().endswith(('.yml', '.yaml')):
with open(vm_file, 'r') as data:
vm_data = load(data, Loader=FullLoader)
vms_list.append(vm_data)
return vms_list
Now we can get a VM list by calling this plugin
- name: Print the names of all existing VMs in the environment
debug:
msg: "{{ item.vms.name }}"
with_vms: "{{ env }}"
Pay attention to the with_vms
property. This is where we instruct Ansible to call out the lookup plugin.
Create the VMs
Now that we can read all the info about the VMs, we can run the virtual machines. To make it simpler, we put all tasks related to the azure.azcollection.azure_rm_virtualmachine
module into the create_vms.yml
file, then calling this code would look like this.
- name: VM | Create VMs
include: create_vms.yml
with_vms: "{{ env }}"
Idempotency
Another important topic is idempotency. We already discussed what to do with new VMs, but what if an engineer wants to remove one? Then he/she removes the related file from the repo. Ansible needs to pull information about all VMs, then compare it to the list in the code, and remove redundant VMs.
- name: Existing VMs list
azure_rm_virtualmachine_info:
register: existing_vms
- name: Desired VMs list
set_fact:
desired_vms: "{{ desired_vms | default({}) | item.vm.name }}"
with_vms: "{{ env }}"
- name: VMs to destroy
set_fact:
destroy_vms: "{{ existing_vms | dictdifference(desired_vms) }}"
In this code, dictdifference
is another filter plugin, located in filter_plugins/dictdifference.py
file
from ansible import errors
def dictdifference(dict1, dict2):
res = {}
diff = dict1.items() - dict2.items()
for k,v in diff:
res[k] = v
return res
class FilterModule(object):
def filters(self):
return {
'dictdifference': dictdifference
}
End user perspective
From the end user’s perspective, who knows little about Ansible or Azure clouds, you provide the ability to easily request or remove a VM. It is as easy as putting a file (created from a template) in the repo and making a pull request. Then your team reviews the PR, and as soon as it is merged, Ansible does all the work for you.
Summary
By structuring your Ansible repo around VM-as-file and environment-as-directory concepts, you can achieve a scalable, maintainable way to manage cloud VMs.
This approach keeps your infrastructure code organized, easy to understand, and adaptable, crucial for managing complex environments at scale.