Monday 29 April 2024

YAML Ain't Markup Language (YAML)




YAML file format is used to represent data, like other data structure formats like XML or JSON.

This table shows the same data, represented in all three formats:


Files that use YAML format can have extension .yaml or .yml

Key-Value Pair


Data in its simplest form is a key value pair and that's how it's defined in YAML: key and value separated by a colon (colon must be followed by space).

name: Server1

The key above is name
The value is Server1.


YAML file is basically a collection of one or more key-value pairs where key is a string (without quotation marks) while value is a string or some more complex data structure like a list, dictionary or list of dictionaries etc...


Array/List


Array name is followed by colon and then each item goes in its own line with a dash in the front:


Servers:
- Server1
- Server2
...
- ServerN


The dash indicates that it's an element of an array.


In the example above we have actually a key-value pair where value is a list. As YAML document is a collection of key-value pairs, we can have a file like this:

foo.yaml:

Servers:
- Server1
- Server2

DataCentres:
- London
- Frankfurt

Dictionary (Map)


A dictionary is a set of properties grouped together under an item.

Technically, the example below is a key-value pair where key is the name of the dictionary and value is the dictionary itself:

Server1:
    name: Server1
    owner: John
    created: 123456
    status: active

Notice the blank space before each property. There must be an equal number of blank spaces before the properties of a single item so they are all aligned together, meaning that they are all siblings of their parent, which is key Server1.

The number of (indentation) spaces before these properties doesn't matter. But that number should be the same as they are siblings.

A YAML file use spaces as indentation, you can use 2 or 4 spaces for indentation, but no tab. In other words, tab indentation is forbidden. [php - A YAML file cannot contain tabs as indentation - Stack Overflow]

Why does YAML forbid tabs?

Tabs have been outlawed since they are treated differently by different editors and tools. And since indentation is so critical to proper interpretation of YAML, this issue is just too tricky to even attempt. Indeed Guido van Rossum of Python has acknowledged that allowing TABs in Python source is a headache for many people and that were he to design Python again, he would forbid them. [YAML Ain't Markup Language]

Notice the number of spaces before each property that indicates these key value pairs fall within Server1

Let's suppose that we have:

Server1:
    name: Server1
    owner: John
       created: 123456
    status: active

In this case created has more spaces on the left than owner and so it is now a child of the owner property instead being its sibling, which is incorrect.

Also these properties must have more spaces than its parent which is Server1.

What if we had extra spaces for created and status?  

Server1:
    name: Server1
    owner: John
       created: 123456
       status: active

Then they will fall under owner and thus become properties of owner.  This will result in a syntax error which will tell you that mapping values are not allowed here because owner already has a value set which is John

For a value of the key-value pair we can either set a direct value or a hash map. We cannot have both. So the number of spaces before each property is key in YAML.


Complex Data Types


A list containing dictionaries 


We have here a key-value pair where value is a list of key-value pairs where value is a dictionary (we can say that we have a list of dictionaries where each dictionary has a name):

Servers:
- Server1:
    name: Server1
    owner: John
    created: 123456
    status: active
- Server2:
    name: Server2
    owner: Jack
    created: 789012
    status: shutdown


We have here a list of servers and the elements of the list are key-value pairs Server1 and Server2.
Their values are dictionaries containing server information.

We can have a list of (unnamed) dictionaries where each element of list is not a key-value pair but a dictionary itself:

 Servers:
  - name: Server1
    owner: John
    created: 123456
    status: active
  - name: Server2
    owner: Jack
    created: 789012
    status: shutdown

A list containing dictionaries containing list


Servers:
- Server1:
    name: Server1
    owner: John
    created: 123456
    status: active
    applications:
       - web server
       - authentication database
- Server2:
    name: Server2
    owner: Jack
    created: 789012
    status: shutdown
    applications:
       - caching database

When to use a list, dictionary and list of dictionaries?


Use dictionary if need to represent information or multiple properties of a single object.

Dictionary is a collection of key-value pairs grouped together:

name: Server1
owner: John
created: 123456
status: active

In case we need to split the owner further into name and surname, we could then represent this as a dictionary within another dictionary.

name: Server1
owner: 
   name: John
   surname: Smith
created: 123456
status: active

In this case the single value of owner is now replaced by a small dictionary with two properties name
and surname. So this is a dictionary within another dictionary.


Use a list/array to represent multiple items of the same type of object.  
E.g. that type could be a string.

We have here a key-value pair where value is a list of strings

Servers:
- Server1
- Server2

What if we would like to store all information about each server? We'll expand each item in the array and replace the name with the dictionary. This way we are able to represent all information about multiple servers in a single YAML file using a list of dictionaries.

We have here a key-value pair where value is a list of dictionaries:

Servers:
- name: Server1
  owner: John
  created: 123456
  status: active
- name: Server2
  owner: Jack
  created: 789012
  status: shutdown


When the order of items matter?



Dictionary is an unordered collection.
Lists/arrays are ordered collections.

Dictionary 

name: Server1
owner: John

is the same as:

owner: John
name: Server1

But list:

- Server1
- Server2

is not the same as list:

- Server2
- Server1


Comments in YAML

Any line beginning with a hash is automatically ignored and considered as a comment.

# List of servers
- Server1
- Server2

References:


No comments: