YAML file format is used to represent data, like other data structure formats like XML or JSON.
This table shows the same data, represented in all three formats:
Files that use YAML format can have extension .yaml or .yml.
Key-Value Pair
Data in its simplest form is a key value pair and that's how it's defined in YAML: key and value separated by a colon (colon must be followed by space).
name: Server1
The key above is name
The value is Server1.
YAML file is basically a collection of one or more key-value pairs where key is a string (without quotation marks) while value is a string or some more complex data structure like a list, dictionary or list of dictionaries etc...
Sequence/Array/List
Array name is followed by colon and then each item goes in its own line with a dash in the front:
Servers:
- Server1
- Server2
...
- ServerN
The dash indicates that it's an element of an array.
In the example above we have actually a key-value pair where value is a list. As YAML document is a collection of key-value pairs, we can have a file like this:
foo.yaml:
Servers:
- Server1
- Server2
DataCentres:
- London
- Frankfurt
This style of writing sequences is called a block style. Sequences can also be written in flow style, where elements are separated by comma, within square brackets:
Servers: [Server1, Server2]
DataCentres: [London, Frankfurt]
Dictionary (Map)
A dictionary is a set of properties grouped together under an item.
Technically, the example below is a key-value pair where key is the name of the dictionary and value is the dictionary itself:
Server1:
name: Server1
owner: John
created: 123456
status: active
Notice the blank space before each property. There must be an equal number of blank spaces before the properties of a single item so they are all aligned together, meaning that they are all siblings of their parent, which is key Server1.
The number of (indentation) spaces before these properties doesn't matter. But that number should be the same as they are siblings.
Why does YAML forbid tabs?
Tabs have been outlawed since they are treated differently by different editors and tools. And since indentation is so critical to proper interpretation of YAML, this issue is just too tricky to even attempt. Indeed Guido van Rossum of Python has acknowledged that allowing TABs in Python source is a headache for many people and that were he to design Python again, he would forbid them. [
YAML Ain't Markup Language]
Notice the number of spaces before each property that indicates these key value pairs fall within Server1.
Let's suppose that we have:
Server1:
name: Server1
owner: John
created: 123456
status: active
In this case created has more spaces on the left than owner and so it is now a child of the owner property instead being its sibling, which is incorrect.
Also these properties must have more spaces than its parent which is Server1.
What if we had extra spaces for created and status?
Server1:
name: Server1
owner: John
created: 123456
status: active
Then they will fall under owner and thus become properties of owner. This will result in a syntax error which will tell you that mapping values are not allowed here because owner already has a value set which is John.
For a value of the key-value pair we can either set a direct value or a hash map. We cannot have both. So the number of spaces before each property is key in YAML.
Complex Data Types
A list containing dictionaries
We have here a key-value pair where value is a list of key-value pairs where value is a dictionary (we can say that we have a list of dictionaries where each dictionary has a name):
Servers:
- Server1:
name: Server1
owner: John
created: 123456
status: active
- Server2:
name: Server2
owner: Jack
created: 789012
status: shutdown
We have here a list of servers and the elements of the list are key-value pairs Server1 and Server2.
Their values are dictionaries containing server information.
We can have a list of (unnamed) dictionaries where each element of list is not a key-value pair but a dictionary itself:
Servers:
- name: Server1
owner: John
created: 123456
status: active
- name: Server2
owner: Jack
created: 789012
status: shutdown
A list containing dictionaries containing list
Servers:
- Server1:
name: Server1
owner: John
created: 123456
status: active
applications:
- web server
- authentication database
- Server2:
name: Server2
owner: Jack
created: 789012
status: shutdown
applications:
- caching database
When to use a list, dictionary and list of dictionaries?
Use dictionary if need to represent information or multiple properties of a single object.
Dictionary is a collection of key-value pairs grouped together:
name: Server1
owner: John
created: 123456
status: active
In case we need to split the owner further into name and surname, we could then represent this as a dictionary within another dictionary.
name: Server1
owner:
name: John
surname: Smith
created: 123456
status: active
In this case the single value of owner is now replaced by a small dictionary with two properties name
and surname. So this is a dictionary within another dictionary.
Use a list/array to represent multiple items of the same type of object.
E.g. that type could be a string.
We have here a key-value pair where value is a list of strings
Servers:
- Server1
- Server2
What if we would like to store all information about each server? We'll expand each item in the array and replace the name with the dictionary. This way we are able to represent all information about multiple servers in a single YAML file using a list of dictionaries.
We have here a key-value pair where value is a list of dictionaries:
Servers:
- name: Server1
owner: John
created: 123456
status: active
- name: Server2
owner: Jack
created: 789012
status: shutdown
When the order of items matter?
Dictionary is an unordered collection.
Lists/arrays are ordered collections.
Dictionary
is the same as:
owner: John
But list:
is not the same as list:
Comments in YAML
Any line beginning with a hash is automatically ignored and considered as a comment.
# List of servers
How to break long string values into multiple lines?
Use:
- vertical bar (|) pipe character which preserves the new line
- greater-than (>) character which folds the new line and converts it into spaces
References: