Learn how to parse and manipulate YAML files more efficiently using
yq a command line utility and this simple scam sheet
Today, YAML is used to configure almost anything (for better or worse), so Whether you are DevAn ops engineer who works with Kubernetes or Ansible, or a developer who configures to log in to Python or CI / CD with GitHub Actions – you need to process YAML files at least from time to time. Therefore, being able to effectively query and process YAML is a necessary skill for all of us – engineers. The best way to learn is by mastering a YAML processing tool, such as
yq, which can make you more efficient in many daily tasks, from simple searches to complex manipulations. So let’s go through and learn all this
yq has to offer – including navigation, selection, sorting, shrinking and more!
Before we start using
yq, we must first install it. When google
yq however, you will find two projects / archives. The first of them, at https://github.com/kislyuk/yq there is a wrapper around
jq – JSON processor. If you are already familiar
jq you may want to grab this and use the syntax you already know. In this article, however, we use another – a slightly more popular project – https://github.com/mikefarah/yq. This version does not match 100%
jq syntax, but its advantage is that it is not dependent (does not depend on
jq), see the following for more information on the differences GitHub problem.
Go to to install it documentation and choose the appropriate installation method for your system, make sure you install version 4 as we are working here. In addition to this, you may want to specify the completion of the shell, which can be found at https://mikefarah.gitbook.io/yq/commands/shell-completion.
Now that it is installed, we also need some YAML file or document to test the commands to run. To do this, we use the following files, which have all the common things you could find in YAML – attributes (regular and nested), tables, and different Value Types (string, integers, and booleans):
When this is gone, learn the basics!
All commands to be executed start on the same basis
yq evalfollowed by the quoted expression and your YAML file. One exception to this would be the beautiful printing of YAML files, in which case you omit the expression, for example
yq eval some.yaml – if you are familiar
jq, then this corresponds
cat some.json | jq ..
Alternatively, we can also attach some extra tickets, some of the more useful ones
-C force colored output,
-I n sets the output indent to
n spaces or
-P for beautiful printing.
There are a lot of things we can do about basic clauses, but the most common of these is going through YAMLs, or in other words – looking for some key in a YAML document. This is done using
. (point) operator and in basic form this would look like this:
In addition to basic map navigation, you often want to retrieve a specific directory from the matrix (using
And finally, you might as well find out splat a useful operator that flattens maps / tables (note the difference from the first example we looked at):
In addition to basic trips, you may also want to explore by selecting, which allows you to filter by logical expressions. For this we use
select(. == "some-pattern-here"). Here is a simple example of filtering by leading numbers:
This example also shows the use of a pipe (
|) – we use it to first navigate to the part of the document we want to filter and then move it on
In the example above, we used
== Find fields that are the same size as the pattern, but you can also use them
!= to match those who are not equal. In addition, you can leave
select function completely and instead of values you only get matching results:
Whether you are brand new
yq or if you’ve been using it for a while, you’re sure to have problems where you have no idea why your query isn’t returning what you want. In these situations you can use
-v the flag produces an accurate output, which may give you information about why the query is behaving in the same way.
The previous section covered the basics, which are often sufficient for fast retrieval and filtering, but sometimes you may want to use more advanced functions and operators, such as automating certain tasks that involve YAML input and / or output. So, let’s look at a few more things
yq has to offer.
Sometimes it can be useful to sort the keys in a document, for example if you are versioning your YAML files in a git or just for general readability. It is also very handy if you want to separate 2 YAML files. For this we can use
If the YAML document you entered is dynamic and you’re not sure what keys are, it may make sense to first check for their presence
has("key"), you may need to obtain a dynamic key list first before performing certain operations with a document that you can use
Checking the length of the value may be necessary to filter / validate the inputs or to ensure that the value does not exceed some predefined limits. This is done using
For automation tasks with parameterized inputs, you must be able to move environment variables
yq queries. Of course, you can use normal shell environment variables, but you end up with very cumbersome and hard-to-read quotation marks. Therefore, it may be better to use
env() function instead:
To simplify the handling of some fields or tables, you can also use some string functions, such as
split append or break text:
The last and probably most complex example for this part is data conversion using
ireduce. In order for you to have a good reason to use this feature, you need a rather complicated YAML document that I don’t want to clear here. So instead of at least getting an idea of how the function works, we use it to implement it “poor man” version
join from the previous example:
This is not as self-evident as the previous ones, so let’s break it down a bit. The first half (
.user.orders as $item ireduce) query) takes some iterative field (sequence) from YAML and assigns it to a variable – in this case
$item. The second part defines the initial value
""; (empty string) and an expression that is evacuated to each
$item – here it would be a value that used to be, combined with space (
(. + " ")) followed by the currently playing item (
Most of the time, you only have to search, search, and filter existing documents, but from time to time, you may also need to manipulate YAML files and create new ones.
yq offers a couple of operators to do such tasks, so let’s take a brief look at them and look at a few examples.
The simplest of these is a union operator, which is really just
, (comma). It allows us to combine the results of multiple queries. This can be useful if you need to unpack multiple parts of YAML at the same time, but you can’t do it with a single query:
Another fairly common use would be to add a record to a table or concatenate to 2 groups. This is done
+ (plus) operator:
Another handy is the upgrade operator (
=), which (surprise surprise) updates a field. A very simple example of updating the log level in a sample YAML:
It is important to note that by default, the result is sent to standard output and not to the original file. To perform a local upgrade, you must use
There are a few more operators available, but they aren’t particularly useful (most of the time), so I’ll show you a bunch of examples that probably won’t help you, but I’ll give you some links to the documents if you want to know dig a little deeper:
Now that we know the theory, let’s look at some examples and handy commands that you can incorporate into your workflow right away.
For obvious reasons, we’ll start with Kubernetes, as it’s probably the most popular project that uses YAML for configurations. The simplest but very useful thing
yq can help us print Kubernetes resources or query specific parts of the list:
Another thing we can do is list the resource name and a specific attribute. This can be useful for finding or unloading listening ports for all services, or for retrieving each podium in the namespace, for example.
Note that above we had to use
.items because when you
get all instances of the resource, restored Friendly is List /
For resources such as Pods, Deployments, or Services, which often have multiple instances in each namespace, it may not be desirable for all of them to simply be poured into the console and manually filtered. So instead, you can filter them by some attribute, such as the list name and listening port for services exposed only to a specific port:
Like all Kubernetes YAML engineers Sometimes it can be hard to remember all the fields in a particular resource, so why not just ask for all the keys for example
Moving from the Governors, what about some
docker-compose? Maybe you need to temporarily remove some part, such as
healthcheck – Well, here we go (this is destructive, so be careful):
In the same way, you can also delete a task from Ansible Playbook. If we talk – what about change
remote_user in all Ansible Playbook missions – switch now
Hope this “Collision course” to help you get started
yq, but like any tool, you learn to use it only by practicing and actually doing real-life tasks, so next time you need to look for something in a YAML file, don’t just clear it in a terminal, but rather type
yq a survey to do the work for you. Also, if you’re trying to find a query for your own task and Google search doesn’t yield anything useful, try to find a solution that uses
jq instead – the query syntax is almost the same and searching may be better
jq solutions as it is a more popular / commonly used tool.