I must admit, build scripts are not my favorite part of a project. Build scripts tend to be less maintained, long and tricky to debug locally. However, with just a little bit of patience, perseverance, a little bit of sweat, generous dose of prayer, and some trial and errors, CI/CD scripts can be clean and beautiful. The examples here are from Azure’s build system but the principles should apply for most build scripts.
How it all began
In the good old days of monolithic architecture, teams would write all their features, APIs and millions and millions lines of code onto a single repository. There was one piece of code to rule them all, the buildscript. Life was good. The DevOps (or their early equivalent) made infrequent changes, sipping coffee and discussing episodes of Dallas while watching the code slowly build on the server.
Fast forward to 2020, and things couldn’t be more different. Nobody has heard of the show Dallas and the most popular streamed show is some Korean drama. Space and cloud computing are so cheap and so easy that creating and deploying new services is possible with minimal clicks under two minutes. In the new microservice architecture, each feature, each set of APIs can be a group of server instances. In an agile-feature-driven software development, the ease of cloning or spawning new services can quickly multiply the number of services in a very short time. It’s not too long before you have a large number of repositories, each with their own build needs to manage.
How did we get here
We’ve all been here. It is a Friday afternoon. Several large features from two teams must be deployed before the weekend, next week’s large marketing promo depends on it. And so your life depends on it. Toni, who regularly does deployments has conveniently taken leave today. So the honor has gone to you. You got this! There is a little catch, the CTO’s golf buddy has offered free space on their new cloudspace.com. So you need to modify the scripts, you look at one and panic begins to seep in.
Alone and forgotten
How’d it get so messy, when all the rest of the code (let’s just assume, but I hear you readers laughing) is so pretty and beautifully formatted? Well, build scripts often lack developer attention and linter love. They are quickly copied from some template at the beginning so that projects can be built and spun up as fast as possible. As the project grows, environments are added and removed, build stages and triggers get tacked on, modified, copied and pasted on multiple times. The result is a Frankensteinian script that most devs stay away from not wanting to risk breaking builds.
Figure 1 above illustrates a build script. It typically includes environment definitions, OS versions, triggers and all the steps required to build and deploy the project. In general building follows the following steps.
- Trigger on some task then checkout project.
- Get dependencies
- Do some linting/formatting
- Run unit tests
- Run the project build
- Push image to repository
- Done and Reporting
All the steps above will often need to be run in multiple environments. You will definitely need to run in a development environment, probably one or several staging environments for testing and qualification, and certainly a production environment. Invariably, all this is stuffed into one large yaml called
Fortunately, there are a few tools available to help check our bloated scripts. Or perhap, in unspoken acknowledgement that we need help, these tools were created.
Azure has a good plugin by Microsoft for their Visual Code editor that helps with the yaml. The extension market for Visual Code offers a lot of choices in addition to the official one. Bitbucket and Gitlab have a web interface for their yaml validators. Just copy paste your yaml and it will (hopefully) detect errors in your script.
GitHub is a bit unique in this instance. Their developer friendly market has plenty of free automation as GitHub Actions and many applications that you can use to scan, test, build and deploy your project (have I mentioned they are free?).
There is a better way
Is there a better way? The answer is a loud and emphatic yes! As shown in Figure 1, a lot of the steps are repeated for each environment or triggers. It seems logical to group everything together into separate scripts.
Broken down into their logical functions, you may have distilled your steps down to a set of smaller scripts stored in the project
build/ directory like so:
Most pipelines have the capability to load other scripts, in azure you can use the template feature. In GitLab you can use the
include function to reference external files.
pipelines.yml may now look something like:
Your main pipeline is now much simpler, readable and shorter. You can easily see the flow of builds and steps without moving screens. “But wait!” I can hear you say, “nothing is that simple, what happens to all the variations?” Figure 3 is admitedly cleaned up for clarity. In actuality, you can send all the different variations through parameters or variables so that you test and build scripts become modular. On Azure it may look like:
- template: azure/go/doBuild.yml
Therefore you can use build generic scripts to cater all the different build requirements you may have.
includekeyword to include external YAML files in your CI/CD configuration. You can break down one long
gitlab-ci.ymlinto multiple files to increase readability, or reduce duplication of the same configuration in multiple places.
Any changes you need to do, for instance to your docker build parameters, can be performed just once in its file
doBuildDocker.yml, and the changes will propagate through all the environments. Life is good!
In a versioned repository when a change needs to be made, you would first pull the latest version from the main branch. Then create a new branch, make your modifications, push your updates and then ask for a code review before finally requesting a merge back to the main branch. When your changes are infrequent or limited to just a few repositories this is not a problem.
In the instances that you need to make large scale changes, say to move providers, multiple pipelines need updating. Then the above process of pull, push, code reviews and merges involving a large number of resources becomes very disruptive and very slow. Even just 10 instances is not something for a late Friday afternoon.
An even better way
What if you can remove the build scripts completely from the main project repository? That way, any changes to the build steps will not require changing and committing to the project repository. Also this removes the build scripts from unintentional modifications that could cause breakage.
Your build script may be located in
github/yourogranization/pipelines, it might have the structure something like:
│ ├── doBuildDocker.yml
│ └── doLoginDocker.yml
│ ├── doBuild.yml
│ ├── doGenericScript.yml
│ ├── doTests.yml
│ └── getDependencies.yml
│ ├── doBuild.yml
│ ├── doGenericScript.yml
│ └── doTest.yml
Then in the
azure_pipeline.yml, you need to declare the external repository you want to use. In Azure it is a resource of type
- repository: repositoryname
Then it is similar to using templates, with an additional
In GitLab, you can mirror your external repository into GitLab itself. Then use
Setup External Template
How do you set up an external resource for use in an azure pipeline? The steps are simple but the documents are a bit lacking here.
- The first step is to create a service connection in your project. Click settings in the Project tab on the bottom menu of the left panel.
- Click on the service connection
- On the New Service dialog (see Figure 7), click which repository your external templates is located. For instance click GitHub if it is on Github.
- For GitHub, you need to generate a Personal Access Token to allow Azure CI pipelines to access the repository.
- Finally, note the endpoint name assigned to your new service. This name will be used in the pipeline declaration under
Note: Please declare your external resource/repository first before creating your pipeline. The pipeline does not resolve the external templates otherwise.
The last word
By removing all scripts related to build to an external repository, and keeping a single generic script with the project, you have a single point to manage the build for all the services. You can guarantee no changes to the original project repository because all build changes are done elsewhere. A single update also updates all your services down the line. Time to brew that coffee and sit back.