There are common DevOps industry issues, though we all want the same thing: to deliver value to our customers. Developers do this by implementing creative solutions in code. Ops teams deliver value by ensuring that the environment is always running, regardless of what the dev team does.
Your quality assurance uses the product in ways never thought of. Your security team makes sure that your customers are protected from everyone.Â
Everyone seems to be at odds with each other. Blaming each other for problems that arise. If we all want the same thing, shouldn’t we be working together in a more cohesive way?
I don’t think anyone starts out like this. We just sort of find ourselves here. Nothing describes the problems better than the book The Phoenix Project.
Using Automation to Solve Common DevOps Industry Issues
Why do we automate?
We implement automation in software delivery for multiple reasons. A shortlist includes:
Accountability
A common phrase heard in software shops is “it works on my machine.â€Â The problem with this is we are not shipping your machine to the customer. We need it to work when it isn’t on your machine.Â
Once we know what environment enables it to run on your machine, we can set up other machines (such as other developers, the test environment, and the running production environment) in the same way.
If things break because you introduced a new requirement then we can detect this change in the automated build process, know who’s build failed because of this change, and determine how to automate this new requirement into the production.
NOTE: it is important to note that nothing here is to assign blame, but for everyone to take ownership of the product. Accountability is to know who to ask for more information, instead of wasting time holding a meeting.
Reduce DevOps Industry Issues Caused by Human Error
Humans are prone to error. Especially when doing things that have lots of steps. Deployments can be complex processes with a lot of things happening requiring precise orchestration.
In an effort to reduce cycle times (how long to know if a change produces value) we want to get as many changes to the customer as fast as possible. Organizations, such as Netflix, deploy at least 3 times a day.
Doing this multiple times a day is not feasible when asking humans to do this. Humans are good at solving complex problems. Computers are good at doing what they’re told. Let’s use our strengths.
Repeatability/Predictability
When a process is laid out, it can be followed the same way, every time. We know what the outcome should be each time. Once established, we can fire and forget. Only to be notified when finished, or an error has occurred (or some other human intervention is required).
Computers and machines are best equipped for doing the same thing over and over, and can do it faster, more efficiently, and reliably than humans can. Get the most out of your organization by alleviating your smart people from doing dumb tasks.
Lower Cycle Times
How long it takes to know if a change will provide value is an important metric. A lot of organizations don’t know if a change will provide value until 3 years after it started. Or if they do know, they can’t start receiving the benefits until the project is finished. This is typically observed in waterfall-style deployments, where nothing is deployed until it is complete.
The thing about software is that it is never complete. The Agile software process, while many bawk at it, is meant to break down work into as small as possible changes and deploy them as quickly as possible. Many people try to work in an agile fashion but deploy a waterfall policy. This causes grief and causes people to blame agile. To really get the benefit, we need to implement an agile work policy and a rapid deployment pipeline that gets those agile changes into production.
Other Solutions for DevOps Industry Issues
Deployment is just part of the equation. We want everyone to own the product. From the developers to the operations team. There are many overlapping areas and concepts that help achieve this and ensure that your product is stable and predictable. Remember, your product is more than just the code. It is also the environment in which it is running.
- Configuration/Infrastructure as Code (CaC/IaC)
- Source Control
- Security
- Database concerns
- Quality Assurance (QA) and Automated Testing
- Test Driven Development (TDD)
- Virtualization/Containerization
Configuration as Code (CaC) & Infrastructure as Code (IaC)
We need to realize and understand that your product is more than the executable and database running in production. Our product is also the environment in which the executable is running in. This includes the virtual machines, cloud resources, load balancers, on-premise integrations, and any other hardware (whether on-prem, in the cloud, or a combination of the two).
If needed, are you able to completely stand up your production environment? Do you have any doubts about if it will be configured correctly? How long would it take? When was the last time you practiced?
These questions brought in the idea that the infrastructure is just as important as the code they run. And if we needed to run an earlier version of the code, could we get the same infrastructure and configuration used.
The ability to define your infrastructure as files that machines could interpret and use to create what you needed, brings all the benefits of reliability and predictability mentioned earlier. The same is true for your configurations.
You can even version these things and have them tied to releases so if you need a previous version, you would also have the infrastructure to support it. Couple this with your deployment tools running these CaC/IaC files, you can reliably ensure you have the infrastructure you need every time.
Source Control
Anything can happen to your computer. It can be stolen, damaged by a power surge, or just a general hardware failure. Source control allows you to back up your changes on a remote machine.
Another added benefit is that it becomes easier to share your code and changes with others. When working with a group each person has a copy of what is on the remote machine. They work locally and submit their changes to the server, where everyone can pull them down and work on them.
There are many ways to manage these changes and recommend reading about the different branching strategies. Feature branching is the most recommended. However, don’t get caught up on best practices. Find what works best for your organization and policies, or supports your goals the best.
Security
Security is something that is lacking in the technology sector. With things becoming more and more convenient, we are paying for it with the lack of security it introduces. The tradeoff of convenience is security. When implementing new and innovative ideas, it needs to be asked how an attacker can use this to their advantage. Your product may not be the piece that holds the bank account, but it might be the one that grants access to it. Security is everyone’s responsibility.
DevOps Industry Issues caused by Database Deployments
Databases in production are fickle creatures. They can not be completely blown away and replaced in the same manner as every other piece in the pipeline. Because of this, care must be taken to not destroy live data.Â
It is a good idea to practice deploying/upgrading your database components in a staging environment, housing a sanitized backup of your production data. Taking a backup, sanitizing, and then restoring to a staging environment should be done as part of your stage-specific deployment process. This allows you to test and find issues in your upgrade process before you could negatively affect your production data.
Some tools to look at include DbUp and RedGate. Tools like this allow you to treat your database updates as standard code files, allowing you the follow the same peer-review process as you do for any other code file. This allows developers of all walks to participate in the database development lifecycle.
Quality Assurance & Automated Testing
Quality Assurance is an important aspect of your development and deployment team. These people typically know the software better than anyone else. They know the nuances and unique behaviors of the software. They can answer almost any questions about use.
Let’s not waste their insights on running the same suite of tests every time a new build is ready for release. As we discussed, we want to deploy small changes, as many times as possible. Even a few times a day.
We also want our changes to be stable and not introduce catastrophic failure, let alone re-introducing a bug that was already fixed. We can achieve this by using the computer’s strengths. Computers can run any test and analyze the results faster than any human. Get the computer to run the test for each build.
Let the developer know when they introduced a bug that has been fixed. When a new bug is fixed, the developer needs to write the automated test to ensure it doesn’t show its nasty head again.
As for the QA, let them be creative. Given the same access a user would have, how can they get the software to respond in undesirable ways. Desired behavior is a business decision. Any newly discovered behavior could be viewed as a feature and a new marketing point for the sales team. Just because it was unexpected when discovered doesn’t mean more work needs to be injected. Product owners should be the ones to make that decision.
Test Driven Development (TDD)
Test Driven Development takes our automated testing to the next level. This is a concept where, before any code is written, a test is written to validate the functionality of the new code. This means that goals need to be determined and set before any work on the functionality is made.
There can be many tests written and viewed as milestones. The more tests for functionality the better. Tests run relatively quickly, especially when automated. The resources spent writing the tests upfront will save you more than the resources spent trying to debug and then fix it when it will inevitably break in the future.
Virtualization & Containerization
Virtualization is when you run computers inside other computers. A virtual machine is an entire computer, with its own CPUs to use and memory and storage space, used from a pool, shared by other virtual machines (and the host). This is great and useful when you need multiple computers to do things.
An example is a virtual machine for a Database Server, and another virtual machine for the webserver. These machines have different roles, and thus, different resource requirements (along with security and other ops tasks). The ability to have this infrastructure pool resources is made possible by virtualization.
Recently, this idea has been taken to the next level with containerization. Containerization still pools hardware resources, but also some of the software resources. Resources like the host operating system, file system, network ports, and other kernel-level resources can be shared. The benefit of this is that containers are very fast to spin up and down. No longer are we turning on an entire computer, but rather just booting an operating system.
Most containers are some slimmed-down version of a Linux distribution, meaning that they are light and fast, though windows containers do exist. The speed at which these containers boot allows for rapid failover, fault recovery, and dynamic horizontal scaling (adding more instances to process requests).
Containers are also useful because they are self-contained environments that can be built and deployed the same way in every stage of our deployment. This way you know that what was tested is the same that is being deployed.
Docker is the recommended toolchain to use when implementing containerization. There is a large ecosystem surrounding Docker, especially in the orchestration of automated failover and rolling deployments. A popular tool is Kubernetes (K8)
Some Tools to Solve Common DevOps Industry Issues
Tools used to automate software builds:
Tools used to automate software deployments:
Final Thoughts on DevOps Industry Issues
This was written to help give context about some of the problems found in the software industry to someone new. It is intended for someone who has little to no experience in this industry, along with some tools used to help implement solutions, policies, and concepts.
Remember, software development is like an organization, always growing and changing. What works for one may not for another (even within the same organization). Each project is different.
Find what fits your needs, goals, and processes the best, and change them as needed. Everyone owns the process. Change is inevitable, so don’t fear change, practice change.