The history of poisoning wells in times of conflict is an established one. Whether by cutting off access to wells or using it as a force multiplier for spreading disease, the town well has always been a significant attack vector.
In modern times, we can draw the analogy of a well to a script or API endpoint that initiates an automation that drives change into infrastructure, applications, and digital services. Most organizations—78% according to our upcoming State of Application Strategy 2023 report—employ a rich set of automations across IT to do just that. That should be no surprise given the prevalence of automation to drive changes into complex, hyperscale systems operated by Facebook, Twitter, and Amazon, among others.
That’s because, like the shared well of olden days, a single script can affect thousands of systems in a matter of minutes. In the before times, manual changes affecting the same number of systems might have taken days or even weeks. Automation is a force multiplier, allowing operations of all kinds to scale in ways that human beings could never achieve. It is the cornerstone of scaling processes, practices, and the business. Indeed, one can argue—as we did in Enterprise Architecture for Digital Business—that an organization cannot become a digital business without automation. It is one of the six key capabilities organizations need to build in order to successfully capitalize on data, adopt SRE operations, and infuse digital services with the ability to adapt through modern app delivery.
But the thing about automation is that, well, it’s automatic. Once begun, it’s difficult to intercept the cascading changes driven across such systems. Speed of change is one of the drivers for automation, after all, and once begun those changes are difficult—if not impossible—to stop.
You’d have to be living off-grid to not have heard about automation propagating unintended changes that, ultimately, impacted large swaths of the Internet. A bad parameter pushed into a script is nearly impossible to recall once the enter button is pushed, or API endpoint invoked. Once executed, the well has been poisoned.
This is not the first time I’ve raised the alarm with respect to the security of IT automation. It is an overlooked and underexplored attack vector that will, eventually, be exploited. And even if ‘eventually’ is decades away, the more immediate threat of human error remains extant. According to the latest Uptime Institute research, “nearly 40% of organizations have suffered a major outage caused by human error over the past three years.”
This is where AI—more correctly, ML—enters the room.
Machine learning is particularly adept at uncovering patterns and relationships between data points. Today, most of the market is focusing on the application of machine learning to solving security and operational challenges. This includes identifying whether a user is a bot or a human, recognizing attacks, and even predicting imminent outages.
An area often unexplored is the use of AI and ML to protect app infrastructure. For example, using machine learning to understand how operators and admins interact with critical systems and immediately notice when an interaction deviates from the norm. This is useful for detecting attackers attempting to access directories they shouldn’t or invoke commands with parameters outside normal usage.
Read that last part again. Invoke commands with parameters outside normal usage.
Ah, there it is. There is nothing peculiar to security in the ability of AI and machine learning in general—to detect anomalous parameters or an attempt to execute an unusual command. Which means this technology could just as easily be applied to IT automation to catch either human error or intentionally malicious commands.
Assuming the right level of access to target systems, such a machine learning solution could certainly offer a path to protecting systems against occasional bad parameters, lateral communication attempts, or any other attack. Ransomware, anyone?
Infrastructure—for apps, app delivery, and automation—is still an attractive attack vector. As organizations move to adopt more automation—and they are—they need to simultaneously consider the ramifications—accidental or intentional—of the use of that automation. From there, it’s necessary to consider how to protect it against the inevitable fat finger or malicious keystroke.
Automation is a force multiplier. Full stop. That means it’s useful for both intended and malicious use cases. Which implies a need to protect it. Machine learning may be one way to integrate AI with ops to protect the infrastructure that remains a vital component of a digital business.