musing at the confluence of data, software and security
by Earl Chen
CrowdStrike - Is It Code?
18 August 2024There has already been considerable discussion about the CrowdStrike incident on 19 July 2024. There is finger pointing by CrowdStrike customers. Finger pointing between Microsoft and CrowdStrike. There is finger pointing by governments. Finger pointing back at governments. And sadly also finger pointing by CrowdStrike and Microsoft at their customers. Laudably CrowdStrike published a detailed root cause analysis and implemented a series of changes to mitigate the impact of future issues.
Unfortunately, the current discourse and even Crowdstrike’s own reporting has overlooked and downplayed a fundamental issue that requires more thoughtful consideration. On that fateful day in July, was the update that crashed 8.5 million Windows devices content or was the update code?
Effectively managing software development requires formalized processes for code changes, code testing, code deployment, and code retirement. The priority and rigor of these processes must scale with the size and reach deployments. Categorizing a system artifact as code brings those formalized processes into play. Categorizing the artifact as content relaxes or even removes the required rigor.
CrowdStrike calls this type of update Rapid Response Content. By using content in the name CrowdStrike is categorizing the update as something other than code. CrowdStrike explicitly states “Rapid Response Content is configuration data; it is not code or a kernel driver.” Digging into the documentation we find that Rapid Response Content consists of multiple Template Instances. Template Instances consist of regex content. CrowdStrike even tacks the word content onto the term regex to emphasize the content over code distinction. They did not explain what the difference is between regex content and a regex so we will ignore the CrowdStrike term of art and stick with the widely used single word regex and the plural form, regexes.
Regular expressions (regexes) reside in the intersection of content and code. Like both content and code, regexes are text. Unlike some code e.g. C, C++, Swift, regexes are not compiled into an executable prior to use. However, unlike content, regexes must be interpreted by a regular expression engine that reads input and creates output based on the regex. This behavior is identical to interpreted code e.g. Java, JavaScript, Python.
Whether a regex is content or code is the fundamental issue because reclassifying code as content can be used to bypass more rigorous code pipeline controls. Reading through CrowdStrike’s mitigations:
- Add runtime input array bounds checks
- Increase test coverage
- Create additional checks
- Update content configuration test procedures
- Add additional deployment layers and acceptance checks
- Provide customer control over deployment
These are all code and code pipeline improvements. Despite what CrowdStrike has written in their documentation, these mitigations clearly indicate that Rapid Response Content and more significantly the way it is used within the CrowdStrike environment are code not content. Thankfully now it will be treated as such, but what other instances of code masquerading as content exist in CrowdStrike’s platform?
tags: software