We’re happy to announce a new series of blog posts that shed more light on how Twistlock uses machine learning (ML) to provide deeper protection to its customers. Machine learning (ML) is a broad topic, and there are a lot of concepts to talk about. Our approach will be to choose a new and different example of how we use machine learning in Twistlock for each blog post from this series.
At Twistlock, we standardize the usage of whitelist and container immutability as security armor for your apps. That is, we actively scan your images, containers, hosts and cluster configurations and build a highly granular security profile that fits like a glove to the runtime behavior of each of your apps. The profile contains a significant amount of information about the expected runtime behavior including which processes should run, which files are accessed and modified by your app, which system calls are used and much more. The whitelisted profile is built using an extensive set of generalized business rules, which we continuously modify and actively update using our intelligence feed, that in turn is constantly updated based on the work of our security researchers. In practice, this security model substantially reduces the attack surface and provide great defense against common attacks, including APT.
Nevertheless, there is a natural limitation to the granularity that can be achieved with customized business rules. As tight as the whitelisted profile may be, we want to be able to detect attacks even when they stay inside the whitelist boundaries. To illustrate the problem, and our solution, let’s discuss a real life scenario.
Let’s assume we are running an nginx container. By analyzing the container, its image, metadata, etc., we can learn multiple different whitelist features, e.g., which process are expected to run (nginx), which system calls are used (e.g., socket and read), and what the interaction is with the filesystem. Practically every deviation from this whitelist model will automatically trigger a simple audit event. Even in case the container is compromised, the whitelisted profile makes it very hard for an attacker to further manipulate the container and perform lateral movement within the cluster. However, even in this case, we would like to notify and react to the threat as fast as possible, even if the threat uses artifacts from within the whitelist (e.g., executes allowed process). To solve this problem, we apply a machine learning technique described below.
A naive approach to detect data breaches or attacks might use standard supervised learning techniques using app behavior statistics as features and historical/fake data for labels – especially negative ones. Using historical data is problematic since it is, by definition, built using known malware, and usually a very small portion of those. As a result, the learned model ability to detect behavior of a zero day vulnerability is very limited. This is manifested by the unacceptably high false negative rate.
Twistlock’s Approach to Machine Learning
One of Twistlock’s approaches to machine learning is to build a statistical model that describes the state of the system (app) prior to the event and after the event. Once built, we can use the new model to decide whether a new state the system was transformed into, via a new event, is valid or not. To enhance the accuracy and robustness of the learning process, we use the cluster and container metadata to learn events from multiple nodes simultaneously. The fact that each container encapsulates a single application, helps us to easily reduce the environmental noise from other apps when collecting new events.
Given the data, we learn if the transition from state Si given event e (e.g., new process) to state Sj is valid. To build the classifier only from positive data, we use a method called one-class support vector machine . Given a set of valid (allowed) states, this technique enables us to classify whether the new event transitions the system into anew valid state or not. Let’s use a simple visualization to illustrate the process.
Assume that we map each state in the system to a unique number in the range [-6,6]. We use the x-axis to plot the current state and y-axis to plot the next state. Given this training data, we create a one-class classifier that defines valid transition states. Now, given a new event, we can determine if this event transitions the system state to a valid state.
As mentioned above, this approach is one example of a set of different applications Twistlock takes to machine learning. In future blog posts, we will show how to use more advanced techniques to score, correlate, and composite multiple events. If you find this content interesting please subscribe to our blog, follow @TwistlockTeam on Twitter, or contact us for a demo.
Follow us on Twitter
Follow us on Twitter for real time updates on the cloud native ecosystem, Twistlock product, and cloud native security threats.
Announcing Our Series C FundingRead the Blog
Real Time View of Your Cloud Native Applications: Radar v3Read the Blog
AWS Fargate Security: Runtime Defense with Twistlock 2.5Read the Blog
Cloud Native Forensics: Security Incident Response in Twistlock 2.5Read the Blog
Announcing Twistlock 2.5: GA Release NotesRead the Blog