Serverless technology continues to grow throughout development and devops teams. To better understand proper configuration and threats presented regarding serverless functions, I decided to create a two part series exploring leading serverless platforms. In the first part of the series, below, I’ll review the serverless model in general, and include a security review of Google’s Serverless implementation. The second part will review Microsoft Azure Functions. Stay tuned to our @TwistlockLabs’ Twitter channel for that.
The Shift from Cloud Computing to Serverless Computing
Until few years ago, when you wanted to run an app on the cloud, you had to start a VM, run its code and pay for the machine run time. The payment included the idle time of the machine when it was non productive: not serving requests or executing computations.
Serverless computing transfers the allocation of execution time to the cloud provider. The cloud provider decides how many machines are running and then splits the machines’ run time between clients’ load. Each client writes his own Serverless Function which in turn runs on the machine.
Each one of the clients pays only for the actual run-time and resources the function has consumed. The client doesn’t pay for the time the server is idle or used by other functions.
The Serverless model is event driven: your code runs as a response to an event that happened and its purpose is to handle the event.
Serverless became a commercial service in 2014 when Amazon introduced Amazon Lambda. Since then more players joined the game: Microsoft introduced Azure Functions, Google released Google AppEngine and then Google Functions.
The main differences between the services is the development environment, for example, Google Functions supports Node.js while Amazon Lambda supports Java, Python and others.
Serverless functions have some practical use cases:
- Analyze log files and send notifications using Slack API
- Create periodical backups of the database
- Process uploaded videos and convert them into a standard file format
Common execution procedure
Each execution of a function consists of the following steps:
- A trigger happens, such as an incoming HTTP request, creation of a new file linked to a storage system, delivery of an email, etc.
- Once a trigger arrives, a sandbox is created for the function which isolates it from the host it’s running on.
- The sandbox is ready. The function code is uploaded to the sandbox, unpacked and executed. The function receives the relevant metadata of the incoming trigger via a known interface.
- Once the function execution is finished or reached the runtime timeout the sandbox is destroyed.
This model may vary between different Serverless cloud providers.
The Serverless model raises new security concerns in each of the execution steps:
- Needless to say that the application can be prone to attacks.
- The trigger can be abused and lead to a possible DDOS or malformed input to the function.
- The trigger mechanism is written and managed by the cloud provider. Any flaw in the trigger can allow takeover of the proxy server or parts of the infrastructure
- An insecure sandbox can allow code to escape out of functions and interfere with the host / other functions.
- Some cloud providers (such as Amazon) use cache to allow faster deployment of the functions and save computing power. The cache can be abused and allow an attacker to store persistent malicious code and files inside the cache. This attack was demonstrated before by Rich Jones on Amazon Lambda.
- The security of the infrastructure is handled completely by the service provider. If there are security issues in the infrastructure there is usually nothing developers can do about it.
- The used libraries and packages, whether they are part of the deployed sandbox environment or imported manually can be prone to vulnerabilities.
Let’s dig into Google Functions and review it with respect to the security concerns mentioned above.
Google Functions supports writing functions in Node.js. The entrypoint of the function is inside
index.js file. On creation you can upload a ZIP file with
index.js and the rest of your code and libraries.
The Node.js code executes on response to one of the following events:
- Cloud Pub/Sub notification that was delivered from any of Google’s cloud services (Cloud Logs, Cloud API, Compute Dataflow, etc)
- Cloud Storage Bucket – a new file was uploaded to the server or similar filesystem events
- HTTP Trigger – an HTTP request to the function’s URL. Once this trigger is selected an unique HTTPS URL is attached to the function.
HTTP Trigger naming convention
The function HTTP trigger is automatically assigned (but can be changed manually) and is in this format
This allows guessing and revealing the the function trigger address for attack trials.
The capabilities inside the sandbox are pretty much the same as standard Linux environment, which makes me question whether the function runs inside a container or not. I think it is not and pretty sure that Google has modified the kernel to run this small VM. It seems that some syscalls were modified in some way. For example, standard TCP and UDP sockets are working properly where raw sockets returns “Function not implemented” which is a common error when the kernel is build without certain syscalls.
The sandbox is “batteries included”, it already contains relevant libraries such as ImageMagick (used for converting image formats). Make sure to verify that such libraries are updated and not vulnerable (for example, ImageTragick CVE).
The kernel is updated and not vulnerable against common exploits.
The sandbox doesn’t suffer from cache leaks – as the whole VM is destroyed when the functions ends.
Networking inside the sandbox
UDP & TCP are the only kind of sockets that works. The rest are blocked and that’s why ping isn’t working (sends ICMP through raw sockets). Listing the network interfaces is not possible as
/proc/net/dev is missing. There are two accessible local addresses:
appengine.googleapis.internal. I assume that the VM has a localhost interface, an internet interface and another local interface on
169.254.169.x subnet to communicate with the metadata and appengine addresses.
The default service account that assigned to the function has access to the following scopes:
https://www.googleapis.com/auth/bigquery https://www.googleapis.com/auth/cloud-platform https://www.googleapis.com/auth/compute https://www.googleapis.com/auth/datastore https://www.googleapis.com/auth/devstorage.read_only https://www.googleapis.com/auth/devstorage.read_write https://www.googleapis.com/auth/logging.write https://www.googleapis.com/auth/monitoring https://www.googleapis.com/auth/userinfo.email
(Full list of API scopes)
The scopes are self-explanatory, pay attention to them as if someone takes over your function it allows persistency and lateral movement.
I won’t get into details about Amazon Lambda as it was reviewed thoroughly in 2016 and 2017:
- Gone in 60 milliseconds @ CCC
- Hacking Serverless Runtimes: Profiling AWS Lambda, Azure Functions, and More @ BlackHat 17
Serverless technology allows devops to deploy apps faster but also poses new kinds of security issues. I have laid out an initial list of security concerns but there are still others which were not discussed. In our upcoming Twistlock release, we provide productized support for vulnerability management of serverless apps. By checking your functions against known vulnerabilities in their libraries, you can better ensure that your serverless apps are secure. Still, we recommend being extra cautious when deploying Serverless code–do your own research.
For further reading I suggest reading the linked articles. Keep up with the container security world by following us on twitter @twistlocklabs.
T19 Challenge – Twistlock Lab’s first security challenge summary and solutionsRead the Blog
Kubernetes emergency survival: Hotfix patching running podsRead the Blog
Demystifying Kubernetes CVE-2018-1002105 (and a dead simple exploit)Read the Blog
Buffer Overflows in QEMU: Disclosing Four New CVEsRead the Blog