2021 will always be remembered as the year the world took Ransomware seriously. This is because reported ransomware attacks doubled from the previous year, 2020; these attacks cost companies an estimated $2 billion total, in both recovery and in some cases, paid ransom.
The global shift to containerized applications has only complicated matters for companies. While containers have accelerated application development and deployment, their security implications are often murky for auditors and security teams. Here are the top 10 things customers should be focused on to protect against not only ransomware, but all intrusions.
1. Proper OS security on worker nodes
- Install Host-based security software – file integrity monitoring, anti-virus, and application control.
- Harden to an acceptable benchmark like DoD STIG or CIS Benchmarks.
- Scan for vulnerabilities on the host regularly
Ransomware is a type of malware. So, it makes sense that the first lines of defense to protecting containerized applications from malware is proper hardening and a solid HBSS with frequent signature updates. One common point of confusion is whether the containers themselves need Anti-virus protection or not. The answer is no – containers use a union mount file system, so the containerized apps will inherit the protections applied to the hosts. It is important to leverage agent-based vulnerability scanners (rather than remote scanners that leverage SSH or SMB protocols) as this allows you to completely close remote access to your worker nodes. Access to hosts can be still delegated using AWS Sessions Manager. Sessions Manager does not require any open inbound ports and can integrate with external IDPs to enforce corporate access policies such as strong MFA and device posture checks.
It is recommended that companies leverage an immutable deployment strategy for updating their worker nodes – do not patch and update in place! To accelerate security, it is recommended that an Image Pipeline is utilized to automate the creation of the golden worker node Amazon Machine Image’s (AMI) on a monthly basis. For customers using Amazon Web Services (AWS), EC2 Image Builder is a perfect candidate for this automation, as it allows you to write custom recipes using SSM and build a schedule for the creation of the golden AMI. This new AMI can then be pushed monthly to your worker nodes. Amazon Elastic Kubernetes Service (EKS) has built-in features to accelerate updates of worker nodes which includes the cordoning of nodes, re-scheduling of pods etc. The infographic below provides an overview of the automated pipeline.
2. Deploy HIDS such as Falco.
While most host-based security software these days includes Application Control – which only allows desired applications to run, this might not be enough protection. Depending on how the solution is implemented – hash-based or heuristic-based, the ability to effectively limit applications to run can be problematic for many workloads, especially those that are container-based. This why it is important to implement an additional host-based intrusion detection system (HIDS) such as Falco.
Falco can be deployed as a binary on the worker node itself (in which case it would be included in the image pipeline above) or could be deployed as a DaemonSet. This allows for seamless autoscaling of Kubernetes workloads. If using a DaemonSet, there is no need to pre-bake Falco into the worker node AMI.
Falco uses either kernel modules or the newer eBPF to monitor every process’ interaction with the host’s kernel. With Falco’s fully extensible DSL, you can write custom rules to alert on any suspicious behavior.
Here is an example rule for notifying if a shell is opened inside of a container.
- rule: shell_in_container desc: notice shell activity within a container condition: evt.type = execve and evt.dir=< and container.id != host and proc.name = bash output: shell in a container (user=%user.name container_id=%container.id container_name=%container.name shell=%proc.name parent=%proc.pname cmdline=%proc.cmdline) priority: WARNING
Here is another example of an alert when a process writes to sensitive system directories.
- macro: open_write condition: > (evt.type=open or evt.type=openat) and fd.typechar='f' and (evt.arg.flags contains O_WRONLY or evt.arg.flags contains O_RDWR or evt.arg.flags contains O_CREAT or evt.arg.flags contains O_TRUNC) - macro: package_mgmt_binaries condition: proc.name in (dpkg, dpkg-preconfigu, rpm, rpmkey, yum) - macro: bin_dir condition: fd.directory in (/bin, /sbin, /usr/bin, /usr/sbin) - rule: write_binary_dir desc: an attempt to write to any file below a set of binary directories condition: evt.dir = < and open_write and not proc.name in (package_mgmt_binaries) and bin_dir output: "File below a known binary directory opened for writing (user=%user.name command=%proc.cmdline file=%fd.name)" priority: WARNING
Falco supports integration with AWS CloudWatch Logs and AWS Security Hub for proper tracking and processing of detected anomalies. stackArmor ThreatAlert® ConMon Lens provides native alerting and monitoring of Falco data along with response automation.
3. Control egress traffic flow.
Ransomware gets into a network because it was either downloaded from a malicious location or uploaded by a malicious user. To protect against malicious downloads, controlling egress traffic is essential. Some companies may use egress gateways in a service mesh (like Istio) to craft policies to external services. However, that is insufficient – they only apply egress policies to containers that are part of the service mesh; any container or processes outside of the mesh would not respect the egress gateway policies. In EKS, egress traffic policies can be controlled in 3 ways:
Egress to internet – In today’s cloud-first world, truly air-gapped systems are becoming increasingly rare. At some point, your system will need to download a patch, an anti-virus update, or a container image from the internet. It is important to use a firewall to regulate the flow of traffic between your EKS cluster and the internet. AWS Network Firewall is a managed firewall service that can provide protection against malicious domains used to host malware.
It provides protection using the following mechanisms:
Stateless Rules – Used to limit which ports and protocols are allowed out to internet – typically: 80/TCP (HTTP), 443/TCP (HTTPS), 123/UDP (NTP), 53/UDP (DNS), 53/TCP (DNS). Malware can be hosted on non-traditional ports and protocols not typically used over internet like 389/TCP (LDAP). The infamous Log4Shell Vulnerability was exploited using this method.
Stateful Rules – Can be used to control TLS and HTTP traffic to remote domains. By inspecting the SNI extension of the TLS Client Hello (or the host header in un-encrypted HTTP), the firewall can block egress traffic to external domains. This can be done either by using allow lists or deny lists.
Managed Rule Groups – AWS also provides managed rule groups that can be used as deny lists to malicious domains. Here are a few managed rule groups relevant to ransomware. For a full list, please visit: https://docs.aws.amazon.com/network-firewall/latest/developerguide/aws-managed-rule-groups-list.html
Access to AWS Services –While AWS does monitor accounts for abuse, it is possible that ransomware packages could be hosted in AWS services, particularly S3. When an application requires access to AWS services, adding a VPC endpoint with a strict endpoint policy can add a layer of protection and protect workloads from accessing unapproved AWS services and ultimately ransomware.
DNS Lookups – DNS is one of the fundamental protocols of the internet – there’s a reason why every engineer and administrator has shouted “IT’S ALWAYS DNS” at some point. This has led to DNS becoming an interesting attack vector. Hackers can register a malicious domain – e.g., ransomware123.com, thus all DNS queries for the entire internet targeting that domain are forwarded to the malicious name servers. Any information that is included as a “subdomain” in a DNS query can be logged by the attacker. This can be a vector for data exfiltration, Command and Control and more. Data exfiltration is not “traditional” ransomware, but if sensitive data is leaked via this method, it could certainly be used as leverage to extort an organization. The protection against this vector is called DNS Sink holing and AWS Route 53 Resolver DNS Firewall fits the bill. Sink holing allows you to inspect DNS queries and evaluate if it is a malicious domain; if it is malicious, the DNS Sink hole returns a dummy response to the DNS query, so the query is never forwarded to the malicious name servers. DNS Firewalls allows you to create specific allow lists, deny lists, or leverage AWS Managed lists for known malicious domains identified by AWS.
4. Limit access to control plane.
Ransomware can be deployed as a container; consequently, anyone with access to the control plane of your cluster can deploy ransomware (wittingly or unwittingly).
Controlling access to authenticated users using the aws-auth ConfigMap in Kubernetes is important but so is controlling network access to the control plane itself. As is standard best practice with AWS access, avoid using AWS IAM Users for accessing the control plane – rely on IAM roles which use short-lived credentials.
A recent study found that approximately 380k different Kubernetes clusters have their Kubernetes API – and thus the control plane – exposed to the internet. While the API servers may still require authenticated access, it can easily be mitigated by limiting which public IPs can access the API server; or even better don’t expose it to internet at all and require a VPN – after all, credentials can be leaked or stolen.
5. Least privilege in containers.
Another way ransomware will get introduced to a Kubernetes cluster is via one of the containers in the cluster. Therefore, it is important to implement proper runtime security on the containers themselves. By limiting the privileges of your containers your limit their ability to be used maliciously. Some common options for implementing container runtime security:
- Read-only root file system.
- Run container as a user with limited access to host.
- Deny privileged containers (unless specifically required) – using Admission Controllers such as OPA Gatekeeper.
- OPA Gatekeeper can also be leveraged to prevent the scheduling of containers from unapproved registries into your cluster.
- Restrict Linux kernel capabilities – such as NET_ADMIN and SYS_CHROOT.
- Restrict hostPath volume mounts to non-sensitive directories on host – can be enforced using Admission Controllers.
- Restrict kernel access using AppArmor or SecComp profiles.
- Leverage AWS IRSA for access to AWS API to prevent relying on an overly permissive EC2 Instance Profile.
This is part one of a two part blog-series. We will continue and provide a subsequent post with the remaining 5 additional ways to protect Kubernetes clusters.