Author: Matt Venne, Solutions Director, stackArmor, Inc.
One of the biggest challenges that cloud architects and security professionals have is protecting “sensitive” data. This challenge is multiplied when that sensitive data must move between different systems for analysis and consumption. Data security is difficult in such a dynamic scenario, which requires special tooling and techniques to prevent the data from leaving its designated areas. Typically, these tools and techniques fall in the category of Data Loss Prevention or DLP for short. The marketplace has no shortage of DLP solutions; they can be network-based, examining data in flight at central egress points – e.g., firewalls; or agent-based, installed on a device, such as a workstation, to examine data at rest to programmatically identity which data is sensitive. Often, they are used in conjunction, agents identify sensitive data and network firewalls block the data identified by the agents from leaving.
Data Loss Prevention (DLP)
Data Loss Prevention (DLP) solutions whether custom or third-party must be carefully evaluated for cost, compliance, performance and solution-fit. DLP is a difficult problem to solve given the wide variety of scenarios. For example, a malicious insider can just insert a USB drive into their laptop and exfiltrate the sensitive data, bypassing all DLP mechanisms on the server in the process. This generally requires a layered approach to DLP which at times degrade the customer experience because those controls are overly prohibitive and seriously impact day-to-to productivity. Also, many DLP tools were simply not designed for the cloud. An additional dimension to DLP solution is to make sure they are acceptable to auditors and meet compliance requirements. Many of our customers operate in highly regulated markets that require FedRAMP, FISMA/RMF, StateRAMP or CMMC 2.0 compliance. Any DLP solution implemented in such environments must comply with these requirements thereby restricting the solution sets available. Specifically, the SC-7 (10) BOUNDARY PROTECTION | PREVENT UNAUTHORIZED EXFILTRATION requirement for a FedRAMP High baseline explicitly calls out the need for DLP.
Balancing usability and implementing strong security controls is a predicament faced by many of our customers in highly regulated markets. One of our fintech/financial services customers needed a strong DLP solution that met regulatory scrutiny. The customer had developed a Financial Data Analysis SaaS product hosted on AWS. Many of their clients were high-profile financial institutions that needed the data analysis that our client provided but had valid security concerns about their data and how it would be handled. But given the nature of how their product would be used, traditional DLP solutions did not seem to fit the bill – so a custom solution was required.
This blog post details our implementation of a zero trust based data diode pattern on AWS.
Summary of Client Requirements for DLP and Access Control
In order to implement an effective DLP solution one must start with defining the data flows and understanding the requirements.
- The product was designed to run on a Windows VM.
- The DLP solution has to be user-friendly without the need for installing third-party clients.
- Preference for HTTPS for communication.
- Strict multi-factor and endpoint authentication.
- Strong encryption in transit and at rest at all levels.
- Dedicated client environments – i.e. no shared disks, no shared VMs.
- 24/7 monitoring on access to environment
The first question that must be asked is: where is the sensitive data allowed to exist and where is it NOT allowed to exist? For our client, we concluded that the sensitive data was allowed to exist at rest and in transit in the following three areas:
- On the client’s workstations for pre-processing – This seems self-evident but bears mentioning given that end-point security is extremely critical. Even with enforcement of multi-factor authentication on access to upload and download data, we had to ensure that data could only be downloaded to approved corporate endpoints. This would prevent customers and our clients’ personnel from downloading sensitive data to personal laptops that could not be monitored and secured by their corporate security.
- In the data staging environment – When customers uploaded their data for processing, the pre-processing location of that data needed to support multifactor and endpoint authentication. Also, we wanted to strictly support HTTPS for file transfer to start for that file transfer. Many customers block egress SFTP and FTPS so we did not want to rely on those protocols – but we wanted a solution that could integrate with those protocols to give clients flexibility should they desire it.
- In the data processing environment – The data must reside on the actual designated VM during data processing.
Data was required to flow only in these designated areas. We just had to ensure that sensitive data was only allowed to flow in approved directions to approved locations – a “data diode” pattern was selected. Any other path for data access must be explicitly blocked. Also, ensuring strong identity controls must be implemented to tie the transaction to a specific user authorized to perform the action.
Our Data Diode Solution for DLP with Zero Trust Access Control
stackArmor’s cloud and security engineering specialists designed a custom solution that ensured the security of the data and allowed for adequate monitoring. Key elements of the solution are described in this section.
Since the solution was hosted on Windows Server VMs, the customers had to have ability to access their environment using Remote Desktop Protocol. However, RDP uses port 3389/TCP for communication. It is not best practice to expose 3389 to the internet even if you are white-listing IP addresses – which we are. Malicious actors and port scanners are always looking for open privileged access ports such as 3389 and 22 (for SSH) to attempt brute force attacks. We implemented a Remote Desktop Gateway to get around this: the Remote Desktop Client encapsulates the RDP traffic in HTTPS (443/TCP) and the RDGW decapsulates the traffic and translates it to 3389/TCP once inside the AWS VPC. This allowed us to only expose 443/TCP to the internet. It also allowed BDS to grant their customers access without requiring a VPN client installed on their corporate device.
Data Staging Environment
The only solution that existed that met all our requirements was AWS S3. It was accessible over public internet (no VPN), but still could require MFA and Device authentication using IAM roles and SAML authentication via an identity service like Okta or Cisco Duo. S3 supports HTTPS, SFTP (via AWS Transfer), and fine-grained bucket policies to control access.
Dedicated AWS Customer Managed Keys were created for each customer. This key was used to encrypt all the customer EBS volumes and their data in S3.
Authentication and Authorization
We used Cisco Duo Beyond for Multifactor Endpoint Authentication with identities tied to an in-boundary, VPC-hosted AWS Managed AD. Since the solution was designed to run on a Windows VM, each customer received a dedicated EC2 instance which was joined to the AD Domain. AD security groups controlled, and Group Policies allowed us to centrally control access – corporate users would only be allowed to access their dedicated EC2 instance. The Duo RDP MFA agent is installed on all VMs, which requires users to present valid MFA credentials to establish an RDP connection. Industry standard methods of MFA are supported: Push Notification on Mobile Device, YubiKey, hardware TOTP token, etc.
When customers wanted to upload their data to S3, the data-staging environment, they used SAML Authentication with Cisco Duo acting as the SAML Identity Provider. Cisco Duo would authenticate the user using MFA and the device using mTLS – the device had to have a pre-installed certificate in its trusted store to present to Cisco Duo. If the device does not have this certificate, it is not allowed to authenticate even if valid MFA credentials are presented. After authentication, Cisco Duo would grant the customers a dedicated IAM role with specific IAM policies that controlled the following:
- Only allowed upload to a single staging bucket via Public Internet
- Could only download the data from that bucket when the data is accessed via a VPC endpoint – i.e. the customer had to be RDP’d into their dedicated instance to download the data from the bucket. Customers would access S3 browser from inside the VPC using browser assuming the same role via DUO SAML as in step 1.
- Could only upload to a post-processing bucket from the via the VPC endpoint. The report generated by BDS analysis tool was what is uploaded in this stage.
- Could only download data from that post-processing bucket over the internet.
- Ability to encrypt and decrypt with dedicated KMS Key.
Additionally, specific bucket policies were applied to each bucket that ensured the above flow. GetObject and PutObject actions had to originate from a specific VPC endpoint or were denied from VPC endpoints depending on the use case.
When accessing a remote system over RDP, RDP presents the ability to present resources on your local machine such as folders, clipboards, USB devices, etc. to the remote machine. This was explicitly blocked by Group Policies on all computers in the domain. This way no one could copy a file over RDP from their secure data processing machine to their local machine over RDP.
Network Traffic Policies
All EC2 instances received a VPC Security Group that prevented East-West communication between instances in the environment. Even though users only had credentials to access their approved instances, the VPC security groups prevented access at the network layer, in additional to the application layer which was controlled by Active Directory.
AWS Network Firewall was deployed to filter all egress traffic and only allow communication to approved domains – this list comes down to essentially, Cisco Duo, AWS, Windows Updates specific domains, and Anti-Virus updates. This list was carefully curated to ensure file-sharing, web-based email, and other potential vectors of exfiltration were not allowed. All ingress and egress traffic in the secure VPC environment was centrally routed via Transit Gateway to this central egress point for inspection by the Network Firewall. Network Firewall can filter domains of both HTTP and HTTPS traffic. Since HTTP is not encrypted, it just inspects the layer 7 HTTP hosts headers. HTTPS these headers are encrypted so the Network Firewall inspects the SNI header in the TLS Client Hello. However, other protocols are not supported (as sometimes FQDNs are not visible in the protocol – only IP addresses, and the complexity of codifying the inspection of each protocol). To get around this –NTP (123 UDP) , HTTP (80 TCP) and HTTPS (443 TCP) traffic is allowed to traverse the firewall via stateless inspection. To put it simply, only approved ports and approved domains are allowed outside of the secure enclave. This was deployed via a custom CDK package written by stackArmor to ensure repeatability and testing of the deployment prior to implementation.
The only vector that did not flow through the Network Firewall is traffic to AWS S3 from inside the VPC. We leveraged a Gateway VPC Endpoint to capture all traffic to S3 from the secure enclave. We then applied a specific VPC endpoint policy which allows fine-grained zero-trust access control to S3 from our VPC. VPC endpoint policy blocked all traffic to buckets not owned by the client – this was to ensure even if someone was able use AWS IAM credentials not brokered by Duo to access the AWS API, the VPC Endpoint would prevent them from exfiltrating data to AWS accounts owned by them. Additionally, since all S3 buckets followed a strict naming convention, we could control which S3 API actions were allowed to their respective buckets – as outlined in the Authentication and Authorization section above.
Alerting on Potential Data Loss
Since we treated all data inside the VPC as sensitive to start, there was no need to implement a DLP agent on servers. The flow of the data was controlled by other mechanisms. Being bucket policies, prevented PutObject actions to S3 via VPC endpoints on all but approved buckets, the only way sensitive data could be exfiltrated to S3 would be if sensitive data was uploaded to these buckets – only reports (not raw data) were intended to be uploaded to these buckets. On all PutObject actions on these buckets, a targeted AWS Macie scan would be triggered and would decrypt the files (which were encrypted with dedicated KMS keys) and scanned for patterns of known sensitive data – credit cards, social security #s, etc. Each instance of sensitive data in a file would be scored on a confidence level – i.e. how confident AWS Macie is that it is actually sensitive data and not something that just looks like it (e.g. 111-11-1111 is probably a social security number, but we can more confident if the letters SSN or Soc. Sec. occur within 50 characters of that string). If AWS Macie detects sensitive data above certain threshold, it will send an alert to a Slack channel for manual inspection. Other alerts that that are sent to Slack: all file uploads to customer buckets, all AWS Console log-on events, all RDP log-on events.
Do you have a cloud security or compliance question or project, then please visit us at stackArmor.com or complete the form to schedule a technical discussion.