How Forcepoint Data Loss Prevention (DLP) safeguards your AWS Generative AI solutions

Forcepoint

By Kunal Sharma, Senior Solutions Architect – AWS
By Mehmet Bakkaloglu, Principal Solutions Architect – AWS
By Jaimen Hoopes, VP (Product Management) – Forcepoint

Over the past year, many customers have built generative AI proofs-of-concept and solutions. But now, as customers look to promote these proofs-of-concept into production and roll out these solutions across their organizations, they are increasingly becoming conscious of how their sensitive data is used and protected.

A common method of implementing a generative AI-powered solution is Retrieval-Augmented Generation (RAG) , where an organization’s internal knowledge base is referenced to contextualize and optimize the output of the large language model (LLM). In this scenario sensitive or personally identifiable information (PII) data loss can occur through prompts supplied by the user, the knowledge base or data used to customize the underlying model.

Forcepoint, an AWS partner, offers a range of products on the AWS Marketplace, delivering modern cybersecurity by proactively safeguarding critical data and IP. Forcepoint utilizes a broad set of AWS services such as Amazon Elastic Compute Cloud (Amazon EC2), Amazon Simple Storage Service (Amazon S3), Amazon Redshift, Amazon Elasticsearch, and Amazon Relational Database Service (Amazon RDS).

In this blog post we cover how Forcepoint Data Loss Prevention (DLP) provides a necessary layer of protection on top of guardrails and comprehensive security features provided by AWS services. This helps improve the security posture of customers implementing generative AI solutions.

Forcepoint Data Loss Prevention

Forcepoint DLP enables users to discover, classify, prioritize, protect and monitor their sensitive information effectively while maintaining a seamless user experience without disrupting normal business operations.

Figure 1: Data Security Lifecycle

The key advantages of Forcepoint DLP are:

Breadth & depth by industry and geo – 1700+ built-in classifiers and pre-defined templates for industries like healthcare and finance, covering 80+ countries. Enabling creation of reusable policies across organizational user groups, with rules that automatically adapt to specific industry or regional regulations.
Unified management across all data channels – Create data security policies once and apply them to the web traffic, SaaS applications, email, and endpoints. This includes real-time audit and investigation capabilities based on risk by user, category, channel of policy violation.
Coverage for 900+ file types, 300+ NLP scripts and 130+ Indicators of behavior, protecting PII and protected health information (PHI), company financials, trade secrets, credit card data, and other pieces of sensitive customer data even in images.
Consistent enforcement – whether devices are online or offline.
Real-time inline and API protection with risk-adaptive data protection based on individual user’s risk level.

Forcepoint DLP Components

The Management Server hosts Forcepoint Security Manager, core DLP components, and provides advanced analysis capabilities. The Forcepoint App Data Security API secures custom applications such as GenAI applications being developed internally and enables organizations to analyze file and data traffic within these applications. It allows the application of DLP policies, making sure that sensitive information is protected while maintaining operational efficiency.

Endpoint Agents are software installed on end-user devices to enable data loss prevention. They monitor and enforce policies for various data channels, operate online and offline, analyze content for sensitive information, and can encrypt data, help block actions, and generate alerts. This extends to even when disconnected from internet/network.

Let’s look at the capabilities of the Forcepoint App Data Security API to allow, block or quarantine the data supplied to your AWS GenAI solutions based on DLP recommendations.

Figure 2: Forcepoint Data Loss Prevention (DLP) Solution Overview

AWS Generative AI Services and Guardrails

AWS offers a range of services for customers to build generative AI applications securely. AWS App Studio allows users to build enterprise-grade applications using natural language. Amazon Bedrock is a fully managed AWS service that offers a choice of high-performing foundation models (FMs) from leading AI companies like Anthropic, Cohere, Meta, and Amazon through a single API, along with a broad set of capabilities such as Amazon Bedrock
Agents to build generative AI applications. With Amazon SageMaker customers can either develop their own FMs, or choose from among the 400+ models available in Amazon SageMaker JumpStart and deploy the models to a single tenant endpoint.

All of these services benefit from the comprehensive security, monitoring and audit features provided by AWS. Note that customer data is not shared by third party model providers.

Focusing specifically on data loss, Amazon Bedrock Guardrails can evaluate user input and detect sensitive content such as PII, and either reject or redact inputs. These guardrails can also be used with Amazon Bedrock models, third-party models, self-hosted models, and Amazon SageMaker models. Customers can also use third-party guardrails such as Llama Guard which is available through Amazon SageMaker JumpStart.

Forcepoint DLP with Amazon Bedrock and Amazon SageMaker

Amazon Bedrock Guardrails provides an extensive protection for your GenAI applications but when combined with Forcepoint DLP’s capabilities, customers can benefit even more from:

Centralized control and guardrails for their data – web, email, network, cloud, endpoint, and GenAI applications.
Capability to dynamically adapt policy enforcement based on user risk.
In-depth identification and tracking of sensitive data.
Automated incident response workflows.
Adherence and reporting for various compliances.

When building and deploying applications powered by FMs from Amazon Bedrock or Amazon SageMaker, enforcing guardrails becomes essential when the applications are made available to internal or external users.

With customized FMs, the risk is heightened as the model has been fine-tuned with customer data and therefore the responses from the model can lead to data leakage. To overcome that, the burden to enforce and maintain compliance with guardrails falls on the development team.

With pre-defined templates and 1700+ classifiers, developers can utilize simple create once and multi-deploy DLP policies. Developers can invoke them through API before the inference and after the response is generated, offering better protection to their AI applications.

Figure 3: Forcepoint DLP Policy templates and rules

In Figure 3, we see the Forcepoint Security Manager, creating and managing data usage policies. The outlined section shows rules designed to protect your organization’s internal data from being exfiltrated to or from AI systems.

In this case, your developers might be using AI to refine and debug their code. To address this, Forcepoint DLP offers out-of-the-box classifiers, such as the Python source code classifiers and fingerprinting technology. These classifiers are designed to identify specific company-based intellectual property and other proprietary identifiers.

By leveraging these pre-built classifiers, organizations can help prevent sensitive code from being entered into AI systems.

The figure also illustrates how specific actions can be implemented to block, allow, or audit data transfers based on these classifications. This approach helps maintain data security while allowing for controlled use of AI tools in the development process.

Figure 4: Amazon Bedrock & Forcepoint DLP integration high-level overview

In the Figure 4, a Bedrock user initiates the process by sending a prompt to a custom application that integrates Amazon Bedrock’s models and knowledge base. The application first triggers an ingress inspection by forwarding the prompt to Forcepoint’s App Data Security API within their DLP system. Here, the DLP Protector scrutinizes the prompt against predefined policies and conditions. Based on this analysis, the system decides to either allow the prompt, block it, or request additional action or justification from the user. In this scenario, the user is prompted to provide justification, which is subsequently logged in a database for compliance and audit purposes.

Once cleared, the prompt is processed by the Bedrock model, generating a response. This response then undergoes a similar egress inspection process through Forcepoint’s DLP. The system again evaluates the content against security policies, determining whether to allow, block, or redact the response. Here, the DLP identifies sensitive information and enforces a redaction policy. Consequently, the user receives a redacted version of the response, with sensitive or unauthorized information removed while still preserving the permissible content.

This dual-inspection approach ensures comprehensive data protection and compliance throughout the entire interaction with the AI system.

Forcepoint DLP Pre-defined Policy Highlights

Forcepoint’s Protected Health Information (PHI) Predefined Policies and Classifiers exemplify how they help customers automatically protect their PHI with minimal effort. These policies and classifiers encompass PHI formats and items for multiple countries, including the US, UK, Sweden, and India to identify patterns in patient profiles, forms, names, etc.

With Financial Regulations Predefined Policies and Classifiers customers get predefined policies for major worldwide regulations such as EU Finance, FCRA, FFIEC, FSA SYSC, NYSE, and SEC, among others. The policies incorporate rules designed to detect sensitive financial data, including account numbers, passwords, and magnetic credit card tracks. Furthermore, Forcepoint has enhanced these policies with additional rules that can identify combinations of PII, such as credit card details paired with identification numbers.

The complete list of Forcepoint DLP Predefined Policies and Classifiers can help address your organization’s security requirements with minimal efforts. Additionally, customers can edit or implement their own DLP policies and classifiers.

Customer Success Stories

Forcepoint provides robust cybersecurity solutions to thousands of companies across industries worldwide, including manufacturing, education, government, banking, oil and gas. For more information on the wide range of customers that trust Forcepoint to secure their sensitive data see Forcepoint’s customer stories.

Forcepoint DLP is recognized by industry analysts. Forrester named them a Leader in the Forrester Wave: Data Security Platforms, Q1 2023 report while Radicati calls them a Top Player in their Data Loss Prevention (DLP) Market Quadrant for 2024.

Conclusion

Forcepoint DLP offers a simple API for customers to implement data loss prevention for their generative AI solutions in a consistent manner taking advantage of 1,700+ classifiers and pre-defined templates. Forcepoint DLP coupled with guardrails and comprehensive security features offered by AWS can help you implement generative securely and safely across your organization.

To find out more and receive an assessment of your generative AI readiness across data security, connect with Forcepoint.

Forcepoint – AWS Partner Spotlight

Forcepoint is an AWS Partner that simplifies security for global businesses. Forcepoint’s all-in-one, truly cloud-native platform makes it easy to adopt Zero Trust and prevent the theft or loss of sensitive data and intellectual property no matter where people are working.

Contact Forcepoint | Partner Overview | AWS Marketplace

Source: View source