Welcome back to the era of GenAI, where the world remains captivated by the boundless potential of artificial intelligence. However, the proliferation of AI does not preclude us from considering the new risks it poses. As you may recall, I have been supporting numerous initiatives around AI Risk Management as part of our ATO for AI offering and recently explored the ethical risks surrounding AI using the IEEE CertifAIEd framework. As part of the NIST AI RMF, we need to continue to adopt new and emerging risk management practices unique to AI systems. Today, we’re going to be exploring the other type of risks: the security vulnerability vectors that are unique to AI systems. Today we will be examining the most common of those vectors, called the OWASP LLM Top 10 and how to protect yourself while building solutions using AWS Native AI services.
What is the OWASP LLM Top 10?
Most InfoSec professionals are familiar with the OWASP Top 10 for Web Vulnerabilities — I blogged previously about how you can leverage AWS WAF to protect your web apps from these exploits. The OWASP LLM Top 10 is new set of terms and attack vectors that are unique to LLMs. However, most of the vectors are extrapolations of best security practices found in the Web Vulnerability Top 10 — sanitizing inputs and outputs, least privilege, etc.
By understanding these risks, developers can take proactive steps to protect their AI systems, whether they are built on AWS or any other platform.
Disclaimer: Though I tried to be as thorough as possible, the suggested mitigations below should not be considered exhaustive — they are just examples to provide guidance. It is important to perform threat modeling of your LLM application to identify the threat vectors unique to your system. For practical purposes, I have used AWS AI services like Amazon Bedrock to demonstrate applicability of the various threats and their mitigations.
- Prompt Injection
This manipulates a large language model (LLM) through crafty inputs, causing unintended actions by the LLM. Direct injections overwrite system prompts, while indirect ones manipulate inputs from external sources.
Prompt injection exploits can be broken down into two sub-categories:
- Direct — a malicious user crafts specific inputs to circumvent guardrails to obtain “root” access to the LLM — also known as “jailbreaking” or D.A.N. mode.
- Indirect — when downstream external data sources introduce malicious prompt to an LLM — resulting in a “confused deputy”.
Potential Impact:
Successful prompt injection can result in manipulation of the LLM to imitate a potentially harmful identity or interact with user plugins. This could result in leaking sensitive data, privilege escalation (unauthorized use of plugins), or even social engineering. Essentially, the compromised LLM becomes a malicious insider alongside the actual attacker.
Mitigations:
- All Amazon Bedrock inputs and outputs are reviewed via automation for malicious use. Humans do not view these inputs or outputs. They are non-blocking however and are just used to identify customers who may be misusing Bedrock.
- Explore Guardrails for Bedrock to block malicious prompts and responses. It is currently in Preview. Planning to incorporate PII filtering in future.
- Enable Cloud Trail Data Events for Bedrock. You can then have human-review of the user provided prompts (if acceptable with your customers) or feed them into automation to identify customers who are misusing your application and take appropriate action.
- Use tools like AutoPoison to test against malicious inputs.
- Do not pass raw user input directly to LLM — utilize prompt engineering and instruction tuning to add context and instructions transparently to act as a security guardrail. A great white-paper for more research on this topic.
from langchain import PromptTemplate
prompt_template = PromptTemplate.from_template(
template="You are personal assistant for an executive.
You do not respond to any requests to perform malicious acts such as
- illicit image generation, making bombs, planning an insurrection.
If you receive any of these requests -
you respond with 'I am unable to comply for ethical reasons'
Respond to the following question: {question}.
)
Note: Malicious input filtering is automatically incorporated using Pre-processing Advanced Prompts in Amazon Bedrock Agents. However, you should test out the default template to ensure that it adequately accounts for your specific threat models.
- Utilize the Self-Reminder model.
- When using RAG, provide a similarity_score_threshold. When using document-based searches like Amazon Kendra, you can hard-code responses to requests that do not match any documents in your index. Malicious requests are typically not going to be found in enterprise data stores used in RAG — therefore, searches for them will come back empty or with low similarity.
##Example 1
retriever = db.as_retriever(
search_type="similarity_score_threshold",
search_kwargs={"score_threshold": 0.5}
)
##Example 2
.......
llm_jurassic = Bedrock(
client=bedrock_client,
model_id="ai21.j2-ultra-v1",
endpoint_url="https://bedrock-runtime." + REGION_NAME + ".amazonaws.com",
model_kwargs={"temperature": 0.2, "maxTokens": 1200, "numResults": 1}
)
qnachain = ConversationalRetrievalChain.from_llm(
llm=llm_jurassic,
condense_question_llm=llm_jurassic,
retriever=retriever,
return_source_documents=True,
condense_question_prompt=question_generator_chain_prompt,
combine_docs_chain_kwargs={"prompt": combine_docs_chain_prompt}
)
.......
llm_result = qnachain(input_variables)
.......
if(len(llm_result['source_documents']) > 0):
response_text = llm_result["answer"].strip()
else:
response_text = "I don't know, no source documents matched your question"
- Insecure Output Handling
This vulnerability occurs when an LLM output is accepted without scrutiny, exposing backend systems. Misuse may lead to severe consequences like XSS, CSRF, SSRF, privilege escalation, or remote code execution.
Insecure Output Handling is the result of inadequate validation, sanitation, and management of outputs generated by LLMs before they are sent downstream for further consumption or processing. This vulnerability arises because LLM-generated content can be influenced by user input, effectively granting indirect access to additional functionality.
Potential Impact:
Exploiting Insecure Output Handling can lead to security risks such as XSS and CSRF in web browsers, as well as SSRF, privilege escalation, or remote code execution in back-end systems. This vulnerability can be exacerbated by over-privileged LLM access, susceptibility to indirect prompt injection attacks, and insufficient input validation third-party plugins.
Mitigations:
Utilize a zero-trust approach and treat the LLM as an insider threat.
- Apply proper input validation on responses coming from the model to backend functions. OWASP recommends following their own ASVS guidelines— such as encoding all output text to prevent it from being executed automatically by Javascript or Markdown.
- If you are hosting a HTTP service that ingests output from an LLM. directly, implement AWS WAF to detect malicious requests in the application layer. If your HTTP service does not utilize an AWS WAF supported service such as ALB or API Gateway, implement application-layer protection utilizing a virtual appliance like Palo Alto NGFW or other mechanism.
- If using Agents for Bedrock, consider enabling a Lambda parser for each of the templates to have more control over the logical processing.
- Training Data Poisoning
This occurs when LLM training data is tampered, introducing vulnerabilities or biases that compromise security, effectiveness, or ethical behavior. Sources include Common Crawl, WebText, OpenWebText, & books.
Data poisoning is essentially an ‘integrity attack’ due to its disruptive influence on the fundamental capacity of our AI model to generate precise predictions. Introducing external data increases the risk of training data poisoning due to the limited control that model developers have over it.
Potential Impact:
Poisoned information may result in false, biased or inappropriate content be presented to users or create other risks like performance degradation, downstream software exploitation.
Mitigations:
- Verify supply chain of all training data used in embedding or fine-tuning.
- Enable audit logging on all sources of training or RAG. For example: if using Amazon S3 (Amazon Bedrock Knowledge Bases or Amazon Kendra), enable Cloud Trail Data events for write events.
- Enable strict access control on data sources. This includes using dedicated Service Roles for crawling data in Amazon Kendra or Amazon Bedrock Knowledge bases — with IAM Policies only allowing access to required sources.
- Only use data that has been properly prepped via automation. Your data preparation process should include processes to identify and exclude anomalous data. Amazon SageMaker Data Wrangler can greatly accelerate the data preparation process. Finalized data can be exported to Amazon S3 for consumption.
- Test and document model performance prior to release. Include human review of responses made during testing.
- Tools like Autopoison can also be used here for adversarial training to minimize impact of data poisoning.
- Model Denial of Service
Attackers cause resource-heavy operations on LLMs, leading to service degradation or high costs. The vulnerability is magnified due to the resource-intensive nature of LLMs and unpredictability of user inputs.
Model Denial of Service is similar to a network-based DoS attack — where repeated or very large requests can overwhelm LLM-based systems.
Potential Impact:
This can result in either completely disabling a service or runaway costs when using AWS services that charge for each request made such as Amazon Bedrock, Amazon Kendra, and Amazon Bedrock Knowledge bases using OpenSearch Serverless (since it could potentially scale the required OCUs).
Mitigations:
- Utilize maximum token limits which is supported by Langchain.
- Implement a caching layer in your LLM client — this can prevent common queries from being repeatedly processed by your LLM. This is also supported by Langchain. Proper research should be done when selecting and deploying any caching solution. Langchain supports SQLLite, InMemory and GPTCache currently.
from langchain.cache import SQLiteCache
set_llm_cache(SQLiteCache(database_path=".langchain.db"))
- If feasible, proxy your Bedrock model with API Gateway and Lambda — then enabling AWS WAF with rate limiting, source IP restriction, or other relevant rules. Note: All Amazon Bedrock models have rate-limits specific to each model. Side note: would love to see AWS implement resource policies on Amazon Bedrock models/agents to more easily incorporate customized security.
- Monitor usage closely using Amazon CloudWatch with Anomaly Detection and create alarms.
from langchain.llms.bedrock import Bedrock
import boto3
bedrock_client=boto3.client('bedrock-runtime')
#print('Initalizing Anthropic Claude v2.1')
llm_anthropic_claude21 = Bedrock(
client=bedrock_client,
model_id="anthropic.claude-v2:1",
endpoint_url="https://bedrock-runtime." + REGION_NAME + ".amazonaws.com",
model_kwargs={"temperature": 0.25, "max_tokens_to_sample": 1000}
)
- Implement a caching layer in your LLM client — this can prevent common queries from being repeatedly processed by your LLM. This is also supported by Langchain. Proper research should be done when selecting and deploying any caching solution. Langchain supports SQLLite, InMemory and GPTCache currently.
from langchain.cache import SQLiteCache
set_llm_cache(SQLiteCache(database_path=".langchain.db"))
- If feasible, proxy your Bedrock model with API Gateway and Lambda — then enabling AWS WAF with rate limiting, source IP restriction, or other relevant rules. Note: All Amazon Bedrock models have rate-limits specific to each model. Side note: would love to see AWS implement resource policies on Amazon Bedrock models/agents to more easily incorporate customized security.
- Monitor usage closely using Amazon CloudWatch with Anomaly Detection and create alarms.
- Supply Chain Vulnerabilities
LLM application lifecycle can be compromised by vulnerable components or services, leading to security attacks. Using third-party datasets, pre- trained models, and plugins can add vulnerabilities.
Vulnerabilities within the supply chain of LLMs can pose significant risks, potentially compromising the integrity of training data, machine learning models, and deployment platforms. Such vulnerabilities may result in biased outcomes, security breaches, or even catastrophic system failures. While conventional security concerns often center on software components, Machine Learning introduces a distinct dimension by incorporating pre-trained models and training data supplied by third parties, making them susceptible to tampering and poisoning attacks. This highlights the need for robust security measures in the context of machine learning supply chains.
Potential Impact:
Using third-party software or components does not absolve anyone from responsibility when those third-party components introduce risk. Understanding the risks introduced by third-party components is critical for implementing proper security. This is true in standard software and AI-based systems.
Mitigations:
- Vet any third-party provider and request all relevant xBOMs (bill of materials) — i.e., HBOM, SBOM, SaaS-BOM, ML-BOM for transparency into how the model was built. All of the foundation models in Bedrock have vendor provided Service Cards.
- Follow all SDLC best practices — patching, SBOMs, code and model signing, etc.
- Only enable the approved Bedrock LLMs.
- Review and document AWS Service Cards for all AWS AI Services utilized by your stack. Document any relevant risks or limitations. Note: currently only Service Cards exist for non-LLM AI Services.
- If developing your own LLM using AWS SageMaker, generate ML-BOMs or model cards for your LLM to track the lifecycle of your model.
- Sensitive Information Disclosure
LLMs may inadvertently reveal confidential data in its responses, leading to unauthorized data access, privacy violations, and security breaches. Its crucial to implement data sanitization and strict user policies to mitigate this.
Data privacy is a critical concern when dealing with AI and machine learning systems. Ensure that data stored and processed by AWS AI services comply with data privacy regulations and use encryption and access controls to protect sensitive information.
Potential Impact:
Anytime sensitive data is leaked, whether through LLMs or other software solutions, it has the potential to harm any of the parties involved.
Mitigations:
- Do not use sensitive data to fine-tune models unless it is explicitly necessary. Currently, LLMs do not have the capability to incorporate access control to any data once it has been trained into it.
- Limit sensitive data to RAG systems that support user-based access control. Amazon Kendra supports token-based user access when searching documents. This can ensure only information the user has access to are used when generating responses.
- If using Amazon Lex and you do not wish store user utterances (e.g., a system used by children), enable “childDirected” — COPPA compliance — to not store any user utterances in the logs.
- If you are logging utterances and LLM queries in CloudWatch, consider enabling CloudWatch Log Data Protection which automatically masks identified sensitive data in the log streams.
- Limit access to data stores and vector databases. Remember vectorization/embedding is not encryption! Vectorized data can be susceptible to analytical attacks to potentially reverse engineer the original data.
- If using Agents with Bedrock, consider enabling a Post-Processing template to add more logic in crafting your final response to the user. This can act as a final safeguard for stripping unnecessary/sensitive information.
- Insecure Plugin Design
LLM plugins can have insecure inputs and insufficient access control. This lack of application control makes them easier to exploit and can result in consequences like remote code execution.
Plugins are essentially a tool belt of processes that can be invoked by a model using agents, often transparently to the user, that help respond to complex queries more accurately.
Potential Impact:
Plugins can be exploited to extend the reach of a malicious attack. Any plugin accessible by a model agent is susceptible to mis-use — opening you up to remote code execution, data leakage, or privilege escalation.
Mitigations:
- Parameterize inputs to plugins as much as possible — similar to protecting against SQLi and XSS.
- Sanitize and validate any input/output provided to/from a plugin — especially, if third-party. This can be done manually in the agent used to invoke your model or in Amazon Bedrock Agents using Advanced Prompt Parsing. Follow OWASP ASVS guidelines.
- Excessive Agency
LLM-based systems may undertake actions leading to unintended consequences. The issue arises from excessive functionality, permissions, or autonomy granted to the LLM-based systems.
The entire value proposition of using AI to streamline production with minimal human interaction. The more you want to off-load to the AI, the more access it requires. This is a balancing act and will certainly become more complicated and more difficult to control as the capabilities of AI grow in the coming years.
Potential Impact:
Excessive agency can result in a wide range of impacts depending on how the AI implementation — more specifically what tools and data sets it has access to.
Mitigations:
Incorporate least privilege at all layers.
- Only grant access to RAG systems or Knowledge Bases required for the specific process that is calling it.
- Do not rely on your LLM for authorization, enforce authentication and authorization on all APIs accessed by LLM — third-party and internal.
- Avoid open-ended/general purpose functions.
- In Amazon Lex, avoid relying on the FallbackIntent. Codify all approved intents directly.
- When using Agents for Bedrock, create dedicated API schemas for each Action Group. Create dedicated Lambda functions with dedicated roles for each Action Group.
- Use adequate descriptions in your API schema so your agent can properly identify the proper method to pass the lambda function.
- Only add tools and plugins that are necessary and
- Limit what tools your LLM can access. This includes specifying only the required Lambda functions in your Agents service role. Your lambda function should also include execution logic for specific paths. Consider the following example from this AWS blog post:
def lambda_handler(event, context):
responses = []
for prediction in event['actionGroups']:
response_code = ...
action = prediction['actionGroup']
api_path = prediction['apiPath']
if api_path == '/claims':
body = open_claims()
elif api_path == '/claims/{claimId}/identify-missing-documents':
parameters = prediction['parameters']
body = identify_missing_documents(parameters)
elif api_path == '/send-reminders':
body = send_reminders()
else:
body = {"{}::{} is not a valid api, try another one.".format(action, api_path)}
- Over-reliance
Systems or people overly depending on LLMs without oversight may face misinformation, miscommunication, legal issues, and security vulnerabilities due to incorrect or inappropriate content generated by LLMs.
Everyone who has played around with GenAI in any capacity has run into times when the AI was just flat out wrong. It does not have the information, tools, or context to provide the correct response so it just makes one up called “hallucinations”. It’s important to provide sanity checks on outputs — This differs from insecure output handling which is mainly about properly typing your outputs and making sure they do not introduce exploits down stream. Over-reliance is focused on validating the actual content generated by the AI.
Potential Impact:
Overly relying on AI increases the likelihood that hallucinations will affect your company or your users. This can include code generated by an AI that is automatically executed or natural language responses being fed to downstream processes in the AI or to users. This is why it is so important for AIs to maintain a “human-in-the-loop” for sanity checks during its use and development.
Mitigations:
- Regularly review model outputs for consistency and accuracy.
- Test prompts multiple times to ensure the prompt always gives you the desired response.
- When using RAG, use similarity filters.
- If using Amazon Kendra, add metadata to your documents and tune your searches with metadata weighting and relevancy parameters. Side note: can’t recommend this enough — do not just crawl your docs blindly.
- If using Amazon Kendra, enrich your documents by removing unnecessary content or adding new content.
- Always look to optimize your model and app — fine-tuning, prompt engineering, adversarial testing, updating vector databases and external document stores.
- Break down complex tasks into individual chains or agents.
- Always make the user aware they are interacting with an AI — Transparency is core of AI Ethical Use.
- Optimize randomness parameters like Temperature, Top P, and Top K.
- Enabling Guardrails for Bedrock can minimize the risk your AI generates inappropriate content. (You do not want to rely on your AI to do this by itself.)
- Add checks against trusted external sources for LLM outputs.
- Model Theft
This involves unauthorized access, copying, or exfiltration of proprietary LLM models. The impact includes economic losses, compromised competitive advantage, and potential access to sensitive information.
Protect your Intellectual Property! Model theft is not just ensuring the actual code of your model is kept safe; it also entails sending a high volume of carefully crafted prompts used by an attacker to partially reverse engineer your model — e.g., weights and parameters used, training data sources, foundation models etc.
Potential Impact:
If a competitor gets access to your model, it can potentially put you out of business by either offering the same product for cheaper or impersonating you in some way and cause reputational damage.
Mitigations:
- Use a centralized ML Model Inventory or Registry for ML models. AWS Sagemaker has a model registry that enforces access control.
- Develop and deploy models solely using ML Ops Pipelines. This includes training data, testing data, and RAG Source Data. This helps limit the identities that require access.
- Apply resource policies on your model registry to prevent access from unauthorized accounts. Monitor updates to these resource policies.
- If using fine-tuned models Amazon Bedrock, put bucket policies on the training data to ensure only authorized individuals and processes are allowed. Monitor updates to these bucket policies.
- Implement a watermark.
- Monitor query frequency of individual users and alert on anomalies. If using Amazon Lex, make sure you are receiving the userId in the logs. If you are receiving it, it may require enriching with third party API’s when alerting— e.g., userID = 383j10d for an actual user jsmith@domain.com.
- Enable Amazon CloudTrail Data Events for Read events on your S3 buckets with proprietary data — training, RAG sources, etc. Ingest these events into your SIEM to identify authorized access.
- Use Service Control Policies to ensure that Amazon Bedrock is only used in approved accounts in your organization.
A sample architecture with hotspots where different vulnerabilities can be introduced that must be accounted for in any implementation.
Conclusion
As AI becomes increasingly integrated into our digital lives, the security of AI systems is of paramount importance. Cloud service providers like AWS offer a robust set of native AI services, but it’s crucial to be aware of the security challenges and risks associated with implementing and configuring real-world solutions utilizing these services. However, as a solutions specialist it is important to stay abreast of emerging standards and practices in the area of AI risk management. The OWASP LLM Top 10 provides valuable insights into potential threats and vulnerabilities that should not be overlooked.
If you are a Federal or DOD agency looking to implement AI Risk Management practices, then contact us to schedule a free briefing on how our open ATO for AITM can help with your compliance effort to deliver safe and secure AI services.