Need for Security in AI/ML Applications
While traditional application security measures focus on securing source code, 3rd-party dependencies, and runtime environments, AI/ML applications bring with them unique challenges that need a different security approach. Chatbots and agentic workflow platforms that utilize foundational models for workflow automation are increasingly compromising the security posture of AI/ML systems
Traditional Application Security vs AI/ML Application Security
When it comes to traditional Application Security programs, enterprises invest in point security solutions like SAST, DAST, SCA, End Point Security, RASP, Perimeter Security, etc. On top of these tools there are other platforms for CSPM, ASPM, CNAPP and other platforms with various alphabet combinations depending on maturity and type of applications used by enterprises.
As the applications using LLMs are rapidly evolving, understanding security implications and securing them takes added urgency. Let me briefly address the foundations of traditional AppSec before jumping into common themes and differences between traditional applications and AI model-based applications.
Traditional Application Security Practices
Software development and delivery of applications involves
- Using third party software as libraries or services – from npm, Github or maven repos for example.
- Building business specific logic with first party software or logic
- Toolchain security for building and packaging the software including first party and third party components in deployable units using containers—Deb or RPM packages
- Deploy application services including dependent services in a runtime environment to serve business purpose in a public cloud environment using a load balancer for access
Extending Application Security to AI/ML Applications
AI/ML or foundation model based applications follow the same pattern with some changes
- Using third party models—from Huggingface for example
- Building business specific logic with first party software or logic
- Toolchain security for packaging the software including first party and third party components in deployable units—using containers, Deb or RPM packages for smaller models
- Deploy application services including dependent services in a runtime environment to serve business purpose in a public cloud environment using a load balancer for access
As we can see the development and delivery of AI models follow similar patterns to traditional application security and can be protected using traditional security defense measures. However, runtime security for these models requires newer methods of securing the applications.
LLM Models that are trained with proprietary data have security implications for data security and privacy. Traditionally, the data stores can be protected with RBAC and using access controls to segregate confidential and proprietary information from public data. Since the AI model stores the data in a form that is not possible to have traditional data controls applied to the retrieval of data by the models, one has to create an additional layer to evaluate data for PII or proprietary information on the responses of LLM.
Model theft and DOS attacks are other areas where LLM based applications differ from traditional models. With the interface that allows free form queries that elicit responses for stored data in the models, additional measures to validate inputs and outputs must be put in place.
OWASP Top 10 for LLM lists top 10 LLM security vulnerabilities. These are not necessarily unique to LLMs. By looking closely at these attack vectors as unique threats will require specialized techniques to defend against them.
OWASP Top 10 Vulnerabilities for LLMs
Top 10 attack vectors for LLMs and comparison with traditional application threat vectors:
- Prompt Injection – Threat actor’s ability to modify the prompt given to LLM from a web page or interface that is supposed to have built in prompts to perform actions. This modification of prompt could result in exfiltration of private data, deletion of proprietary data or provide incorrect results to queries like risk analysis. This can be prevented using traditional defense mechanisms.
- Insecure Output Handling – Threat actors can attack systems that use automation based on LLM output to generate output that is automated using exec or using plugins to cause exfiltration of private data, deletion of proprietary data or create backdoor by connecting to third party servers. This can be prevented by traditional defense mechanisms using input validation.
- Training Data Poisoning – Similar to Data poisoning of ML, providing invalid data to model, having access to model training with unsuspecting users providing incorrect data or deliberate malicious data to train the model. This is a form of supply chain security and chain of custody addressed by traditional defense mechanisms.
- Model Denial of Service – Threat actor can send poisoned queries using recursive resource usage, queries with query stream that exceeds context window, repetitive long inputs, large volume of variable length inputs causing high resource consumption with filled up context windows for each of the queries. This is specific to LLM models that needs to be handled by the Models.
- Supply Chain Vulnerabilities – Using models no longer supported, poisoned crowd sourced data, vulnerable pre-trained models cause the model to perform with incorrect or skewed results that can be taken advantage of by threat actors in production. Traditional chain of custody and validation of source and provenance can be used to address this type of vulnerability.
- Sensitive Information Disclosure – Incomplete or improper filtering in LLM responses, overfitting or Skewed data used for training can cause unintended disclosure of PII by well crafted queries by user with knowledge of metadata of training set or other users PII for a legitimate query for one user. This is specific to LLM models that need to be handled by monitoring responses for PII and private data.
- Insecure Plugin Design – Plugins are driven by LLM model and input to plugin are driven by LLM. This could result in prompt resulting in input generation for plugin that can lead to data exfiltration, data deletion, or providing access to the system to malicious users. This is similar to insecure output handling, handled by traditional prevention techniques like strict parameterized inputs with validation by plugin, vulnerability checks on plugins, ASVS practices, SAST, DAST checks, requiring manual authorization for any sensitive plugins.
- Excessive Agency – Ability to interface with other systems and take actions in response to prompts can lead to unintended consequences due to hallucination, compromised plugin, or poorly performing model. This is similar to insecure plugin design which can be prevented by ensuring boundaries of authorizations allowed for automation based on the actor and ensuring validity of commands from LLM.
- Overreliance – LLMs can be authoritative even with hallucinations, bugs in generated code can be hard to detect causing problems downstream when LLM output is used without proper validation. This is specific to LLM models that needs to be handled by monitoring responses.
- Model Theft – Unauthorized user gaining access to enterprise network, insider threat, attack vector for model extraction with carefully crafted queries issued in large numbers. This is prevented with combination of traditional defense mechanisms and LLM specific techniques of robust model for adversarial queries
As can be seen from above, only Denial of Service, Sensitive Information Disclosure, and Model Theft are the ones that require new techniques for prevention. MITRE Adversarial Threat Landscape for Artificial-Intelligence Systems (ATLAS) is based on real world attack observations that can be mapped to the techniques exploiting vulnerabilities detailed by OWASP. Garak is an interesting project we are working with to provide comprehensive security posture for AI applications. Stay tuned for updates on AI security and other exciting enhancements at https://www.opsmx.com/secure-software-delivery/ .
Conclusion
Extending an AppSec program to AI/ML applications involves adapting and enhancing traditional practices while addressing new challenges unique to AI systems. By building on proven methods and integrating AI-specific security measures, enterprises can confidently secure their AI/ML workflows.
Unique Security Challenges with AI/ML-based Applications
- Protecting against prompt injection in LLMs.
- Implementing measures for model theft prevention and DOS attacks.
- Ensuring responsible AI/ML outputs to mitigate issues like overreliance and excessive agency
Upcoming: The 2nd part in this blog series will share practical ways of securing your LLMs with the help of OpsMx and Open Source tools. Stay tuned!
About OpsMx
OpsMx is a leading innovator and thought leader in the Secure Continuous Delivery space. Leading technology companies such as Google, Cisco, Western Union, among others rely on OpsMx to ship better software faster.
OpsMx Secure CD is the industry’s first CI/CD solution designed for software supply chain security. With built-in compliance controls, automated security assessment, and policy enforcement, OpsMx Secure CD can help you deliver software quickly without sacrificing security.
OpsMx Delivery Shield adds DevSecOps capabilities to enterprise deployments by providing Application Security Posture Management (ASPM), unified visibility, compliance automation, and security policy enforcement to your existing application lifecycle.
Frequently Asked Questions on AI/ML Security Challenges and LLM Security Measures
1. What are the main security challenges in AI/ML applications?
The main security challenges in AI/ML applications are listed below. The same is also covered extensively in the blog above:
- Data Privacy and Security
- Adversarial Attacks
- Model Theft and IP Protection
- Flawed and Biased Datasets
- Model Integrity and Model Denial of Service
- Compliance
- Provenance Checks
2. How can traditional security measures be applied to AI/ML systems?
There are a few measures that can be taken to ensure security of AI/ML systems. Such as:
- Access control – restricting access of training data, models, and APIs
- Vulnerability Testing – regular testing the model, data, and system for vulnerabilities
- Authentication – encouraging the use of strong authentication techniques while accessing AI/ML resources
Besides these, generic security measures such as Data Encryption and Monitoring can be used to secure AI/ML systems.
3. What are the OWASP Top 10 vulnerabilities for LLMs?
According to the OWASP project, the OWASP Top 10 vulnerabilities for Large Language Model (LLM) Applications are:
- LLM01: Prompt Injection
- LLM02: Insecure Output Handling
- LLM03: Training Data Poisoning
- LLM04: Model Denial of Service
- LLM05: Supply Chain Vulnerabilities
- LLM06: Sensitive Information Disclosure
- LLM07: Insecure Plugin Design
- LLM08: Excessive Agency
- LLM09: Overreliance
- LLM10: Model Theft
4. How can I protect my AI models from theft?
You can protect your AI models from theft by implementing the following security measures:
- Access Control – By implementing strong authentication, RBAC, and IP whitelisting
- Encryption – By encrypting models both at rest and in transit(API communication)
- Obfuscation – By using techniques like model watermarking or parameter encryption to make theft less valuable
- API Security – By implementing rate limiting, authentication, and logging for API endpoints
- Monitoring – By continuously monitoring for unauthorized access, unauthorized downloads, unusual activity or suspicious access patterns
- Prompt Injection – Perform input validation to prevent threat actors from exploiting any opportunity
- Insecure Output Handling – Threat actors can attack systems that use automation based on LLM output to generate output that is automated using exec or using plugins to cause exfiltration of private data, deletion of proprietary data or create backdoor by connecting to third party servers. This can be prevented by traditional defense mechanisms using input validation
5. What is the importance of securing AI/ML systems?
Securing AI/ML systems is necessary because compromised models or training data can result in biased or flawed responses from the AI/ML systems. Some reasons are:
- Unauthorized access to training data ensures privacy and prevents biases
- Preventing model tampering or theft maintains reliability and trust
- Guarding against adversarial AI attacks can ensure reliable model outputs
- Continued operations to prevent disruptions from malicious attacks
- AI Agents and Agentic workflows are gaining prominence, and if they are compromised then it could lead to critical vulnerabilities in the system
0 Comments