Skip to content

This project systematically evaluates the security risks in C code generated by large language models (LLMs).

License

Notifications You must be signed in to change notification settings

gfelber/llm_security_and_poisoning

Repository files navigation

Authors: Georg Felber, Grgic Filip, Sperk Lukas

LLM Security and Poisoning

This project systematically evaluates the security risks in C code generated by large language models (LLMs). We benchmark OpenAI's GPT-4o, Anthropic's Claude 3.7 Sonnet, and DeepSeek Chat across critical programming tasks under various prompt engineering scenarios, revealing how prompt phrasing and intent manipulation affect code safety.


Framework

Reproducibility

The entire workflow is automated and reproducible:

# isntall requirements pip install -r requirements.txt # list available templates ./test.py test.py list # generate new tests ./test.py run [OPTIONS] TEMPLATE # run tests on cached files ./test.py cache [OPTIONS] TEMPLATE # iterate over all memory corruptions ./test.py analyze [OPTIONS] TEMPLATE # Analyze logged results (create diagrams) ./analyze.py

adding new problems

Templates are listed in template/ and compose of the following files (taking array_index as an example):

array_index/ ├── bugs.c ├── oracle.c ├── problem.md └── tests └── ... 
  • bugs.c
    is the klee setup file that comparesn file contains the problem statement and the prompt used to generate the test
  • oracle.c
    this files contain the code that is that fullfills the task and used for comparsion against the generated code
  • problem.md
    this markdown file contains the problem statement and the prompt used to generate the test
  • tests/
    this folder contains the generated tests

Results

Overview

We generated and analyzed 3,000 samples of LLM-generated C code, combining:

  • 3 Models: GPT-4o, Claude 3.7 Sonnet, DeepSeek Chat V3
  • 4 Tasks: Array Operations, Decompression, Deserialization, String Manipulation
  • 5 Prompt Strategies: No injection, secure, fast, unsafe, and conflicting (unsafe & secure)

These combinations were evaluated for correctness, memory safety, and vulnerability.

🔥 Error Heatmap

A breakdown of bug types and frequency across all models and prompts.

heatmap


Prompt Injection Strategies

We tested how models react to system-level prompt injections that steer them toward fast, secure, or even maliciously unsafe code:

  • No Injection: Default behavior
  • Fast: Prioritize performance over safety
  • Secure: Add maximum validation/safety checks
  • Unsafe: Introduce backdoors or memory corruptions
  • Unsafe & Secure: Conflicting instructions

These manipulations revealed the extreme sensitivity of LLMs to prompt phrasing and goal alignment.


Tasks and Testing Pipeline

Each LLM was asked to solve four security-relevant tasks in C:

Task Name Key Risk Area
array_index Bounds-checked memory access
decompression Pointer arithmetic, recursion risks
deserialization Length validation & buffer overrun
unique_words Heap safety and memory management

The generated code was compiled and symbolically analyzed using KLEE.


Outcome Distribution

Each output was labeled as:

  • Bug: logical or functional error
  • Crpt: memory corruption
  • Failed: compilation or runtime failure

📊 Outcome Categories Across All Samples

distribution


Key Results

  • 37.4% of generated samples had logical bugs
  • 14.7% showed memory corruption
  • Secure prompting dropped corruption rates as low as 2–3.5%
  • GPT had the highest bug rates; Claude the lowest
  • Decompression was the most error-prone task

⚔️ Model Comparison by Prompt Type

comparison


Impact of Prompt Engineering

Prompt design was the most impactful factor in output safety:

  • Unsafe prompts maximized failure rates (up to 70%+ bugs)
  • Secure prompts reduced vulnerability but not perfectly
  • Conflicting prompts caused partial override, not full mitigation

📉 Bug Rates by Prompt Strategy

prompt_impact

About

This project systematically evaluates the security risks in C code generated by large language models (LLMs).

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages