Deep1994

Follow

🎯

Focusing

Ding Peng Deep1994

🎯

Focusing

Follow

Make the change

41 followers · 27 following

NJU(Nanjing University)
Nanjing, China
09:11 (UTC +08:00)
https://deep1994.github.io/

Achievements

Achievements

Deep1994/README.md

👋 Hi, I'm Peng Ding (丁鹏)!

🏫 I’m pursuing a PhD in Computer Science at Nanjing University, supervised by Prof. Shujian Huang.
🔬 I’m currently interested in LLMs safety (jailbreak & defense, interpretability, etc.).
📚 My blog: https://deep1994.github.io
🤝 Contact me: dingpeng@smail.nju.edu.cn

Pinned Loading

NJUNLP/ReNeLLM NJUNLP/ReNeLLM Public

The official implementation of our NAACL 2024 paper "A Wolf in Sheep’s Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily".

Python 150 16
NJUNLP/Hallu-PI NJUNLP/Hallu-PI Public

The code and datasets of our ACM MM 2024 paper "Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs".

11
NJUNLP/SAGE NJUNLP/SAGE Public

The official implementation of our ACL 2025 paper "Why Not Act on What You Know? Unleashing Safety Potential of LLMs via Self-Aware Guard Enhancement".

Python 8 1
NJUNLP/SDGO NJUNLP/SDGO Public

The code and datasets of our EMNLP 2025 paper "SDGO: Self-Discrimination-Guided Optimization for Consistent Safety in Large Language Models".

Jupyter Notebook 7
NJUNLP/ISA NJUNLP/ISA Public

The code and datasets of our paper "Friend or Foe: How LLMs' Safety Mind Gets Fooled by Intent Shift Attack".

Python 3 1