Skip to content

fork from LetterLiGo's repo "letterli-arxiv-daily"

License

Notifications You must be signed in to change notification settings

chengyiqiu1121/letterli-arxiv-daily

Repository files navigation

Contributors Forks Stargazers Issues

Updated on 2024.11.05

Usage instructions: here

Table of Contents
  1. Text-to-Image Safety
  2. LLM Security & Privacy
  3. LLM Agent Security & Privacy
  4. Audio Deepfake

Text-to-Image Safety

Publish Date Title Authors PDF Code
2022-11-10 Red-Teaming the Stable Diffusion Safety Filter Javier Rando et.al. 2210.04610 null

(back to top)

LLM Security & Privacy

Publish Date Title Authors PDF Code
2024-10-28 Systematically Analyzing Prompt Injection Vulnerabilities in Diverse LLM Architectures Victoria Benjamin et.al. 2410.23308 null
2024-10-22 DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy Conflicts in Large Language Models Chen Qian et.al. 2410.16672 link
2024-10-20 Jailbreaking and Mitigation of Vulnerabilities in Large Language Models Benji Peng et.al. 2410.15236 null
2024-10-18 From Solitary Directives to Interactive Encouragement! LLM Secure Code Generation by Natural Language Prompting Shigang Liu et.al. 2410.14321 null
2024-10-17 Large Language Models are Easily Confused: A Quantitative Metric, Security Implications and Typological Analysis Yiyi Chen et.al. 2410.13237 null
2024-10-07 Aligning LLMs to Be Robust Against Prompt Injection Sizhe Chen et.al. 2410.05451 link
2024-10-06 Towards Secure Tuning: Mitigating Security Risks Arising from Benign Instruction Fine-Tuning Yanrui Du et.al. 2410.04524 null
2024-10-04 MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt Attacks Giandomenico Cornacchia et.al. 2409.17699 null
2024-10-03 PathSeeker: Exploring LLM Security Vulnerabilities with a Reinforcement Learning-Based Jailbreak Approach Zhihao Lin et.al. 2409.14177 null
2024-10-01 Extracting Memorized Training Data via Decomposition Ellen Su et.al. 2409.12367 null
2024-10-19 Securing Large Language Models: Addressing Bias, Misinformation, and Prompt Attacks Benji Peng et.al. 2409.08087 null
2024-09-06 Recent Advances in Attack and Defense Approaches of Large Language Models Jing Cui et.al. 2409.03274 null
2024-10-11 Safety Layers in Aligned Large Language Models: The Key to LLM Security Shen Li et.al. 2408.17003 null
2024-08-27 Investigating Coverage Criteria in Large Language Models: An In-Depth Study Through Jailbreak Attacks Shide Zhou et.al. 2408.15207 null
2024-09-06 LLM-PBE: Assessing Data Privacy in Large Language Models Qinbin Li et.al. 2408.12787 null
2024-08-21 Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks Yiyi Chen et.al. 2408.11749 link
2024-07-11 Virtual Context: Enhancing Jailbreak Attacks with Special Token Injection Yuqi Zhou et.al. 2406.19845 null
2024-06-26 Natural Language but Omitted? On the Ineffectiveness of Large Language Models' privacy policy from End-users' Perspective Shuning Zhang et.al. 2406.18100 null
2024-06-24 Noisy Neighbors: Efficient membership inference attacks against LLMs Filippo Galli et.al. 2406.16565 null
2024-06-18 Can We Trust Large Language Models Generated Code? A Framework for In-Context Learning, Security Patterns, and Code Evaluations Across Diverse LLMs Ahmad Mohsin et.al. 2406.12513 null
2024-06-17 Self and Cross-Model Distillation for LLMs: Effective Methods for Refusal Pattern Alignment Jie Li et.al. 2406.11285 null
2024-06-16 garak: A Framework for Security Probing Large Language Models Leon Derczynski et.al. 2406.11036 link
2024-06-06 AutoJailbreak: Exploring Jailbreak Attacks and Defenses through a Dependency Lens Lin Lu et.al. 2406.03805 null
2024-05-25 FastQuery: Communication-efficient Embedding Table Query for Private LLM Inference Chenqi Lin et.al. 2405.16241 null
2024-05-24 Hacc-Man: An Arcade Game for Jailbreaking LLMs Matheus Valentim et.al. 2405.15902 null
2024-06-13 SecureLLM: Using Compositionality to Build Provably Secure Language Models for Private, Sensitive, and Secret Data Abdulrahman Alabdulkareem et.al. 2405.09805 link
2024-05-03 LLM Security Guard for Code Arya Kavian et.al. 2405.01103 link
2024-04-19 CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models Manish Bhatt et.al. 2404.13161 link
2024-04-16 Private Attribute Inference from Images with Vision-Language Models Batuhan Tömekçe et.al. 2404.10618 null
2024-04-16 Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning Xiao Wang et.al. 2404.10552 null
2024-04-12 Subtoxic Questions: Dive Into Attitude Change of LLM's Response in Jailbreak Attempts Tianyu Zhang et.al. 2404.08309 null
2024-03-20 Mapping LLM Security Landscapes: A Comprehensive Stakeholder Risk Assessment Proposal Rahul Pankajakshan et.al. 2403.13309 null
2024-03-23 Ensuring Safe and High-Quality Outputs: A Guideline Library Approach for Language Models Yi Luo et.al. 2403.11838 link
2024-03-13 Tastle: Distract Large Language Models for Automatic Jailbreak Attack Zeguan Xiao et.al. 2403.08424 link
2024-03-14 On Protecting the Data Privacy of Large Language Models (LLMs): A Survey Biwei Yan et.al. 2403.05156 null
2024-02-28 A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems Fangzhou Wu et.al. 2402.18649 null
2024-06-10 Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction Tong Liu et.al. 2402.18104 link
2024-06-04 ASETF: A Novel Method for Jailbreak Attack on LLMs through Translate Suffix Embeddings Hao Wang et.al. 2402.16006 null
2024-06-18 Is the System Message Really Important to Jailbreaks in Large Language Models? Xiaotian Zou et.al. 2402.14857 null
2024-05-17 A Comprehensive Study of Jailbreak Attack versus Defense for Large Language Models Zihao Xu et.al. 2402.13457 link
2024-09-25 StruQ: Defending Against Prompt Injection with Structured Queries Sizhe Chen et.al. 2402.06363 link
2024-10-31 Fight Back Against Jailbreaking via Prompt Adversarial Tuning Yichuan Mo et.al. 2402.06255 link
2024-06-05 Text Embedding Inversion Security for Multilingual Language Models Yiyi Chen et.al. 2401.12192 link
2023-12-18 A Comprehensive Survey of Attack Techniques, Implementation, and Mitigation Strategies in Large Language Models Aysan Esmradi et.al. 2312.10982 null
2024-03-20 A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly Yifan Yao et.al. 2312.02003 null
2023-10-16 Prompt Packer: Deceiving LLMs through Compositional Instruction with Hidden Attacks Shuyu Jiang et.al. 2310.10077 null
2024-05-06 Beyond Memorization: Violating Privacy Via Inference with Large Language Models Robin Staab et.al. 2310.07298 link
2023-09-04 Baseline Defenses for Adversarial Attacks Against Aligned Language Models Neel Jain et.al. 2309.00614 null
2023-11-01 Multi-step Jailbreaking Privacy Attacks on ChatGPT Haoran Li et.al. 2304.05197 link

(back to top)

LLM Agent Security & Privacy

Publish Date Title Authors PDF Code
2024-10-28 Systematically Analyzing Prompt Injection Vulnerabilities in Diverse LLM Architectures Victoria Benjamin et.al. 2410.23308 null
2024-10-30 InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models Hao Li et.al. 2410.22770 link
2024-10-29 Embedding-based classifiers can detect prompt injection attacks Md. Ahsan Ayub et.al. 2410.22284 link
2024-10-28 FATH: Authentication-based Test-time Defense against Indirect Prompt Injection Attacks Jiongxiao Wang et.al. 2410.21492 link
2024-10-28 Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection Md Abdur Rahman et.al. 2410.21337 null
2024-10-27 LLM Robustness Against Misinformation in Biomedical Question Answering Alexander Bondarenko et.al. 2410.21330 link
2024-10-28 Palisade -- Prompt Injection Detection Framework Sahasra Kokkula et.al. 2410.21146 null
2024-10-22 Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You In Itay Nakash et.al. 2410.16950 null
2024-10-21 SMILES-Prompting: A Novel Approach to LLM Jailbreak Attacks in Chemical Synthesis Aidan Wong et.al. 2410.15641 link
2024-10-18 Making LLMs Vulnerable to Prompt Injection via Poisoning Alignment Zedian Shao et.al. 2410.14827 link
2024-10-18 Backdoored Retrievers for Prompt Injection Attacks on Retrieval Augmented Generation of Large Language Models Cody Clop et.al. 2410.14479 null
2024-10-09 Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems Donghyun Lee et.al. 2410.07283 null
2024-10-07 Aligning LLMs to Be Robust Against Prompt Injection Sizhe Chen et.al. 2410.05451 link
2024-10-07 A test suite of prompt injection attacks for LLM-based machine translation Antonio Valerio Miceli-Barone et.al. 2410.05047 link
2024-10-03 Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents Hanrong Zhang et.al. 2410.02644 link
2024-09-29 GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks Rongchang Li et.al. 2409.19521 null
2024-10-10 System-Level Defense against Indirect Prompt Injection Attacks: An Information Flow Control Perspective Fangzhou Wu et.al. 2409.19091 link
2024-09-23 PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs Jiahao Yu et.al. 2409.14729 link
2024-09-20 Applying Pre-trained Multilingual BERT in Embeddings for Improved Malicious Prompt Injection Attacks Detection Md Abdur Rahman et.al. 2409.13331 null
2024-08-08 FDI: Attack Neural Code Generation Systems through User Feedback Channel Zhensu Sun et.al. 2408.04194 link
2024-09-09 A Study on Prompt Injection Attack Against LLM-Integrated Mobile Robotic Systems Wenxiao Zhang et.al. 2408.03515 null
2024-08-01 WHITE PAPER: A Brief Exploration of Data Exfiltration using GCG Suffixes Victor Valbuena et.al. 2408.00925 null
2024-07-23 Prompt Injection Attacks on Large Language Models in Oncology Jan Clusmann et.al. 2407.18981 null
2024-07-12 A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends Daizong Liu et.al. 2407.07403 link
2024-06-20 Prompt Injection Attacks in Defended Systems Daniil Khomsky et.al. 2406.14048 null
2024-07-18 AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents Edoardo Debenedetti et.al. 2406.13352 link
2024-06-11 Knowledge Return Oriented Prompting (KROP) Jason Martin et.al. 2406.11880 null
2024-09-05 SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner Xunguang Wang et.al. 2406.05498 null
2024-09-25 Ranking Manipulation for Conversational Search Engines Samuel Pfrommer et.al. 2406.03589 link
2024-07-19 Are you still on track!? Catching LLM Task Drift with Activations Sahar Abdelnabi et.al. 2406.00799 link
2024-06-06 Exfiltration of personal information from ChatGPT via prompt injection Gregory Schwartzman et.al. 2406.00199 null
2024-05-31 Preemptive Answer "Attacks" on Chain-of-Thought Reasoning Rongwu Xu et.al. 2405.20902 null
2024-09-24 Goal-guided Generative Prompt Injection Attack on Large Language Models Chong Zhang et.al. 2404.07234 null
2024-07-29 Fine-Tuning, Quantization, and LLMs: Navigating Unintended Outcomes Divyanshu Kumar et.al. 2404.04392 null
2024-08-24 Optimization-based Prompt Injection Attack to LLM-as-a-Judge Jiawen Shi et.al. 2403.17710 null
2024-03-20 Defending Against Indirect Prompt Injection Attacks With Spotlighting Keegan Hines et.al. 2403.14720 null
2024-03-14 Scaling Behavior of Machine Translation with Large Language Models under Prompt Injection Attacks Zhifan Sun et.al. 2403.09832 link
2024-03-07 Automatic and Universal Prompt Injection Attacks against Large Language Models Xiaogeng Liu et.al. 2403.04957 link
2024-05-02 Neural Exec: Learning (and Learning from) Execution Triggers for Prompt Injection Attacks Dario Pasquini et.al. 2403.03792 link
2024-02-16 The AI Security Pyramid of Pain Chris M. Ward et.al. 2402.11082 null
2024-02-15 AbuseGPT: Abuse of Generative AI ChatBots to Create Smishing Campaigns Ashfak Md Shibli et.al. 2402.09728 null
2024-09-25 StruQ: Defending Against Prompt Injection with Structured Queries Sizhe Chen et.al. 2402.06363 link
2024-02-08 In-Context Learning Can Re-learn Forbidden Tasks Sophie Xhonneux et.al. 2402.05723 null
2024-01-31 An Early Categorization of Prompt Injection Attacks on Large Language Models Sippo Rossi et.al. 2402.00898 null
2024-10-15 Mitigating the Influence of Distractor Tasks in LMs with Prior-Aware Decoding Raymond Douglas et.al. 2401.17692 null
2024-01-15 Signed-Prompt: A New Approach to Prevent Prompt Injection Attacks Against LLM-Integrated Applications Xuchen Suo et.al. 2401.07612 null
2024-01-02 A Novel Evaluation Framework for Assessing Resilience Against Prompt Injection Attacks in Large Language Models Daniel Wankit Yip et.al. 2401.00991 null
2024-01-08 Jatmo: Prompt Injection Defense by Task-Specific Finetuning Julien Piet et.al. 2312.17673 link
2024-03-08 Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models Jingwei Yi et.al. 2312.14197 link
2023-12-12 Maatphor: Automated Variant Analysis for Prompt Injection Attacks Ahmed Salem et.al. 2312.11513 null
2024-05-25 Assessing Prompt Injection Risks in 200+ Custom GPTs Jiahao Yu et.al. 2311.11538 link
2023-11-02 Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game Sam Toyer et.al. 2311.01011 null
2024-06-01 Formalizing and Benchmarking Prompt Injection Attacks and Defenses Yupei Liu et.al. 2310.12815 link
2023-11-25 Evaluating the Instruction-Following Robustness of Large Language Models to Prompt Injection Zekun Li et.al. 2308.10819 link
2023-07-03 From ChatGPT to ThreatGPT: Impact of Generative AI in Cybersecurity and Privacy Maanak Gupta et.al. 2307.00691 null
2024-03-02 Prompt Injection attack against LLM-integrated Applications Yi Liu et.al. 2306.05499 null

(back to top)

Audio Deepfake

Publish Date Title Authors PDF Code
2024-10-13 Prompt Tuning for Audio Deepfake Detection: Computationally Efficient Test-time Domain Adaptation with Limited Target Dataset Hideyuki Oiso et.al. 2410.09869 link
2024-10-09 Toward Robust Real-World Audio Deepfake Detection: Closing the Explainability Gap Georgia Channing et.al. 2410.07436 null
2024-10-09 Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge Yi Zhu et.al. 2410.07379 null
2024-09-24 Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection Orchid Chetia Phukan et.al. 2409.15767 null
2024-09-21 Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition Orchid Chetia Phukan et.al. 2409.14221 null
2024-09-14 SafeEar: Content Privacy-Preserving Audio Deepfake Detection Xinfeng Li et.al. 2409.09272 link
2024-09-09 Continuous Learning of Transformer-based Audio Deepfake Detection Tuan Duy Nguyen Le et.al. 2409.05924 null
2024-08-20 Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio? Yuankun Xie et.al. 2408.10853 link
2024-08-14 WavLM model ensemble for audio deepfake detection David Combei et.al. 2408.07414 null
2024-08-13 Temporal Variability and Multi-Viewed Self-Supervised Representations to Tackle the ASVspoof5 Deepfake Challenge Yuankun Xie et.al. 2408.06922 null
2024-09-12 ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild Jiangyan Yi et.al. 2408.04967 null
2024-07-26 SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection Yi Zhu et.al. 2407.18517 null
2024-07-10 Targeted Augmented Data for Audio Deepfake Detection Marcella Astrid et.al. 2407.07598 null
2024-07-01 Deepfake Audio Detection Using Spectrogram-based Feature and Ensemble of Deep Learning Models Lam Pham et.al. 2407.01777 null
2024-06-24 One-Class Learning with Adaptive Centroid Shift for Audio Deepfake Detection Hyun Myung Kim et.al. 2406.16716 null
2024-06-12 Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio Yi Lu et.al. 2406.08112 null
2024-06-18 RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection Yujie Chen et.al. 2406.06086 link
2024-06-12 Harder or Different? Understanding Generalization of Audio Deepfake Detection Nicolas M. Müller et.al. 2406.03512 null
2024-06-09 Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy Yuankun Xie et.al. 2406.03240 null
2024-08-13 Towards Robust Audio Deepfake Detection: A Evolving Benchmark for Continual Learning Xiaohui Zhang et.al. 2405.08596 link
2024-05-15 The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio Yuankun Xie et.al. 2405.04880 link
2024-07-01 Training-Free Deepfake Voice Recognition by Leveraging Large-Scale Pre-Trained Models Alessandro Pianese et.al. 2405.02179 null
2024-04-24 CLAD: Robust Audio Deepfake Detection Against Manipulation Attacks with Contrastive Learning Haolin Wu et.al. 2404.15854 link
2024-04-23 Retrieval-Augmented Audio Deepfake Detection Zuheng Kang et.al. 2404.13892 null
2024-04-19 Enhancing Generalization in Audio Deepfake Detection: A Neural Collapse based Sampling and Training Approach Mohammed Yousif et.al. 2404.13008 null
2024-09-20 Cross-Domain Audio Deepfake Detection: Dataset and Analysis Yuang Li et.al. 2404.04904 null
2024-03-31 Heterogeneity over Homogeneity: Investigating Multilingual Speech Pre-Trained Models for Detecting Audio Deepfake Orchid Chetia Phukan et.al. 2404.00809 link
2024-03-21 Exploring Green AI for Audio Deepfake Detection Subhajit Saha et.al. 2403.14290 link
2024-03-04 A robust audio deepfake detection system via multi-view feature Yujie Yang et.al. 2403.01960 null
2023-12-15 What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection Xiaohui Zhang et.al. 2312.09651 null
2024-01-10 Audio Deepfake Detection with Self-Supervised WavLM and Multi-Fusion Attentive Classifier Yinlin Guo et.al. 2312.08089 null
2023-10-05 Securing Voice Biometrics: One-Shot Learning Approach for Audio Deepfake Detection Awais Khan et.al. 2310.03856 null
2023-09-15 HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods Hyun-seo Shin et.al. 2309.08208 link
2024-06-12 Towards generalisable and calibrated synthetic speech detection with self-supervised representations Octavian Pascu et.al. 2309.05384 null
2023-09-06 FSD: An Initial Chinese Dataset for Fake Song Detection Yuankun Xie et.al. 2309.02232 link
2023-08-29 Audio Deepfake Detection: A Survey Jiangyan Yi et.al. 2308.14970 null
2023-08-22 Complex-valued neural networks for voice anti-spoofing Nicolas M. Müller et.al. 2308.11800 null
2023-08-20 The DKU-DUKEECE System for the Manipulation Region Location Task of ADD 2023 Zexin Cai et.al. 2308.10281 null
2023-06-27 TranssionADD: A multi-frame reinforcement based sequence tagging model for audio deepfake detection Jie Liu et.al. 2306.15212 null
2023-05-30 Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification Qing Wang et.al. 2305.19020 null
2023-05-25 Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion Rui Liu et.al. 2305.16353 link
2023-05-23 ADD 2023: the Second Audio Deepfake Detection Challenge Jiangyan Yi et.al. 2305.13774 null
2023-06-10 Defense Against Adversarial Attacks on Audio DeepFake Detection Piotr Kawa et.al. 2212.14597 link
2022-10-12 SpecRNet: Towards Faster and More Accessible Audio DeepFake Detection Piotr Kawa et.al. 2210.06105 link
2022-08-02 Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features Jun Xue et.al. 2208.01214 null
2022-07-21 Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection Piotr Kawa et.al. 2206.13979 link
2024-08-27 Does Audio Deepfake Detection Generalize? Nicolas M. Müller et.al. 2203.16263 null
2022-03-03 The Vicomtech Audio Deepfake Detection System based on Wav2Vec2 for the 2022 ADD Challenge Juan M. Martín-Doñas et.al. 2203.01573 null
2024-07-02 ADD 2022: the First Audio Deep Synthesis Detection Challenge Jiangyan Yi et.al. 2202.08433 null
2021-11-04 WaveFake: A Data Set to Facilitate Audio Deepfake Detection Joel Frank et.al. 2111.02813 link
2021-06-26 Generalized Spoofing Detection Inspired from Audio Generation Artifacts Yang Gao et.al. 2104.04111 null

(back to top)

About

fork from LetterLiGo's repo "letterli-arxiv-daily"

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages