ARKAI Research Lab | AI Safety, Robustness & Accountability

Gen AIUnlearning

PreUnlearn: Auditing Collateral Knowledge Damage Before Large Language Model Unlearning

Bo Su, Ankit Shah, Thai Le

arXiv preprint arXiv:2606.18473

Pre-print

SecurityGen AIModel Steering

Adversarial Robustness of Activation Steering in Large Language Models

Kien Le, Thai Le

arXiv preprint arXiv:2606.07696

Pre-print

Gen AIModel Steering

Beyond Linear Activation Steering: Invertible Latent Transformations for Controlling LLM Behavior

Tuc Nguyen, Thai Le

arXiv preprint arXiv:2606.08454

Pre-print

Gen AIModel Steering

ATLAS: Verifier-Guided Adaptive Latent Activation Steering for Efficient LLM Reasoning

Tuc Nguyen, Thai Le

arXiv preprint arXiv:2601.03093

Pre-print

Gen AI

ShareChat: A Dataset of Chatbot Conversations in the Wild

Yueru Yan, Tuc Nguyen, Bo Su, Melissa Lieffers, Thai Le

arXiv preprint arXiv:2512.17843

Pre-print | Download

Gen AIUnlearning

The Limits of Obliviate: Evaluating Unlearning in LLMs via Stimulus-Knowledge Entanglement-Behavior Framework

Aakriti Shah, Thai Le

arXiv preprint arXiv:2510.25732

Pre-print

SecurityAudio LMDeepfakeExplainable AI

SARA: Stress Test Reasoning in Audio Deepfake Detection

Binh Nguyen, Charles Fleming, Thai Le

arXiv preprint arXiv:2601.03615

Pre-print

Security

Verification-Guided Falsification for Safe RL via Explainable Abstraction and Risk-Aware Exploration

Tuan Le, Risal Shefin, Debashis Gupta, Thai Le, Sarra Alqahtani

Journal of Autonomous Agents and Multi-Agent Systems (JAAMAS) 2026

Journal

SecurityExplainable AI

Interpretable Failure Analysis in Multi-Agent Reinforcement Learning Systems

Risal Shahriar Shefin, Debashis Gupta, Thai Le, Sarra Alqahtani

International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2026

Conference

SurveyTopological Data Analysis

Unveiling Topological Structures from Language: A Survey of Topological Data Analysis Applications in NLP

Adaku Uchendu, Thai Le

KDD Explorations, June 2026

Journal

SecurityGen AI

PARASITE: Conditional System Prompt Poisoning to Hijack LLMs

Viet Pham, Thai Le

Annual Meeting of the Association for Computational Linguistics (ACL) 2026

Conference

Gen AI

CtrlAct: Grounding LLMs to Bridge the Gap between Embodied Instruction and Action

Qingyang Xiao, Bo Su, Ling Sun, Zhu Zhu, Thai Le

NeurIPS Workshop on Foundation Models for Embodied AI (FMEA) 2025

Workshop

SecuritySocial MediaGen AI

Manipulating LLM Web Agents with Indirect Prompt Injection Attack via HTML Accessibility Tree

Sam Johnson, Viet Pham, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025 (Demo)

Conference

SecurityAudio LMDeepfake

What You Read Isn’t What You Hear: Linguistic Sensitivity in Deepfake Speech Detection

Binh Nguyen, Shuju Shi, Ryan Ofman, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025

Conference

PrivacyGen AIAuthorship

Unraveling Interwoven Roles of Large Language Models in Authorship Privacy: Obfuscation, Mimicking, and Verification

Tuc Nguyen, Yifan Hu, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025

Conference

SecurityGen AIUnlearning

Harry Potter is Still Here! Probing Knowledge Leakage in Targeted Unlearned Large Language Models

Bang Trinh Tran To, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025 (Findings)

Conference

SecurityDeepfake

Turing’s Echo: Investigating Linguistic Sensitivity of Deepfake Voice Detection via Gamification

Binh Nguyen, Thai Le

Interspeech (INTERSPEECH) 2025 (Demo)

Demo

SecuritySocial Media

NoisyHate: Benchmarking Content Moderation Machine Learning Models with Human-Written Perturbations Online

Yiran Ye, Thai Le, Dongwon Lee

AAAI International Conference on Web and Social Media (ICWSM) 2025

Conference

Gen AIDeepfake

PlagBench: Exploring the Duality of Large Language Models in Plagiarism Generation and Detection

Jooyoung Lee, Toshini Agrawal, Adaku Uchendu, Thai Le, Jinghui Chen, Dongwon Lee

Annual Conference of the Nations of the Americas Chapter of the ACL (NAACL) 2025

Conference

Security

xSRL: Safety-Aware Explainable RL - Safety as a Product of Explainability

Risal Shahriar Shefin, Md Asifur Rahman, Thai Le, Sarra Alqahtani

International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2025

Conference

Explainable AI

NoMatterXAI: Generating “No Matter What” Alterfactual Examples for Explaining Black-Box Text Classification Models

Tuc Nguyen, James Michels, Hua Shen, Thai Le

AAAI Conference on Artificial Intelligence (AAAI) 2025

Conference

Security

Adapters Mixup: Mixing Parameter-Efficient Adapters to Enhance the Adversarial Robustness of Fine-tuned Pre-trained Text Classifiers

Tuc Nguyen, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2024

Conference

Gen AITopological Data AnalysisDeepfakeAuthorship

Topology-Aware Authorship Attribution of Deepfake Texts with Diverse Writing Styles

Adaku Uchendu, Thai Le, Dongwon Lee

European Conference on Artificial Intelligence (ECAI) 2024

Conference

Security

Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning

Tuc Nguyen, Thai Le

Annual Meeting of the Association for Computational Linguistics (ACL) 2024

Conference

Security

A Curious Case of Searching for the Correlation between Training Data and Adversarial Robustness of Transformer Textual Models

Dang Cao Cuong, Dung D. Le, Thai Le

Findings of the Annual Meeting of the Association for Computational Linguistics (ACL) 2024

Conference

SecurityDeepfake

A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts

Nafis Irtiza Tripto, Saranya Venkatraman, Dominik Macko, Robert Moro, Ivan Srba, Adaku Uchendu, Thai Le, Dongwon Lee

Annual Meeting of the Association for Computational Linguistics (ACL) 2024

Conference

Gen AI

The Unexpected Effects of Google Smart Compose on Open-Ended Writing Tasks

Robert Cummings, Sijan Shrestha, Carrie Smith, Thai Le

International Conference on Artificial Intelligence in Education (AIED) 2024

Conference

SecurityAuthorship

ALISON: Fast and Effective Stylometric Authorship Obfuscation

Eric Xin, Saranya Venkatraman, Thai Le, Dongwon Lee

AAAI Conference on Artificial Intelligence (AAAI) 2024

Conference

Social Media

The Strange Case of Jekyll and Hyde: Analysis of r/ToastMe and r/RoastMe Users on Reddit

Wooyong Jung, Nishant Asati, Lucy Phuong Doan, Thai Le, Aiping Xiong, Dongwon Lee

AAAI International Conference on Web and Social Media (ICWSM) 2024

Conference

Social Media

Enhancing Brand Affinity in the Face of Political Controversy: the Role of Disclosing AI Moderator on Social Media Platforms

Marie Ozanne, Maria Molina, Thai Le.

American Academy of Advertising (AAA) 2024

Conference

Security

“Are Your Explanations Reliable?” Investigating the Stability of LIME in Explaining Text Classifiers by Marrying XAI and Adversarial Attack

Christopher Burger, Lingwei Chen, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023

Conference

Gen AIDeepfake

MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark

Dominik Macko, Robert Moro, Adaku Uchendu, Jason Lucas, Michiharu Yamashita, Matúš Pikuliak, Ivan Srba, Thai Le, Dongwon Lee, Jakub Simko, Maria Bielikova

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023

Conference

SecurityPrivacyAuthorship

UPTON: Preventing Authorship Leakage from Public Text Release via Data Poisoning

Ziyao Wang, Thai Le, Dongwon Lee

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023 (Findings)

Conference

Gen AIAudio LM

HANSEN: Human and AI Spoken Text Benchmark for Authorship Analysis

Nafis Irtiza Tripto, Adaku Uchendu, Thai Le, Mattia Setzu, Fosca Giannotti, Dongwon Lee

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023 (Findings)

Conference

Gen AIHuman

Does Human Collaboration Enhance the Accuracy of Identifying Deepfake Texts?

Adaku Uchendu, Jooyoung Lee, Hua Shen, Thai Le, Ting-Hao Huang and Dongwon Lee

AAAI Conference on Human Computation and Crowdsourcing (HCOMP) 2023

Conference

Gen AIDeepfake

Catch Me If You GAN: Generation, Detection, and Obfuscation of Deepfake Texts

Adaku Uchendu, Thai Le, Dongwon Lee

The ACM Web Conference (WWW) 2023 (Tutorial)

Conference

Gen AI

Do Language Models Plagiarize?

Jooyoung Lee, Thai Le, Jinghui Chen, Dongwon Lee

The ACM Web Conference (WWW) 2023

Conference

Security

CryptText: Interactive Discovery and Visualization of Human-Written Text Perturbations in the Wild

Thai Le, Ye Yiran, Yifan Hu, Dongwon Lee

IEEE International Conference on Data Engineering (ICDE) 2023

Conference

Security

SHIELD: Defending Textual Neural Networks against Black-Box Adversarial Attacks with Stochastic Multi-Expert Patcher

Thai Le, Noseong Park, Dongwon Lee

Annual Meeting of the Association for Computational Linguistics (ACL) 2022

Conference

Security

Perturbations in the Wild: Leveraging Human-Written Text Perturbations for Realistic Adversarial Attack and Defense

Thai Le, Jooyoung Lee, Kevin Yen, Yifan Hu, Dongwon Lee

Findings of the Annual Meeting of the Association for Computational Linguistics (ACL) 2022

Conference

Explainable AI

CAPS: Comprehensible Abstract Policy Summaries for Explaining Reinforcement Learning Agents

Joe McCalmon, Thai Le, Alqahtani Sarra, Dongwon Lee

International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2022

Conference

Social MediaSecurity

Socialbots on Fire: Modeling Adversarial Behaviors of Socialbots via Multi-Agent Hierarchical Reinforcement Learning

Thai Le, Dongwon Lee

The ACM Web Conference (WWW) 2022

Conference

Security

A Sweet Rabbit Hole by DARCY: Using Honeypots to Detect Universal Triggers Adversarial Attacks

Thai Le, Noseong Park, Dongwon Lee

Annual Meeting of the Association for Computational Linguistics (ACL) 2021

Conference

Gen AIDeepfake

TURING-BENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation

Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang Le, Dongwon Lee

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2021 (Findings)

Conference

Weak Supervision

CHECKER: Detecting Clickbait Thumbnails with Weak Supervision and Co-Teaching

Tianyi Xie, Thai Le, Dongwon Lee

European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD) 2021

Conference

Social Media

Does Clickbait Actually Attract More Clicks? Three Clickbait studies you must read

Maria D. Molina, S. Shyam Sundar, Md Main Uddin Rony, Naeemul Hassan, Thai Le, Dongwon Lee

ACM Conference on Human Factors in Computing Systems (CHI) 2021

Conference

SecuritySocial MediaGen AI

MALCOM: Generating Malicious Comments to Attack Neural Fake News Detection Models

Thai Le, Suhang Wang, Dongwon Lee

IEEE International Conference on Data Mining (ICDM) 2020

Conference

Explainable AI

GRACE: Generating Concise and Informative Contrastive Sample to Explain Neural Network Models Prediction

Thai Le, Suhang Wang, Dongwon Lee

ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2020

Conference

Gen AIDeepfakeAuthorship

Authorship Attribution for Neural Text Generation

Adaku Uchendu, Thai Le, Kai Shu, Dongwon Lee

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020

Conference

Social MediaGen AI

5 Sources of Clickbaits You Should Know! Using Synthetic Clickbaits to Improve Prediction and Distinguish between Bot-Generated and Human-Written Headlines

Thai Le, Kai Shu, Maria D Molina, Dongwon Lee, S Shyam Sundar, Huan Liu

IEEE/ACM International Conference on Advances in Social Network Analysis and Mining (ASONAM) 2019

Conference

Social Media

Reading, Commenting and Sharing of Fake News: How Online Bandwagons and Bots Dictate User Engagement

Maria Molina, Jinping Wang, S. Shyam Sundar, Thai Le, Carlina DiRusso

Communication Research 2019

Conference

Social Media

How Gullible Are You? Predicting Susceptibility to Fake News

Tracy Jia Shen, Robert Cowell, Aditi Gupta, Thai Le, Amulya Yadav, Dongwon Lee

ACM Conference on Web Science (WebSci) 2019

Conference

Survey

Natural Language Processing Methods for the Study of Protein-Ligand Interactions

James Michels, Ramya Bandarupalli, Amin Ahangar Akbari, Thai Le, Hong Xiao, Jing Li, Erik F. Y. Hom

Journal of Chemical Information and Modeling (JCIM) 2025

Journal

Social Media

Fake News is Not Simply False Information: A Concept Explication and Taxonomy of Online Content

Maria D Molina, S Shyam Sundar, Thai Le, Dongwon Lee

American Behavioral Scientist 2019

Journal

Signal Processing

A Machine Learning Framework for Automating Well Log Depth Matching

Thai Le, Lin Liang, Timon Zimmermann, Smaine Zeroug, Denis Helio

Journal of Petrophysics 2019

Journal

Explainable AI

A Policy-Graph Approach to Explain Reinforcement Learning Agents: A Novel Policy-Graph Approach with Natural Language and Counterfactual Abstractions for Explaining Reinforcement Learning Agents

Tongtong Liu, Thai Le

Autonomous Agents and Multi-Agent Systems Journal (AAMASJ) 2023

Journal

Recommendations

PathFinder: Graph-based Itemset Embedding for Learning Course Recommendation and Beyond

Jiasheng Zhang, Thai Le, Yiming Liao, Dongwon Lee

IEEE International Conference on Data Mining (ICDM) 2019

Conference

Others

Large-Scale Data-Driven Airline Market Influence Maximization

Duanshun Li, Jing Liu, Jinsung Jeon, Seoyoung Hong, Thai Le, Noseong Park, Dongwon Lee.

ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2021

Conference

Forthcoming / Pre-prints

2026

2025

2024

2023

2022 & Earlier