ARKAI Research Lab | AI Safety, Robustness & Accountability

Gen AIModel Steering

ATLAS: Adaptive Test-Time Latent Steering with External Verifiers for Enhancing LLMs Reasoning

Tuc Nguyen, Thai Le

arXiv preprint arXiv:2601.03093

Pre-print

Gen AI

ShareChat: A Dataset of Chatbot Conversations in the Wild

Yueru Yan, Tuc Nguyen, Bo Su, Melissa Lieffers, Thai Le

arXiv preprint arXiv:2512.17843

Pre-print | Download

Gen AIUnlearning

The Limits of Obliviate: Evaluating Unlearning in LLMs via Stimulus-Knowledge Entanglement-Behavior Framework

Aakriti Shah, Thai Le

arXiv preprint arXiv:2510.25732

Pre-print

SecurityAudio LMDeepfakeExplainable AI

Analyzing Reasoning Shifts in Audio Deepfake Detection under Adversarial Attacks: The Reasoning Tax versus Shield Bifurcation

Binh Nguyen, Thai Le

arXiv preprint arXiv:2601.03615

Pre-print

SurveyTDA

Unveiling Topological Structures from Language: A Survey of Topological Data Analysis Applications in NLP

Adaku Uchendu, Thai Le

arXiv preprint arXiv:2411.10298

Survey

Security

Verification-Guided Falsification for Safe RL via Explainable Abstraction and Risk-Aware Exploration

Tuan Le, Risal Shefin, Debashis Gupta, Thai Le, Sarra Alqahtani

Journal of Autonomous Agents and Multi-Agent Systems (JAAMAS) 2026

Journal

SecurityExplainable AI

Interpretable Failure Analysis in Multi-Agent Reinforcement Learning Systems

Risal Shefin, Thai Le, Sarra Alqahtani

International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2026

Conference

SecurityGen AI

SPECTRE: Conditional System Prompt Poisoning to Hijack LLMs

Viet Pham, Thai Le

Annual Meeting of the Association for Computational Linguistics (ACL) 2026

Conference

Gen AI

CtrlAct: Grounding LLMs to Bridge the Gap between Embodied Instruction and Action

Qingyang Xiao, Bo Su, Ling Sun, Zhu Zhu, Thai Le

NeurIPS Workshop on Foundation Models for Embodied AI (FMEA) 2025

Workshop

SecuritySocial MediaGen AI

Manipulating LLM Web Agents with Indirect Prompt Injection Attack via HTML Accessibility Tree

Sam Johnson, Viet Pham, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025 (Demo)

Conference

SecurityAudio LMDeepfake

What You Read Isn’t What You Hear: Linguistic Sensitivity in Deepfake Speech Detection

Binh Nguyen, et al., Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025

Conference

PrivacyGen AIAuthorship

Unraveling Interwoven Roles of Large Language Models in Authorship Privacy: Obfuscation, Mimicking, and Verification

Tuc Nguyen, Yifan Hu, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025

Conference

SecurityGen AIUnlearning

Harry Potter is Still Here! Probing Knowledge Leakage in Targeted Unlearned Large Language Models

Bang Trinh Tran To, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025 (Findings)

Conference

SecurityDeepfake

Turing’s Echo: Investigating Linguistic Sensitivity of Deepfake Voice Detection via Gamification

Binh Nguyen, Thai Le

Interspeech (INTERSPEECH) 2025 (Demo)

Demo

SecuritySocial Media

NoisyHate: Benchmarking Content Moderation Machine Learning Models with Human-Written Perturbations Online

Yiran Ye, Thai Le, Dongwon Lee

AAAI International Conference on Web and Social Media (ICWSM) 2025

Conference

Gen AIDeepfake

PlagBench: Exploring the Duality of Large Language Models in Plagiarism Generation and Detection

Jooyoung Lee, Thai Le

Annual Conference of the Nations of the Americas Chapter of the ACL (NAACL) 2025

Conference

Security

xSRL: Safety-Aware Explainable RL - Safety as a Product of Explainability

Risal Shefin, Thai Le

International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2025

Conference

Explainable AI

NoMatterXAI: Generating “No Matter What” Alterfactual Examples for Explaining Black-Box Text Classification Models

Tuc Nguyen, Thai Le

AAAI Conference on Artificial Intelligence (AAAI) 2025

Conference

Security

Adapters Mixup: Mixing Parameter-Efficient Adapters to Enhance the Adversarial Robustness of Fine-tuned Pre-trained Text Classifiers

Tuc Nguyen, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2024

Conference

Gen AIDeepfakeAuthorship

Topology-Aware Authorship Attribution of Deepfake Texts with Diverse Writing Styles

Adaku Uchendu, Thai Le

European Conference on Artificial Intelligence (ECAI) 2024

Conference

Security

Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning

Tuc Nguyen, Thai Le

Annual Meeting of the Association for Computational Linguistics (ACL) 2024

Conference

Security

A Curious Case of Searching for the Correlation between Training Data and Adversarial Robustness of Transformer Textual Models

Dang Cao Cuong, Thai Le

Findings of the Annual Meeting of the Association for Computational Linguistics (ACL) 2024

Conference

SecurityDeepfake

A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts

Nafis Tripto, et al., Thai Le

Annual Meeting of the Association for Computational Linguistics (ACL) 2024

Conference

Gen AI

The Unexpected Effects of Google Smart Compose on Open-Ended Writing Tasks

Robert Cummings, Thai Le

International Conference on Artificial Intelligence in Education (AIED) 2024

Conference

SecurityAuthorship

ALISON: Fast and Effective Stylometric Authorship Obfuscation

Eric Xin, Thai Le

AAAI Conference on Artificial Intelligence (AAAI) 2024

Conference

Social Media

The Strange Case of Jekyll and Hyde: Analysis of r/ToastMe and r/RoastMe Users on Reddit

Wooyong Jung, Thai Le

AAAI International Conference on Web and Social Media (ICWSM) 2024

Conference

Social Media

Enhancing Brand Affinity in the Face of Political Controversy: the Role of Disclosing AI Moderator on Social Media Platforms

Marie Ozanne, Maria Molina, Thai Le

American Academy of Advertising (AAA) 2024

Conference

Security

“Are Your Explanations Reliable?” Investigating the Stability of LIME in Explaining Text Classifiers by Marrying XAI and Adversarial Attack

Christopher Burger, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023

Conference

Gen AIDeepfake

MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark

Dominik Macko, et al., Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023

Conference

SecurityPrivacyAuthorship

UPTON: Preventing Authorship Leakage from Public Text Release via Data Poisoning

Ziyao Wang, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023 (Findings)

Conference

Gen AIAudio LM

HANSEN: Human and AI Spoken Text Benchmark for Authorship Analysis

Nafis Tripto, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023 (Findings)

Conference

Gen AIHuman

Does Human Collaboration Enhance the Accuracy of Identifying Deepfake Texts?

Adaku Uchendu, Thai Le

AAAI Conference on Human Computation and Crowdsourcing (HCOMP) 2023

Conference

Gen AIDeepfake

Catch Me If You GAN: Generation, Detection, and Obfuscation of Deepfake Texts

Adaku Uchendu, Thai Le

The ACM Web Conference (WWW) 2023 (Tutorial)

Conference

Gen AI

Do Language Models Plagiarize?

Jooyoung Lee, Thai Le

The ACM Web Conference (WWW) 2023

Conference

Security

CryptText: Interactive Discovery and Visualization of Human-Written Text Perturbations in the Wild

Thai Le, Dongwon Lee

IEEE International Conference on Data Engineering (ICDE) 2023

Conference

Security

SHIELD: Defending Textual Neural Networks against Black-Box Adversarial Attacks with Stochastic Multi-Expert Patcher

Thai Le, Noseong Park, Dongwon Lee

Annual Meeting of the Association for Computational Linguistics (ACL) 2022

Conference

Security

Perturbations in the Wild: Leveraging Human-Written Text Perturbations for Realistic Adversarial Attack and Defense

Thai Le, Dongwon Lee

Findings of the Annual Meeting of the Association for Computational Linguistics (ACL) 2022

Conference

Explainable AI

CAPS: Comprehensible Abstract Policy Summaries for Explaining Reinforcement Learning Agents

Joe McCalmon, Thai Le

International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2022

Conference

Social MediaSecurity

Socialbots on Fire: Modeling Adversarial Behaviors of Socialbots via Multi-Agent Hierarchical Reinforcement Learning

Thai Le, Dongwon Lee

The ACM Web Conference (WWW) 2022

Conference

Security

A Sweet Rabbit Hole by DARCY: Using Honeypots to Detect Universal Triggers Adversarial Attacks

Thai Le, Dongwon Lee

Annual Meeting of the Association for Computational Linguistics (ACL) 2021

Conference

Gen AIDeepfake

TURING-BENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation

Adaku Uchendu, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2021 (Findings)

Conference

Weak Supervision

CHECKER: Detecting Clickbait Thumbnails with Weak Supervision and Co-Teaching

Tianyi Xie, Thai Le

European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD) 2021

Conference

Social Media

Does Clickbait Actually Attract More Clicks? Three Clickbait studies you must read

Maria Molina, Thai Le, Dongwon Lee

ACM Conference on Human Factors in Computing Systems (CHI) 2021

Conference

SecuritySocial MediaGen AI

MALCOM: Generating Malicious Comments to Attack Neural Fake News Detection Models

Thai Le, Dongwon Lee

IEEE International Conference on Data Mining (ICDM) 2020

Conference

Explainable AI

GRACE: Generating Concise and Informative Contrastive Sample to Explain Neural Network Models Prediction

Thai Le, Dongwon Lee

ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2020

Conference

Gen AIDeepfakeAuthorship

Authorship Attribution for Neural Text Generation

Adaku Uchendu, Thai Le

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020

Conference

Social MediaGen AI

5 Sources of Clickbaits You Should Know! Using Synthetic Clickbaits to Improve Prediction and Distinguish between Bot-Generated and Human-Written Headlines

Thai Le, Kai Shu, Dongwon Lee

IEEE/ACM International Conference on Advances in Social Network Analysis and Mining (ASONAM) 2019

Conference

Social Media

Reading, Commenting and Sharing of Fake News: How Online Bandwagons and Bots Dictate User Engagement

Maria Molina, Thai Le

Communication Research 2019

Conference

Social Media

How Gullible Are You? Predicting Susceptibility to Fake News

Jia Shen, Thai Le, Dongwon Lee

ACM Conference on Web Science (WebSci) 2019

Conference

Survey

Natural Language Processing Methods for the Study of Protein-Ligand Interactions

James Michels, Thai Le

Journal of Chemical Information and Modeling (JCIM) 2025

Journal

Social Media

Fake News is Not Simply False Information: A Concept Explication and Taxonomy of Online Content

Maria Molina, Thai Le

American Behavioral Scientist 2019

Journal

Signal Processing

A Machine Learning Framework for Automating Well Log Depth Matching

Thai Le, Lin Liang, Denis Helio

Journal of Petrophysics 2019

Journal

Explainable AI

A Policy-Graph Approach to Explain Reinforcement Learning Agents: A Novel Policy-Graph Approach with Natural Language and Counterfactual Abstractions for Explaining Reinforcement Learning Agents

Tongtong Liu, Thai Le

Autonomous Agents and Multi-Agent Systems Journal (AAMASJ) 2023

Journal

Recommendations

PathFinder: Graph-based Itemset Embedding for Learning Course Recommendation and Beyond

Thai Le, et al.

IEEE International Conference on Data Mining (ICDM) 2019

Conference

Others

Large-Scale Data-Driven Airline Market Influence Maximization

Duanshun Li, Thai Le, et al.

ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2021

Conference

Forthcoming / Pre-prints

2026

2025

2024

2023

2022 & Earlier