Thai Le

welcome!

Welcome to ARKAI Research Lab (Accountable, Resilient, and Kind AI). Our reseach lab is on a mission to enhance the robustness, safety, and transparency of AI-related technologies in various sociotechnical contexts, ensuring that the society and netizen can harness their power with confidence and clarity. We hope to associate AI with kindness as not to humanize AI, but to emphasize the much needed responsible and ethical development and use of AI, and to also encourage AI applications that nurture humans and the society.

try out our demo!

Linguistic Sensitivity of Deepfake Voice Detectors

Attacking LLM-enabled Web Autonomous Agents

research projects

How can we keep AI systems safe from security and privacy threats? This research focuses on protecting AI models from being tricked or manipulated. It explores ways to prevent attacks that exploit weaknesses, ensuring AI makes reliable and secure decisions. [ ACL'21, ACL'22a, ACL'22b, EMNLP'23c, ICDE'23, AAAI'24, ACL'24b, ACL'24c, EMNLP'24 ]
How to improve and detect AI-generated artifacts? This research focuses on improving AI’s ability to generate natural, meaningful language and speech while also developing methods to detect AI-written and spoken content, ensuring authenticity and preventing misuse. [ EMNLP'20, EMNLP'21, EMNLP'23b, EMNLP'23d, HCOMP'23, WWW'23a, WWW'23b, SIGKDD Explorations'23, NAACL'24, ACL'24a, ECAI'24, NAACL'25, INTERSPEECH'25 ]
How to explain AI's decisions to end-users in an easy-to-understand and secure way? This research explores making AI decisions more transparent while addressing vulnerabilities in explanations. It aims to ensure AI insights are trustworthy, resistant to manipulation, and easy for end users to understand, contributing to safer and more reliable AI systems. [ KDD'20, AAMAS'22, AAMASJ'23, EMNLP'23a, AAAI'25, AAMAS'25 ]
How to effectively apply NLP to solve pressing social and science problems? We applies our developed methodology in various critical NLP applications including detecting online fakes, hatespeech, enhancing education, analyzing social media and recently biology. [ ICDM'20, CHI'21, WWW'22, AAAI-ICWSM'24, AIED'24, JCIM'25, ICWSM'25 ]

From 2025

07/2024 - Manipulating LLM Web Agents with Indirect Prompt Injection Attack via HTML Accessibility Tree demo is available!

05/2025 - Congrats to Tuc who is awarded the CS Graduate Research Award 2025!

05/2025 - Congrats to Sam who graduated his MS degree in Data Science and joining Advanced Space in Colorado!

05/2025 - New work on Finding Hidden Harry Porter - Knowledge Leakage in LLMs Unlearning Algorithms is available!

05/2025 - New work on What You Read Isn't What You Hear: Linguistic Sensitivity of Deepfake Speech Detectors is available!

05/2025 - New work on Can LLMs be Weaponized for Information Manipulation? is available!

05/2025 - New work on Triple Roles of LLMs in Authorship Privacy: Obfuscation, Mimicking and Verification is available!

05/2025 - New work on Explainable Falsification for Safe Reinforcement Learning is available!

05/2025 - Turing's Echo: Can you fool deepfake voice detectors better than AI? is accepted to INTERSPEECH 2025! Please try out our game demo!

04/2025 - Congrats to research intern Cuong Nguyen who just accepted a PhD position at Virginia Tech from Fall 2025!

04/2025 - NoisyHate: Benchmarking Hatespeech Detection with Natural Perturbations is accepted to ICWSM 2025!

02/2025 - PlagBench: Duality of LLMs in Plagiarism Generation and Detection is accepted to NAACL 2025!

01/2025 - Safety-Aware Explainable Reinforcement Learning is accepted to AAMAS 2025!

01/2025 - "No Matter What" Alterfactual Examples is accepted to AAAI 2025!

01/2025 - Happy New Year!

From 2024

11/2024 - Survey of TDA Methods for NLP is released!

09/2024 - Adapters Mixup is accepted to EMNLP 2024!
We use mixup with adapters to fine-tune pre-trained language models to enhance their adversarial robustness with unknown future attacks.

07/2024 - Deepfake Text Detection with Topological Analysis is accepted to ECAI 2024!

06/2024 - We delivered our Deepfake Text Detection and Obfuscation at NAACL 2024 (Mexico City) with Penn State, MIT Lincoln Lab and GPT-Zero

06/2024 - Beyond Individual Facts for GPT Models is available
We investigate knowledge locality of GPT models for not one but a group of conceptually related facts.

06/2024 - PlagBench is available.
We collect data and benchmark LLM's dual behaviors: (1) plagarism generation via summarization and (2) plagarism detection.

06/2024 - XAI Similarity Metrics is available.
We investigate a variety of similarity measures designed for text-based adversarial attacks on Explainable AI.

06/2024 - A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts (Nafis, Penn State) is accepted to ACL 2024

06/2024 - Searching for the Correlation between Training Data and Adversarial Robustness (Cuong, FPT) is accepted to ACL 2024

06/2024 - Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions (Tuc, IUB) is accepted to ACL 2024

05/2024 - Mentee Jun (Oxford High School) was awarded 3rd place (SPECIAL AWARD by NSA) at ISEF 2024. Congratulations!

05/2024 - Lisette accepted a research-based MSc program at Wake Forest University with a full-ride scholarship. Congratulations!

04/2024 - Chris passed his doctorate prospectus. Congratulations!

03/2024 - Our paper The Unexpected Effects of Google Smart Compose on Open-Ended Writing Tasks is accepted to AIED 2024

01/2024 - Happy New Year!

From 2023

12/2023 - Rishabh successfully defended his Master thesis. Congratulations!

12/2023 - Our paper ALISON: Fast and Effective Stylometric Authorship Obfuscation accepted to AAAI'24
This paper proposed a simple, fast and effective algorithm to hide or mask true authorships from texts, even ChatGPT generated ones
Congratulations to Eric (PhD at UMD) and Saranya (PhD at PSU)!

11/2023 - Our paper ``Enhancing Brand Affinity in the Face of Political Controversy: the Role of Disclosing AI Moderator on Social Media Platforms'' accepted to AAA'24.
This is a collaborative effort with Maria (Michigan State University) and Marie (Cornell University).
We designed a user-interface to evaluate the effects of AI moderation disclosure on brand affinity.

11/2023 - PC Member: ACL Rolling Review (EACL'24)

11/2023 - Our paper ``The Strange Case of Jekyll and Hyde: Analysis of r/ToastMe and r/RoastMe Users on Reddit'' accepted to ICWSM'24, FINALLY!
This paper analyzes why some people frequently joined both r/ToastMe and r/RoastMe subreddits at the same time.
Congratulations to the team, especially Wooyong (Penn State)!

10/2023 - Our papers are accepted to EMNLP'23. Congratulations to
KInIT and Pike Lab (multilingual neural text detection),

Jason (UMD, preventing privacy leakage in texts on social media),
Nafis (Penn State, written v.s. spoken neural texts) and
Chris (Ole Miss, fooling XAI explanations)!

10/2023 - Our tutorial on deepfake text detection and obfuscation accepted at NAACL'24

09/2023 - Dr. Le participated in the LEVEL UP workshop organized by CRA in Atlanta

08/2023 - ``Does Human Collaboration Enhance the Accuracy of Identifying Deepfake Texts?'' accepted to AAAI HCOMP 2023

08/2023 - Welcome new lab members: Hoang Tran and Tuc Nguyen

07/2023 - Sijan and Chris successfully defended their Master theses. Congratulations!

07/2023 - Our tutorial session on deepfake text detection accepted to 2023 NSF Cybersecurity Summit

07/2023 - PC Member: AAAI'24

07/2023 - Invited guest-lecturer at REU@PSU. Happy to see old/new faces at Penn State!

07/2023 - Paper on Explaining RL agents via policy-graph+NLP+counterfactual abstractions is accepted to AAMAS Journal

06/2023 - Preprint of XAIFooler - fooling AI explanations on text classifiers is available.

05/2023 - Congratulations to mentee Jason Zijao (PhD at UMD), Eric Xing (PhD at WashU), Jayesh Khandelwal (MS at NYU) on graduate admissions.

03/2023 - Preprint of NoisyHate - an adversarial toxic texts dataset with human-written perturbations is available

01/2023 - Paper on the plagiarism behaviors of LLM is accepted to WWW'23

01/2023 - Preprint on Unattributable Authorship Text is available

From 2022

2022 - Tutorial ``Catch Me If You GAN: Generation, Detection, and Obfuscation of Deepfake Texts" accepted at WWW'23, with Prof. Dongwon Lee and Adaku Uchendu.

2022 - Survey paper on Authorship Detection of Deepfake Texts will be published at KDD Exploration

2022 - Demo paper on perturbations in the wild is accepted at ICDM'23

2022 - PC Members: PKDD'22, EMNLP'22, WSDM'23, AAAI'23, WWW'23

2022 - Dr. Le accepted tenure-track faculty position at University of Mississippi

2022 - Dr. Le received the IST Ph.D. Student Award for Research Excellence, College of IST, PSU

2022 - Our papers on adversarial texts are accepted at ACL'22

2022 - Paper on RL-based Adversarial Socialbots is accepted at WWW'22.

2022 - Paper on Explainable RL is accepted at AAMAS'22.

2022 - Invited talk ``AI & Machine Learning for Communication Researchers'' at Michigan State University