Forthcoming / Pre-prints
ATLAS: Adaptive Test-Time Latent Steering with External Verifiers for Enhancing LLMs Reasoning
arXiv preprint arXiv:2601.03093
ShareChat: A Dataset of Chatbot Conversations in the Wild
arXiv preprint arXiv:2512.17843
The Limits of Obliviate: Evaluating Unlearning in LLMs via Stimulus-Knowledge Entanglement-Behavior Framework
arXiv preprint arXiv:2510.25732
SPECTRE: Conditional System Prompt Poisoning to Hijack LLMs
arXiv preprint arXiv:2505.16888
Analyzing Reasoning Shifts in Audio Deepfake Detection under Adversarial Attacks: The Reasoning Tax versus Shield Bifurcation
arXiv preprint arXiv:2601.03615
Unveiling Topological Structures from Language: A Survey of Topological Data Analysis Applications in NLP
arXiv preprint arXiv:2411.10298
2026
Verification-Guided Falsification for Safe RL via Explainable Abstraction and Risk-Aware Exploration
Journal of Autonomous Agents and Multi-Agent Systems (JAAMAS) 2026
Interpretable Failure Analysis in Multi-Agent Reinforcement Learning Systems
International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2026
2025
CtrlAct: Grounding LLMs to Bridge the Gap between Embodied Instruction and Action
NeurIPS Workshop on Foundation Models for Embodied AI (FMEA) 2025
Manipulating LLM Web Agents with Indirect Prompt Injection Attack via HTML Accessibility Tree
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025 (Demo)
What You Read Isn’t What You Hear: Linguistic Sensitivity in Deepfake Speech Detection
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025
Unraveling Interwoven Roles of Large Language Models in Authorship Privacy: Obfuscation, Mimicking, and Verification
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025
Harry Potter is Still Here! Probing Knowledge Leakage in Targeted Unlearned Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025 (Findings)
Turing’s Echo: Investigating Linguistic Sensitivity of Deepfake Voice Detection via Gamification
Interspeech (INTERSPEECH) 2025 (Demo)
NoisyHate: Benchmarking Content Moderation Machine Learning Models with Human-Written Perturbations Online
AAAI International Conference on Web and Social Media (ICWSM) 2025
PlagBench: Exploring the Duality of Large Language Models in Plagiarism Generation and Detection
Annual Conference of the Nations of the Americas Chapter of the ACL (NAACL) 2025
xSRL: Safety-Aware Explainable RL - Safety as a Product of Explainability
International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2025
NoMatterXAI: Generating “No Matter What” Alterfactual Examples for Explaining Black-Box Text Classification Models
AAAI Conference on Artificial Intelligence (AAAI) 2025
2024
Adapters Mixup: Mixing Parameter-Efficient Adapters to Enhance the Adversarial Robustness of Fine-tuned Pre-trained Text Classifiers
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2024
Topology-Aware Authorship Attribution of Deepfake Texts with Diverse Writing Styles
European Conference on Artificial Intelligence (ECAI) 2024
Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning
Annual Meeting of the Association for Computational Linguistics (ACL) 2024
A Curious Case of Searching for the Correlation between Training Data and Adversarial Robustness of Transformer Textual Models
Findings of the Annual Meeting of the Association for Computational Linguistics (ACL) 2024
A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts
Annual Meeting of the Association for Computational Linguistics (ACL) 2024
The Unexpected Effects of Google Smart Compose on Open-Ended Writing Tasks
International Conference on Artificial Intelligence in Education (AIED) 2024
ALISON: Fast and Effective Stylometric Authorship Obfuscation
AAAI Conference on Artificial Intelligence (AAAI) 2024
The Strange Case of Jekyll and Hyde: Analysis of r/ToastMe and r/RoastMe Users on Reddit
AAAI International Conference on Web and Social Media (ICWSM) 2024
Enhancing Brand Affinity in the Face of Political Controversy: the Role of Disclosing AI Moderator on Social Media Platforms
American Academy of Advertising (AAA) 2024
2023
“Are Your Explanations Reliable?” Investigating the Stability of LIME in Explaining Text Classifiers by Marrying XAI and Adversarial Attack
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023
MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023
UPTON: Preventing Authorship Leakage from Public Text Release via Data Poisoning
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023 (Findings)
HANSEN: Human and AI Spoken Text Benchmark for Authorship Analysis
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023 (Findings)
Does Human Collaboration Enhance the Accuracy of Identifying Deepfake Texts?
AAAI Conference on Human Computation and Crowdsourcing (HCOMP) 2023
Catch Me If You GAN: Generation, Detection, and Obfuscation of Deepfake Texts
The ACM Web Conference (WWW) 2023 (Tutorial)
Do Language Models Plagiarize?
The ACM Web Conference (WWW) 2023
CryptText: Interactive Discovery and Visualization of Human-Written Text Perturbations in the Wild
IEEE International Conference on Data Engineering (ICDE) 2023
2022 & Earlier
SHIELD: Defending Textual Neural Networks against Black-Box Adversarial Attacks with Stochastic Multi-Expert Patcher
Annual Meeting of the Association for Computational Linguistics (ACL) 2022
Perturbations in the Wild: Leveraging Human-Written Text Perturbations for Realistic Adversarial Attack and Defense
Findings of the Annual Meeting of the Association for Computational Linguistics (ACL) 2022
CAPS: Comprehensible Abstract Policy Summaries for Explaining Reinforcement Learning Agents
International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2022
Socialbots on Fire: Modeling Adversarial Behaviors of Socialbots via Multi-Agent Hierarchical Reinforcement Learning
The ACM Web Conference (WWW) 2022
A Sweet Rabbit Hole by DARCY: Using Honeypots to Detect Universal Triggers Adversarial Attacks
Annual Meeting of the Association for Computational Linguistics (ACL) 2021
TURING-BENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2021 (Findings)
CHECKER: Detecting Clickbait Thumbnails with Weak Supervision and Co-Teaching
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD) 2021
Does Clickbait Actually Attract More Clicks? Three Clickbait studies you must read
ACM Conference on Human Factors in Computing Systems (CHI) 2021
MALCOM: Generating Malicious Comments to Attack Neural Fake News Detection Models
IEEE International Conference on Data Mining (ICDM) 2020
GRACE: Generating Concise and Informative Contrastive Sample to Explain Neural Network Models Prediction
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2020
Authorship Attribution for Neural Text Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
5 Sources of Clickbaits You Should Know! Using Synthetic Clickbaits to Improve Prediction and Distinguish between Bot-Generated and Human-Written Headlines
IEEE/ACM International Conference on Advances in Social Network Analysis and Mining (ASONAM) 2019
Reading, Commenting and Sharing of Fake News: How Online Bandwagons and Bots Dictate User Engagement
Communication Research 2019
How Gullible Are You? Predicting Susceptibility to Fake News
ACM Conference on Web Science (WebSci) 2019
Natural Language Processing Methods for the Study of Protein-Ligand Interactions
Journal of Chemical Information and Modeling (JCIM) 2025
Fake News is Not Simply False Information: A Concept Explication and Taxonomy of Online Content
American Behavioral Scientist 2019
A Machine Learning Framework for Automating Well Log Depth Matching
Journal of Petrophysics 2019
A Policy-Graph Approach to Explain Reinforcement Learning Agents: A Novel Policy-Graph Approach with Natural Language and Counterfactual Abstractions for Explaining Reinforcement Learning Agents
Autonomous Agents and Multi-Agent Systems Journal (AAMASJ) 2023
PathFinder: Graph-based Itemset Embedding for Learning Course Recommendation and Beyond
IEEE International Conference on Data Mining (ICDM) 2019
Large-Scale Data-Driven Airline Market Influence Maximization
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2021