Forthcoming / Pre-prints
ATLAS: Adaptive Test-Time Latent Steering with External Verifiers for Enhancing LLMs Reasoning
arXiv preprint arXiv:2601.03093
ShareChat: A Dataset of Chatbot Conversations in the Wild
arXiv preprint arXiv:2512.17843
The Limits of Obliviate: Evaluating Unlearning in LLMs
arXiv preprint arXiv:2510.25732
Verification-Guided Falsification for Safe RL
arXiv preprint arXiv:2501.03093
SPECTRE: Conditional System Prompt Poisoning to Hijack LLMs
arXiv preprint arXiv:2505.16888
Reasoning Shifts in Audio Deepfake Detection
arXiv preprint arXiv:2601.03615
Topological Data Analysis Applications in NLP
arXiv preprint arXiv:2411.10298
Similarity Measures in Text-based Explainable AI
arXiv preprint arXiv:2406.15839
2026
Style-First Authorship Verification
AAAI-26 (SAPP)
2025
CtrlAct: Grounding LLMs to Bridge the Gap between Embodied Instruction and Action
NeurIPS 2025 Workshop FMEA
Manipulating LLM Web Agents with Indirect Prompt Injection
EMNLP 2025 (Demo)
Authorship Privacy in Large Language Models
EMNLP 2025
Probing Knowledge Leakage in Unlearned LLMs
EMNLP 2025 (Findings)
Turing's Echo: Deepfake Voice Detection Gamification
INTERSPEECH 2025 (Demo)
NoisyHate Benchmark
ICWSM 2025
PlagBench: Plagiarism Duality in LLMs
NAACL 2025
xSRL: Safety-Aware Explainable RL
AAMAS 2025
NoMatterXAI: Alterfactual Examples
AAAI 2025
2024
Adapters Mixup for Adversarial Robustness
EMNLP 2024
Topology-Aware Authorship Attribution
ECAI 2024
Training Data vs Adversarial Robustness
ACL 2024 (Findings)
Google Smart Compose Effects on Writing
AIED 2024
r/ToastMe and r/RoastMe Users on Reddit
ICWSM 2024
2023
Stability of LIME in Text Classifiers
EMNLP 2023
MULTITuDE Multilingual Detection
EMNLP 2023
UPTON: Preventing Authorship Leakage
EMNLP 2023 (Findings)
HANSEN: Spoken Text Benchmark
EMNLP 2023 (Findings)
Human Collaboration in Identifying Deepfakes
HCOMP 2023
Tutorial on Deepfake Texts
WWW 2023 (Tutorial)
Do Language Models Plagiarize?
WWW 2023
CryptText: Interactive Discovery
ICDE 2023
2022 & Earlier
SHIELD: Defending Textual Networks
ACL 2022
Perturbations in the Wild
ACL 2022 (Findings)
CAPS: Abstract Policy Summaries
AAMAS 2022
A Sweet Rabbit Hole by DARCY
ACL 2021
TURINGBENCH Benchmark
EMNLP 2021 (Findings)
CHECKER: Clickbait Detection
ECML-PKDD 2021
Clickbait Attraction Study
CHI 2021
MALCOM: Malicious Comments
ICDM 2020
Authorship Attribution for Neural Text
EMNLP 2020
5 Sources of Clickbaits
ASONAM 2019
Reading, Commenting and Sharing of Fake News
Comm. Research 2019
Susceptibility to Fake News
WebSci 2019
NLP for Protein-Ligand Interactions
JCIM 2025
Fake News Concept Explication
ABS 2019
Machine Learning Well Log Depth Matching
Journal of Petrophysics 2019
Policy-Graph Approach to Explain RL Analysts
AAMASJ 2023
PathFinder: Course Recommendation
IEEE 2021