Publication | NEU PEACH Lab

Privacy Leakage Overshadowed by Views of AI: A Study on Human Oversight of Privacy in Language Model Agent

Zhiping Zhang, Bingcan Guo, Tianshi Li

Preprint Language model (LM) agents can boost productivity in personal tasks like replying to emails but pose privacy risks. We present the first study (N=300) on people’s ability to oversee LM agents’ privacy implications in asynchronous communication. Participants sometimes preferred agent-generated responses with greater privacy leakage, increasing harmful disclosures from 15.7% to 55.0%. We identified six privacy profiles reflecting different concerns, trust, and preferences. Our findings inform the design of agentic systems that support privacy-preserving interactions and better align with users’ privacy expectations.

arXiv

Rescriber: Smaller-LLM-Powered User-Led Data Minimization for LLM-Based Chatbots

Jijie Zhou, Eryue Xu, Yaoyao Wu, Tianshi Li

CHI 2025 The rise of LLM-based conversational agents has led to increased disclosure of sensitive information, yet current systems lack user control over privacy-utility tradeoffs. We present Rescriber, a browser extension that enables user-led data minimization by detecting and sanitizing personal information in prompts. In a study (N=12), Rescriber reduced unnecessary disclosures and addressed user privacy concerns. Users rated the Llama3-8B-powered system comparably to GPT-4o. Trust was shaped by the tool’s consistency and comprehensiveness. Our findings highlight the promise of lightweight, on-device privacy controls for enhancing trust and protection in AI systems.

arXiv Video

The Obvious Invisible Threat: LLM-Powered GUI Agents' Vulnerability to Fine-Print Injections

Chaoran Chen, Zhiping Zhang, Bingcan Guo, Shang Ma, Ibrahim Khalilov, Simret A Gebreegziabher, Yanfang Ye, Ziang Xiao^†, Yaxing Yao^†, Tianshi Li^†, Toby Jia-Jun Li^†

Preprint A Large Language Model (LLM) powered GUI agent is a specialized autonomous system that performs tasks on the user’s behalf according to high-level instructions. It does so by perceiving and interpreting the graphical user interfaces (GUIs) of relevant apps, often visually, inferring necessary sequences of actions, and then interacting with GUIs by executing the actions such as clicking, typing, and tapping. To complete real-world tasks, such as filling forms or booking services, GUI agents often need to process and act on sensitive user data. However, this autonomy introduces new privacy and security risks. Adversaries can inject malicious content into the GUIs that alters agent behaviors or induces unintended disclosures of private information. These attacks often exploit the discrepancy between visual saliency for agents and human users, or the agent’s limited ability to detect violations of contextual integrity in task automation. In this paper, we characterized six types of such attacks, and conducted an experimental study to test these attacks with six state-of-the-art GUI agents, 234 adversarial webpages, and 39 human participants. Our findings suggest that GUI agents are highly vulnerable, particularly to contextually embedded threats. Moreover, human users are also susceptible to many of these attacks, indicating that simple human oversight may not reliably prevent failures. This misalignment highlights the need for privacy-aware agent design. We propose practical defense strategies to inform the development of safer and more reliable GUI agents.

arXiv

Toward a Human-centered Evaluation Framework for Trustworthy LLM-powered GUI Agents

Chaoran Chen*, Zhiping Zhang*, Ibrahim Khalilov, Bingcan Guo, Simret A Gebreegziabher, Yanfang Ye, Ziang Xiao^†, Yaxing Yao^†, Tianshi Li^†, Toby Jia-Jun Li^†

CHI 2025 HEAL Workshop The rise of LLM-powered GUI agents has advanced automation but introduced significant privacy and security risks due to limited human oversight. This position paper identifies three key risks unique to GUI agents, highlights gaps in current evaluation practices, and outlines five challenges in integrating human evaluators. We advocate for a human-centered evaluation framework that embeds risk assessments, in-context consent, and privacy and security considerations into GUI agent design.

arXiv

Secret Use of Large Language Model (LLM)

Zhiping Zhang, Chenxinran Shen, Bingsheng Yao, Dakuo Wang, and Tianshi Li

CSCW 2025 The advancements of Large Language Models (LLMs) have decentralized the responsibility for the transparency of AI usage. Specifically, LLM users are now encouraged or required to disclose the use of LLM-generated content for varied types of real-world tasks. However, an emerging phenomenon, users’ secret use of LLMs, raises challenges in ensuring end users adhere to the transparency requirement. Our study used mixed-methods with an exploratory survey (125 real-world secret use cases reported) and a controlled experiment among 300 users to investigate the contexts and causes behind the secret use of LLMs. We found that such secretive behavior is often triggered by certain tasks, transcending demographic and personality differences among users. Task types were found to affect users’ intentions to use secretive behavior, primarily through influencing of perceived external judgment regarding LLM usage. Our results yield important insights for future work on designing interventions to encourage more transparent disclosure of LLM/AI use.

arXiv

PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action

Yijia Shao, Tianshi Li, Weiyan Shi, Yanchen Liu, and Diyi Yang

NeurIPS 2024 As language models (LMs) gain agency in personal communication, ensuring privacy compliance is critical but hard to evaluate. We introduce PrivacyLens, a framework that expands privacy-sensitive seeds into vignettes and trajectories for multi-level privacy risk assessment. Testing GPT-4 and Llama-3-70B, we find sensitive information leakage in 25.68% and 38.69% of cases, even with privacy prompts. PrivacyLens also supports dynamic red-teaming through trajectory variations.

arXiv Code

“It's a Fair Game”, or Is It? Examining How Users Navigate Disclosure Risks and Benefits When Using LLM-Based Conversational Agents

Zhiping Zhang, Michelle Jia, Hao-Ping (Hank) Lee, Bingsheng Yao, Sauvik Das, Ada Lerner, Dakuo Wang, and Tianshi Li

CHI 2024 The widespread use of Large Language Model (LLM)-based conversational agents (CAs), especially in high-stakes domains, raises many privacy concerns. Building ethical LLM-based CAs that respect user privacy requires an in-depth understanding of the privacy risks that concern users the most. However, existing research, primarily model-centered, does not provide insight into users’ perspectives. To bridge this gap, we analyzed sensitive disclosures in real-world ChatGPT conversations and conducted semi-structured interviews with 19 LLM-based CA users. We found that users are constantly faced with trade-offs between privacy, utility, and convenience when using LLM-based CAs. However, users’ erroneous mental models and the dark patterns in system design limited their awareness and comprehension of the privacy risks. Additionally, the human-like interactions encouraged more sensitive disclosures, which complicated users’ ability to navigate the trade-offs. We discuss practical design guidelines and the needs for paradigmatic shifts to protect the privacy of LLM-based CA users.

arXiv Video

Human-centered privacy research in the age of large language models

Tianshi Li, Sauvik Das, Hao-Ping (Hank) Lee, Dakuo Wang, Bingsheng Yao, and Zhiping Zhang

CHI 2024 SIG The rise of large language models (LLMs) in user-facing systems has raised significant privacy concerns. Existing research largely focuses on model-centric risks like memorization and inference attacks. We call for more human-centered research on how LLM design impacts user disclosure, privacy preferences, and control. To build usable and privacy-friendly systems, we aim to spark discussion around research agendas, methods, and collaborations across usable privacy, human-AI interaction, NLP, and related fields. This Special Interest Group (SIG) invites diverse researchers to share insights and chart future directions.

arXiv