Publications
Autonomy Matters: A Study on Personalization-Privacy Dilemma in LLM Agents
Zhiping Zhang, Yi Evie Zhang, Freda Shi, Tianshi Li
Preprint Large Language Model (LLM) agents require personal information for personalization in order to better act on users' behalf in daily tasks, but this raises privacy concerns and a personalization-privacy dilemma. Agent's autonomy introduces both risks and opportunities, yet its effects remain unclear. To better understand this, we conducted a 3×3 between-subjects experiment (N=450) to study how agent's autonomy level and personalization influence users' privacy concerns, trust and willingness to use, as well as the underlying psychological processes. We find that personalization without considering users' privacy preferences increases privacy concerns and decreases trust and willingness to use. Autonomy moderates these effects: Intermediate autonomy flattens the impact of personalization compared to No- and Full autonomy conditions. Our results suggest that rather than aiming for perfect model alignment in output generation, balancing autonomy of agent's action and user control offers a promising path to mitigate the personalization-privacy dilemma.
Operationalizing Data Minimization for Privacy-Preserving LLM Prompting
Jijie Zhou, Niloofar Mireshghallah, Tianshi Li
Preprint The rapid deployment of large language models (LLMs) in consumer applications has led to frequent exchanges of personal information. To obtain useful responses, users often share more than necessary, increasing privacy risks via memorization, context-based personalization, or security breaches. We present a framework to formally define and operationalize data minimization: for a given user prompt and response model, quantifying the least privacy-revealing disclosure that maintains utility, and we propose a priority-queue tree search to locate this optimal point within a privacy-ordered transformation space. We evaluated the framework on four datasets spanning open-ended conversations (ShareGPT, WildChat) and knowledge-intensive tasks with single-ground-truth answers (CaseHold, MedQA), quantifying achievable data minimization with nine LLMs as the response model. Our results demonstrate that larger frontier LLMs can tolerate stronger data minimization while maintaining task quality than smaller open-source models (85.7% redaction for GPT-5 vs. 19.3% for Qwen2.5-0.5B). By comparing with our search-derived benchmarks, we find that LLMs struggle to predict optimal data minimization directly, showing a bias toward abstraction that leads to oversharing. This suggests not just a privacy gap, but a capability gap: models may lack awareness of what information they actually need to solve a task.
PrivacyMotiv: Speculative Persona Journeys for Empathic and Motivating Privacy Reviews in UX Design
Zeya Chen, Jianing Wen, Ruth Schmidt, Yaxing Yao, Toby Jia-Jun Li, Tianshi Li
Preprint UX professionals routinely conduct design reviews, yet privacy concerns are often overlooked --- not only due to limited tools, but more critically because of low intrinsic motivation. Limited privacy knowledge, weak empathy for unexpectedly affected users, and low confidence in identifying harms make it difficult to address risks. We present PrivacyMotiv, an LLM-powered system that supports privacy-oriented design diagnosis by generating speculative personas with UX user journeys centered on individuals vulnerable to privacy risks. Drawing on narrative strategies, the system constructs relatable and attention-drawing scenarios that show how ordinary design choices may cause unintended harms, expanding the scope of privacy reflection in UX. In a within-subjects study with professional UX practitioners (N=16), we compared participants' self-proposed methods with PrivacyMotiv across two privacy review tasks. Results show significant improvements in empathy, intrinsic motivation, and perceived usefulness. This work contributes a promising privacy review approach which addresses the motivational barriers in privacy-aware UX.
Not My Agent, Not My Boundary? Elicitation of Personal Privacy Boundaries in AI-Delegated Information Sharing
Bingcan Guo, Eryue Xu, Zhiping Zhang, Tianshi Li
Preprint Aligning AI systems with human privacy preferences requires understanding individuals' nuanced disclosure behaviors beyond general norms. Yet eliciting such boundaries remains challenging due to the context-dependent nature of privacy decisions and the complex trade-offs involved. We present an AI-powered elicitation approach that probes individuals' privacy boundaries through a discriminative task. We conducted a between-subjects study that systematically varied communication roles and delegation conditions, resulting in 1,681 boundary specifications from 169 participants for 61 scenarios. We examined how these contextual factors and individual differences influence the boundary specification. Quantitative results show that communication roles influence individuals' acceptance of detailed and identifiable disclosure, AI delegation and individuals' need for privacy heighten sensitivity to disclosed identifiers, and AI delegation results in less consensus across individuals. Our findings highlight the importance of situating privacy preference elicitation within real-world data flows. We advocate using nuanced privacy boundaries as an alignment goal for future AI systems.
Why am I seeing this: Democratizing End User Auditing for Online Content Recommendations
Chaoran Chen, Leyang Li, Luke Cao, Yanfang Ye, Tianshi Li, Yaxing Yao, Toby Jia-jun Li
UIST 2025 Personalized recommendation systems tailor content based on user attributes, which are either provided or inferred from private data. Research suggests that users often hypothesize about reasons behind contents they encounter (e.g., “I see this jewelry ad because I am a woman”), but they lack the means to confrm these hypotheses due to the opaqueness of these systems. This hinders informed decision-making about privacy and system use and contributes to the lack of algorithmic accountability. To address these challenges, we introduce a new interactive sandbox approach. This approach creates sets of synthetic user personas and corresponding personal data that embody realistic variations in personal attributes, allowing users to test their hypotheses by observing how a website’s algorithms respond to these personas. We tested the sandbox in the context of targeted advertisement. Our user study demonstrates its usability, usefulness, and efectiveness in empowering end-user auditing in a case study of targeting ads.
Privacy Leakage Overshadowed by Views of AI: A Study on Human Oversight of Privacy in Language Model Agent
Zhiping Zhang, Bingcan Guo, Tianshi Li
Preprint Language model (LM) agents can boost productivity in personal tasks like replying to emails but pose privacy risks. We present the first study (N=300) on people’s ability to oversee LM agents’ privacy implications in asynchronous communication. Participants sometimes preferred agent-generated responses with greater privacy leakage, increasing harmful disclosures from 15.7% to 55.0%. We identified six privacy profiles reflecting different concerns, trust, and preferences. Our findings inform the design of agentic systems that support privacy-preserving interactions and better align with users’ privacy expectations.
Rescriber: Smaller-LLM-Powered User-Led Data Minimization for LLM-Based Chatbots
Jijie Zhou, Eryue Xu, Yaoyao Wu, Tianshi Li
CHI 2025 The rise of LLM-based conversational agents has led to increased disclosure of sensitive information, yet current systems lack user control over privacy-utility tradeoffs. We present Rescriber, a browser extension that enables user-led data minimization by detecting and sanitizing personal information in prompts. In a study (N=12), Rescriber reduced unnecessary disclosures and addressed user privacy concerns. Users rated the Llama3-8B-powered system comparably to GPT-4o. Trust was shaped by the tool’s consistency and comprehensiveness. Our findings highlight the promise of lightweight, on-device privacy controls for enhancing trust and protection in AI systems.
The Obvious Invisible Threat: LLM-Powered GUI Agents' Vulnerability to Fine-Print Injections
Chaoran Chen, Zhiping Zhang, Bingcan Guo, Shang Ma, Ibrahim Khalilov, Simret A Gebreegziabher, Yanfang Ye, Ziang Xiao†, Yaxing Yao†, Tianshi Li†, Toby Jia-Jun Li†
Preprint A Large Language Model (LLM) powered GUI agent is a specialized autonomous system that performs tasks on the user’s behalf according to high-level instructions. It does so by perceiving and interpreting the graphical user interfaces (GUIs) of relevant apps, often visually, inferring necessary sequences of actions, and then interacting with GUIs by executing the actions such as clicking, typing, and tapping. To complete real-world tasks, such as filling forms or booking services, GUI agents often need to process and act on sensitive user data. However, this autonomy introduces new privacy and security risks. Adversaries can inject malicious content into the GUIs that alters agent behaviors or induces unintended disclosures of private information. These attacks often exploit the discrepancy between visual saliency for agents and human users, or the agent’s limited ability to detect violations of contextual integrity in task automation. In this paper, we characterized six types of such attacks, and conducted an experimental study to test these attacks with six state-of-the-art GUI agents, 234 adversarial webpages, and 39 human participants. Our findings suggest that GUI agents are highly vulnerable, particularly to contextually embedded threats. Moreover, human users are also susceptible to many of these attacks, indicating that simple human oversight may not reliably prevent failures. This misalignment highlights the need for privacy-aware agent design. We propose practical defense strategies to inform the development of safer and more reliable GUI agents.
Toward a Human-centered Evaluation Framework for Trustworthy LLM-powered GUI Agents
Chaoran Chen*, Zhiping Zhang*, Ibrahim Khalilov, Bingcan Guo, Simret A Gebreegziabher, Yanfang Ye, Ziang Xiao†, Yaxing Yao†, Tianshi Li†, Toby Jia-Jun Li†
CHI 2025 HEAL Workshop The rise of LLM-powered GUI agents has advanced automation but introduced significant privacy and security risks due to limited human oversight. This position paper identifies three key risks unique to GUI agents, highlights gaps in current evaluation practices, and outlines five challenges in integrating human evaluators. We advocate for a human-centered evaluation framework that embeds risk assessments, in-context consent, and privacy and security considerations into GUI agent design.
Secret Use of Large Language Model (LLM)
Zhiping Zhang, Chenxinran Shen, Bingsheng Yao, Dakuo Wang, and Tianshi Li
CSCW 2025 The advancements of Large Language Models (LLMs) have decentralized the responsibility for the transparency of AI usage. Specifically, LLM users are now encouraged or required to disclose the use of LLM-generated content for varied types of real-world tasks. However, an emerging phenomenon, users’ secret use of LLMs, raises challenges in ensuring end users adhere to the transparency requirement. Our study used mixed-methods with an exploratory survey (125 real-world secret use cases reported) and a controlled experiment among 300 users to investigate the contexts and causes behind the secret use of LLMs. We found that such secretive behavior is often triggered by certain tasks, transcending demographic and personality differences among users. Task types were found to affect users’ intentions to use secretive behavior, primarily through influencing of perceived external judgment regarding LLM usage. Our results yield important insights for future work on designing interventions to encourage more transparent disclosure of LLM/AI use.
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action
Yijia Shao, Tianshi Li, Weiyan Shi, Yanchen Liu, and Diyi Yang
NeurIPS 2024 As language models (LMs) gain agency in personal communication, ensuring privacy compliance is critical but hard to evaluate. We introduce PrivacyLens, a framework that expands privacy-sensitive seeds into vignettes and trajectories for multi-level privacy risk assessment. Testing GPT-4 and Llama-3-70B, we find sensitive information leakage in 25.68% and 38.69% of cases, even with privacy prompts. PrivacyLens also supports dynamic red-teaming through trajectory variations.
Zhiping Zhang, Michelle Jia, Hao-Ping (Hank) Lee, Bingsheng Yao, Sauvik Das, Ada Lerner, Dakuo Wang, and Tianshi Li
CHI 2024 The widespread use of Large Language Model (LLM)-based conversational agents (CAs), especially in high-stakes domains, raises many privacy concerns. Building ethical LLM-based CAs that respect user privacy requires an in-depth understanding of the privacy risks that concern users the most. However, existing research, primarily model-centered, does not provide insight into users’ perspectives. To bridge this gap, we analyzed sensitive disclosures in real-world ChatGPT conversations and conducted semi-structured interviews with 19 LLM-based CA users. We found that users are constantly faced with trade-offs between privacy, utility, and convenience when using LLM-based CAs. However, users’ erroneous mental models and the dark patterns in system design limited their awareness and comprehension of the privacy risks. Additionally, the human-like interactions encouraged more sensitive disclosures, which complicated users’ ability to navigate the trade-offs. We discuss practical design guidelines and the needs for paradigmatic shifts to protect the privacy of LLM-based CA users.
Human-centered privacy research in the age of large language models
Tianshi Li, Sauvik Das, Hao-Ping (Hank) Lee, Dakuo Wang, Bingsheng Yao, and Zhiping Zhang
CHI 2024 SIG The rise of large language models (LLMs) in user-facing systems has raised significant privacy concerns. Existing research largely focuses on model-centric risks like memorization and inference attacks. We call for more human-centered research on how LLM design impacts user disclosure, privacy preferences, and control. To build usable and privacy-friendly systems, we aim to spark discussion around research agendas, methods, and collaborations across usable privacy, human-AI interaction, NLP, and related fields. This Special Interest Group (SIG) invites diverse researchers to share insights and chart future directions.