Sixth AAAI Workshop on Privacy-Preserving Artificial Intelligence

Scope and Topics

The rise of machine learning, optimization, and Large Language Models (LLMs) has created new paradigms for computing, but it has also ushered in complex privacy challenges. The intersection of AI and privacy is not merely a technical dilemma but a societal concern that demands careful considerations.
In its sixth edition, the AAAI Workshop on Privacy-Preserving Artificial Intelligence (PPAI-25) will provide a platform for researchers, AI practitioners, and policymakers to discuss technical and societal issues and present solutions related to privacy in AI applications. The workshop will focus on both the theoretical and practical challenges related to the design of privacy-preserving AI systems and algorithms and will have strong multidisciplinary components, including soliciting contributions about policy, legal issues, and societal impact of privacy in AI. The emphasis will be placed on: Policy considerations and legal frameworks for privacy; Broader implications of privacy in LLMs; and The societal impact of privacy within AI.

Topics

We invite three categories of contributions: technical (research) papers, position papers, and systems descriptions on these subjects:

Differential privacy: Applications
Differential privacy: Theory
Contextual integrity
Attacks on data confidentiality
Privacy and Fairness interplay
Legal frameworks and privacy policies
Privacy-centric machine learning and optimization
Benchmarking: test cases and standards
Ethical considerations of LLMs on users' privacy
The impact of LLMs on personal privacy in various applications like chatbots, recommendation systems, etc.
Case studies on real-world privacy challenges and solutions in deploying LLMs
Privacy-aware evaluation metrics and benchmarks specifically for LLMs
Interdisciplinary perspectives on AI applications, including sociological and economic views on privacy
Evaluating models to audit and/or minimize data leakages

Finally, the workshop will welcome papers that describe the release of privacy-preserving benchmarks and data sets that can be used by the community to solve fundamental problems of interest, including in machine learning and optimization for health systems and urban networks, to mention but a few examples.

Format

The workshop will be a one-day meeting. The workshop will include a number of technical sessions, a poster session where presenters can discuss their work, with the aim of further fostering collaborations, multiple invited speakers covering crucial challenges for the field of privacy-preserving AI applications, including policy and societal impacts, a number of tutorial talks, and will conclude with a panel discussion.

Important Dates

November 29, 2024 – Submission Deadline
December 11, 2024 – NeurIPS/AAAI Fast Track Submission Deadline
December 31, 2024 – Acceptance Notification
March 3, 2025 – Workshop Date

Attendance

Attendance is open to all. At least one author of each accepted submission must be present at the workshop.

Submission Information

Submission URL: https://cmt3.research.microsoft.com/PPAI2025

Submission Types

Technical Papers: Full-length research papers of up to 9 pages (excluding references and appendices) detailing high-quality work in progress or work that could potentially be published at a major conference. References and appendices should be included in the same file as the main paper, following the main content.
Extended Abstracts: Position or brief description of initial work (1 page, excluding references and appendices) or the release of privacy-preserving benchmarks and datasets on the topics of interest.

NeurIPS/AAAI Fast Track (Rejected AAAI papers)

Rejected NeurIPS/AAAI papers with average scores of at least 4.5 may be submitted directly to PPAI along with previous reviews. These submissions may go through a light review process or accepted if the provided reviews are judged to meet the workshop standard.

All papers must be submitted in PDF or Word format, using one of the following templates.

Submissions should NOT include any author's name(s), affiliations, or email addresses. Submissions will be refereed on the basis of technical quality, novelty, significance, and clarity. Each submission will be thoroughly reviewed by at least two program committee members.

NeurIPS/AAAI fast track papers are subject to the same page limits of standard submissions. Fast track papers should be accompanied by their reviews, submitted as a supplemental material.

For questions about the submission process, contact the workshop chairs.

PPAI-25 scholarship application

PPAI is pleased to announce a Student scholarship program for 2025. The program provides partial travel support for students who are full-time undergraduate or graduate students at colleges and universities; have submitted papers to the workshop program or letters of recommendation from their faculty advisor.

Preference will be given to participating students presenting papers at the workshop or to students from underrepresented countries and communities.

To participate please fill in the Student Scholarship Program application form.

Deadline: February 10, 2025

Registration

Link for registration: https://aaai.org/conference/aaai/aaai-25/registration/

Program

March 3, 2025
All times are in Eastern Standard Time (UTC-5).

Time	Session
8:50	Introductory remarks
9:00	Invited Talk by Aaron Roth
9:30	Invited Talk by Alexis Shore Ingber
10:00	Contributed Talks
	Talk 1: Understanding and Mitigating the Impacts of Differentially Private Census Data on State Level Redistricting
	Talk 2: Fairness Issues and Mitigations in (Private) Socio-demographic Data Processes
	Talk 3: Privacy-Preserving Retrieval Augmented Generation with Differential Privacy
	Talk 4: Hacking the CRC Archive: Evaluating empirical privacy metrics on deidentified data
10:30	Break
11:00	Contributed Talks
	Talk 5: LLM on the wall, who now, is the appropriate one of all?": Contextual Integrity Evaluation of LLMs
	Talk 6: Understanding Memorization In Generative Models Through A Geometric Framework
	Talk 7: Streaming Private Continual Counting via Binning
	Talk 8: Laplace Transform Interpretation of Differential Privacy
11:30	Tutorial by Eugene Bagdasarian
12:15	Poster Session (by the registration desk)
13:30	Lunch (on your own)
14:45	Invited Talk by Amy Cyphert
15:15	Panel Discussion
15:45	Break
16:15	Invited Talk by Rachel Cummings
16:45	Concluding Remarks

Accepted Papers

Oral Presentations

Privacy-Preserving Retrieval Augmented Generation with Differential Privacy
Tatsuki Koga (UC San Diego), Ruihan Wu (UC San Diego), Kamalika Chaudhuri (UC San Diego)
"LLM on the wall, who *now*, is the appropriate one of all?": Contextual Integrity Evaluation of LLMs
Yan Shvartzshnaider (York University), Vasisht Duddu (University of Waterloo)
Laplace Transform Interpretation of Differential Privacy
Rishav Chourasia (School of Computing, National University of Singapore), Uzair Javaid (Betterdata), Biplap Sikdar (National University of Singapore)
Understanding and Mitigating the Impacts of Differentially Private Census Data on State Level Redistricting
Christian Cianfarani (University of Chicago), Aloni Cohen (University of Chicago)
Understanding Memorization In Generative Models Through A Geometric Framework
Dongjae Jeon (Yonsei University), Dueun Kim (Yonsei University), Albert No (Yonsei University)
Fairness Issues and Mitigations in (Private) Socio-demographic Data Processes
Joonhyuk Ko (University of Virginia), Juba Ziani (Georgia Institute of Technology), Saswat Das (University of Virginia), Matt Williams (RTI International), Ferdinando Fioretto (University of Virginia)
Streaming Private Continual Counting via Binning
Joel Daniel Andersson (University of Copenhagen), Rasmus Pagh (University of Copenhagen)
Hacking the CRC Archive: Evaluating empirical privacy metrics on deidentified data
Gary Howarth (National Institue of Standards and Technology), Christine Task (Knexus Research), Damon Streat (Knexus Research), Karan Bhagat (Knexus Research)

Poster Presentations

Differentially Private Iterative Screening Rules for Linear Regression
Amol Khanna (Booz Allen Hamilton), Fred Lu (Booz Allen Hamilton), Edward Raff (Booz Allen Hamilton)
OPA: One-shot Private Aggregation with Single Client Interaction and its Applications to Federated Learning
Harish Karthikeyan (JP Morgan), Antigoni Polychroniadou (JP Morgan Chase)
Decentralized Group Privacy in Cross-Silo Federated Learning
Virendra Marathe (Oracle Labs), Dave Dice (Oracle Labs), Wei Jiang (Oracle Labs), Pallika Kanani (Oracle Labs), Hamid Mozaffari (Oracle Labs)
A Stochastic Optimization Framework for Private and Fair Learning From Decentralized Data
Devansh Gupta (University of Southern California), A.S. Poornash (Indian Institute of Technology, Patna), Andrew Lowy (University of Wisconsin-Madison), Meisam Razaviyayn (USC)
Efficient and Private: Memorisation under differentially private parameter-efficient fine-tuning in language models
Olivia Ma (Imperial College London), Jonathan Passerat-Palmbach (Imperial College London/Flashbots), Dmitrii Usynin (Imperial College London)
Practical Privacy for Correlated Data
Liyue Fan (UNC Charlotte)
Risk and Response in Large Language Models: Evaluating Key Threat Categories
Bahareh Harandizadeh (University of Southern California), Abel Salinas (University of Southern California), Fred Morstatter (University of Southern California)
A Benchmark of Unlearnable Examples for Medical Images
Hyunjee Nam (Chung-Ang University), Minjung Kang (Chung-Ang University), Sunghwan Park (Chung-Ang University), Il-Youp Kwak (Chung-Ang University), Jaewoo Lee (Chung-Ang University)
Privacy Preservation for Synthetic Data Generation Using Large Language Models: A Case Study of Genomic Data
Reem Al-Saidi (University of Windsor), Ziad Kobti (University of Windsor), Thorsten Strufe (Karlsruhe Institute of Technology)
Shared DReLUs for Private Inference
Amir Jevnisek (Tel-Aviv University), Yakir Gorski (Tel-Aviv University), Shai Avidan (Tel-Aviv University)
ObfuscaTune: Obfuscated Offsite Finetuning and Inference of Proprietary LLMs on Private Datasets
Ahmed Frikha (Huawei), Nassim Walha (Huawei), Ricardo Mendes (Huawei), Krishna Kanth Nakka (Huawei), Xue Jiang (Huawei), Xuebing Zhou (Huawei)
Differential privacy in the clean room: Copyright protections for generative AI
Aloni Cohen (University of Chicago)
Private Fine-Tuning of Language Models using Secure Multiparty Computation
Arisa Tajima (UMass), Wei Jiang (Oracle), Virendra Marathe (Oracle), Adam Pocock (Oracle)
Click Without Compromise: Online Advertising Measurement via Per User Differential Privacy
Yingtai Xiao (TikTok Inc.), Jian Du (TikTok Inc.), Shikun Zhang (TikTok Inc.), Wanrong Zhang (TikTok Inc.), Qiang Yan (TikTok Inc.), Danfeng Zhang (Duke University), Daniel Kifer (Pennsylvania State University)
Testing Credibility of Public and Private Surveys Through The Lens of Regression
Debaborta Basu (Equipe Scool, Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189- CRIStAL), Sourav Chakraborty (Indian Statistical Institute, Kolkata), Debarshi Chanda (Indian Statistical Institute, Kolkata), Buddhadev Das (Indian Statistical Institute, Kolkata), Arijit Ghosh (Indian Statistical Institute, Kolkata), Arnab Ray (Indian Statistical Institute, Kolkata)
FLIPHAT: Joint Differential Privacy for High Dimensional Sparse Linear Bandits
Sunrit Chakraborty (University of Michigan), Saptarshi Roy (University of Michigan), Debabrota Basu (Inria)
The Data Generation Chain: The Risk of Non-Maleficence and Differential Privacy
Amy Russ
PrivacyGAN: Robust Generative Image Privacy
Mariia Zameshina (ESIEE Paris), Marlene Careil (LTCI, Telecom Paris, Institut Polytechnique de Paris), Olivier Teytaud (TAO, CNRS - INRIA - LRI), Laurent Najman (Univ Gustave Eiffel, CNRS, LIGM)
Towards Privacy-Enhanced Language Models: Named Entity Recognition for Blinded Recruitment Tools
Gonzalo Mancera (Universidad Autónoma de Madrid), Aythami Morales (Universidad Autónoma de Madrid), Julian Fierrez (Universidad Autónoma de Madrid), Ruben Tolosana (Universidad Autónoma de Madrid), Alejandro Peña (Universidad Autónoma de Madrid), Francisco Jurado (Universidad Autónoma de Madrid), Alvaro Ortigosa (Universidad Autónoma de Madrid)
Exploring Audio Editing Features as User-Centric Privacy Defenses Against Emotion Inference Attacks
Mohd. Farhan Israk Soumik (Southern Illinois University Carbondale), W.K.M Mithsara (Southern Illinois University Carbondale), Abdur Rahman Bin Shahid (Southern Illinois University Carbondale), Ahmed Imteaj (Southern Illinois University Carbondale)
Targeted Data Protection for Diffusion Model by Matching Training Trajectory
Hojun Lee (Xperty Corp.), Mijin Koo (Seoul National University), Yeji Song (Seoul National University), Nojun Kwak (Seoul National University)
Flash: A Hybrid Private Inference Protocol for Deep CNNs with High Accuracy and Low Latency on CPU Hyeri Roh (Seoul National University), Jinsu Yeo (Seoul National University), Woo-Seok Choi (Seoul National University)
Adaptive PII Mitigation Framework for Large Language Models
Shubhi Asthana (IBM Research - Almaden), Ruchi Mahindru (IBM Research), Bing Zhang (IBM Research), Jorge Sanz (IBM Research)
Seeding with Differentially Private Network Information
Amin Rahimian (University of Pittsburgh), Fang-Yi Yu (George Mason University), Yuxin Liu (University of Pittsburgh), Carlos Hurtado (University of Pittsburgh)
Conversational Privacy Attacks Against Agentic LLMs
Saswat Das (University of Virginia), Joseph Moretto (University of Virginia), David Evans (University of Virginia), Ferdinando Fioretto (University of Virginia)
Eliciting Data Subject's Privacy-Accuracy Preferences for Differentially-Private Deployments
Priyanka Nanayakkara (Northwestern University), Jayshree Sarathy (Northeastern University), Mary Anne Smart (Purdue University), Rachel Cummings (Columbia University), Gabriel Kaptchuk (University of Maryland), Elissa Redmiles (Georgetown University)
GLOCALFAIR: Jointly Improving Global and Local Group Fairness in Federated Learning
Syed Irfan Ali Meerza (University of Tennessee Knoxville), Luyang Liu (Google Research), Jiaxin Zhang (Intuit AI Research), Jian Liu (University of Tennessee Knoxville)
LegalGuardian: A Privacy-Preserving Framework for Secure Integration of Large Language Models in Legal Practice
Mikail M Demir (University at Albany SUNY), Hakan Otal (University at Albany SUNY), M Abdullah Canbaz (University at Albany SUNY)
Coincidental Generation
Jordan Suchow (Stevens Institute of Technology), Necdet Gürkan (University of Missouri, St. Louis)
Differentially Private Sequential Learning
Yuxin Liu (University of Pittsburgh), Amin Rahimian (University of Pittsburgh)
End to End Collaborative Synthetic Data Generation
Sikha Pentyala (University of Washington, Tacoma), Geetha Sitaraman (University of Washington, Tacoma), Trae Claar (University of Washington, Tacoma), Martine De Cock (University of Washington, Tacoma)
Entropy-Guided Attention for Private LLMs
Nandan Kumar Jha (New York University), Brandon Reagen (New York University)
Privacy Amplification Through Synthetic Data: Insights from Linear Regression
Clément Pierquin (Craft AI), Matthieu Boussard (Craft AI), Aulélien Bellet (INRIA), Marc Tommasi (INRIA)
Benchmarking Geolocation Privacy in the Era of Vision Language Models
Neel Jay (Apart Research), Hieu Minh Nguyen (Apart Research), Trung Dung Hoang (Apart Research), Jacob Haimes (Apart Research)

Invited Talks & Tutorials

Talk 1: What Should We Trust in Trustworthy Machine Learning? (Aaron Roth)
Abstract: Machine learning is impressive, but imperfect --- it makes errors. When we use ML predictions to take action, especially in high stakes settings, we want to be cognizant of this fact and take into account our probabilistic uncertainty. There are many ways of quantifying uncertainty, but what are they good for? We take the position that probabilistic predictions should be "trustworthy" in the sense that downstream decision makers should be guaranteed that acting as if the probabilistic predictions are correct should guarantee them high utility outcomes relative to anything else they could do with the predictions. We give algorithms for doing this and show a number of applications.
Talk 2 : Privacy by Design—or by AI? What Humans Can Teach Machines About Privacy (Alexis Shore Ingber)
Abstract: AI systems' governance over human privacy raises questions around how privacy is being defined and decided. Building on an established framework for human privacy management, this talk explores how AI is reshaping privacy through its impact on interpersonal dynamics, platform design, and law and policy. It will then apply this perspective to the deployment of emotion AI in hiring. Findings from an experimental investigation showcase how studying human interaction with AI systems can provide insights into how privacy might be designed. Ultimately, this talk calls for a reimagining of AI privacy governance—one that aligns with human practices and experiences.
Talk 3 : Privacy Law is Inadequate to Regulate AI (Amy Cyphert)
Abstract: The United States has not enacted comprehensive federal privacy legislation, though it does have a patchwork of sector-specific laws (like HIPAA and FERPA) and more than a dozen states have privacy laws. This absence of comprehensive federal privacy laws complicates the already difficult process of regulating AI. Data is the fuel of AI, and data privacy laws are an important first step to mapping out AI regulations. Even where privacy laws do exist in the United States, they do not currently contribute meaningfully enough to AI regulation for a variety of reasons, including the absence of private rights of action in some privacy laws, and the difficult causation/standing issues litigants face in some privacy actions.
Talk 4 : Differential privacy beyond algorithms: Challenges for successful deployment (Rachel Cummings)
Abstract: Differential privacy (DP) has been hailed as the gold standard of privacy-preserving data analysis, by providing strong privacy guarantees while still enabling use of potentially sensitive data. Formally, DP gives a mathematically rigorous worst-case bound on the maximum amount of information that can be learned about an individual's data from the output of a computation. In the past two decades, the privacy community has developed DP algorithms that satisfy this privacy guarantee and allow for accurate data analysis for a wide variety of computational problems and application domains. We have also begun to see a number of high-profile deployments of DP systems in practice, both at large technology companies and government entities. Despite the promise and success of DP thus far, there are a number of critical challenges left to be addressed before DP can be easily deployed in practice, including: mapping the mathematical privacy guarantees onto protection against real-world threats, developing explanations of its guarantees and tradeoffs for non-technical users, integration with other privacy & security tools, preventing misuse, and more.
Tutorial : Context Is Key: Evaluating Privacy and Security in AI Assistants (Eugene Bagdasarian)
Abstract: Privacy of model inputs is a new challenge when deploying AI assistants powered by large language models (LLMs) and agentic capabilities. Operating on sensitive user data, these assistants must distinguish between different contexts when deciding what information to share. In this tutorial, we will define the problem domain and introduce the connection to principles of contextual integrity—a theory defining appropriate information flows. Leveraging this theory, we will outline how AI assistants are fundamentally vulnerable to attacks exploiting context ambiguities. We will walk through our experiences studying safety mechanisms for these agents, along with the hard lessons we learned. To evaluate new agents, we will discuss creating synthetic data and user profiles, as well as obtaining appropriate norms and metrics. Finally, we will explain why blindly applying traditional operating-system methods requires caution and discuss how to adapt them for AI assistants.

Invited Speakers

Aaron Roth

University of Pennsylvania

Rachel Cummings

Columbia University

Amy Cyphert

West Virginia University

Alexis Shore Ingber

University of Michigan

Eugene Bagdasarian

University of Massachusetts, Amherst

PPAI-25 Panel:

“Understanding and regulating the privacy risks in an embodied agents world”

Annette Zimmermann

University of Wisconsin-Madison

Lauryn P. Gouldin

Syracuse University

Eugene Bagdasarian

University of Massachusetts, Amherst

Aloni Cohen

University of Chicago

Code of Conduct

PPAI 2025 is committed to providing an atmosphere that encourages freedom of expression and exchange of ideas. It is the policy of PPAI 2025 that all participants will enjoy a welcoming environment free from unlawful discrimination, harassment, and retaliation.

Harassment will not be tolerated in any form, including but not limited to harassment based on gender, gender identity and expression, sexual orientation, disability, physical appearance, race, age, religion or any other status. Harassment includes the use of abusive, offensive or degrading language, intimidation, stalking, harassing photography or recording, inappropriate physical contact, sexual imagery and unwelcome sexual advances. Participants asked to stop any harassing behavior are expected to comply immediately.

Violations should be reported to the workshop chairs. All reports will be treated confidentially. The conference committee will deal with each case separately. Sanctions include, but are not limited to, exclusion from the workshop, removal of material from the online record of the conference, and referral to the violator’s university or employer. All PPAI 2025 attendees are expected to comply with these standards of behavior.