Scope and Topics

The rise of machine learning, optimization, and Large Language Models (LLMs) has created new paradigms for computing, but it has also ushered in complex privacy challenges. The intersection of AI and privacy is not merely a technical dilemma but a societal concern that demands careful considerations.
The Privacy Preserving AI workshop, in its 5th edition, will provide a multi-disciplinary platform for researchers, AI practitioners, and policymakers to focus on the theoretical and practical aspects of designing privacy-preserving AI systems and algorithms. The emphasis will be placed on policy considerations, broader implications of privacy in LLMs, and the societal impact of privacy within AI

Topics

We invite three categories of contributions: technical (research) papers, position papers, and systems descriptions on these subjects:
  • Differential privacy applications
  • Privacy and Fairness interplay
  • Legal frameworks and privacy policies
  • Privacy-centric machine learning and optimization
  • Benchmarking: test cases and standards
  • Ethical considerations of LLMs on users' privacy
  • The impact of LLMs on personal privacy in various applications like chatbots, recommendation systems, etc.
  • Case studies on real-world privacy challenges and solutions in deploying LLMs
  • Privacy-aware evaluation metrics and benchmarks specifically for LLMs
  • Interdisciplinary perspectives on AI applications, including sociological and economic views on privacy
  • Evaluating models to audit and/or minimize data leakages
  • Privacy and Fairness
  • Privacy and causality
  • Privacy-preserving optimization and machine learning

Finally, the workshop will welcome papers that describe the release of privacy-preserving benchmarks and data sets that can be used by the community to solve fundamental problems of interest, including in machine learning and optimization for health systems and urban networks, to mention but a few examples.

Format

The workshop will be a one-day meeting. The workshop will include a number of technical sessions, a poster session where presenters can discuss their work, with the aim of further fostering collaborations, multiple invited speakers covering crucial challenges for the field of privacy-preserving AI applications, including policy and societal impacts, a number of tutorial talks, and will conclude with a panel discussion.


Important Dates

  • November 22, 2023 – Submission Deadline [Extended]
  • December 12, 2023 – NeurIPS/AAAI Fast Track Submission Deadline
  • December 22, 2023 – Acceptance Notification
  • January 15, 2024 – Student Scholarship Program Deadline [Extended]
  • February 26, 2024 – Workshop Date

Submission Information

Submission URL: https://cmt3.research.microsoft.com/PPAI2024

Submission Types

  • Technical Papers: Full-length research papers of up to 7 pages (excluding references and appendices) detailing high quality work in progress or work that could potentially be published at a major conference.
  • Short Papers: Position or short papers of up to 4 pages (excluding references and appendices) that describe initial work or the release of privacy-preserving benchmarks and datasets on the topics of interest.

NeurIPS/AAAI Fast Track (Rejected AAAI papers)

Rejected NeurIPS/AAAI papers with *average* scores of at least 4.0 may be submitted directly to PPAI along with previous reviews. These submissions may go through a light review process or accepted if the provided reviews are judged to meet the workshop standard.

All papers must be submitted in PDF format, using the AAAI-24 author kit. Submissions should include the name(s), affiliations, and email addresses of all authors.
Submissions will be refereed on the basis of technical quality, novelty, significance, and clarity. Each submission will be thoroughly reviewed by at least two program committee members.

NeurIPS/AAAI fast track papers are subject to the same page limits of standard submissions. Fast track papers should be accompanied by their reviews, submitted as a supplemental material.

For questions about the submission process, contact the workshop chairs.


PPAI-24 scholarship application

PPAI is pleased to announce a Student scholarship program for 2024. The program provides partial travel support for students who are full-time undergraduate or graduate students at colleges and universities; have submitted papers to the workshop program or letters of recommendation from their faculty advisor.

Preference will be given to participating students presenting papers at the workshop or to students from underrepresented countries and communities.

To participate please fill in the Student Scholarship Program application form.

Deadline: January, 10, 2024.


Registration

Registration in each workshop is required by all active participants, and is also open to all interested individuals. Early registration deadline is on January 6th. For more information please refer to AAAI-24 Workshop page.

Program

February, 26, 2024
All times are in Pacific Standard Time (UTC-8).

Time Talk / Presenter
8:50 Introductory remarks
9:00 Invited Talk: Everything Looks Like a Nail by Matthew Jagielski
9:30 Poster Session
10:30 Break
11:00 Tutorial: Why do we care about privacy? by Katherine Lee
11:45 Roundtable discussions
12:30 Lunch Break
14:00 Invited Talk: Navigating Privacy Risks in (Large) Language Models: Strategies and Solutions by Peter Kairouz
14:30 Spotlight Talk: Text Sanitization Beyond Specific Domains: Zero-Shot Redaction & Substitution with Large Language Models
14:40 Spotlight Talk: Epsilon*: Privacy Metric for Machine Learning Models
14:50 Spotlight Talk: De-amplifying Bias from Differential Privacy in Language Model Fine-tuning
15:00 Invited Talk: Unregulated? Think Again: Unpacking all the meaningful ways in which data privacy law regulates Generative AI by Gabriela Zanfir-Fortunagab
15:30 Break
16:00 Invited Talk: Enforcing Right to Explanation: Algorithmic Challenges and Opportunities by Himabindu Lakkaraju
16:30 Panel Discussion: Privacy in Generative AI - a technical and policy discussion. Where are we and what are we missing?
17:15 Concluding Remarks

Accepted Papers

Spotlight Presentations
  • Text Sanitization Beyond Specific Domains: Zero-Shot Redaction & Substitution with Large Language Models
    Federico Albanese Federico Albanese (University of Buenos Aires)*; Daniel Ciolek (National University of Quilmes); Nicolas D'Ippolito (ASAPP)
  • Epsilon*: Privacy Metric for Machine Learning Models
    Diana Negoescu Diana Negoescu (LinkedIn Corporation)*; Humberto Gonzalez (LinkedIn Corporation); Saad Eddin Al Orjany (LinkedIn); Jilei Yang (LinkedIn Corporation); Yuliia Lut (LinkedIn Corporation); Rahul Tandra (LinkedIn Corporation); Xiaowen Zhang (LinkedIn Corporation); Xinyi Zheng (University of Michigan, Ann Arbor, Michigan); Zach Douglas (LinkedIn Corporation); Vidita Nolkha (LinkedIn Corporation); Parvez Ahammad (LinkedIn); Gennady Samorodnitsky (Cornell University)
  • De-amplifying Bias from Differential Privacy in Language Model Fine-tuning
    Sanjari Srivastava Sanjari Srivastava (Stanford University)*; Piotr Mardziel (Independent); Zhikun Zhang (CISPA Helmholtz Center for Information Security); Archana Ahlawat (Princeton University); anupam datta (Carnegie Mellon University); John Mitchell (Stanford University)
Poster Presentations
  • Towards Machine Unlearning Benchmarks: Forgetting the Personal Identities in Facial Recognition Systems
    Dongbin Na Dasol Choi (Kyunghee University); Dongbin Na (POSTECH)*
  • Randomized algorithms for precise measurement of differentially-private, personalized recommendations
    Allegra Laro Allegra Laro (Apple)*; Yanqing Chen (Apple); Hao He (Apple); Babak Aghazadeh (Apple)
  • Differentially Private and Adversarially Robust Machine Learning: An Empirical Evaluation
    Janvi Thakkar Janvi Thakkar (Imperial College London)*; Giulio Zizzo (IBM Research); Sergio Maffeis (Imperial College London)
  • Differentially Private Neural Tangent Kernels for Privacy-Preserving Data Generation
    Yilin Yang Yilin Yang (University of British Columbia)*; Mi Jung Park (UBC )
  • RQP-SGD: Differential Private Machine Learning through Noisy SGD and Randomized Quantization
    Ce Feng Ce Feng (Lehigh University)*; parv Venkitasubramaniam (Lehigh University)
  • Evaluating Privacy Leakage in Split Learning
    Xinchi Qiu Xinchi Qiu (University of Cambridge)*; Ilias Leontiadis (Samsung Ai); Luca Melis (Meta); Alexandre Sablayrolles (Facebook AI Research); Pierre Stock ()
  • Don't Forget Private Retrieval: Distributed Private Similarity Search for Large Language Models
    Tobin South Guy Zyskind (MIT); Tobin South (MIT)*; Alex `Sandy' Pentland (MIT)
  • Empirical Privacy Trade-Off Curves: Understanding the Gap between Theoretical and Practical Privacy Guarantees
    Mohammad Yaghini Mohammad Yaghini (University of Toronto & Vector Institute)*; Lukas Wutschitz (Microsoft); Santiago Zanella-Beguelin (Microsoft Research)
  • Why Does Differential Privacy with Large $\varepsilon$ Defend Against Practical Membership Inference Attacks?
    Andrew Lowy Andrew Lowy (USC)*; Zhuohang Li (Vanderbilt University); Jing Liu (MERL); Toshiaki Koike-Akino (Mitsubishi Electric Research Laboratories); Kieran Parsons (Mitsubishi Electric Research Laboratories); Ye Wang (Mitsubishi Electric Research Laboratories)
  • Group Decision-Making among Privacy-Aware Agents
    Marios Papachristou Marios Papachristou (Cornell)*; Amin Rahimian (University of Pittsburgh)
  • Differentially Private Training of Mixture of Experts Models
    Pierre Tholoniat Pierre Tholoniat (Columbia University)*; Huseyin Inan (Microsoft Research ); Janardhan Kulkarni (Microsoft Research); Robert Sim (Microsoft Research)
  • An Efficient and Accurate Gated RNN for Execution under TFHE
    Rickard Brännvall Rickard Brännvall (RISE Research Institutes of Sweden)*; Andrei Stoian (ZAMA)
  • How To Filter Out Malicious Encrypted Gradients in Federated Learning
    Sílvia Casacuberta Jordan Barkin (Harvard University); Ratip Emin Berker (Harvard University); Sílvia Casacuberta (Harvard University)*; Janet Li (Harvard University)
  • Qrlew: Query Rewriting for Differential Privacy
    Nicolas Grislain Nicolas Grislain (Sarus)*; Paul Roussel (Sarus); Victoria de Sainte Agathe (Sarus)
  • Applying Directional Noise to Deep Learning
    Pedro Faustini Pedro Faustini (Macquarie University)*; Natasha Fernandes (Macquarie University); Shakila Tonni (Macquarie University); Annabelle McIver (Macquarie University); Mark Dras (Computing Department, Macquarie University, NSW 2109)
  • Brave: Byzantine-Resilient and Privacy-Preserving Peer-to-Peer Federated Learning
    Luyao Niu Zhangchen Xu (University of Washington); Fengqing Jiang (University of Washington); Luyao Niu (University of Washington)*; Jinyuan Jia (The Pennsylvania State University); Radha Poovendran (University of Washington)
  • Differentially Private Prediction of Large Language Models
    James Flemings James Flemings (University of Southern California)*; Murali Annavaram (University of Southern California); Meisam Razaviyayn (University of Southern California)
  • Tuning Differential Privacy Mechanisms Using Causal Models of Contextual Integrity
    Sebastian Benthall Sebastian Benthall (International Computer Science Institute)*; Rachel Cummings (Columbia University)
  • Training Differentially Private Ad Prediction Models with Semi-Sensitive Features
    Badih Ghazi Badih Ghazi (Google)*; Amer Sinha (Google); Avinash Varadarajan (Google AI Healthcare); Chiyuan Zhang (Google); Lynn Chua (Google); Charlie Harrison (Google); Qiliang Cui (Google); Krishna Giri Narra (University of Southern California); Pasin Manurangsi (Google); Pritish Kamath (Google Research); Walid Krichene (Google); Ravi Kumar (Google)
  • Benchmarking Private Population Data Release Mechanisms: Synthetic Data vs. TopDown
    Aadyaa Maddi Aadyaa Maddi (Carnegie Mellon University)*; Swadhin Routray (Carnegie Mellon University); Alexander Goldberg (Carnegie Mellon University); Giulia Fanti (CMU)
  • Prεεmpt: Sanitizing Sensitive Prompts for LLMs
    Amrita Roy Chowdhury Amrita Roy Chowdhury (UCSD)*; David Glukhov (University of Toronto); Divyam Anshumaan (University of Wisconsin-Madison); Prasad CHALASANI (XaiPient); Nicolas Papernot (University of Toronto and Vector Institute); Somesh Jha (University of Wisconsin-Madison and XaiPient)
  • Memory Triggers: Unveiling Memorization in Text-To-Image Generative Models through Word-Level Duplication
    Ali Naseh Ali Naseh (University of Massachusetts Amherst)*; Jaechul Roh (University of Massachusetts Amherst ); Amir Houmansadr (University of Massachusetts Amherst)
  • High Epsilon Synthetic Data Vulnerabilities in MST and PrivBayes
    Steven Golob Steven Golob (University of Washington Tacoma)*; Sikha Pentyala (University of Washington, Tacoma); Anuar Maratkhan (University of Washington Tacoma); Martine De Cock (University of Washington Tacoma)
  • I Can’t See It But I Can Fine-tune It: On Encrypted Fine-tuning of Transformers using Fully Homomorphic Encryption
    Prajwal Panzade Prajwal Panzade (Georgia State University)*; Daniel Takabi (Old Dominion University); Zhipeng Cai (Georgia State University)
  • Performance Fairness in Differentially Private Federated Learning
    Saber Malekmohammadi Saber Malekmohammadi (University of Waterloo)*; Afaf Taik (Mila - Quebec AI Institute, Université de Montréal); Golnoosh Farnadi (McGill University, Mila, Université de Montréal, Google)
  • Trust the Process: Zero-Knowledge Machine Learning to Enhance Trust in Generative AI Interactions
    Bianca-Mihaela Ganescu (Imperial College London); Jonathan Passerat-Palmbach (Imperial College London / Flashbots)
  • Achieving Certified Fairness with Differential Privacy
    Hai Phan Khang Tran (New Jersey Institute of Technology); Ferdinando Fioretto (University of Virginia); Issa Khalil (Qatar Computing Research Institute); My T. Thai (University of Florida); Hai Phan (New Jersey Institute of Technology)
  • DPFedSub: A Uncompromisingly Differentially Private Federated Learning Scheme with Randomized Subspace Descent Optimization
    Chuan Ma Huiwen Wu (Zhejiang Lab); Cen Chen (East China Normal University); Chuan Ma (Zhejiang Lab)*; TianFang Wang (ZHEJIANG LAB); Zhe Liu (Zhejiang Lab)

Tutorial

Why do we care about privacy?

by Katherine Lee (Google DeepMind).

Abstract: “Privacy” means something different to every person. And “privacy” means something different in different jurisdictions. Together, we’ll explore different notions of privacy, and impacts of different privacy legislation. We’ll discuss some policies set by governments and products, and brainstorm how we might meet those notions of privacy.

Invited Talks

Everything Looks Like a Nail

by Matthew Jagielski (Google DeepMind)

Abstract:
TGood definitions are one of the greatest achievements of privacy research; in machine learning, frameworks ("hammers") such as differential privacy and machine unlearning have spurred successful research programs and practical applications. In the wake of the recent explosion in interest in large, general purpose models, it is natural to begin applying our existing tools to these more modern applications. In this talk, I will discuss scenarios that make our existing hammers break down, which I hope will motivate creative future research.

Navigating Privacy Risks in (Large) Language Models: Strategies and Solutions

by Peter Kairouz (Google)

Abstract:
The new wave of large language models (LLMs) is spurring a suite of exciting opportunities: from content generation to question answering and information retrieval. However, the process of training, fine-tuning, and deploying these models comes with several important risks, including privacy, the focus of this talk.
Building on a comprehensive taxonomy for privacy threats, we demonstrate privacy vulnerabilities in LLMs fine-tuned on user data. We then show how these risks can be mitigated using a combination of techniques such as federated learning and user-level differential privacy, albeit with increased computational demands. We finally demonstrate how even moderate user-level differential privacy can completely mitigate risks against many realistic threat models. We do so by presenting a novel method that is capable of accurately estimating the privacy leakage in a one-shot fashion (i.e. in a single training run).

Unregulated? Think Again: Unpacking all the meaningful ways in which data privacy law regulates Generative AI

by Gabriela Zanfir-Fortunagab (Future of Privacy Forum)

Abstract:
The Italian Privacy Commission banned ChatGPT in the country for a month in March 2023 due to several alleged breaches of the General Data Protection Regulation (GDPR). This ban rang the alarm just as Generative AI was experiencing unprecedented growth in the realm of consumer applications. Soon after, Data Protection Authorities and Privacy Commissioners in Canada, South Korea, Japan, Brazil, Poland and the US announced they started investigations into ChatGPT in the application of their privacy and data protection legal frameworks. In October last year, the Global Privacy Assembly, an international organization reuniting more than 130 privacy and data protection commissioners from around the world, adopted a “Resolution on Generative Artificial Intelligence Systems”. It lays out how the decades old fair information practice principles, such as data minimization and purpose limitation, which are encoded in data privacy laws must be observed by developers and deployers of Generative AI systems. All of this is happening under the radar, as the public is paying attention to prominent Summits on AI and other intergovernmental initiatives that debate under the spotlight best choices for vague future AI governance frameworks. This talk will explore all the meaningful and extremely concrete ways in which data protection law is applicable, now, to Generative AI in particular, and AI systems more generally. It will also explain why exactly data privacy is so relevant to how these technologies interact with the data of their users, or the data in their training datasets.

Enforcing Right to Explanation: Algorithmic Challenges and Opportunities

by Himabindu Lakkaraju (Harvard University)

Abstract:
As predictive and generative models are increasingly being deployed in various high-stakes applications in critical domains including healthcare, law, policy and finance, it becomes important to ensure that relevant stakeholders understand the behaviors and outputs of these models so that they can determine if and when to intervene. To this end, several techniques have been proposed in recent literature to explain these models. In addition, multiple regulatory frameworks (e.g., GDPR, CCPA) introduced in recent years also emphasized the importance of enforcing the key principle of “Right to Explanation” to ensure that individuals who are adversely impacted by algorithmic outcomes are provided with an actionable explanation. In this talk, I will discuss the gaps that exist between regulations and state-of-the-art technical solutions when it comes to explainability of predictive and generative models. I will then present some of our latest research that attempts to address some of these gaps. I will conclude the talk by discussing bigger challenges that arise as we think about enforcing right to explanation in the context of large language models and other large generative models.

Roundtable Discusssions

Roundtable discussions are informal conversations in which ALL workshop audience is asked to actively participate regarding topics of interest, fostering a collaborative environment for exchanging ideas, experiences, and insights.

Privacy in Generative AI

moderated by TBA.

Privacy and Policy

moderated by Golnoosh Farnadi (McGill University).

Invited Speakers

Katherine Lee

Google DeepMind

Tutorial details

Peter Kairouz

Google

Talk details

Matthew Jagielski

Google DeepMind

Talk details

Gabriela Zanfir-Fortunagab

Future of Privacy Forum

Talk details

Himabindu Lakkaraju

Harvard University

Talk details

PPAI-24 Panel:

Privacy in Generative AI - a technical and policy discussion. Where are we and what are we missing?

Confirmed Panelists

Gary S. Howarth

Physical Scientist, NIST

Katherine Lee

Google DeepMind

Tutorial details

Ayaz Minhas

Privacy Policy Manager, Artificial Intelligence at Meta

Seth Neel

Assistant Professor, Harvard

Tina M. Park

Head of Inclusive Research and Design at Partenrship for AI


Sponsors

Diamond

Google

Gold

Georgia Tech

Silver

Open DP
NJIT



Interested in being a PPAI sponsor? Take a look at our Sponsors Tiers.

Code of Conduct

PPAI 2024 is committed to providing an atmosphere that encourages freedom of expression and exchange of ideas. It is the policy of PPAI 2024 that all participants will enjoy a welcoming environment free from unlawful discrimination, harassment, and retaliation.

Harassment will not be tolerated in any form, including but not limited to harassment based on gender, gender identity and expression, sexual orientation, disability, physical appearance, race, age, religion or any other status. Harassment includes the use of abusive, offensive or degrading language, intimidation, stalking, harassing photography or recording, inappropriate physical contact, sexual imagery and unwelcome sexual advances. Participants asked to stop any harassing behavior are expected to comply immediately.

Violations should be reported to the workshop chairs. All reports will be treated confidentially. The conference committee will deal with each case separately. Sanctions include, but are not limited to, exclusion from the workshop, removal of material from the online record of the conference, and referral to the violator’s university or employer. All PPAI 2024 attendees are expected to comply with these standards of behavior.


Program Committee

  • Abdullatif Mohammed Albaseer - HBKU
  • Ajinkya Mulay - Purdue University
  • Ali Ghafelebashi - University of Southern California
  • Amin Rahimian - University of Pittsburgh
  • Antti Koskela - Nokia Bell Labs
  • Audra McMillan - Apple
  • Aurélien Bellet - INRIA
  • Catuscia Palamidessi - Laboratoire d'informatique de l'École polytechnique
  • Clément Canonne clement. - University of Sydney
  • Difang Huang - University of Hong Kong
  • Diptangshu Sen - Georgia Institute of Technology
  • Elette Boyle - IDC Herzliya
  • Ero Balsa ero. - Cornell Tech
  • Fan Mo - Imperial College London
  • Gautam Kamath - University of Waterloo
  • Gharib Gharibi - TripleBlind
  • Graham Cormode g. - University of Warwick
  • Hongyan Chang - National University of Singapore
  • Ivoline Ngong - Konya Technical University
  • James Flemings - University of Southern California
  • Jason Mancuso - Cape Privacy
  • Jie Huang - University of Illinois at Urbana-Champaign
  • Karthick Prasad Gunasekaran - Amazon
  • Keegan Harris - Carnegie Mellon University
  • Krishna Acharya - Georgia Institute of Technology
  • Krishna Sri Ipsit Mantri - Purdue University
  • Krystal Maughan krystal. - University of Vermont
  • Lucas Rosenblatt lucas. - New York University
  • Ludmila Glinskih - Boston University
  • Luyao Zhang - Duke Kunshan University
  • Marco Romanelli - New York University
  • Mohammad Naseri - University College London
  • Moshe Shenfeld - Hebrew University
  • Muhammad Habib ur Rehman - King's College London
  • Pierre Tholoniat - Columbia University
  • Pratiksha Thaker - Carnegie Mellon University
  • Rakshit Naidu - Georgia Institute of Technology
  • Ranya Aloufi - Imperial College
  • Robert Mahari - Massachusetts Institute of Technology
  • Rojin Rezvan - University of Wisconsin-Madison
  • Roozbeh Yousefzadeh roozbeh. - Lightmatter
  • Rui-Jie Yew - Brown University
  • Sahib Singh - Ford Research (R&A)
  • Sankarshan Damle - International Institute of Information Technology, Hyderabad
  • Saswat Das - University of Virginia
  • Seohui Bae - LG AI Research
  • Sina Sajadmanesh sina. - Sony AI
  • Stacey Truex - Denison University
  • Stephen Casper - Massachusetts Institute of Technology
  • Tao Lin - Harvard University
  • Tianhao Wang - University of Virginia
  • Vahid Behzadan - University of New Haven
  • Vandy Tombs - Oak Ridge National Laboratory
  • Vasanta Chaganti - Swarthmore College
  • Vincent Tao Hu - University of Amsterdam
  • Wanrong Zhang - Harvard University
  • Xi He - University of Waterloo
  • Yeojoon Youn - Georgia Institute of Technology
  • Yijun Bian - University of Science and Technology of China
  • Yongjun Zhao - TikTok
  • Yulu Jin - University of California, Davis

Workshop Chairs

Ferdinando Fioretto

University of Virginia

fioretto@virginia.edu

Niloofar Mireshghallah

University of Washington

niloofar@cs.washington.edu

Christine Task

Knexus Research Corporation

christine.task@knexusresearch.com

Pascal Van Hentenryck

Georgia Institute of Technology

pvh@isye.gatech.edu

Juba Ziani

Georgia Institute of Technology

juba.ziani@isye.gatech.edu