2021 IEEE International Symposium on Information Theory

12-20 July 2021 • Melbourne, Victoria, Australia

IEEE Information Theory Society

Institute of Electrical and Electronics Engineers (IEEE)

2021 IEEE International Symposium on Information Theory

12-20 July 2021 • Melbourne, Victoria, Australia

All Dates/Times are Australian Eastern Standard Time (AEST)

Technical Program

Paper Detail

Paper ID	D2-S1-T4.2
Paper Title	A Linear Reduction Method for Local Differential Privacy and Log-lift
Authors	Ni Ding, University of Melbourne, Australia; Yucheng Liu, University of Newcastle, Australia; Farhad Farokhi, University of Melbourne, Australia
Session	D2-S1-T4: Local Differential Privacy
Chaired Session:	Tuesday, 13 July, 22:00 - 22:20
Engagement Session:	Tuesday, 13 July, 22:20 - 22:40
Abstract	This paper considers the problem of publishing data $X$ while protecting the correlated sensitive information $S$. We propose a linear method to generate the sanitized data $Y$ with the same alphabet $\mathcal{Y} = \mathcal{X}$ that attains local differential privacy (LDP) and log-lift at the same time. It is revealed that both LDP and log-lift are inversely proportional to the statistical distance between conditional probability $P_{Y\|S}(x\|s)$ and marginal probability $P_{Y}(x)$: the closer the two probabilities are, the more private $Y$ is. Specifying $P_{Y\|S}(x\|s)$ that linearly reduces this distance $\|P_{Y\|S}(x\|s) - P_Y(x)\| = (1-\alpha)\|P_{X\|S}(x\|s) - P_X(x)\|,\forall s,x$ for some $\alpha \in (0,1]$, we study the problem of how to generate $Y$ from the original data $S$ and $X$. The Markov randomization/sanitization scheme $P_{Y\|X}(x\|x') = P_{Y\|S,X}(x\|s,x')$ is obtained by solving linear equations. The optimal non-Markov sanitization, the transition probability $P_{Y\|S,X}(x\|s,x')$ that depends on $S$, can be determined by maximizing the data utility subject to linear equality constraints on data privacy. We compute the solution for two linear utility function: the expected distance and total variance distance. It is shown that the non-Markov randomization significantly improves data utility and the marginal probability $P_X(x)$ remains the same after the linear sanitization method: $P_Y(x) = P_X(x), \forall x \in \mathcal{X}$.