Title :
Enabling dynamic linkage of linguistic census data at Statistics Canada (extended abstract)
Author :
Casteigts, Arnaud ; Chomienne, Marie-Helene ; Bouchard, Louise ; Jourdan, Guy-Vincent
Author_Institution :
Univ. of Ottawa, Ottawa, ON, Canada
Abstract :
Research in population health consists in studying the impact of various factors (determinants) on health, with the longterm objective of yielding better policies, programs, and services. Researchers of Official Language Minority Communities (OLMCs) focus specifically on determinants related to speaking a minority language, such as English in Quebec, or French in the rest of Canada. Investigations of this type require the possibility of associating health data to linguistic information. Unfortunately, the largest health databases in Ontario, held at the Institute for Clinical Evaluative Sciences (ICES), do not contain usable linguistic variables to date. High-quality language variables however exist at Statistics Canada (2006 Census), and we are interested in enabling its linkage to ICES health data in a dynamic way. The linkage we consider is intrinsically transient and aggregated: it consists in allowing ICES to learn interactively how many Francophones are present in a given sample of individuals (sum queries). We suggest two possible privacy-preserving mechanisms to enable dynamic sum queries: 1) by constraining the dataflow itself; 2) by adapting recent results ([1]) to characterize what leakage is at play in our scenario and what parameters impact the tradeoff between leakage and utility. We rely on these results to argue that a safe exposition of linguistic data could indeed be envisioned, and beyond, that similar techniques could be used to enrich provincial health databases in general with a range of federal census data, making it possible to perform fine-grained community-based studies in Canada.
Keywords :
data privacy; health care; linguistics; medical information systems; natural language processing; query processing; English; Francophones; French; ICES; Institute for Clinical Evaluative Sciences; OLMC; Official Language Minority Communities; Ontario; Quebec; Statistics Canada; dynamic linkage; dynamic sum queries; health data; linguistic census data; population health; privacy-preserving mechanisms; provincial health databases;
Conference_Titel :
Intelligence and Security Informatics (ISI), 2011 IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4577-0082-8
DOI :
10.1109/ISI.2011.5984777