Sahel Shariati Samani


Assessing and Managing the Disclosure Risks with Genomics Data


Sept 2012 – Sept 2015


Economic and Social Research Council +2 Studentship


PhD summary

Sharing and exploring genome sequences has a potential of improving medicine and public health.

A key component in this is the linkage of socioeconomic and genomic data the potential of which is now being realised in the UK cohort studies.

However, genomes are people’s unique genetic blueprints and uniquely identifiable. Genotypes can be transformed into phenotypes, such as eye and hair colour ethnicity, and risk of diseases. This phenotypic information could then be linked to other resources and cause re-identification. Therefore, before any dissemination/sharing of genomic data, their disclosure risk must be assessed.

In most statistical disclosure risk research, a pragmatic assumption has been made that there was no uncertainty on the data. However, this assumption is invariably false with real data and will certainly be so if the attacker does not have access to the actual attributes of population units but infers them from other data sources.

For example if an attacker has access to the genome data from an individual and knowledge of genetics, he might infer that there is a probability that this person comes from a Northern European background, a chance that he has blue eyes, and a chance that he is obese and suffers from diabetes, and so on. We need to take account of such uncertainty or we risk making overly conservative decisions.

On the other hand, genomics data may be subject to different types of attack than have been considered within the orthodox disclosure scenario framework suggesting that a new framework is required. The main aim of this project is to investigate the design space for a framework for assessing disclosure risk of genomics data realistically.


I did my bachelor’s degree in Electrical and Electronic Engineering at the University of Tehran in Iran.

In 2010, I came to Manchester to start a master’s degree at The University of Manchester in computer science with a specialisation in artificial intelligence.

In 2012, I started my PhD at the Department of Computer Science in collaboration with The Cathie Marsh Centre for Census and Survey Research (CCSR) and NorthWest e-Health. At the start of my second year, I transferred to Social Statistics and found my new home in CCSR - now the Cathie Marsh Institute.

Contact details

Office: G45, Humanities Bridgeford Street