Proposed Guide on Synthetic Data Generation

Synthetic Data (SD) generation has emerged as an increasingly utilised Privacy Enhancing Technology (PET). A potentially significant use case is to drive the growth of AI/ML by enabling AI model training, while protecting the underlying personal data. SD also addresses challenges related to data quantity/quality for AI model training (e.g. insufficient or biased data) by enabling the augmentation of training datasets.

 

This proposed guide intends to assist organisations to understand SD generation techniques and potential use cases, particularly for AI. The guide will be offered as a resource within the Privacy Enhancing Technology Sandbox, which also includes a checklist of good practices to adopt when generating SD in order to guard against any possible risk of re-identification.

 

Please refer to the Proposed Guide to Synthetic Data Generation and IMDA’s PET Sandbox respectively to find out more.