Technical Blueprint for Operationalizing Privacy by Design
A lot has changed since 1995, when the concept of “privacy by design” first emerged.
Data now fuels entire industries, with devices and software constantly collecting user information. But people might be starting to lose trust, and privacy is becoming increasingly tightly regulated worldwide.
In this environment, implementing privacy-by-design practices presents a huge opportunity. And software developers and engineers are uniquely positioned to contribute to—and benefit from—improved privacy practices.
This article explores technical approaches to operationalizing privacy by design throughout the systems development lifecycle (SDLC).
The Seven Principles of Privacy By Design
Before we look at how to operationalize privacy by design, here’s a reminder of the seven privacy-by-design principles:
- Proactive, Not Reactive; Preventative, Not Remedial
- Privacy as Default
- Embedding Privacy into Design
- Full Functionality — Positive-Sum, Not Zero-Sum
- End-to-End Security — Full Lifecycle Protection
- Visibility and Transparency — Keep It Open
- Respect User Privacy — Keep It User-Centric
System Architecture and Data Flow Analysis
A clear understanding of your system architecture and data flows is essential for compliance with privacy-by-design principles.
To meet the principles of privacy by design, you must:
- Only collect and use personal data for a clearly defined purpose.
- Assign retention periods to each type of personal data.
- Strictly control third-party access.
To do all this (and more), you’ll need clear and constant visibility over your data.
Techniques for Gaining Oversight of Your System Architecture and Data Flows
Data mapping helps achieve privacy-by-design goals by helping you answer questions such as:
- How and why do your products collect personal data?
- What types of personal and sensitive data do you collect?
- Which third parties can access the data you control?
Dynamic data maps can serve as a foundation for conducting privacy impact assessments (PIAs) and keeping accurate and up-to-date records of data-processing activities (RoPA).
Three interrelated concepts help define the data-mapping process:
- Data discovery: Learning what types of personal data exist in your systems, what you do with the data, and why.
- Data inventory: Organizing and labeling your data. Creating records of your data-processing activities to ensure accountability.
- Data flow mapping: Crafting diagrams to show how data moves into, throughout, and out of your organization.
Privacy violations occur when third parties have unauthorized or inappropriate access to personal data. Creating data flow diagrams can reveal every touchpoint of user data within your systems.
Privacy by design (PbD) principle: Visibility and Transparency — Keep It Open: Visibility is crucial to ensuring privacy. Develop and maintain data maps to get a visual representation of your data processing practices.
Manual and Automated Data Mapping
Data mapping can be:
- Manual: Sending questionnaires to people in your organization to learn how they collect and use personal data.
- Automated: Using data discovery and classification software to automatically and continually scan your systems for personal data.
Certain organizations may require manual data mapping in some contexts. However, where you can automate, you’ll likely see improvements, including:
- Time. Less time and resources spent chasing people up, and reports generated automatically.
- Accuracy: Less risk of misunderstandings and mistakes, and consistent labels for different data types.
- Adaptability: Dynamic scaling as your business grows and changes.
PbD Principle: Embedding Privacy into Design: Integrating automated data flow mapping and code-scanning software into your development processes for total oversight of how your product processes user data.
Integrating Privacy Impact Assessments (PIAs) in the SDLC
Privacy by design means finding opportunities to assess and mitigate risk throughout the entire system development lifecycle (SDLC).
Privacy impact assessments (PIAs) help understand your data processing activities and foresee and mitigate privacy risks.
What Is a PIA?
Fundamentally, a PIA is about balancing the benefits of a processing activity (like collecting a user’s email address at account setup) against the risks to individuals (such as exposing the email address to unauthorized access).
Conducting regular PIAs—and automating the process as far as possible—helps you build better, more secure, and more trustworthy products.
But—conducting PIAs is not just good practice. The process is also a legal requirement under certain circumstances.
In US states such as California, Colorado, Connecticut, and Virginia, comprehensive privacy laws require you to conduct a risk assessment before certain data processing activities, including conducting targeted advertising and processing sensitive data.
The EU and UK General Data Protection Regulation (GDPR) describes the process of a Data Protection Impact Assessment (DPIA), which is mandatory before undertaking high-risk data processing activities—particularly if you’re using or developing new technologies.
How to Conduct a PIA
There are several ways to conduct a PIA, and you might need to comply with legal requirements under one of the laws mentioned above.
Most PIA models require you to consider the following:
- The nature, purpose, context, and scope of the processing: How and why you’ll be collecting and using personal data, how much data you’ll collect about how many people.
- The types of personal data you’ll be collecting.
- The reasonable expectations of your users.
- Which third parties and service providers will have access to the data
- The possible risks to people’s privacy and other rights
- The steps you can take to mitigate those risks and protect the data.
You must document your PIA, and you should take a continuous approach to the assessment. Treat the PIA as a live process that you revisit, review, and amend as your product develops.
Embedding Tools to Identify and Mitigate Risks
You can use your PIA to integrate tools that help identify privacy risks before they manifest.
- Embed privacy tools into the continuous integration and continuous deployment (CI/CD) pipeline.
- Automated code scanning can help identify privacy risks and feed information into your PIA processes.
- Fix problems as they arise, such as unnecessary third-party integrations and data leaks.
Think broader than data leaks.
“Privacy risks” can also include an inability to provide transparency or facilitate people’s rights (such as opting out of certain activities or accessing or deleting their data).
As explored above, a clear system architecture can help you facilitate people’s rights. The PIA process can identify areas where upholding a user’s rights might be difficult or impossible.
If your PIA finds risks you can’t mitigate, the project might require a more fundamental re-think.
Under the UK and EU GDPR, you must consult your data protection authority (DPA) if you still hope to proceed with the data processing despite unresolved risks.
PdD Principle: Proactive Not Reactive; Preventative Not Remedial: Embed PIAs and privacy assessment tools into your continuous integration and continuous deployment (CI/CD) pipelines to identify risks as early as possible.
Milestones and KPIs
Privacy by design benefits your organization, your products, and your users. But how do you know your approach to privacy by design is working?
Here are some milestones and key performance indicators (KPIs) that can quantify privacy enhancements:
- Data minimization: A more careful approach to data collection and sharing should mean you process less personal data altogether. Once your data-mapping processes are in place, you can reduce the amount of personal data in your systems.
- Reduced data breaches and privacy incidents: Heading off privacy risks early should reduce data breaches. If you’ve experienced data breaches or near-misses in the past, you should see these incidents reduce as your privacy program matures.
- Improved risk reporting: While privacy by design means privacy risks manifest less often, improved awareness of privacy risks among employees should mean they flag risks more often.
- Customer satisfaction: Privacy enhances customer trust. Privacy by design should help speed up data access and deletion requests, strengthen users’ control over their data, and ultimately reduce customer complaints.
PbD Principle: Respect User Privacy — Keep It User-Centric: When measuring your privacy program’s success, keep your users in mind. Everyone can benefit from greater privacy, but keeping your customers safe and happy should guide your decision-making.
Integrating Privacy by Design within DevOps and Privacy Engineering
Before we conclude, let’s look at how privacy by design sits within the DevOps and privacy engineering methodologies.
DevOps provides many opportunities for improving privacy. The DevSecOps philosophy of “shifting security left” applies to privacy, too. And privacy engineering provides a range of tools to help you make that shift.
Elias Grünewald, a Privacy Engineering Research Associate at the Technical University of Berlin, uses the term DevPrivOps to describe the application of privacy by design principles through DevOps and privacy engineering methods.
PbD principle: End-to-End Security — Full Lifecycle Protection: Security and privacy are closely linked. Ensure strong data security by embedding automated privacy checks into DevOps processes such as code commits, build processes, and containerization.
Regular code commits are vital for keeping your code clean and simple, reducing the likelihood of vulnerabilities. Software becomes unnecessarily complex without regular code commits, leading to data leaks and vulnerabilities.
Automated privacy code checks in local repositories can spot red flags during code commits, such as the presence of personal data, passwords, and third-party API keys.
But we can shift privacy further left. Using pre-commit hooks, you can ensure developers only commit code that meets predetermined privacy-centric criteria
For manual code reviews, consider adopting a checklist approach to ensure reviewers cover everything required.
If you don’t run privacy checks during the pre-commit stage—perhaps because you don’t want to block developers from committing code—you have another opportunity to identify privacy risks before pull or merge requests.
And even during the build stage, the UK National Cyber Security Center (NCSC) recommends a vigilant approach to identifying external dependencies:
“If third-party code is dynamically included into your product during the build or deployment process, can you ensure that it can't be maliciously modified?”
Automated tools can check for privacy and security vulnerabilities during the build process, including unresolved external dependencies, outdated libraries, and unwanted third-party tracking code.
Containers present unique privacy and security issues due to their potentially wide attack surface. Maintaining visibility over dynamic containerized environments can be a challenge, as can managing access controls.
Container image scanning should be a critical step in your CI/CD pipeline.
Most container scanning software can reveal known vulnerabilities matched to a database. Privacy code scanning goes further, revealing different types of personal data within containers and identifying third-party code and potential data leaks.
Every team within your company can contribute toward implementing privacy by design.
But developers and engineers are best placed to identify and deal with privacy risks before they become privacy harms.
Developers and engineers can use their technical know-how to implement privacy-by-design at every stage of a software product’s life via practices such as:
- Data mapping
- Privacy impact assessments (PIAs)
- Privacy engineering throughout the development lifecycle
To make privacy by design “business as usual”, automate where you can—and scan code for privacy risks at every opportunity.
Recommended reading material
Anne Cavoukian, Privacy By Design: The 7 Foundational Principles
Elias Grunewald, Cloud Native Privacy Engineering Through DevPrivOps
National Cyber Security Centre (NCSC), Secure development and deployment guidance
- How to Implement privacy by design?
A good first step to implement privacy by design is to conduct a data mapping exercise. This process helps determine how your organization collects, uses, and shares personal data.
Once you’ve created a data inventory and mapped your data flows, you can find opportunities to reduce the amount of data you collect, carry out privacy impact assessments (PIAs), and apply privacy-enhancing safeguards.
- When do you use privacy by design?
You use the privacy-by-design principles at every stage of the development lifecycle: From planning a new product to delivering and maintaining it. You can also apply privacy by design to your organization’s internal policies and third-party risk management operations.
- Why should we care about privacy by design?
You should care about privacy by design because, done right, it can benefit everyone: Your organization, your customers, and society as a whole. Better privacy practices should lead to more efficiency, better customer relations, and better products.
- How do privacy by design and privacy engineering operate together?
Privacy by design and privacy engineering go hand in hand: Privacy by design provides the principles for improving privacy with your company and its products, privacy engineering provides the tools to make those principles a reality.
- What is privacy and security by design?
While “privacy by design” focuses on reducing risks to people’s privacy and other rights—for example, by collecting or sharing less personal data—“security by design” focuses on building products that are inherently less vulnerable to cyberattacks and other security threats.
You can’t properly implement privacy by design without considering also considering security by design. After all, the fifth privacy-by-design principle is “End-to-End Security — Full Lifecycle Protection”.
- What is privacy by design and privacy by default?
“Privacy by design” means applying privacy-protecting practices and safeguards throughout the entire system development lifecycle. “Privacy by default” means ensuring products have the most privacy-protective settings—such as keeping user data private—turned “on” by default.
- What is a strategy for operationalizing privacy by design?
You can develop a strategy for operationalizing privacy by design around the core concepts of data mapping, privacy impact assessments (PIAs), and privacy engineering. These practices enable you to identify how you process personal, assess the risks, and apply safeguards.
Robert is a writer covering privacy, security, and AI. He is a respected voice on privacy and has covered and has been working in the field since 2017.