The past as a compass for the future

The past as a compass for the future

SMEs and the startup ecosystem in India share concerns about the (retracted) draft Data Protection Bill, 2021 - and the way forward for businesses

To understand the impact of the Draft Data Protection BIll (DPB) on Small and Medium Businesses (SMBs) and startups, Privacy Mode interviewed representatives across the industry. The interviewees shared their perspectives on how complying with the mandates and provisions of the Bill is likely to affect opportunities for innovation, investment and the costs of doing business in India.

This report provides a more nuanced discussion on data governance policies, especially regarding the regulation of data protection laws in India, and helps inform more consultations around data governance, data protection and rights.

The Personal Data Protection (PDP) Bill, 2019, was first introduced in the Lok Sabha by the Ministry of Electronics and Information Technology (MeitY) in December, 2019. Its primary intent was to protect the digital privacy of individuals relating to their data, while acknowledging the right to privacy as a fundamental right and necessary to protect personal data as an essential facet of informational privacy. It also aimed to create a collective culture that fosters a free and fair digital economy, respecting the informational privacy of individuals, and ensuring empowerment, progress and innovation through digital governance and inclusion.

Cite this report- Dash, Sweta “The past as a compass for future - SMEs and the startup ecosystem in India share concerns about the (retracted) draft Data Protection Bill, 2021 - and the way forward for businesses” (2022) at https://hasgeek.com/PrivacyMode/dpb-survey-report/

Executive Summary

(The reference text of the Draft Data Protection Bill, 2021 is mentioned in the citations. You can also see the timeline, showing how the text and provisions of the Bill have evolved through various stages.)

According to the Bill, personal data is defined as data about or relating to:

  1. Natural person who is directly or indirectly identifiable, having regard to any characteristic, trait, attribute or any other feature of the identity of such a natural person.
  2. Whether online or offline.
  3. Any combination of such features with any other information.
  4. Shall include any inference drawn from such data for the purpose of profiling.

In 2019, the Union Government referred this Bill to a Joint Parliamentary Committee (JPC). The updated Draft Data Protection Bill (DPB), 2021 emerged from the JPC report tabled in 2021.

The Draft DPB had changed the initial PDP Bill significantly, and received mixed responses and concerns from stakeholders 1.

To understand the impact of the Draft DPB on Small and Medium Businesses (SMBs) and startups, Privacy Mode interviewed representatives across the industry. The interviewees shared their perspectives on how complying with the mandates and provisions of the Bill is likely to affect opportunities for innovation, investment and the costs of doing business in India.

This report provides a more nuanced discussion on data governance policies, especially regarding the regulation of Data Protection Laws in India and helps inform more consultations around data governance, data protection and rights.

Disclaimer

The conduct of this survey and the drafting report was done prior to the withdrawal of the DPB on 3rd August, 2022. The intent of producing this report was to collect peer review from industry practitioners and compile this as feedback to be shared with MeitY and the JPC. We believe that this report is relevant and timely because the findings presented here provide insights into industry concerns which can be leveraged when the government drafts the next version of India’s privacy bill. For data privacy of users to be genuinely achieved in India, privacy policies and laws must provide guidelines and directions to the industry without detailing operational requirements. Else, compliance becomes a checkbox to tick, while privacy continues to be put on the backburner2.

Participant Profile Distribution

Visualization
{
"height": "320",
"width": "480",
"autosize": {
 "type": "fit",
 "contains": "padding",
 "align": "centre"
},
 "data": {
 "values": [
   {"category": [" ","Architect"], "value": 4.2, "label": "4.2%"},
   {"category": "Product manager", "value": 12.5, "label": "12.5%"},
   {"category": ["Senior", "Engineer"], "value": 33.3, "label": "33.3%"},
   {"category": "Founder", "value": 50, "label": "50%"}
 ]
},
"mark": "arc",
"encoding": {
 "theta": {"field": "value", "type": "quantitative", "stack": true},
 "color": {"field": "category", "type": "nominal", "legend": null}
},
"layer": [
 {"mark": {"type": "arc", "outerRadius": 130, "innerRadius": 70, "padAngle": 0.01}
},
 {
   "mark": {"type": "text", "radius": 105, "fill": "#fff"
   },
   "encoding": {
     "text": {"field": "label", "type": "nominal"},
     "size": {"value": 12}
     }
 },
 {
   "mark": {"type": "text", "radius": 170
   },
   "encoding": {
     "text": {"field": "category", "type": "nominal"},
     "fill": {"value": "#000"},
     "size": {"value": 12}
     }
 }
]
}

Industry Domain Distribution

Visualization
{
"height": "430",
"width": "480",
"autosize": {
 "type": "fit",
 "contains": "padding"
},
 "data": {
 "values": [
   {
   "category": [[" "," ","Agritech"], [" ","AI Tech"], [" "," ","Software", "Development"], ["B2B", "eCommerce"], "CRM",  "Cloud Tech", [" "," ","MLOps"], ["IT Services ", "& Consulting"], [" "," ","OSS Products", "& Services"], [" "," ","SSD Cloud"], ["Cybersecurity", "Tech"], "Fintech", "Health Tech"],
   "value": [7.1, 14.3, 3.6, 3.6, 3.6, 7.1, 3.6, 3.6, 3.6, 3.6, 3.6, 25, 14.3],
   "label": ["7.1%", "14.3%", "3.6%", "3.6%", "3.6%", "7.1%", "3.6%", "3.6%", "3.6%", "3.6%", "3.6%", "25%", "14.3%"]
   }
   ]
 },
"transform": [
{"flatten": ["category", "value", "label"]}
],
"mark": "arc",
"encoding": {
 "theta": {"field": "value", "type": "quantitative", "stack": true},
 "color": {
   "field": "category",
   "type": "nominal",
   "legend": null,
   "scale":{"range": ["#267278","#3363a9","#4e82ea","#f2a354", "#3db3a3", "#f46767", "#d15a69", "#f49667", "#f7cc19", "#2abca7", "#2c96ff", "#569d79", "#78b3ce"]}
 }
},
"layer": [
 {"mark": {"type": "arc", "outerRadius": 170, "innerRadius": 85, "padAngle": 0.01}
 },
 {
   "mark": {"type": "text", "radius": 145, "fill": "#fff"},
   "encoding": {
     "text": {"field": "label", "type": "nominal"},
     "size": {"value": 12}
     }
 },
 {
   "mark": {"type": "text", "radius": 200, "align": "left", "dx": -10, "dy": -5},
   "encoding": {
     "text": {"field": "category", "type": "nominal"},
     "fill": {"value": "#000"},
     "size": {"value": 10}
     }
 }
]
}
Summary of key concerns
Ambiguities about sensitive and personal data, and the addition of non-personal data (NPD) into the ambit of DPB
Increase in compliance burden and costs owing to provisions such as privacy by design and algorithmic fairness which will be certified by the Data Protection Authority (DPA)
Restrictions on cross border flow of data, and impact on innovation
Mandates for privacy by design and algorithmic fairness are unviable and impractical to implement
Overreaching powers for the government further increase unjustified surveillance

Top Concerns

Visualization
{
"height": "430",
"width": "520",
"autosize": {
 "type": "fit",
 "contains": "padding"
},
 "data": {
 "values": [
   {
   "category": [["Mixing of personal", "and non-personal data"], ["Ambiguities and", "uncertainties"], ["Data localisation and", "cross border data transfer"], ["Privacy by design", "and algorithmic fairness"], ["Overarching powers", "to the government"], ["Compliance", "burdens"]],
   "value": [8.4, 19.2, 19.2, 17, 17, 19.2],
   "label": ["8.4%", "19.2%", "19.2%", "17%", "17%", "19.2%"]
   }
   ]
 },
"transform": [
{"flatten": ["category", "value", "label"]}
],
"mark": "arc",
"encoding": {
 "theta": {"field": "value", "type": "quantitative", "stack": true},
 "color": {
   "field": "category",
   "type": "nominal",
   "legend": null,
   "scale":{"range": ["#f46767", "#d15a69", "#f49667", "#f7cc19", "#2abca7", "#2c96ff", "#569d79", "#78b3ce"]}
 }
},
"layer": [
 {"mark": {"type": "arc", "outerRadius": 165, "innerRadius": 85, "padAngle": 0.01}
 },
 {
   "mark": {"type": "text", "radius": 145, "fill": "#fff"},
   "encoding": {
     "text": {"field": "label", "type": "nominal"},
     "size": {"value": 12}
     }
 },
 {
   "mark": {"type": "text", "radius": 215, "align": "left", "dx": -45, "dy": -10},
   "encoding": {
     "text": {"field": "category", "type": "nominal"},
     "fill": {"value": "#000"},
     "size": {"value": 10}
     }
 }
]
}

Mixing of Personal and Non-Personal Data

While the JPC report recommended that both personal and non-personal data must be brought under the ambit of the same data protection law, or rather under “a single administration and regulatory authority", respondents remain sceptical of the intent and implications of such a move. They said this transition from PDP to the current DPB relegates users to the margins instead of putting them on the centrestage in the discourse on privacy 3.

To them, the onus of the user’s privacy now shifts on to businesses. And, since data aggregated by businesses is a mix of both personal and non-personal, it increases their operations and compliance costs. Segregating this data into non-personal data, sensitive personal data, and critical personal data is a herculean task for businesses, especially those who operate on a data heavy model4.

📖 Read more about this key finding


Consent Management

On one hand, the DPB now allows non-consensual processing of data under several circumstances. That is concerning because consent must ideally be the foundation of a Bill on data protection, especially given the fact that DPB is still a chapter in the history of the milestone Puttaswamy judgement.
Clause 13 of the DPB, for instance, notes that non-consensual processing of data “can reasonably be expected by the Data Principal.” The next Clause then disregards user consent for measures like search engine operation and credit scoring.

On the other hand, the mechanisms for businesses to adhere to consent have become more cumbersome. With the requirements of consent managers and multiple levels of checks and balances, respondents are confused about what is even expected of them. To them, this will eventually be a reason for greater compliance costs for the business and friction for the end-users.

📖 Read more about this key finding


Data localization and cross border data flows

The draft DPB’s mandates on physical data storage and processing the data within the country’s jurisdictional borders is seen as a serious impediment to growth, investment, and innovation opportunities for businesses.
Additionally, the DPB has different standards for handling sensitive personal data and critical personal data adds to compliance costs because businesses are finding it difficult to understand what this will mean for costs of operations. They also find it challenging to now segregate three categories of data and having to invest in resources that will be needed to do the same.
Transfer of data cross-border requires explicit consent of the Data Principal, pursuant to a contract or intra-group scheme approved by the Data Protection Authority (DPA) in consultation with the Centre. This leaves businesses worried about extra approval mechanisms and audit systems. So much so that some said they might consider moving the base of their business to a different country instead.

📖 Read more about this key finding


Privacy by Design and Algorithmic Fairness

In principle, respondents welcomed the move to build mechanisms and processes for Privacy By Design and Algorithmic Fairness. They think it is time that privacy and fairness gets its due recognition and importance in data businesses. However, they are concerned that these are theoretical concepts and not viable to implement and adhere to on a routine basis. Respondents said that it is difficult to have broad but uniform standards for these approaches, and that a blanket solution will not cater to the nuances of data that each business operates with.

Respondents also shared that these are not measurable metrics - which then translates into:

  1. It will be difficult to comply with and get certifications by the DPA; and
  2. It is always possible for some algorithms to have roundabout ways to seem fair without actually being fair. Respondents felt that this defeats the purpose of this provision in the DPB 5.

📖 Read more about this key finding


Problems with government exemptions; fear of data sharing with government and central agencies; anonymized datasets

The DPB 2021 assures exemptions for the government and central agencies (including the police, Central Bureau of Investigation (CBI), Enforcement Directorate (ED), Research and Analysis Wing (RAW), Intelligence Bureau (IB) and Unique Identification Authority of India (UIDAI) after the JPC report with the insertion of a non-obstante provision in Clause 35.

Respondents remain fearful of such provisions that grant overarching powers for the government and central agencies to process data without the user’s consent. Among other things, they are concerned that these exemptions are “scary”, “unjustified”, and “unconstitutional”.

They are also worried that such unregulated data access by the State can have potential security threats to their digital and proprietary information.

📖 Read more about this key finding


Way forward

The objective of this qualitative study was to understand the concerns that startups and SMBs had regarding the Draft DPB. At a time when startups and SMBs play such a crucial role in the digital economy of the country, and data itself holds the centrestage across sectors, it is imperative to hear from the individuals who have firsthand experiences that can inform more consultations around data governance, data protection and rights.

The interviews reveal that there is a strong need for:

  1. Clarifying the scope and intent of the DPB;
  2. Include provisions for reasonable and proportional legal safeguards as part of the mandates drafted in the DPB. Without this, respondents are worried that the ramifications will be fatal for innovation, growth and security of data, among other things.

Now that the DPB has been withdrawn and it is likely that the Government will table a new set of legislations for data privacy in the winter session of the Parliament later this year, we hope that these concerns of SMBs and startups will be taken into account. We hope the report helps to facilitate more interactions between practitioners and policymakers for such future iterations of India’s privacy bill, and in turn, will inform policy directions and guidelines that can genuinely protect users’ digital data.


Conclusion

After four years since it was first tabled in the Parliament, the Draft DPB was withdrawn in August 2022. The next version of data protection legislation is likely to be tabled in the winter session of the Parliament later this year. It has been said that the DPB will be replaced by a more “comprehensive framework” that will be in alignment with “contemporary digital privacy laws”. 6

It is worth remembering that a robust legislation on digital data protection is, indeed, the need of the hour, and surely long overdue. And, the road to this legislation has had a commendable history - one that stems from the Puttaswamy judgement which acknowledged privacy as a right. That the country needs a reliable data protection law, especially in these times of digitization and consensus on the importance of data, cannot be emphasized enough.

We do consider this a milestone that the State is finally invested in the framing of a legislation that is meant to safeguard the users’ data privacy and sovereignty as well as facilitate growth and innovation of businesses dealing with digital data. Reports already suggest that certain concerning aspects of the DPB are likely to be taken care of. 7 Having said that, the fact remains that four years later, we are at square one again.

As we wait for a data protection law in India, we hope that the new legislation will cater to the on-ground voices of the businesses who will be affected by such laws. Besides, as SMBs and startups have had a lot of experience with regulations and compliance procedures for their specific businesses already, be it with the European Union’s General Data Protection Regulation (GDPR) or with sectoral laws and policies for their industry, they certainly do have useful insights on what data protection regimes can actually do to foster innovation while safeguarding privacy rights.

Below are some recommendations, drawn from the survey, which Privacy Mode advocates need to be considered in the new data protection legislation when it is next tabled in the Parliament.

Regarding mixing of personal and non-personal data.

The mixing of personal and non personal data has given rise to a lot of confusion about the DPB, and adds more layers of compliance and operational costs for businesses.
Since non personal data can be de-anonymized, it poses a privacy threat to the ecosystem. Even when the data is in the form of aggregated, non-identifiable form, respondents said that there is always the possibility of re-identification.
We recommend that non-personal data be left out of the DPB, and that it be governed through other frameworks. We also recommend that the government must carry out consultations with stakeholders to decide on how non-personal data can be regulated. It is also recommended that policymakers provide concrete definitions for new categories of data as sensitive personal data, and not let this be an arbitrary process.

Regarding data localization and cross border data flows.

It is imperative that the DPB does not mandate restrictions on storage, transfer, and processing of personal data within the border of this country alone. This will be a serious blow to the open nature of the internet and digital data.
While it is commendable that this provision is meant to assure safety and privacy of personal data, these could very well be achieved without such restrictive measures. An environment ensuring free flow of data - while guaranteeing privacy and reasonable safeguards for data sovereignty - will help in promoting an open and innovative society and economy.
In fact, the latest National Trade Estimate Report on Foreign Trade Barriers released by the US government in March 2022 also makes a strong case against such provisions in the DPB. It said that these provisions “would serve as significant barriers to digital trade between the United States and India. These requirements, if implemented, would raise costs for service suppliers that store and process personal information outside India by forcing the construction or use of unnecessary, redundant local data centres in India … (and) could serve as market access barriers, especially for smaller firms.”8
To assure privacy in the free flow of data across borders, the future version of a privacy bill for India must endeavour to provide adequate legal safeguards that will be beneficial to the user’s data and to the business’s success. 9 Additionally, ambiguous phrases like “public policy” and “State policy” must be defined in it.

Regarding Privacy by Design.

First, as the founder of an MLOps business said,

“But I think the way to do privacy by design is to create public goods, shared recipes, scripts, tools, methods, in steps to be followed, make it really easy for companies to think about privacy, right? But you will not have this until you have means, motive, and opportunity.” By means, the founder referred to necessary background knowledge about tools and script required. By motivation, they referred to the creation of a general discourse on privacy in tech. And, by opportunities, they meant that individuals who pursue privacy research and design ought to be given incentives and made to feel valued. “The bill addresses a little bit of the motivation, but we have a long way to go,” the founder said.

Second, respondents suggested that there should be clarity about what Privacy by Design even means in the context of DPB, and how the DPA hopes to certify and approve this for businesses.

Third, many respondents suggested that Privacy by Design policy should not be a mandatory compliance requirement that needs approval by the DPA. “It should come into picture when there is a dispute in terms of data protection, i.e., if there have been some issues in terms of data protection, data privacy or information security, then the privacy by design policy of the company can be scrutinized.”

Fourth, one respondent involved with an agri-tech business suggested easing of the consent management systems involved with Privacy by Design policy as prescribed in the provisions of the DPB. They suggested one waiver instead of multiple consent management checks that add more friction to the process for users and for businesses.

Regarding algorithmic fairness.

First, the provision needs clarity. Since this is a design and technology principle that is largely a theoretical concept, it will be useful to have defined boundaries regarding what the DPB means by algorithmic fairness.

An architect with a FinTech business said,

“I think the regulation needs to define what exactly it tries to achieve with looking at the whole fair AI algorithm. In my view, that basically comes to the question of specific vulnerable groups, for example, groups of women who do not have access to the formal financial system. So for people with low income or people who are on social benefits, and make sure that the algorithms are not discriminating against groups of people.”

Second, it is necessary to have use cases for this provision. In the words of the respondent cited earlier,

“This is what needs to be defined very well by the regulation: what specific use cases need to be addressed? Otherwise, we can always find, you know, a criteria on which certain algorithms won’t be fair or want to get to groups of customers in the same way. So it is a very, I would say delicate question, which needs specific use cases to be defined to make it very much practicable and enforceable, especially in the financial technology sector.”

Third, data and technology experts, especially, recommended that this provision of the future version of a privacy bill for India can be closer to being practical only when measurability and accountability factors are clarified. Respondents said that it is essential to know what metrics the DPA hopes to use for algorithmic fairness.

Finally, that will then require a team of auditors who are well-versed with data and algorithms in ways that they can address nuances and specificities of all businesses. The auditors should be composed of neutral arbitrators too “who can actually assess how fair the algorithms are in that particular context” said one respondent.

Regarding overarching powers of the government.

To thwart the risks of overriding powers of the government’s access to data, some of the recommendations by respondents are as follows.

The lack of clarity about what constitutes as “necessary or expedient” to enable broad data sharing with the government needs to be addressed.

“I think the Bill needs to specify what exactly means by fair requirements, and in what cases this actually needs to happen. Otherwise, what is left at the discretion of the government agencies might be interpreted in multiple ways. It is important to outline more more concrete, specific use cases,” said an architect.

One of the respondents suggested that such demands for broad exemptions to the government and central agencies must be supported by “at least the High Courts or higher, and not even by the level of a magistrate or even SHO kind of thing.” Another respondent also echoed this recommendation,

“I think the exemptions need to have a process that the courts need to uphold, rather than the exemptions being blanket requests, which they can make at any time without any sort of checks and balances.”

It is worth noting that the earlier 2018 draft did have provisions for due authorization by law for such provisions. 10


Survey Design and Research Methodology

Participant Profile Distribution

Visualization
{
"height": "320",
"width": "480",
"autosize": {
 "type": "fit",
 "contains": "padding",
 "align": "centre"
},
 "data": {
 "values": [
   {"category": [" ","Architect"], "value": 4.2, "label": "4.2%"},
   {"category": "Product manager", "value": 12.5, "label": "12.5%"},
   {"category": ["Senior", "Engineer"], "value": 33.3, "label": "33.3%"},
   {"category": "Founder", "value": 50, "label": "50%"}
 ]
},
"mark": "arc",
"encoding": {
 "theta": {"field": "value", "type": "quantitative", "stack": true},
 "color": {"field": "category", "type": "nominal", "legend": null}
},
"layer": [
 {"mark": {"type": "arc", "outerRadius": 130, "innerRadius": 70, "padAngle": 0.01}
},
 {
   "mark": {"type": "text", "radius": 105, "fill": "#fff"
   },
   "encoding": {
     "text": {"field": "label", "type": "nominal"},
     "size": {"value": 12}
     }
 },
 {
   "mark": {"type": "text", "radius": 170
   },
   "encoding": {
     "text": {"field": "category", "type": "nominal"},
     "fill": {"value": "#000"},
     "size": {"value": 12}
     }
 }
]
}

This report has been created through semi-structured interviews with individuals in SMBs and startups 11.
The Privacy Mode team identified and shortlisted business leaders, startup founders, Chief Executive Officers (CEOs), Chief Technology Officers (CTOs), security and compliance experts, product managers, and engineering heads from the Indian SMB and startup ecosystem. A total of 30 individuals were interviewed through June and July 2022. Domain diversity and scale of operations of the startups were the two factors considered when shortlisting and contacting individuals and organisations to participate in this research.

The Privacy Mode team reached out to the interviewees with a primer on DPB, interview questionnaire, and an ethics and consent form prior to the interviews. See Appendices I and II for reference to the primer and the questionnaire. The primer and background material were compiled so that respondents understood the nuances and trajectories of DPB before the interview, and were in a position to respond to the questions with an informed opinion.


Credits and acknowledgements

We thank all the interviewees who participated in this research and have shared their views.

  • Sweta Dash is the Lead Researcher of this study. She is a researcher and independent journalist based in New Delhi.

  • Kalki Vundamati was the research assistant for the report.

  • Aditya Sujith Gudimetla drafted the interview questionnaire, which was finalized taking into account comments from reviewers, and based on the responses during initial interviews.

  • Neeta Subbiah draft the primer, and participated in initial interviews.

  • Sankarshan Mukhopadhyay, editor at Privacy Mode, reviewed and provided critical feedback during various stages of this report’s preparation.

  • David Timethy is project manager at Privacy Mode. He oversaw the completion and publication of this report.

  • Anish TP create charts and visuals for the report.


Community participation and peer review

In keeping with Privacy Mode’s policy of peer review, interviews were conducted by the Lead Researcher and collaborators from the community. We thank the interviewers from the community for their active role in the research process, and for bringing a critical perspective to this report.

  • Dr. Akshay S Dinesh is policy and ethics consultant at Weavez Technologies.
  • Joshina Ramakrishnan from Weavez Technologies is a software engineer and an entrepreneur with a decade of experience in inclusive technologies.
  • Kritika Bhardwaj is an advocate practising in Delhi.
  • Maansi Verma is a lawyer and public policy researcher.
  • Sameer Anja is co-founder at Arrka Privacy Management Platform.

Citations and references for additional reading

👉 Draft Data Protection Bill, 2021:

👉 Seetharaman, Bhavani: “Understanding innovation in the Indian tech ecosystem” published at Mozilla Open Innovation Project: Understanding Innovation in the Indian Tech Ecosystem . Specifically, see the chapter on the impact of policy on entrepreneurs in non-urban ecosystems - https://has.gy/ipSo

👉 Timeline of the Bill

👉 Appendix - 1 Primer

👉 Appendix - II Interview questionnaire

👉 Glossary


Footnotes


  1. Privacy Mode reviewed the changes introduced in this PDP Bill, and its likely impact on SMEs. This review was shared with the newly constituted JPC in September 2021. The review is published at: hasgeek.com/privacymode/pdp-bill.
    Also see Data Protection Bill will increase compliance cost for small companies: Hasgeek: Business Line. Sept 2021 ↩︎

  2. In the report on privacy practices in the Indian tech industry in 2020, Nadika Nadja and Anand Venkatnarayanan make the argument that compliance often becomes a checkbox to achieve instead of companies focussing genuinely on user data privacy. This particularly happens in heavily regulated sectors when leadership looks at compliance as an inconvenience that must be fulfilled, instead of paying attention to genuine user privacy issues. See - Privacy practices in the Indian technology ecosystem.
    Withdrawal of the DPB in August 2022:
    Government Withdraws Personal Data Protection Bill, Plans New Set of Legislations: The Wire. Aug 3rd 2022
    Explained: Why the Govt has withdrawn the Personal Data Protection Bill, and what happens now: The Indian Express. Aug 6th 2022 ↩︎

  3. In a review of Telangana state government’s agriculture data management policy, it has been pointed out that policymakers discount the fact that non-personal data (NPD) stems from personal data, and hence, focussing excessively on NPD poses risks for deanonymization of personal data. Review of Telangana state’s Agricultural Data Management Policy 2022: Privacy Mode. Aug 6th 2022 ↩︎

  4. See the summary of this public discussion on the internal and external organisational risks posed by NPD on businesses at India’s Non-Personal Data (NPD) framework: Privacy Mode.
    Justice K.S.Puttaswamy(Retd) ... vs Union Of India And Ors. on 24 August, 2017: Justice K.S.Puttaswamy(Retd) ... vs Union Of India And Ors. on 24 August, 2017: Indian Kanoon ↩︎

  5. In a panel discussion on current industry practices around Privacy by Design, it was suggested that policies be made on a principle basis, rather than with very specific technological recommendations. The implementation of these policies should be left to broad industry discussions, among tech and business communities. See Privacy Best Practices Guide: for a summary of the panel discussion : Privacy Mode. ↩︎

  6. Source: https://www.business-standard.com/article/economy-policy/70-respondents-want-data-protection-bill-to-drop-localisation-rule-survey-122082400325_1.html ↩︎

  7. See For better compliance, tech transfer, Govt to ease data localisation norms: Indian Express Aug 14 2022. Also see What MeitY Has Said On Upcoming IT Laws Since Withdrawing The Data Protection Bill: Medianama Aug 10 2022 ↩︎

  8. See USTR Releases 2022 National Trade Estimate Report on Foreign Trade Barriers: ustr.Gov Mar 31 2022. ↩︎

  9. See Bhavani Seethraman’s critique of the data localization provisions in the PDP Bill, and the potential loss to GDP that this clause will clause were it to be implemented: Privacy Mode. ↩︎

  10. See Ugly Sides of Data Protection Bill and Fallacies of JPC Report: News Click Dec 20 2021, There’s an expansion of state power in the domain of privacy: Indian Express Dec 18 2021, Sweeping powers to government under data protection Bill a step backwards, say experts: Economic Times Dec 11 2019 ↩︎

  11. According to the Government of India, small and medium businesses are those that have investments between 10-50 crores and turnovers between 50-250 crores respectively. Businesses are recognised as a startup till 10 years from its date of incorporation, with a revenue threshold of Rs 100 crore. MSME Gazette of India 1, MSME Gazette of India 2 ↩︎

Hosted by

Deep dives into privacy and security, and understanding needs of the Indian tech ecosystem through guides, research, collaboration, events and conferences. Sponsors: Privacy Mode’s programmes are sponsored by: more

Supported by

Sweta Dash

@sd93

Proposed design and technological architecture changes - problems with Privacy by Design and algorithmic fairness requirements

Submitted Aug 15, 2022

Privacy by Design essentially approaches privacy from a design-thinking perspective. It suggests that privacy must not be an afterthought in the design architecture but should actually be incorporated into networked data systems and technologies by default, or at least at the outset. With that, privacy then remains proactive and embedded in the design architecture throughout the lifecycle of any product/service provided by the business.

The DPB states that this must contain the measures for protection of privacy throughout processing from the point of collection to deletion of personal data; managerial, organisational, business practices and technical systems designed to anticipate, identify, and avoid harm to the Data Principal; the obligations of data fiduciaries; the technology used in the processing of personal data; and the legitimate interests of businesses.

The adoption of Privacy by Design is supposed to ensure transparent processing of digital data and help users know if and how their privacy is guaranteed by businesses.

In line with the GDPR, the DPB also states that Data Fiduciaries are required to prepare a Privacy by Design policy. The policy is to be submitted to the DPA “for certification within such period and in such manner as may be specified by regulations” Thereafter, it is supposed to be published on the business’s website and on that of the DPA as well.

Respondents of this survey anticipate many challenges with such a policy. Foremost, Privacy by Design is subjective, and it is not practical to implement it in this measure. The founder of a FinTech startup was worried about how privacy practices of businesses can be measured. He said, “It is in the eye of the beholder. Our data must be of high quality to us. But if it refers to practices, data protection and privacy practices, then it makes sense. But again, the problem seems to be that it is very poorly defined or not defined at all. I mean, if somebody comes up with a really skewed metric of measuring quality, which is highly subjective, then that is going to be disastrous for everyone, the entire industry.”

An individual from a health-tech business said that this is after all a subjective policy. “When they are approving privacy by design policy, it cannot be a tick the box approach, right, because they also have to keep an open mind. And, they have to understand the nuances of each business. It can’t be like, you know, a one-shoe-fits-all approach. So they have to be quite open and actually understand while approving the policy that what works for one business may not work for the other.”1

To explain that privacy by design is a very abstract thing, the product manager of an AI Tech business said, “like PII or when you exchange tokens, you don’t really know the person’s email or any phone number or so on. But that still only partially covers some of the fairness things…because even in that case, I will have a lot of other information like, you know, the kind of phone, the IP or the zip code or whatever, which may not, which will not come in privacy by design. Privacy by design is like, in many cases is like I’m doing a hand check. all my key PII information is not shared. But I can still infer a lot of things, and maybe what I’m doing is still fundamentally unfair and biased, and that will still fly.” While he believed in the concept, he did not think that fairness and privacy could be ensured by making privacy by design a mandatory feature.

Second, respondents remained sceptical of such a certification and approval system involving the DPA.

“My primary question is why do you need this certification, right? An audit is always fine. But the certification seems to be like an approval flow,” said the Senior Engineer of a CRM business. “It is going to be just burdensome for the company, as well as for the DPA to actually get all the Privacy by Design policies approved. So I think that is on the compliance part, that is very heavy, and I think it is not a viable option to be done,” said product managers of a company that builds open source software products and services. They also shared that they already try to ensure privacy for the customer’s data but they are of the opinion that the set of procedures for this should remain confidential to the business. “So these things should be private to a company. It should not be actually shared with a DPA because it is upon the company to actually create a Privacy by Design approach.” They suggested instead that the approach can be shared with the DPA in exceptional situations like a data breach or a security threat.

Another senior engineer at a health tech business added what concerns her about such approval from the DPA. She said, “If you need a sign off from the authority on your Privacy by Design policy, right, there are no timelines specified within which the authorities will get back. I’d assume that the government is hopefully coming up with some guidelines on this, because otherwise this can hamper business if there is no cut-off date for them to respond.”

Third, respondents are concerned that this will take a hit on resources for businesses. Some of these respondents have already invested in Privacy by Design policies in their businesses because of experiences with GDPR compliance, but they still have these concerns about the policy being a mandate.

“In my case, because I knew that I have to offer to customers outside India, I had internalised the cost, right? If I was building for India now, in most cases, we don’t have any funding. And the customers don’t pay and all those kinds of things, you know, this is all an afterthought. But, the point here is that in most companies people are seriously starved and resource starved. And for them, survival is a bigger thing than Privacy by Design,” said the founder of an MLOps business. Similar responses were echoed by other respondents that complying with the Privacy by Design requirements would add to the already stretched capacity - of knowledge and personnel.

The founder of a FinTech business said, “the cost of building technology... So the ROI of basically becoming a tech enabled business, at times seems like non justifiable, because you already have so much to adhere to, right, from the regulation to changing updates and all.” Another founder of an AgriTech startup said, “See, for example, if I am, sort of educated person having so much problem reading through this Bill and trying to understand what it means for our business, what compliance should I follow? I’m thinking, what happens to somebody who is just running a very small operation? Who is probably unaware of this. Though you have to keep your customers’ data private and that is important, but at the same time, I am just worried about the compliance angle of it, and the compliance angle becoming too cumbersome. That’s my main concern.”2

Fourth, there is a seeming lack of clarity about inclusion in the sandbox. Clause 40 of the DPB states that the Data Protection Authority shall create a sandbox for “encouraging innovation in artificial intelligence, machine-learning or any other emerging technology in the public interest” and may permit some “regulatory relaxations for a specified period of time” for the businesses that are included in the sandbox. In effect, a sandbox acts as a system to promote innovation with relaxed regulatory norms for the businesses that are included in it.3 But, respondents remained unsure of how the inclusion works and what relaxations might come into action.

One of our respondents said she is aware that certain businesses can apply for inclusion in the sandbox after an approval on their Privacy by Design policy by the DPA. She did express that her business is likely to apply for this inclusion. “So I think this is where they have protected the interests of innovators also.” But she, as well as all others, felt “Us being included in the Sandbox is a mere speculation at this time. And it can only be confirmed post regulation is released to this effect, for clarity.”

However, most respondents agreed that theoretically it is a great concept and it is about time that some efforts are made to incentivize embedding privacy into the technological and design architecture. “There is a ripe opportunity for some minds to think in terms of how to solve this problem from a more technical and product angle,” said one of them.

Algorithm fairness is another emerging area of research in technological and design architecture. Lying at the intersection of Machine Learning (ML) and ethics, this aims to identify and correct causes of bias in algorithms and data used for automated decision making.

Under Clause 23, the DPB prescribes that entities must share information on the ‘fairness of algorithm’ In order to ensure transparency of algorithms used by various entities for processing of personal data and to prevent its misuse, the JPC also recommended this provision.

Respondents had similar concerns for privacy by design and algorithmic fairness.

Once again, first, they did not find enough clarity about what algorithmic fairness means in terms of implementation, and thought that this too is an impractical concept. As a founder of a FinTech company said, “It is wishful thinking. I don’t think it is possible to ensure fairness, because humans themselves can’t ensure fairness. How can algorithms do this?”

“In the US, I have friends who are doing algorithmic fairness and things like that, even in the US and elsewhere, that is slowing down. Because it’s just too hard to do mathematically,” said a founder of an MLOps business. An individual associated with a FinTech business said that while there are a lot of different methods to evaluate fairness of algorithms based on models like, “for instance, how it treats women versus men, or people of different ethnicities, gender and so on. We can test these models, but eventually I think, whatever decision we make, whether it is taken by an AI algorithm or whether the decisions are made by humans, there is always a bias.”

Respondents also said that the fairness of algorithms might not be enforceable and possible at all. One of them said, “I think if you are a little clever about it, you can appear to be fair based on a few judgments and yet be very unfair. And this is what happens a lot in the US where you are not allowed to use race as an example. But you can infer race with things as simple as name and obviously zip code and so on and so forth.”

Another respondent said that some algorithms are meant to have a bias right from the start “because that’s what they want … that could be language bias because that is their market. For example, they need some language bias for Spain right, which gives certain weightage for certain particular attributes. Whereas if it is a German market, based on German as a language that is some other thing, are these unfair, you never know.”

Second, respondents remained sceptical of who is going to audit fairness of algorithms and how. “I’m not sure who’s going to audit your algorithmic fairness, because at least what I have seen is many of the audit agencies, they don’t understand the core of the algorithm, right? So who’s going to audit it, how it is going to be even determined, fair algorithm, non fair algorithm, and whether it is going to be random whether somebody can just raise an issue saying, “Oh, it is not treating me fairly.” And it’s on us the company to prove it a fair algorithm?” said one respondent.

Respondents also worried about the full scope of sharing algorithms because algorithms themselves evolve over time. “So, what is the frequency of sharing every version when I change it, version one, version two 1.5 version 1.3 Should I share it?” asked one FinTech founder.

Another respondent worried about the safety and privacy of their algorithms after such audits. One agritech founder said she is not confident about the mechanisms for storing the algorithms. “Now, where is this data really sitting? Is it just with the government? Or is this getting exposed somewhere? Or are there mechanisms? Like, the IRCTC data also got leaked. So, how secure are their mechanisms for storing this data?” 4

Third, and most commonly in fact, all respondents unanimously agreed that this poses threats to their intellectual property rights. As one founder of a FinTech company said about sharing algorithms, “This is something that will affect absolutely every unicorn that is out there. That (algorithm) has the secret sauce (of business) that has, you know, data and what not.” “So if I share my algorithm, then how will I have my competitive advantages preserved? How is my IP going to be protected? So that will concern anyone. So that is not good. Definitely,” said the founder of an agritech business.

While the few of the businesses that rely on open source models think they would remain unaffected by IP concerns, some disagreed. One respondent said, “Our source code is open source. So it’s already hosted on GitHub. There is no problem over there. But in terms of even open source code, or open source code like MIT licences where there is customizations done on MIT licence, and some companies actually decided that the source code will be a proprietary source code, now, their intellectual property at some level is going to get impacted when this kind of disclosure or the level of disclosure is not mentioned.”

As with privacy by design, respondents appreciated the merit of this concept of fairness in algorithms. While they had concerns about implementation and compliances about this provision, they thought this could go a long way with some deliberation by the policymakers.


  1. In a review of the Amendments to the IT Rules 2022, Anwesha Sen points out a similar “one-size-fits-all” solution approach that policymakers have arrived at with the conception of the Grievance Appellate Committee (GAC). The role of the GAC is to review and redress complaints about content posted on social media platforms. The GAC is conceptualized as a single body for adjudicating on all complaints, across all platforms. This is problematic because the members of the GAC may not always understand the nuances of the industry verticals. Hence, their judgement may not only be questionable, but also flawed in many instances. See the table of review and recommendations for the Amendments to IT Rules 2022 at- Amendments to the IT Rules 2022: Impact on SMEs and SSMIs: Privacy Mode. ↩︎

  2. As we also found in an earlier survey titled: Privacy practices in the Indian technology ecosystem
    A 2020 survey of the makers of products and services: “Small and medium organizations do not have the organization structure or capability to have specialised departments to handle risk and compliance, which – combined with their lack of training budgets for privacy, and lack of standard procedures to handle privacy concerns or risk – is a point of great concern and will have implications across the tech ecosystem.”Privacy practices in the Indian technology ecosystem - A 2020 survey of the makers of products and services: Privacy Mode. ↩︎

  3. Sandbox: Section 40 PDP Bill Sandbox for encouraging innovation, India Business Law Journal The curious case of the privacy sandbox ↩︎

  4. See: User data of more than 900,000 leaked from IRCTC last year, resurfaces on dark web: Hindustan Times. ↩︎

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Deep dives into privacy and security, and understanding needs of the Indian tech ecosystem through guides, research, collaboration, events and conferences. Sponsors: Privacy Mode’s programmes are sponsored by: more

Supported by