Proposed design and technological architecture changes - problems with Privacy by Design and algorithmic fairness requirements

Aug 2022

22 Mon 02:58 PM – 02:58 PM IST

23 Tue 02:58 PM – 02:58 PM IST

24 Wed 02:58 PM – 02:58 PM IST

25 Thu 02:58 PM – 02:58 PM IST

26 Fri 02:58 PM – 02:58 PM IST

27 Sat 02:58 PM – 02:58 PM IST

28 Sun 02:58 PM – 02:58 PM IST

Aug 2022

29 Mon 02:58 PM – 02:58 PM IST

30 Tue 02:58 PM – 02:58 PM IST

31 Wed 02:58 PM – 02:58 PM IST

1 Thu 02:58 PM – 02:58 PM IST

2 Fri 02:58 PM – 02:58 PM IST

3 Sat 02:58 PM – 02:58 PM IST

4 Sun 02:58 PM – 02:58 PM IST

Sep 2022

5 Mon 02:58 PM – 02:58 PM IST

6 Tue 02:58 PM – 02:58 PM IST

7 Wed 02:58 PM – 02:58 PM IST

8 Thu 02:58 PM – 02:58 PM IST

9 Fri 02:58 PM – 02:58 PM IST

10 Sat 02:58 PM – 02:58 PM IST

11 Sun 02:58 PM – 02:58 PM IST

Sep 2022

12 Mon 02:58 PM – 02:58 PM IST

13 Tue 02:58 PM – 02:58 PM IST

14 Wed 02:58 PM – 02:58 PM IST

15 Thu 02:58 PM – 02:58 PM IST

16 Fri 02:58 PM – 02:58 PM IST

17 Sat 02:58 PM – 02:58 PM IST

18 Sun 02:58 PM – 02:58 PM IST

Sep 2022

19 Mon 02:58 PM – 02:58 PM IST

20 Tue 02:58 PM – 02:58 PM IST

21 Wed 02:58 PM – 02:58 PM IST

22 Thu 02:58 PM – 02:58 PM IST

23 Fri 02:58 PM – 02:58 PM IST

24 Sat 02:58 PM – 02:58 PM IST

25 Sun 02:58 PM – 02:58 PM IST

Sep 2022

26 Mon 02:58 PM – 02:58 PM IST

27 Tue 02:58 PM – 02:58 PM IST

28 Wed 02:58 PM – 02:58 PM IST

29 Thu 02:58 PM – 02:58 PM IST

30 Fri 02:58 PM – 02:58 PM IST

1 Sat

2 Sun

Proposed design and technological architecture changes - problems with Privacy by Design and algorithmic fairness requirements

Submitted Aug 15, 2022

Privacy by Design essentially approaches privacy from a design-thinking perspective. It suggests that privacy must not be an afterthought in the design architecture but should actually be incorporated into networked data systems and technologies by default, or at least at the outset. With that, privacy then remains proactive and embedded in the design architecture throughout the lifecycle of any product/service provided by the business.

The DPB states that this must contain the measures for protection of privacy throughout processing from the point of collection to deletion of personal data; managerial, organisational, business practices and technical systems designed to anticipate, identify, and avoid harm to the Data Principal; the obligations of data fiduciaries; the technology used in the processing of personal data; and the legitimate interests of businesses.

The adoption of Privacy by Design is supposed to ensure transparent processing of digital data and help users know if and how their privacy is guaranteed by businesses.

In line with the GDPR, the DPB also states that Data Fiduciaries are required to prepare a Privacy by Design policy. The policy is to be submitted to the DPA “for certification within such period and in such manner as may be specified by regulations” Thereafter, it is supposed to be published on the business’s website and on that of the DPA as well.

Respondents of this survey anticipate many challenges with such a policy. Foremost, Privacy by Design is subjective, and it is not practical to implement it in this measure. The founder of a FinTech startup was worried about how privacy practices of businesses can be measured. He said, “It is in the eye of the beholder. Our data must be of high quality to us. But if it refers to practices, data protection and privacy practices, then it makes sense. But again, the problem seems to be that it is very poorly defined or not defined at all. I mean, if somebody comes up with a really skewed metric of measuring quality, which is highly subjective, then that is going to be disastrous for everyone, the entire industry.”

An individual from a health-tech business said that this is after all a subjective policy. “When they are approving privacy by design policy, it cannot be a tick the box approach, right, because they also have to keep an open mind. And, they have to understand the nuances of each business. It can’t be like, you know, a one-shoe-fits-all approach. So they have to be quite open and actually understand while approving the policy that what works for one business may not work for the other.”¹

To explain that privacy by design is a very abstract thing, the product manager of an AI Tech business said, “like PII or when you exchange tokens, you don’t really know the person’s email or any phone number or so on. But that still only partially covers some of the fairness things…because even in that case, I will have a lot of other information like, you know, the kind of phone, the IP or the zip code or whatever, which may not, which will not come in privacy by design. Privacy by design is like, in many cases is like I’m doing a hand check. all my key PII information is not shared. But I can still infer a lot of things, and maybe what I’m doing is still fundamentally unfair and biased, and that will still fly.” While he believed in the concept, he did not think that fairness and privacy could be ensured by making privacy by design a mandatory feature.

Second, respondents remained sceptical of such a certification and approval system involving the DPA.

“My primary question is why do you need this certification, right? An audit is always fine. But the certification seems to be like an approval flow,” said the Senior Engineer of a CRM business. “It is going to be just burdensome for the company, as well as for the DPA to actually get all the Privacy by Design policies approved. So I think that is on the compliance part, that is very heavy, and I think it is not a viable option to be done,” said product managers of a company that builds open source software products and services. They also shared that they already try to ensure privacy for the customer’s data but they are of the opinion that the set of procedures for this should remain confidential to the business. “So these things should be private to a company. It should not be actually shared with a DPA because it is upon the company to actually create a Privacy by Design approach.” They suggested instead that the approach can be shared with the DPA in exceptional situations like a data breach or a security threat.

Another senior engineer at a health tech business added what concerns her about such approval from the DPA. She said, “If you need a sign off from the authority on your Privacy by Design policy, right, there are no timelines specified within which the authorities will get back. I’d assume that the government is hopefully coming up with some guidelines on this, because otherwise this can hamper business if there is no cut-off date for them to respond.”

Third, respondents are concerned that this will take a hit on resources for businesses. Some of these respondents have already invested in Privacy by Design policies in their businesses because of experiences with GDPR compliance, but they still have these concerns about the policy being a mandate.

“In my case, because I knew that I have to offer to customers outside India, I had internalised the cost, right? If I was building for India now, in most cases, we don’t have any funding. And the customers don’t pay and all those kinds of things, you know, this is all an afterthought. But, the point here is that in most companies people are seriously starved and resource starved. And for them, survival is a bigger thing than Privacy by Design,” said the founder of an MLOps business. Similar responses were echoed by other respondents that complying with the Privacy by Design requirements would add to the already stretched capacity - of knowledge and personnel.

The founder of a FinTech business said, “the cost of building technology... So the ROI of basically becoming a tech enabled business, at times seems like non justifiable, because you already have so much to adhere to, right, from the regulation to changing updates and all.” Another founder of an AgriTech startup said, “See, for example, if I am, sort of educated person having so much problem reading through this Bill and trying to understand what it means for our business, what compliance should I follow? I’m thinking, what happens to somebody who is just running a very small operation? Who is probably unaware of this. Though you have to keep your customers’ data private and that is important, but at the same time, I am just worried about the compliance angle of it, and the compliance angle becoming too cumbersome. That’s my main concern.”²

Fourth, there is a seeming lack of clarity about inclusion in the sandbox. Clause 40 of the DPB states that the Data Protection Authority shall create a sandbox for “encouraging innovation in artificial intelligence, machine-learning or any other emerging technology in the public interest” and may permit some “regulatory relaxations for a specified period of time” for the businesses that are included in the sandbox. In effect, a sandbox acts as a system to promote innovation with relaxed regulatory norms for the businesses that are included in it.³ But, respondents remained unsure of how the inclusion works and what relaxations might come into action.

One of our respondents said she is aware that certain businesses can apply for inclusion in the sandbox after an approval on their Privacy by Design policy by the DPA. She did express that her business is likely to apply for this inclusion. “So I think this is where they have protected the interests of innovators also.” But she, as well as all others, felt “Us being included in the Sandbox is a mere speculation at this time. And it can only be confirmed post regulation is released to this effect, for clarity.”

However, most respondents agreed that theoretically it is a great concept and it is about time that some efforts are made to incentivize embedding privacy into the technological and design architecture. “There is a ripe opportunity for some minds to think in terms of how to solve this problem from a more technical and product angle,” said one of them.

Algorithm fairness is another emerging area of research in technological and design architecture. Lying at the intersection of Machine Learning (ML) and ethics, this aims to identify and correct causes of bias in algorithms and data used for automated decision making.

Under Clause 23, the DPB prescribes that entities must share information on the ‘fairness of algorithm’ In order to ensure transparency of algorithms used by various entities for processing of personal data and to prevent its misuse, the JPC also recommended this provision.

Respondents had similar concerns for privacy by design and algorithmic fairness.

Once again, first, they did not find enough clarity about what algorithmic fairness means in terms of implementation, and thought that this too is an impractical concept. As a founder of a FinTech company said, “It is wishful thinking. I don’t think it is possible to ensure fairness, because humans themselves can’t ensure fairness. How can algorithms do this?”

“In the US, I have friends who are doing algorithmic fairness and things like that, even in the US and elsewhere, that is slowing down. Because it’s just too hard to do mathematically,” said a founder of an MLOps business. An individual associated with a FinTech business said that while there are a lot of different methods to evaluate fairness of algorithms based on models like, “for instance, how it treats women versus men, or people of different ethnicities, gender and so on. We can test these models, but eventually I think, whatever decision we make, whether it is taken by an AI algorithm or whether the decisions are made by humans, there is always a bias.”

Respondents also said that the fairness of algorithms might not be enforceable and possible at all. One of them said, “I think if you are a little clever about it, you can appear to be fair based on a few judgments and yet be very unfair. And this is what happens a lot in the US where you are not allowed to use race as an example. But you can infer race with things as simple as name and obviously zip code and so on and so forth.”

Another respondent said that some algorithms are meant to have a bias right from the start “because that’s what they want … that could be language bias because that is their market. For example, they need some language bias for Spain right, which gives certain weightage for certain particular attributes. Whereas if it is a German market, based on German as a language that is some other thing, are these unfair, you never know.”

Second, respondents remained sceptical of who is going to audit fairness of algorithms and how. “I’m not sure who’s going to audit your algorithmic fairness, because at least what I have seen is many of the audit agencies, they don’t understand the core of the algorithm, right? So who’s going to audit it, how it is going to be even determined, fair algorithm, non fair algorithm, and whether it is going to be random whether somebody can just raise an issue saying, “Oh, it is not treating me fairly.” And it’s on us the company to prove it a fair algorithm?” said one respondent.

Respondents also worried about the full scope of sharing algorithms because algorithms themselves evolve over time. “So, what is the frequency of sharing every version when I change it, version one, version two 1.5 version 1.3 Should I share it?” asked one FinTech founder.

Another respondent worried about the safety and privacy of their algorithms after such audits. One agritech founder said she is not confident about the mechanisms for storing the algorithms. “Now, where is this data really sitting? Is it just with the government? Or is this getting exposed somewhere? Or are there mechanisms? Like, the IRCTC data also got leaked. So, how secure are their mechanisms for storing this data?” ⁴

Third, and most commonly in fact, all respondents unanimously agreed that this poses threats to their intellectual property rights. As one founder of a FinTech company said about sharing algorithms, “This is something that will affect absolutely every unicorn that is out there. That (algorithm) has the secret sauce (of business) that has, you know, data and what not.” “So if I share my algorithm, then how will I have my competitive advantages preserved? How is my IP going to be protected? So that will concern anyone. So that is not good. Definitely,” said the founder of an agritech business.

While the few of the businesses that rely on open source models think they would remain unaffected by IP concerns, some disagreed. One respondent said, “Our source code is open source. So it’s already hosted on GitHub. There is no problem over there. But in terms of even open source code, or open source code like MIT licences where there is customizations done on MIT licence, and some companies actually decided that the source code will be a proprietary source code, now, their intellectual property at some level is going to get impacted when this kind of disclosure or the level of disclosure is not mentioned.”

As with privacy by design, respondents appreciated the merit of this concept of fairness in algorithms. While they had concerns about implementation and compliances about this provision, they thought this could go a long way with some deliberation by the policymakers.

In a review of the Amendments to the IT Rules 2022, Anwesha Sen points out a similar “one-size-fits-all” solution approach that policymakers have arrived at with the conception of the Grievance Appellate Committee (GAC). The role of the GAC is to review and redress complaints about content posted on social media platforms. The GAC is conceptualized as a single body for adjudicating on all complaints, across all platforms. This is problematic because the members of the GAC may not always understand the nuances of the industry verticals. Hence, their judgement may not only be questionable, but also flawed in many instances. See the table of review and recommendations for the Amendments to IT Rules 2022 at- Amendments to the IT Rules 2022: Impact on SMEs and SSMIs: Privacy Mode. ↩︎
As we also found in an earlier survey titled: Privacy practices in the Indian technology ecosystem
A 2020 survey of the makers of products and services: “Small and medium organizations do not have the organization structure or capability to have specialised departments to handle risk and compliance, which – combined with their lack of training budgets for privacy, and lack of standard procedures to handle privacy concerns or risk – is a point of great concern and will have implications across the tech ecosystem.”Privacy practices in the Indian technology ecosystem - A 2020 survey of the makers of products and services: Privacy Mode. ↩︎
Sandbox: Section 40 PDP Bill Sandbox for encouraging innovation, India Business Law Journal The curious case of the privacy sandbox ↩︎
See: User data of more than 900,000 leaked from IRCTC last year, resurfaces on dark web: Hindustan Times. ↩︎

The past as a compass for the future

Proposed design and technological architecture changes - problems with Privacy by Design and algorithmic fairness requirements

Comments