A Primer on the Non-Personal Data (NPD) framework proposed for India
Data is viewed as a risk factor that has to be governed to protect the individual’s privacy. This is broadly known as personal data, i.e., data associated with a particular individual.
Non-personal data (NPD) is any data that is not personal. Traffic data is an example of NPD. This includes traffic patterns in the city, how long it takes to go from Point A to Point B, etc. A company like Uber collects this information on its platform because it has users who give this data back to Uber. This aggregated data, or insights derived from aggregated data, is NPD. It gives Uber competitive advantage over other players. Similar examples can be found in other domains such as e-commerce, food delivery, etc.
Some aspects of NPD are already present in the Personal Data Protection Bill (PDPB) where the government can demand NPD for developmental and planning purposes. Companies must be cognizant about the NPD Report even before a dedicated NPD protection law is enacted. This is because NPD compliance will involve reengineering existing systems and business processes. NPD regulation will also impact the go-to-market strategies of startups, especially the ‘high-growth’ ones.
Categories of NPD:
- Public NPD - this is either created by the government or funded by taxpayer money in a way that society has greater claim over it. All the datasets available on the government’s open data portal data.gov.in fall under this category. By recognizing this data as public NPD, the government encourages other departments to open their datasets, which can then be used for fueling innovation.
- Private NPD - belongs to companies and startups which they may be required to share, in raw form, with competitors or with the government itself. The NPD Report is deliberately vague about raw and processed data. While algorithms and proprietary data don’t have to be shared under Private NPD, even in raw data format much information can be gleaned about companies.
- Community NPD - Resident Welfare Associations (RWAs), tribes, people from a village, or a group of people affected by an event can demand NPD about themselves for developmental purposes. There are likely to be conflicts around purpose and access to this NPD by members of the same community. This will create confusing situations for companies figuring out how to serve the conflicting demands of each community.
The rationale for NPD regulation is to reign in big tech. The government hopes to create competitive advantage by forcing big companies to open their datasets and insights with smaller players, thereby ‘creating a level playing field’.
The other purpose of NPD regulation is to promote public good. Here, the anonymized aggregated datasets that large companies will open under NPD will be used by ‘communities’ for development. The objective of data sharing is to generate “economic benefits for citizens and communities in India” and ensure that the “benefits from processing NPD accrue not only to the organizations that collect such data but also to India and the community that typically produces the data that is being captured.”
However, the NPD Report also allows companies to protect their proprietary datasets under the Copyright Act, thereby reducing the number and quality of datasets that can actually be shared for public good.
NPD has traditionally not been regulated. Companies have processes to assess risk and regulate personal data to meet domestic and international compliance requirements. Now, companies will have to do risk and compliance for NPD too.
NPD is freely used inside companies, and is available in the public domain. This makes compliance for NPD harder, and is something most companies are very poorly prepared for.
Below are the important aspects of NPD compliance:
- Data sharing: Companies will have to share their data with other companies, or with the government. The government, in turn, will decide which entities it will share NPD with. The NPD Report is unclear about when it wants the data, and when it wants the insights, to be shared with the government.
- Data Business registration requirement: All businesses above a certain size have to register as a Data Business in India. The thresholds for companies to register as a Data Business will be set on the basis of criteria such as gross revenue, number of consumers/households/devices handled, % of revenues from consumer information.
To register as a Data Business, companies must provide business ID, business name, associated brand names, rough data traffic and cumulative data collected in terms of number of users, records and data; nature of Data Business, kinds of data collection, aggregation, processing, uses, selling, data-based services developed, etc.
Registering as a Data Business will increase record-keeping that every company will have to undertake for how much of its data has to be made accessible under NPD sharing.
- Metadata Directory contribution: Data Businesses will have to share metadata on the data fields collected by them with the Non-Personal Data Authority (NPDA). This metadata directory will be available under open access.
- High Value Datasets (HVDs) are datasets that are public good, and will benefit the community. HVDs have predetermined data fields. A government or non-profit private organization in its role as a Data Trustee can request creation of HVD with the NPDA. The Report has suggested the granularity of NPD to be collected for creating an HVD at the raw, aggregate, and inferred data level.
- Data Trustees can be government or non-profit private organizations, including Section 8 companies. They are responsible for the creation, maintenance and sharing of HVDs.
A group of community members can also come together to create a Data Trustee and host an HVD.
Data trustees have a ‘duty of care’ to the concerned community, ensuring that HVDs are only used in the interests of the community, and no harm occurs due to re-identification of NPD.
- Data processors are companies that process NPD on behalf of a Data Custodian, for e.g., CSPs, SAAS providers, etc. Data processors are not obligated to share metadata belonging to their customers.
- Data requesters: Only public or private organizations registered in India can request Data Trustees for access to datasets contained in HVDs. Individual persons cannot be data requesters.
- Data Principal: The first iteration of NPD defined Data principal as the corresponding person, entity or community to whom the data relates. Data Principal is removed in the revised NPD Report which says that once any personal data is anonymized, or if data pertains to information other than a person (traffic details, natural phenomenon), there is no Data Principal associated. The community can exercise rights over NPD, revoking individual rights to NPD.
The NPDA will facilitate legitimate data sharing requests, regulate and supervise data sharing arrangements, and address market failures. NPDA will have representatives from industry, Personal Data Protection Authority (DPA), Competition Commission of India (CCI), etc. Sectoral regulators can build additional data regulations on top of those developed by the NPDA.
Erosion of competitive advantage and innovation:
NPD gives insights about how companies operate, and the advantages that the company has over its competitors. For example, Uber determines pricing structure for rides by looking at NPD. This competitive advantage will be eroded when governments mandate large companies to share their NPD with smaller players.
The role of trusts and government will increase in the data ecosystem - for access to data, as guarantor that the data is verified and indeed contributing to public good, and for resolving conflicts - leading to friction during enforcement. Under the PDP Bill, the role of trusts and government has already increased as intermediaries for similar issues, on the grounds of safeguarding privacy and security.
Further, it is difficult to distinguish between insights derived from data and the processes used to arrive at that insight, including use of the raw data itself. NPD Report keeps shifting between insights and raw data in an unclear manner.
The NPD Report mentions that companies will share data with the government, and the government will decide which data is to be shared with which entity. This positions the government to have a control and oversight stake in the ecosystem.
The proposed NPD framework will therefore radically reshape the data ecosystem in India.
Impact of NPD on startups and innovation by Udbhav Tiwari, Public Policy Advisor at Mozilla: https://hasgeek.com/PrivacyMode/npd-impact-on-startups/
Impact of NPD on compliance and engineering processes by Sathish K S, VP of Engineering at Zeotap: https://hasgeek.com/PrivacyMode/impact-of-non-personal-data-npd-framework-on-engineering-processes/
Practical concerns about metadata directory and HVDs requirements in NPD: https://hasgeek.com/PrivacyMode/npd-week/sub/metadata-directory-and-high-value-datasets-hvds-JegscGrtuXzeNFo2Y8Z6vc
Definition of communities and public good in NPD, and what this means for participatory governance: https://hasgeek.com/PrivacyMode/npd-week/sub/interrogating-community-public-good-and-data-trust-DE1r1QQU7Wegr6sUmktxS4
Global data regulations and institutional mechanisms for regulating personal and NPD: https://hasgeek.com/PrivacyMode/npd-week/sub/personal-and-non-personal-data-regulations-globall-H3ZLzwHch2e599u2qNEgVW