About the Ethical AI guidelines from ICMR The guidelines from ICMR around AI in healthcare and biomedical research are a good starting point. However, they are at this moment rather insufficient. Rea… more
Policies around data privacy and data governance often have a gap between declared intent and implementation. Sometimes this is because the teams drafting these texts are not included in the product/service development cycle. The unintended consequences and end consumer experience are often developed from the effects of this gap.
This project aims to examine the texts of policies to identify data governance practices. And as an outcome it will be producing commentary, opinion and feedback to improve the texts themselves as well how it is easier for the reader to comprehend.
Review of Telangana state's Agricultural Data Management Policy 2022
The draft version of Telangana state’s agricultural data management policy 2002 is published at https://invest.telangana.gov.in/wp-content/uploads/2022/07/Draft-Telangana-Agriculture-Data-Management-Policy-2022-vEnglish.pdf
The policymakers invited suggestions to this draft. Privacy Mode submitted the following comments and suggestions.
Some of the key concerns with this draft are as follows:
- The draft does not specify the origins and sources of agriculture data. It is presumed that agri data exists everywhere/somewhere, and that over time, this data will be automatically funneled into databases. As a result, the processes by which consent should be obtained from Data Principals is weakly defined in this draft.
- The individual Data Principals i.e., farmers, marginal land owners, tenant farmers and other individual entities in the agri ecosystem should be at the center of this draft policy and its future iterations. The draft policy, in its current form, does not explicitly consider individuals as Data Principals and therefore overlooks the nuances of how consent should be obtained from individuals to process and access their data. Instead, the policy appears to be guided by gross assumptions about data management and data access that privilege AIPs which may often be institutions and corporate bodies rather than individuals.
- The lack of explanation of data sources also leads to a non-nuanced examination of what Non Personal Data (NPD) means, where it is likely to stem from. Consequently, the policy does not attend to the challenges NPD poses for de-anonymization of data for Data Principals, and how Data Principals will access their data in future. NPD fundamentally flows from personal data, and needs stronger systems and processes for storage, consent and access.
|Sr. No||Heading, Clause No., Page No.||Suggestions or Objections||Proposed Amendment|
|1||Principles of ADMP, Clause No. 4(1) (Notice), Page No. 6||Suggestions: 1. Notice should be issued in as many regional languages as possible, instead of issuing these only in English. 2. At least one month’s time should be given for response to notices - for Data Principals to respond. 3. Notices should be issued not just online, but also using other mediums such as print, via local governance and village panchayat bodies, and through as many avenues by which the Data Principals can be reached and notified.||The draft policy should be explicit about: 1. Publishing notices in regional languages. 2. Making notices available via multiple mediums. 3. Specify a notice and response period of at least one month for Data Principals to respond.|
|2||Principles of ADMP, Clause No. 4(2) (Consent based processing and sharing of personal data), Page No. 6||Suggestions: Assuming that Data Principals are individuals, and in many cases, may not be conversant or literate in English, the following suggestions are recommended: 1. Data Principals should be adequately informed and educated about how their personal data will be processed and shared, and the implications of different scenarios. 2. Data Principals should also be educated about the rights they have over their personal data, and what rights of refusal they have regarding collection and processing of their personal data. All this needs to be done in language and mediums that Data Principals are comfortable and conversant with.||The policy must be clear about the nuances, and the subsequent mandates and requirements in the case when Data Principals are individuals, and when Data Principals are collectives, groups and organizations.|
|3||Principles of ADMP, Clause No. 4(5), (Responsibility and accountability in processing agricultural data) Page No. 7||Suggestions: 1. If the AIP is an individual, it is unclear how they are expected to appoint a compliance officer on their behalf. 2. Appointing compliance officers to fulfill the obligations of this policy may not be feasible for all AIPs and AISs, especially if these entities are individuals and small organizations. This adds to costs of compliance.||The policy must specify how data sharing and processing can take place when AIUa and AIPs are individuals and smaller entities who cannot appoint compliance officers to act on their behalf. There may also be capacity gaps in that compliance officers who understand privacy in the context of agri-tech may not be readily available for hire or contracting. The policy needs to take cognizance of this gap, and provide general guidelines and directions rather than procedural details of compliance.|
|4||Principles of ADMP, Clause No. 4(7) (Consent based processing and sharing of personal data), Page No. 7||Suggestions: 1. Data Principals should have access to their data at all times. Access cannot be attained by sending written requests, especially if Data Principals are illiterate or are unable to communicate in written form owing to disabilities1. 2. Disclosure of use of the Data Principal’s data should always be proactive. The current policy requires the Data Principal to be vigilant about use of their data, instead of providing proactive disclosures whereas the Data Principal should be placed at the heart and center of this policy.||The policy guidelines and mandates should place the Data Principal at the heart and center, instead of focusing on organizations, and giving primacy to the data collection process over the rights of the Data Principals.|
|5||Principles of ADMP, Clause No. 4(8), (Technological and operational safeguards for data security and privacy) Page No. 7||Objection: The policy fails to take into account that the majority of the actors do not have data security and privacy-by-design practices in place. In the absence of these safeguards, AIS and DSPs will likely skirt the regulations and operate by paying fines, thereby leading to compliance as a checkbox rather than building the ability to genuinely protect users’ data.|
|6||Principles of ADMP, Annexure II linked to Clause No. 7, Page No. 15||Suggestions: 1. Where data will be used for General Agricultural Services, such as Innovation, there is a risk of adversarial attacks if large datasets from production are used for innovation and testing, especially when running experiments with AI/ML2. Therefore, stringent guidelines should be issued on how datasets will be used for testing and innovation. 2. Annexure II also gives the impression that large corporations and entities will benefit from use of agri-data. There is no clear indication on how this policy will benefit individuals, smaller collectives and small entities. The flow of data is from the bottom to the top, but the benefits of this data sharing doesn’t appear to flow from top to bottom.||1. Place Data Principals, including individuals and collectives at the center of this policy. 2. Include provisions and clarity on how Data Principals stand to benefit from data sharing under this policy.|
|7||Principles of ADMP, Clause No. 9(6), Page No. 10 - on de-identification and anonymization.||Objection: The division between personal and non-personal data is assumed to be straightforward whereas the category of non-personal data is complex, and is not simply “that data which is not personal”. In stating this objection, it is implied that Non Personal Data (NPD), especially in the context of agriculture, can reveal a great deal about communities, including giving indications of geography and identities3.||Recognize the definite privacy risks - with de-anonymization and collection of secondary datasets that could lead to re-identification of individuals.|
|8||Principles of ADMP, Clause No. 9(6), Page No. 10 - on de-identification and anonymization.||Suggestion: Data governance frameworks adopted by different organizations - under the policy - can be shared with the long-term goal of building shared resources on how to govern the data better, and implement Privacy by Design principles in agri-tech services and applications. This can help in building a community of technology and design practitioners in the agri sector.|
In summary, it is unclear who owns the data and therefore who will provide the data for the agriculture data management policy. The policy largely refers to data collection, without specifying where the data will originate from and who are the sources and owners of this data.
The policy is also very unclear about the Data Principal, and what it means for a farmer or a landowner to be a Data Principal. Therefore, there are concerns whether these Data Principals will be treated fairly under this policy, and how individual farmers and landowners as Data Principals will have access to their data and notifications around their data - and if this access will be in regional and local languages, and mediums easily accessible to them.
Farmers - and the nuances of their location in society and economy - need to be considered carefully when conceptualizing personal and non-personal data. Farmer data cannot be conceptualized simply as land records data and credit history data because tenancies and land ownership in agriculture are complex. Tenants are often not recognized and therefore, cannot be "classified and categorized’ in a straightforward manner in a digital database or under a data management policy. The policy must consider if such data management will lead to exclusion of tenant farmers, and render them even more marginal.
Privacy Mode is a platform for practitioners - individuals, collectives and organizations - who are thinking about and/or implementing data privacy in their domains and verticals. Privacy Mode:
- Discovers interesting, novel ideas and approaches to solve for privacy, identity and such complex challenges that are emerging in an increasingly digital society.
- Provides a forum to share these ideas - via talks, policy reviews, Fellowships to document best practices, and such activities.
- Elevates ideas and practitioners by putting them before an audience and/or distributing their work to a wider audience of peers. The Fellowship is one such project. Other works include community responses to recent policies such as the CERT-In directives, IT Rules Amendments, etc.
- Assesses education, capacity building and skill validation needs that tech practitioners from SMEs and practice-based communities have around data privacy and data security.
Privacy Mode is actively engaged with the following verticals:
- Data governance
- Health - from data, technology and policy perspectives
- Digital IDs and identity systems - from the point of view of privacy and risk mitigation
- Internet governance
Hasgeek is a platform for building communities. Hasgeek believes that effective and sustainable communities are built in a modular manner, and with an underlying layer of infrastructure and services that enable communities to focus on the core of their work. Hasgeek provides this infrastructure, and the capabilities for communities to amplify their work and presence.
In an analysis of AgriStack and how its design needs to be conceptualized from a farmer’s point of view, it was recommended that farmers’ data should be encrypted and farmers alone should have the encryption to access the data. This analysis is published at http://www.kisanswaraj.in/wp-content/uploads/FINAL-Response-to-IDEA-Consultation-Paper-with-signatories.pdf ↩︎
For more information about adversarial attacks and how the threats can be mitigated when running experiments using ML data, see https://hasgeek.com/fifthelephant/mlops-conference-july-2021/updates/summary-of-ama-on-privacy-attacks-in-ml-systems-wi-Vg4FQJPbcLhu4cZG3o47mD ↩︎
In a public discussion on Non Personal Data (NPD), held in January 2021, a data scientist and public interest technologies mentioned that data resides on a spectrum, and outside of limited high-science concerns like astronomy or chemical reactions. “All data has very human connections. Care must be exercised when dealing with it in an abstract NPD or PII manner.” Saying that even robust systems of anonymization, such as differential privacy can be degraded or weakened with additional datasets or secondary datasets, the data scientist mentioned that there needs to be a “risk” card associated with data collection and more people need to be made aware of it. Source: https://hasgeek.com/PrivacyMode/non-personal-data/sub/summary-of-panel-discussion-persisting-privacy-con-XX2ycnmWH2UE14c47zNDrk ↩︎