Open Data Institute seeks to define responsible data stewardship

Report is latest attempt to develop a systemic approach to handling data.

The Open Data Institute has published a report on research aimed at defining responsible data stewardship.

It says: “The right kind of access to data is vital in tackling the big challenges we face in society – from the earlier detection and treatment of disease to reducing pollution in urban spaces. However, data and related technologies can also cause harm, including through automating decisions that need a human touch, or embedding existing biases and inequities.”

The ODI reckons data is too often used inconsistently and with varying depth of thought, and its meaning and interpretation differs significantly based on geography and culture. So it wanted to develop a more tangible definition of what good data stewardship is, cutting through the often anecdotal understanding of the subject.

Data science and social impact

It carried out research alongside the Patrick J McGovern Foundation, an organization dedicated to “bridging the frontiers of artificial intelligence, data science, and social impact.

The conclusion reached was that responsible data stewardship should be, “an iterative, systemic process of ensuring that data is collected, used and shared for public benefit, mitigating the ways that data can produce harm, and addressing how it can redress structural inequalities”.

The report, Responsible data stewardship, sets out what the key terms within that definition mean.

  • Iterative – Responsible stewardship of data is too often a box-ticking exercise. It needs to be a proactive process that consistently engages with examples of responsible practice from a variety of contexts. That involves investing staff time and financial support into embedding practices. Success should be judged by activity and outcome.
  • Systemic – “The impacts of data collection and use are rarely fully within the control of any one organisation. Organisations need to develop a systemic view of their data practices that links how choices made around data have impacts outside of the organisation,” says the report. This means there is a need to understand the data ecosystems in which organizations operate, and the interests, power and vulnerability of other actors.
  • Public benefit – Those dealing with data should make every effort to ensure it is used and shared for the benefit of others, not just the organization that holds it. This involves being proactive in exploring how data can be used to have a positive impact, and in involving and empowering data subjects and other stakeholders.
  • Harm – Conversely, responsible stewardship also involves identifying and reducing harmful impacts. So for example, ensuring there are effective security systems to stop data getting into the hands of bad actors, or considering how data could be used to create products that have harmful unintended consequences. This means going beyond legal requirements around privacy, security and transparency.
  • Redress structural inequalities – “The collection, use and sharing of data always occurs within a wider system of relationships, value exchanges and power imbalances. These have real-world consequences for data,” says the ODI. This means data stewards need to consider some fundamental questions such as whether data should be collected at all in certain circumstances. Consideration also needs to be given to new forms of communication and developing alternative forms of governance.

The report notes that “the language of responsibility is used by lots of different actors” but that “there is no consistent definition of responsibility in regards to data. Instead, there are clusters of ideas around what it entails”.

Ethical concerns

Many use the terms ‘responsible’ and ‘ethical’ interchangeably, while some see ethics as a specific concern in a broader spectrum of responsibility. Others see responsibility as referring to an accessible way of thinking about data practices, or as simply adhering to a set of principles.

The issue with seeing responsibility in the context of subjective concepts is that geographical and cultural context comes into play. One of the practitioners interviewed for the report said: “We have to understand people’s individual differences or socio-cultural values in which we are talking about responsibility. We have to consider people’s expectations … to understand people’s fears and concerns. The fears in Europe might not be the fears in Africa.”

Similarly, concepts such as privacy can be approached from a number of different angles. One interviewee said: “the framing of privacy over the years has been heavily influenced by GDPR but has taken a very individual approach like individual privacy rather than thinking of collective notions of things like privacy”.

The ODI also observes that “the language of responsibility has been adopted by large technology companies”, and that this is particularly the case “among the recent progress and discourse around large scale language models (LLM) and generative AI”. A number of the people interviewed warned of the dangers of ‘responsibility-washing’ in this context.

The full report is available on the ODI website.