This blog documents the content and structure of one of OEDP’s two pilot Environmental Data Labs, held on August 16, 2023. OEDP hosted these labs to inform our broader data stewardship work and the emerging model for Community Data Hubs (CDH).
The goal of this workshop was to produce a list of questions to help start conversations on using legal mechanisms to govern environmental data. This list is a resource for stakeholders who could be involved in those conversations, including: community data stewards, interested community members, legal or intermediary organizations working to support data projects, and research collaborators. The list is also designed for data projects that want to reflect on if and how to use legal mechanisms to support data sharing and management.
Participants in the Lab included lawyers, data scientists, and researchers focused on data infrastructures, technology, and public policy. Participants were led through three activities: 1) a brainstorming warm-up activity, 2) a question generation activity, and 3) a “choose your own adventure” activity based on previously provided data scenarios.
Part I. Brainstorming
The brainstorming activity was designed with the following premise: You are trying to understand how a community wants to use their data, and if/which legal mechanisms would be suited to their goals. What do you need to know from a community to start that conversation? What questions would you ask? The responses could be broken up into thematic categories:
questions about users’ goals and challenges
the status of the current data governance framework (roles of and benefits afforded to stakeholders, data accessibility, decision making powers/processes, types of data, etc.),
and community capacity (e.g., financial, technical, organizational, political).
See the full conglomeration of virtual sticky notes below.
Part II. Question generation
In the second and third activities, participants were split into two groups to focus on different data scenarios. The two data scenarios were based loosely on existing community science projects, but were modified to increase anonymity and provide more information on aspects of data access and sharing. The first data scenario is based on a community science project in the Salton Sea and water quality data, while the second data scenario focuses on a community science project in Chicago that collects photographs of neighborhood stormwater flooding.
The second activity focused on generating questions a third party would need to ask the community with data to understand how legal mechanisms could be useful, considering their specific contexts. Thematic categories were provided to guide the conversation, including accessibility, collaboration, scope of data use, and privacy and consent. A number of common questions arose in both data scenarios, such as:
How will the data be housed, will that change over time, and how long should the information be kept?
With whom and how will you share your data?
Will the data be made publicly available? What interfaces will be used to enable access?
Salton Sea Scenario
In the Salton Sea data scenario, a group of community scientists collect, monitor, and store water quality data sampled from the inland sea, which is contaminated by multiple-point source pollution from agricultural runoff. This scenario focuses on how data, managed in a collaborative setting, can validate and communicate a community’s experience of pollution with researchers and policymakers at both local and state levels.
Participants in this breakout room brought up questions such as:
Who do farmers and residents share data with and who do they need to share data with?
How are decisions made about the data and who is involved in making decisions
Is there interest among collaborators in monetizing the data?
What kind of data products would be most compelling to policymakers?
Do potential sharing partners need to share underlying data, or are they more comfortable with sharing a specific data product?
How can we represent the views of the wider community through data sharing?
How have similar policies been informed by legal mechanisms in the past?
How can you structure sets of data agreements that may change over time?
For example, when an individual collects data, they might use a data sharing agreement to share that information with an organization that aggregates data (e.g., Zooniverse). What happens when that organization signs another data sharing agreement for data that includes this original dataset? What level of transparency does an individual contributor have access to in order to understand how their contributed data might be used in the future? In other words, how are the conditions of original data sharing agreements met as datasets are added to other repositories?
Chicago Flooding Scenario
In the Chicago data scenario, the data comprises photographs documenting stormwater flooding taken both in public rights-of-way and on private properties. Data stewards are similarly concerned with validating experiences and advancing advocacy through the governance and sharing of data. However, the risks associated with sharing sensitive data are more pronounced, but the data is still needed for advocacy efforts with government or NGO partners.
In this conversation, the participants honed in on questions such as:
What are the levels of accessibility? Do you have to contribute to a dataset in order to access the data?
What interface will be provided for data access?
Are there limits to data anonymization (i.e., are photos still useful if you can’t tell where the photo was taken)?
What happens if data usage changes over time (i.e., there are collaborations with different partners or different users need access)?
Does similar information live anywhere else (e.g., flood maps)?
How long should this information be kept?
See a selection of sticky notes from the Chicago breakout below.
Part III. Choose your own adventure
In the third activity, the same groups moved onto a “choose your own adventure” activity wherein participants chose one to two questions from the previous activities and worked through the potential answers and legal mechanisms that could be useful in each case (e.g., contracts, data sharing agreements, data trusts, licenses, etc).
The group discussing the Salton Sea data scenario focused largely on the question: What kind of data products are most compelling to policymakers and what does the community need in order to share underlying data? Participants then asked: What data products do neighbors and farmers need in order to speak with various stakeholders such as researchers, government, or other farmers? Answers to these questions led to different legal mechanisms that could be used for different audiences: when sharing data with other community members, data sharing agreements or trust arrangements could be utilized, whereas a different kind of contractual agreement (e.g., conditional use agreement), might be necessary for sharing beyond their community.
The group discussing the Chicago data scenario focused on two related questions: With whom would you like to share public right-of-way data? And with whom would you like to share photos of personal property? While time only allowed for participants to focus on the first question, they similarly divided the answers to the question based on the intended audience: researchers, policymakers, other communities, and the private sector. A recurring question for many participants was when and where a contract was needed, or if data sharing agreements were enough for preventing harm. Additionally, for each audience, there was discussion on the level to which data needed to be pre-processed before sharing, and how data stewards could cater the data product to the intended audience.
Insights and questions to revisit during the CDH Co-Design Process:
These labs inform OEDP’s broader data stewardship work, including our Community Data Hubs model design process. Insights and design considerations from this Lab that can be carried forward to influence that process include:
We don't yet have substantial evidence or examples of how different legal mechanisms have been applied with community-generated data. Collecting different use cases that illustrate how legal mechanisms can influence and interact with environmental data governance, as well as their broader impacts (or lack thereof) will be essential in understanding potential opportunities and challenges in this realm. We will consider using the CDH Resource Library (currently being designed) to create and compile these use case studies to support broader understanding.
In this space, there is a great opportunity to reduce redundancies between community data projects. While each community has specific data stewardship and governance priorities and challenges, we hope to build frameworks and tools, including the output of this Lab, that can speak to the essential legal considerations of environmental data governance.
Activities that allow community members to use a “choose-your-own-adventure” or flowchart exercise can be useful for brainstorming priority considerations and compatible legal tools.