When I asked ChatGPT for a list of experts in cotton and water stewardship, the result consisted entirely of Western academics and climate NGOs. Another prompt asking where water data savings were sourced read: “The training data is not evenly distributed globally. Indigenous, local, or unpublished farmer knowledge is under-represented.” (OpenAI, the US developer of ChatGPT, did not provide comment in time for publication.)
It’s not so simple as inviting traditional communities and Indigenous peoples to the table, either. Many don’t want their knowledge exploited by AI even if asked to participate. But if underlying biases are left unaddressed, they have the potential to undercut progress toward both sustainability and diversity and inclusion.
Who does AI benefit?
Taylor Sparklingeyes is a senior data sovereignty specialist (which relates to the collection, ownership, and use of data) for environmental and community development consulting firm Shared Value Solutions, and a member of Goodfish Lake First Nation, part of Treaty 6 territory in Canada. After members of the Indigenous communities she works with began asking what AI is and whether they should be using it, Sparklingeyes enrolled in the Indigenous Pathfinders in AI program run by Montreal AI research institute Mila, designed to empower First Nations, Inuit, and Métis participants to learn Indigenous-centered approaches to engaging with AI.
Sparklingeyes warns that the speed at which the technology is moving (it is the fastest-spreading technology in human history) risks aspects of safety, security, and privacy being overlooked among Indigenous communities. “That’s one thing working with Indigenous communities, if you want to be a true ally, sometimes you have to not worry about time and expectations. It takes a long time to build those trusted relationships, and they should be the foundation of this work, whether it’s around the co-design of governance, of data, or what impact these systems will have on communities,” she says.
Some experts worry that AI’s bias is not only present, but intentional. Deepak Varuvel Dennison, an AI researcher and PhD student at Cornell University, argues that AI platforms have a direct economic incentive to pay for knowledge that reflects the majority of their paying user base rather than niche or underrepresented subjects. Reaffirming user biases is more likely to keep people on the platform, because their beliefs go unchallenged, and a user base concentrated in the Global North further fuels the “silicon gaze” and marginalizes Indigenous knowledge. “What’s economically valuable to the people in power gets promoted and [what isn’t] gets delegitimized,” Dennison says.
Reckoning with access
Complicating Indigenous representation in AI is a bigger question of whether or not traditional communities want the technology to have access to their data and insights. For many creators in the Global North, this is the first time they’ve reckoned with how their data is used and how to claw back ownership. For Indigenous communities, however, the fight for data sovereignty is nothing new.
“[Indigenous communities] all have unique experiences when it comes to the historical harms of knowledge and data extraction,” says Sparklingeyes, noting that many communities don’t even have access to data about them, because it is often extracted by force or under unfair and misleading terms. This data can span from maps to art pieces, some of which may have been used to train AI if it is present online, in scientific journals, or on government databases, all of which are mined for training. It is likely, however, to be removed from its original context and presented in Western papers and research materials, as freely accessible English language research from high-income countries is over-represented, according to a ChatGPT context note on source materials.

