Could machine learning help bring marginalized voices into historical archives?

Venture BeatThis post was originally published by Kyle Wiggers at Venture Beat

Researchers at the Montreal AI Ethics Institute and Microsoft propose using machine learning to build comprehensive archives that could bridge gaps in cultural understanding, knowledge, and views. They assert that including more voices in archival processes — with the help of machine learning — can have positive effects on communities, particularly those archivists have historically marginalized.

Centralized and accessible directories of records often go unchallenged, even when they contain explicit or implicit biases. For instance, 10 years after South Africa ended years of white rule and racial segregation under apartheid, books used in the country’s schools still did not reflect the history people had experienced. Unfortunately, such archives have wide influence — we depend on them to craft public policies and to preserve languages and culture and shape self-identity, views, and values.

The study’s coauthors sought to explore how technology like AI can address challenges around community databases and archives and make them more widely useful. The team began by identifying areas where current archival practices fail to serve the needs and histories of underserved populations. They found that indigenous peoples, women, children, LGBTQIA2+, senior citizens, victims of genocides, racial minorities, cultural minorities, military veterans, and disabled populations often fall victim to oversights on the part of archival tools and historians.

“Vocal minorities continue to be less discoverable online and in part due to skews in the automated archiving process toward a biased and narrow subset of content creators, who know how to gamify online algorithms and increase their content’s visibility online,” the researchers wrote. “This skew in content discoverability has dramatic implications for what the systems identify as high-value archives.”

This is where thoughtfully applied AI comes into play. The coauthors say it stands to maximize the diversity of viewpoints within archives, for example uncovering content beyond what indexes well on the internet and enhancing the discoverability of low-visibility communities that self-document. AI chatbots could interact with knowledge seekers to bolster their ability to discover relevant artifacts, while at the same time allowing excluded people to develop better digital literacy skills and exposing them to diverse historical perspectives.

The coauthors don’t address the potential for bias within these systems themselves. In the nonprofit Partnership on AI’s first-ever research report last April, the team characterized AI now in use as unfit to automate the pretrial bail process, label people as high risk, or declare others low risk and fit for release from prison. Other ill-fated experiments to predict things like GPA, grit, eviction, job training, layoffs, and material hardship reveal the prejudicial nature of AI algorithms. A recent study that attempted to use AI to predict which college students would fail physics classes was less accurate for women.

Despite these concerns, the researchers maintain a positive view of AI and its potential to “provide a fuller picture” of cultures for those seeking to build a better understanding.

“Benefits of higher discoverability do not only accrue to marginalized communities; they also create positive knock-on effects for others who gain a better understanding of these cultures and are thus able to truly appreciate our shared cultural heritage in its entirety,” the coauthors wrote. “On the subject of comprehensiveness, collation of content from automated systems will enhance the available corpus in the archives … We find that modern AI-enabled approaches can create wider participation in shaping our shared cultural heritage while empowering minorities to have greater control over knowledge and artifacts that serve to represent their past and shape their present and future identities.”

Spread the word

This post was originally published by Kyle Wiggers at Venture Beat

Related posts