Combatting COVID-19 misinformation with machine learning

Venture BeatThis post was originally published by VB Staff at Venture Beat

Presented by AWS Machine Learning


Misinformation around COVID-19 is driving human behavior across the world. Here in the information age, sensationalized clickbait headlines are crowding out actual fact-based content, and, as a result misinformation spreads virally. Conversations within small communities become the epicenter of false information, and that misinformation spreads as people talk, both online and off. As the number of misinformed people grow, this “infodemic” grows.

The spread of misinformation around COVID-19 is especially problematic, because it could overshadow the key messaging around safety measures from public health and government officials.

In an effort to counter misinformed narratives in central and west Africa, Novetta Mission Analytics (NMA) is working with Africa CDC (Center for Disease Control) to discover and identify narratives and behavior patterns around the disease, says David Cyprian, product owner at Novetta. And machine learning is key.

They supply data that measures the acceptability, impact, and effectiveness of public health and social measures. In turn, the Africa CDC analysis of the data enables them to generate tailored guidelines for each country.

“With all these different narratives out there, we can use machine learning to quantify which ones are really affecting the largest population,” Cyprian explains. “We uncover how quickly these things are spreading, how many people are talking about the issues, and whether anyone is actually criticizing the misinformation itself.”

NMA uncovered trending phrases that indicate worry around the disease, mistrust about official messaging, and criticisms of local measures to combat the disease. They found that herbal remedies are becoming popular, as is the idea of herd immunity.

“We know all of these different narratives are changing behavior,” Cyprian says. “They’re causing people to make decisions that make it more difficult for the COVID-19 response community to be effective and implement countermeasures that are going to mitigate the effects of the virus.”

To identify these narrative threads, Novetta ingests publicly-available social media at scale and pairs it with a collection of domestic and international news media. They process and analyze that raw social and traditional media content in their ML platform built on AWS to identify where people are talking about these things, and where events are happening that drive the conversations. They also use natural language processing for directed sentiment analysis to discover whether narratives are being driven by mistrust of a local government entity, the west, or international organizations, as well as identifying influencers that are engendering a lot of positive sentiment among users and building trust.

Pieces of content are tagged as positive or negative to local and global pandemic measures and public entities, creating small human-labeled data sets about specific micronarratives for specific populations that might be trading in misinformation.

By fusing rapid ingestion with a human labeling process of just a few hundred artifacts, they’re able to kick off machine learning and apply it to the scale of social media. This allows them to have more than one learning model that is used for all the problem sets.

“We don’t have a one-size-fits-all approach,” says Cyprian. “We’re always tuning and researching accuracy for specific narratives, and then we’re able to provide large, near-real-time insights into how these narratives are propagating or spreading in the field.”

Built on AWS, their machine learning architecture allows their development team to focus on what they do well, which is develop new applications and new widgets to be able to analyze this data.

They don’t need to worry about any server management, or scaling, since that’s taken care of for them with Amazon EC2 and S3. Their microservices architecture uses some additional features that Amazon offers, particularly Elastic Kubernetes Service (EKS), to orchestrate their services, and Amazon Elastic Container Registry (ECR), to store images and run vulnerability testing before they deploy.

Novetta’s approach is cross-disciplinary, bringing in domain experts from the health field, media analysts, machine learning research engineers, and software developers. They work in small teams to solve problems together.

“In my experience, that’s been the best way for machine learning to make a practical difference,” he says. I would just urge folks who are facing these similar difficult problems to enable their people to do what people do well, and then have the machine learning engineers help to harden, verify, and scale those efforts so you can bring countermeasures to bear quickly.”

Spread the word

This post was originally published by VB Staff at Venture Beat

Related posts