A team of researchers from Princeton University has today detailed a new way of finding malicious troll accounts based on their behavior, using a machine learning tool trained using previous misinformation efforts of China, Russia, and Venezuela.

Experts hope software engineers will run with the idea; building a real-time monitoring system to expose foreign influence in American politics, possibly including campaigns that may surface during the U.S. presidential election in November.

A paper about the project, “Content-Based Features Predict Social Media Influence Operations,” is published in the journal Science Advances.

“Online influence campaigns have some inherent limits if we are paying attention,” said Jacob Shapiro, professor of politics and international affairs at the Princeton School of Public and International Affairs, speaking to Newsweek about the study.

“They either have to say things which aren’t being said, or say a lot more of something that’s already in the conversation than would normally happen.

“Our study shows that it’s possible to find both of those activities in ways that protect user privacy and do not involve any subtle intelligence tradecraft.

He added: “This means that if our government wanted to, it could provide the American people with regular estimates of both how much the Chinese, Russians, and others are doing to influence our politics, as well as what topics and ideas they are pushing.”

The U.S. got a taste of how disruptive a major misinformation campaign can be during the 2016 presidential election, when Russia launched an “influence campaign” that was designed to “undermine public faith in the U.S. democratic process.”

U.S. intelligence determined that Russia had a “clear preference” for President Donald Trump and spewed disinformation across the internet using paid social media users and bots. More lately, social media firms have grappled with fake information about COVID-19.

The Princeton team says its tool identified patterns from earlier campaigns by analyzing posts to Twitter and Reddit, and the hyperlinks or URLs included in them.

Troll data included publicly available data from Chinese, Russian and Venezuelan trolls, totaling 8,000 accounts and 7.2 million posts from late 2015 through to 2019.

It was then combined with a large dataset of posts by “politically engaged” and average users collected by NYU’s Center for Social Media and Politics (CSMaP).

“We couldn’t have conducted the analysis without that baseline comparison dataset of ordinary tweets,” said Joshua Tucker, professor of politics at New York University and co-director of CSMaP. “We used it to train the model to distinguish between tweets from coordinated influence campaigns and those from ordinary users.”

The team said the question was simple: was it possible to tell if a post was part of an troll influence campaign based on the comparison to ordinary content?

Ultimately, the researchers said more than 460 different tests were conducted, with the AI method running five prediction tasks across four influence campaigns.

The university said across “almost all of the 463 different tests it was clear which posts were and were not part of an influence operation, meaning that content-based features can indeed help find coordinated influence campaigns on social media.”

What proved key to the tool, however, was analysis of “meta-content,” which is how the contents of a post had related to information available at the time. “I think the biggest surprise was how important context was,” Shapiro told Newsweek.

“Our co-author Meysam Alizadeh had the great idea of building features that captured how the content of a post compared to things everyone else was saying. That turned out to be very important. Lots of people link to local news sources…. but when Russian trolls did so they tended to mention a different set of users than normal people.

“That discrepancy was necessary… they had to make associations that weren’t already part of the discussion to shift attitudes. But in doing so they revealed themselves.”

Some countries, mainly Russia and China, appeared to be able to make their content appear more organic, but Venezuelan trolls were easier to detect as they only circulated content linked to a certain number of people and topics, the team found.

“The most interesting thing we learned was how dynamic [campaigns] are,” Shapiro said when asked what was unique about prior campaigns targeting the U.S.

“Content was constantly changing, so there was no one signature. But we also saw that content was almost always different from normal users in some way. Differences with normal users varied over time and between campaigns, but were always weird.”

While it’s clearly a starting point, the researchers stressed the machine learning tool won’t solve the problem, or find every troll campaign out there.

For now, it requires that someone has already found recent influence activity to “learn” from. Judging the size of an influence campaign in real time requires distinguishing participating accounts or content from normal users, Shapiro noted.

What it does suggest is that while there is no solid set of characteristics that can define influence efforts, troll content is “almost always"detectable in some way.

“What these results show is that if investigations regularly find a share of what influence campaigns are doing, then it should be possible to provide the American people with near real-time estimates of how much content foreign nations are putting into our social media and what they are talking about when they do,” Shapiro said.

“Between the networks of independent fact checkers out there and the work companies like Twitter are doing we get a good sense of what state-led information campaigns are doing every few weeks. What is lacking is the engineering to build an operational system that learns from those examples to provide widespread public awareness.”

“We are actively looking for resources to build a real-time dashboard which leverages the techniques we developed to help inform the American people and others around the world how external actors are trying to shape their politics,” Shapiro added.

“[Americans] deserve to understand how much is being done by foreign countries to influence our politics,” read a comment attributed to the researcher in a media release from the university. “Results suggest providing that knowledge is technically feasible. What we currently lack is the political will and funding, and that is a travesty.”