A Case Study of Russian Trolls
While investigating the tweets surrounding an event involving Donald Trump, I discovered nine accounts which are apparently all centrally managed for some similar purpose.
These accounts all responded contemporaneously to tweets surrounding a single event. I was tipped off that they might be trolls because of a few factors that struck me as out of place:
- They were all primarily Russian accounts, that were responding in English to threads involving Trump.
- Most of the accounts I’d found suspicious had been opened with a couple of months of each other, and most of them had become inactive around the same time.
- Most of the accounts had a similar number of tweets.
- The accounts all seemed very heavy with retweet content.
- Outside of the threads related to this event, the accounts almost never used English.
This isn’t sufficient to declare them to be related accounts or trolls or bots. The next step was to compare some basic information on their patterns of behavior.
Not much was revealed there except that tweeting in English was relatively rare for all of these accounts, and they all made fairly heavy use of retweets. (Note that the discrepancy between reported tweets and found tweets is an artifact of Twitter’s habit to count all tweets a user has made including those that were later deleted or reported. However for pussynavalny, the difference arises from Twitter’s 3200 tweet limit when pulling data on tweets from users.)
With these additional details, the accounts continued to look suspiciously similar, but there’s no obvious smoking gun, and it actually wasn’t as retweet heavy as I’d originally thought. Time to turn to graphs. The tweets were put into seven categories: replies, retweets, and original content, and Russian or English for each of those makes six, plus one more category for anything not in Russian or English. (I relied on twitter’s own assessment of the language, part of the data structure they return when you use their API to get tweet information.) I assigned a color to each category, grouped them by week, and stacked tweets up in a bar for each week, in the order the tweets occurred.
The results are dramatic. Five of the accounts immediately looked incredibly similar. Two more accounts looked quite a lot like each other. Of the final two accounts that I believe are bots, pussynavalny shares some details with other accounts, in particular a series of retweets in April 2015 that matches franksinator30r. And tan_ef, a much thinner account than the rest, still seemed to share features with some of the other accounts. (For this graph, additional methods were used for the pussynavalny account to obtain additional tweets past the 3200 tweet limit, however this method does not obtain retweets. Only 123 additional tweets were found and the remaining discrepancy may suggest a large number of early retweets, which could match the IgorTuchkov and ivannbogor accounts. Also, the very large red spike in five of the accounts is actually much larger, and was truncated in order to fit the rest of the graphs.)
A closer examination revealed that there was much shared content between all of these accounts. The following chart shows how much each account has in common with the others. The two numbers in each cell are the number of retweets they share, then the number of original (non-reply) tweets that were completely identical.
Again, I turned to graphing. This time I used the exact same method, except that every tweet that exactly matched the content from at least one other account (both retweets, and original text) was plotted in light grey.
At this point, there’s no reasonable possibility that these nine accounts are not strongly linked. While the amount of individuation varies (e.g. the pussynavalny account is much more “custom” than the others) it’s clear that much of their content is identical. Other analysis may be possible (tweeting program, times of day, hashtags in common, etc.) but right now it seems unnecessary. They are all clearly run by the same person, group of people, or organization. For what purpose?
“Though someone loved in Russia”
The event that first drew my attention to these accounts were responses on Twitter to realdonaldtrump after he returned from his Moscow trip in November 2013 for the Miss Universe pageant — the trip in which the Steele dossier claims that the Russians obtained compromising videos of Trump in the Moscow Ritz Carlton. These Russian bot/troll accounts were all activated on this occasion to respond in English to threads initiated by Trump himself.
There were a handful of responses to a few threads, three of them are detailed below.
So we know that these are a set of related troll accounts because of their strong similarity. The uneven relationships between the accounts suggests that these are part of some larger collection of accounts, activated in smaller groups as needed. The timing of the tweets during this event suggests that they were all written by a single person, or small group of people working together, but not for any typical usage for bots and trolls (i.e. to promote a tweet or hashtag, or to sow discord in discussions).
The accounts are all interacting with Trump and a number of other users who are all Russian. In English. It’s reasonable to conclude that the purpose of the trolling was for Trump to see these tweets. All of this occurred before anyone except a small inner circle knew that Trump was likely to run for president. It was before any Russian collusion scandal related to the presidency. It was before there was even an opportunity for Russia to simply hope to create chaos and confusion. The conversations occurred before any public claim of “pee tapes”.
Most of the tweets seem completely innocuous, but others raise eyebrows, like “though someone loved in Russia”. Then there’s “beautiful photos on the memory,” which sounds fine, except no photos were mentioned in the conversation. “show was amazing”, “all feel pleased”, and “thx for coming in Moscow” are all open for varying interpretation.
If you put any weight to the claim from the Steele dossier that Russia obtained compromising material on Trump (e.g. a “pee tape”) while he was there, it’s hard to avoid the conclusion that these tweets were intended to notify Trump that he’d been compromised. The tweets were phrased in a way that would be utterly meaningless (at that time) for anyone who was not “in on it”.
Ultimately I can’t prove for sure why these tweets appeared. But it’s yet one more piece of a circumstantial set of evidence piling up towards validation of the most salacious claim from the Steele dossier. (And there’s quite a bit more of that stacking up—too much to get into here, but Seth Abramson did a good Twitter thread on the topic that covers the evidence through October 2017.)
Aside from adding a bit to the circumstantial case, it’s perhaps more meaningful in simply establishing Russian interest. In March of 2017, a few months after the Steele dossier was first published, Vladimir Putin publicly mocked the idea of obtaining kompromat on Trump in 2013:
‘He wasn’t a politician, we didn’t even know about his political ambitions,’ Putin said. ‘Do they think that our special services are hunting for every U.S. billionaire?’
And yet, a collection of Russian troll accounts, which had all been doing their troll work for more than a year, were activated on this occasion to send messages to Trump. Not to push a hashtag, or to promote Trump’s tweet with retweets or likes. But to communicate. Clearly, there were Russians who were interested in Trump at this time.
Better Tools for the Battle
Where to go from here? I only analyzed a handful of accounts. I didn’t analyze any of the Russian accounts that answered in Russian at that time, although it’s definitely worth a look. (Note that two accounts I did analyze were dropped as having no statistical connection to the above set of trolls.) I didn’t widen my search beyond these threads, although it’s possible there’s more there to unearth.
Twitter has the data and the capability to, relatively quickly, use my starting point and find a whole new collection of previously undetected Russian bot/troll accounts. This could lead to related accounts that were not activated for this event, but Twitter can also find more accounts that are linked to this event — accounts such as cupkupku which is apparently already “shadow-banned”. His tweets do not appear in the above threads, nor can they be found by searching. I know of this account because of (unfortunately incomplete) notes I made months ago. Only Twitter can tell how much more there is here.
And this brings me to a gripe about Twitter’s practices. Twitter’s action in cleaning up bots and trolls is hardly praiseworthy. They delete or shadow-ban them without telling anyone. By doing this, Twitter is effectively aiding and abetting international criminals who are attacking us. We simply have to trust them to report information to the government but there is no oversight. It would be much more ideal to sequester the data but keep it public in some way for review not just by law enforcement and government, but for review by the community that Twitter is supposed to be creating.
In light of recent allegations about Facebook covering up their own misdeeds this is particularly troubling. At best, without this information, the community does not have the tools we need to make effective personal decisions about how to interact, and for those of us attempting to do research, Twitter is in effect working against us, preventing us from finding what we need. At worst, they could be concealing their own complicity in Russian trolling.