The work follows on from our successful collaboration with Nesta on the Political Futures Tracker, which analysed tweets in real-time in the run up to the UK General Election in 2015.
Unlike others, we do not try to predict the outcome of the referendum or answer the question of whether Twitter can be used as a substitute for opinion polls. Instead, our focus is on a more in-depth analysis of the referendum debate; the people and organisations who engage in those debates; what topics are discussed and opinion expressed, and who the top influencers are.
What does it do?
In more detail, the Brexit Analyser uses text analytics and opinion mining techniques from GATE, in order to identify tweets expressing voting intentions, the topics discussed within, and the sentiment expressed towards these topics. Watch this space!
The Data (So Far)We are collecting tweets based on a number of referendum related hashtags and keywords, such as #voteremain, #voteleave, #brexit, #eureferendum.
The volume of original tweets, replies, and re-tweets per day collected so far is shown below. On average, this is close to half a million tweets per day (480 thousand), which is 1.6 times the tweets on 26 March 2015 (300,000), when the Battle For Number 10 interviews took place, in the run up to the May 2015 General Elections.
Subsequent posts will examine the distribution of original tweets, re-tweets, and replies specifically in tweets expressing a remain/leave voting intention.
Hashtags: 1 million of those 1.9 million tweets contain at least one hashtag (i.e. 56.5% of all tweets have hashtags). If only original tweets are considered (i.e. all replies and retweets are excluded), then there are 319 thousand tweets with hashtags amongst the original 678 thousand tweets (i.e. 47% of original tweets are hashtag bearing).
Analysing hashtags used in a Twitter debate is interesting, because they indicate commonly discussed topics, stance taken towards the referendum, and also key influencers. As they are easy to search for, hashtags help Twitter users participate in online debates, including other users they are not directly connected to.
Below we show some common hashtags on June 16, 2016. As can be seen, most are associated directly with the referendum and voting intentions, while others refer to politicians, parties, media, places, and events:
URLs: Interestingly, amongst the 1.9 million tweets only 134 thousand contain a URL (i.e. only 7%). Amongst the 1.1 million re-tweets, 11% contain a URL, which indicates that tweets with URLs tend to be retweet more.
These low percentages suggest that the majority of tweets on the EU referendum are expressing opinions or addressing another user, rather than sharing information or providing external evidence.
@Mentions: Indeed, 90 thousand (13%) of the original 678 thousand tweets contain an username mention. The 50 most mentioned users in those tweets are shown below. The size of the user name indicates frequency, i.e. the larger the text the more frequently has this username been mentioned in tweets.
In subsequent posts we will provide information on the most frequently re-tweeted users and the most prolific Twitter users in the dataset.
So What Does This Tell Us?
Without a doubt, there is a heavy volume of tweets on the EU referendum, published daily. However, with only 6.8% of all tweets being replies and over 58% -- re-tweets, this resembles more an echo chamber, rather than a debate.
Pointers to external evidence/sources via URLs are scarce, as are user mentions. The most frequently mentioned users are predominantly media (e.g., BBC, Reuters, FT, the Sun, Huffington Post); politicians playing a prominent role in the campaign (e.g. David Cameron, Boris Johnson, Nigel Farage, Jeremy Corbyn); and campaign accounts created especially for the referendum (e.g. @StrongerIn, @Vote_Leave).