Making Twitter History: Some Methods to the Madness

Most historians recognize the need to embrace digital technology within the field. No doubt there will always be a few who are either resistant to change or who are simply more comfortable working through long established methods. Yet there are increasing numbers who endeavor to push forward. The use of professional networking sites (for example, LinkedIn and is already so common that young scholars who opt out might risk being considered for some positions. A few have  pushed the boundaries of online networking and engagement to social media sites like Twitter.

So what is a “twitterstorian”?  Here, at Michigan State University, we have two excellent examples in Dr. Peter Alegi and Dr. Jessica Johnson. Both professors use the site to promote scholarship in their respective fields, network and collaborate with other scholars. Their level of activity, the quality of their content and the number of people they reach make them twitterstorians par exellence in my humble opinion. But there are even more ways in which Twitter can be used by historians.  Tweets themselves are records and the archives that people produce when they tweet are valuable resources that historians can mine. Once they are collected, these tweets can be used just like any other document.



The great part about using tweets as a source that there are a lot of them; this is also one of the challenges.  Based on the huge amount of data, it is virtually impossible to engage with Twitter as a source without using some kind of outside technology.  I used Tweet Archivist. This program uses search term to to create a repository of tweets. The results depend on the search terms. I used two terms RIP Mandela and Madiba. Other users might find it useful to use specific hashtags but this will narrow results or create duplicate entries. I  began collection on February 15th  and ended April 5th . Every tweet that contained the words RIP and Mandela found its way into my archive and every tweet containing the word Madiba found its way into the second. This means that the second data set contained a few entries about people who are not Nelson Mandela, but go by Madiba.  Researchers should try to be as precise as possible when selecting their search terms and setting their collection dates.  Otherwise their data set might get bogged down with unrelated tweets.

Other than actually collecting the data the Tweet Archivist tool offers relatively little to historians. The reports that the site generates are mainly meant to answer sociological and marketing questions.  The user can see where people tend to tweet about a term, what hashtags they use and what the most common words in the text are.  To analyze the text effectively, historians using Twitter will have to turn to other tools. Tweet Archivist does offer the information to subscriber through pdf and xls downloads. I recommend frequently downloading xls. spreadsheets of their data to analyze and manipulate off site.

Working in a spreadsheet allows users who are more adept at Excel to quantify their data. However, this process cannot replace actually reading the tweets. For the laborious task of reading and organizing tweets, I highly recommend Google Refine. Google Refine is a tool used to clean messy data in spreadsheets.  It includes a text search and text facet function.  The text search simply lets the researcher cull through all of the tweets that mention a series of words.  For example, if a researcher working on West Africa in the 2014 World Cup used Algeria and Cameroon as her search terms, she could then use the text filter to distinguish between tweets that mention either country individually.  The text facet tool brings up groups of tweets that the researcher has labeled in a pane to the left of the screen.  Once all of the tweets are labeled, the researcher can examine clusters of tweets.  Again, for a match between Algeria and Cameroon the researcher can cluster the tweets under labels like “Algeria Fan” or “Cameroon Fan” and then examine how each set of fans wrote about the match.


Historians using Tweet Archivist and Google Refine, as I have, should be able to analyze and organize the texts of their tweets with these programs.  Nevertheless, digital historians working with Twitter are limited by the fact that they have to begin their archive before the event that they wish to describe.  However, historians are likely to find additional useful information that they did not plan to encounter. Twitter data hits high volume.  The most obvious problem is that much of the data generated is not very useful.  But within the large data sets, historians can find useful information about events.


This site hosts two products that the method described above can yield. The first is an assessment of the text within a group of tweets.  The research question was formed prior to collection and I entered my analysis with a clear agenda. The second piece grew out of my analysis of the first data set. I did not intend to collect any data on Palestinian liberation, but my data set made it apparent that this was an important issue to South Africans tweeting about, Nelson Mandela.

The Twitter data I collected served as a springboard; a point of departure that allowed me to locate other online sources and contextualize South Africa’s involvement in international Boycott, Divestment and Sanctions campaign.  Both blogs strive to be a complete digital project.  All of the materials were found online and all of its products, including the data sets used, are hosted online and are available for use.  I used Twitter to contact Sarah Robinson, an activist whose tweets about Palestine, and I have included her response to my query in its entirety.

The last part of my method is to write in a way is open and accessible.  Twitter is a public forum and I am happy to make the information it produced more accessible.  I hope that this project encourages historians to use twitter as a source so that we might demand more access to this repository of the public consciousness in the future.


2 thoughts on “Making Twitter History: Some Methods to the Madness

  1. Pingback: The Classroom and the Cloud | Digital Southern African Studies

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s