Understanding the difference

What words are uniquely defining Trump and Biden?

Both candidates and their campaigns claim catastrophe if the other were to get elected. And they are focusing on distinct topics to make that claim. For instance, while Joe Biden states that democracy and decency are on the ballot, Trump suggests it is law and order. The question is: are their narratives sticking? Various natural language processing (NLP) approaches can be used to compare these two text corpora to identify their differences. 

To identify these differences, we will measure the uniqueness of words, or in other words we will identify words that are more commonly used in when talking about Trump compared to Biden. Using the weekly CNN/SSRS/S3MC survey responses, we can construct two corpora: Responses to “read, seen or heard” about Trump and responses to “read, seen or heard” about Biden. To provide a reliable measure of word uniqueness, one needs a third, background, corpus. Here we use all responses. Having defined our corpora, we are now ready to employ the approach introduced by Monroe et al. and find words that uniquely define the two candidates according to our respondents.

In the graph below, the x-axis denotes overall frequency of the word in the survey responses. On the y-axis, is the z-score, which in this context refers to the likelihood that the word is referring to Trump (positive values) or Biden (negative values). 

Frequency and z-scores of words used to describe Trump (positive z-scores) or Biden (negative z-scores)
July 5, 2020 – October 18, 2020

How does the party affect what people say about the candidates?

We can use the technique described to next identify the set of words that are unique to what Democrats and Republicans are remembering about the candidates. For each graph below, the two corpora compared are Republican and Democrat responses. The background corpus includes all responses (i.e. including Independents etc.). The first chart shows this analysis for responses about Biden, and the second chart shows this analysis for responses about Trump. A positive Z score here indicates words used more frequently by Republicans while a negative Z-score indicates words used more frequently by Democrats. For example in the chart about Biden, the word “good” has a negative Z-score since a Democrat is more likely to use this word describing Biden. However in the second chart, the word “good” has a positive Z-score since a Republican is more likely to use this word describing Trump.

Frequency and z-scores of words used to describe Biden by Republicans (positive z-scores) or Democrats (negative z-scores)
July 5, 2020 – October 18, 2020

Frequency and z-scores of words used to describe Trump by Republicans (positive z-scores) or Democrats (negative z-scores)
July 5, 2020 – October 18, 2020

More information about the project methodology

The lead researcher on this specific analysis is Ceren Budak from the University of Michigan. To read a more detailed version of this specific methodology, visit her blog post about a previous text analysis here. The research team conducting analysis of the results of this project are Lisa Singh and Jonathan Ladd from Georgetown University; Josh Pasek, Michael Traugott, Ceren Budak and Stuart Soroka from the University of Michigan; and Jennifer Agiesta and Grace Sparks from CNN.