the study of human identity
using large datasets
and computational methods

Winners of The Jason Jeffrey Jones Award for Excellence in Ipseology Visualization:

Ina Krapp, 2022-03-23

This graph shows the most common tokens in the longitudinal sample of the dataset for the study of identity at scale that were identified as non-neutral by the Nebraska Literary Lab sentiment dictionary. It depicts their prevalence over time, at a logarithmic scale since some terms, particularly “love”, were very widespread in the sample while most of the words were not as common. Terms which were frequent in 2015 very often remained commonly used over time, with relatively few words appearing or disappearing from the set of highly prevalent tokens that are depicted in the graph. Since only non-neutral tokens with a prevalence of at least 100 are included and those all were associated with positive sentiments, the sentiment scale displays the strength of the sentiments, ranging from weakly positive to very positive. While the most common terms are relatively generic, many of the more specific ones refer to professions or activities, such as “artist”, “director”, “teacher”, “producer”, “founder” and “professional”, which might hint at the importance of Twitter for self-promotion, or, alternatively, reflect people’s tendency to see or present themselves in a very positive light. This would explain why negative terms, though existent in the sample, are not present with high prevalence and therefore not depicted in the graph. [1–5]

1. Jones JJ. A dataset for the study of identity at scale: Annual Prevalence of American Twitter Users with specified Token in their Profile Bio 2015-2020. PLoS ONE. 2021; 16:e0260185. Epub 2021/11/18. doi: 10.1371/journal.pone.0260185 PMID: 34793578.
2. Slowikowski K. ggrepel. Automatically Position Non-Overlapping Text Labels with 'ggplot2'. ; 2021.
3. Wickham H, Averick M, Bryan J, Chang W, McGowan L, François R, et al. Welcome to the Tidyverse. JOSS. 2019; 4:1686. doi: 10.21105/joss.01686.
4. R Core Team. R. A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2021.
5. Jockers ML. Syuzhet. Extract Sentiment and Plot Arcs from Text. ; 2015.

Ipseology is the new study of identity. It is the investigation of ipseity: personal identity, selfhood and the essential elements of identity.

This website is Ipseology Central - a place to explore ipseology data, methods, visualizations and publications. It is maintained by Dr. Jason Jeffrey Jones, director of the Computational Social Science of Emerging Realities Group.