Text: An R-package for Analyzing and Visualizing Human Language Using Natural Language Processing and Deep Learning

Oscar Kjell, Salvatore Giorgi, H. Andrew Schwartz

January 2022

Preprint Code Dataset DOI

Abstract

The language that individuals use for expressing themselves contains rich psychological information. Recent significant advances in Natural Language Processing (NLP) and Deep Learning (DL), namely transformers, have resulted in large performance gains in tasks related to understanding natural language. However, these state-of-the-art methods have not yet been made easily accessible for psychology researchers, nor designed to be optimal for human-level analyses. This tutorial introduces text (www.r-text.org), a new R-package for analyzing and visualizing human language using transformers, the latest techniques from NLP and DL. The text package is both a modular solution for accessing state-of-the-art language models and an end-to-end solution catered for human-level analyses. Hence, text provides user-friendly functions tailored to test hypotheses in social sciences for both relatively small and large datasets. This tutorial describes methods for analyzing text, providing functions with reliable defaults that can be used off-the-shelf as well as providing a framework for the advanced users to build on for novel pipelines. The reader learns about three core methods, textEmbed, to transform text to modern transformer-based word embeddings; textTrain and textPredict, to train predictive models with embeddings as input, and use the models to predict from; textSimilarity, to compute semantic similarity scores between texts. The reader also learns about two extended methods, textSimilarityTest, to significance test the difference in meaning between two sets of texts; and textProjection/textProjectionPlot to examine and visualize text within the embedding space according to latent or specified construct dimensions (e.g., low to high rating scale scores).

Type

Journal article

Publication

Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.

Click the Slides button above to demo Academic’s Markdown slides feature.

Supplementary notes can be added here, including code and math.

NLP ML AI text

Oscar Kjell

PostDoc

I’m a researcher in Psychology interested in measuring psychological constructs with words and text responses analyzed with AI. In particular I’m interested in how this method can be used in clinical settings to assessment mental health problems such as depression and anxiety. I’m also interested in researching well-being, harmony in life and sustainable living. I’m currently funded for an international postdoc at the Computer Science Department at Stony Brook University and the University of Copenhagen.

Text: An R-package for Analyzing and Visualizing Human Language Using Natural Language Processing and Deep Learning

Abstract

Oscar Kjell

PostDoc

Related