Frequency of use metrics for American English person descriptors: Extensions of Roivainen's internet search methodology
Loading...
Date
2022-05-02
Authors
McDougald, Sarah
Condon, David M.
Journal Title
Journal ISSN
Volume Title
Publisher
PsyArXiv
Abstract
Personality traits are often measured using person-descriptive terms, but data are limited regarding the frequency of usage for these terms in everyday language. This project reports on the relative frequency of usage for a large pool of American English terms (N = 18,240) using count estimates from search engine results and in books cataloged by Google. These estimates are based on the ngrams formed when each descriptor is combined with a common person-related noun (person, woman, man, girl, boy). Results are reported for each noun form and a frequency index in an online database that can be sorted, searched, and downloaded. We report on associations among the different noun forms and data types, and propose recommendations for the use of these data in conjunction with other resources. In particular, we encourage collaborative approaches among research teams using large language models in psycholexical research related to personality structure.
Description
21 pages
Keywords
Language modeling, Ngrams, Personality descriptors, Personality structure, Psycholexical, Trait descriptive adjectives
Citation
Condon, D. M., & McDougald, S. (2022, May 3). Frequency of use metrics for American English person descriptors: Extensions of Roivainen's internet search methodology. https://doi.org/10.31234/osf.io/9gtj7