Please cite as: 
Mõttus, R., Wood, D., Condon, D. M., Back, M. D., Baumert, A., Costantini, G., Epskamp, S., Greiff, S.,
Johnson, W., Lukaszewski, A., Murray, A., Revelle, W., Wright, A. G. C., Yarkoni, T., Ziegler, M., &
Zimmermann, J. (2020). Descriptive, Predictive and Explanatory Personality Research: Different Goals,
Different  Approaches,  but  a Shared Need to Move beyond the Big Few Traits.  European Journal of
Personality, 34, 1175–1201. https://journals.sagepub.com/doi/full/10.1002/per.2311
Descriptive, predictive and explanatory personality research: Different goals,
different approaches, but a shared need to move beyond the Big Few traits
RENÉ  MÕTTUS1,2*,  DUSTIN  WOOD3,  DAVID  M.  CONDON4,  MITJA  D.  BACK5,  ANNA  BAUMERT6,  GIULIO
COSTANTINI7,  SACHA EPSKAMP8,  SAMUEL  GREIFF9,  WENDY JOHNSON1,  AARON LUKASZEWSKI10,  AJA
MURRAY1,  WILLIAM  REVELLE11,  AIDAN  G.  C.  WRIGHT12,  TAL  YARKONI13,  MATTHIAS  ZIEGLER14,  and
JOHANNES ZIMMERMANN15
1University of Edinburgh, UK
2University of Tartu, Estonia
3University of Alabama, USA
4University of Oregon, USA
5University of Münster, Germany
6Max Planck Institute for Research on Collective Goods, Bonn, and TUM School of Education, Germany
7University of Milan-Bicocca, Italy
8University of Amsterdam, Netherlands
9University of Luxembourg, Luxembourg
10California State University, Fullerton, USA
11Northwestern University, USA
12University of Pittsburgh, USA
13University of Texas at Austin, USA
14Humboldt Universität zu Berlin, Germany
15University of Kassel, Germany
Abstract: We argue that it is useful to distinguish between three key goals of personality science – description, prediction
and explanation – and that  attaining  them often requires  different  priorities  and methodological  approaches.  We put
forward specific recommendations such as publishing findings with minimum a priori aggregation and exploring the limits
of  predictive  models  without  being  constrained  by  parsimony  and intuitiveness  but  instead  maximising  out-of-sample
predictive accuracy. We argue that naturally-occurring variance in many decontextualized and multi-determined constructs
that interest personality scientists may not have individual causes, at least as this term is generally understood and in ways
that are human-interpretable,  never mind intervenable.  If  so,  useful  explanations are narratives that  summarize many
pieces of descriptive findings rather than models that target individual cause-effect associations. By meticulously studying
specific and contextualized behaviours, thoughts, feelings and goals, however, individual causes of variance may ultimately
be  identifiable,  although such  causal  explanations  will  likely  be far  more  complex,  phenomenon-specific  and person-
specific than anticipated thus far. Progress in all three areas – description, prediction, and explanation – requires higher-
dimensional models than the currently-dominant “Big Few” and supplementing subjective trait-ratings with alternative
sources  of  information  such  as  informant-reports  and  behavioural  measurements.  Developing  a  new  generation  of
psychometric tools thus provides many immediate research opportunities. 
Keywords: prediction; explanation; cause; hierarchy; personality
*Correspondence to: René Mõttus, 7 George Square EH8 9JZ Edinburgh, Scotland; rene.mottus@ed.ac.uk
This manuscript is based on an Expert Meeting jointly supported by European Association of Personality Psychology and European Association of Psychological
Assessment, and held from 6th to 8th September 2018 in Edinburgh, Scotland (https://osf.io/fn5pw). Authors are grateful to Tom Booth, Jaime Derringer, Ryne
Sherman and David Stillwell for their contributions to the Expert Meeting, and to Cornelia Wrzus and Samuel Henry for their comments on the manuscript. Not all
authors agree with all arguments put forward in this paper.
Description, prediction and explanation 2
Personality psychology has come a long way in describing knowledge  of  personality  grows  and  research  questions
how  people  differ  in  thinking,  feeling,  behaving,  and become increasingly diverse,  it  may no longer be optimal
wanting.  This  has  been  facilitated  by  agreement  among for researchers to coalesce around a single or even a few
researchers  on  a  limited  number  of  broad  personality ways of operationalizing personality (e.g., the Big Few). We
dimensions,  organizing research and allowing observations distinguish between three broad aims of personality research
to accumulate. The largely overlapping Big Five (Goldberg, –  description,  prediction, and  explanation – and argue that
1990), Five-Factor Model (FFM; McCrae & John, 1992), and these  aims  may  entail  disparate  and  sometimes  even
HEXACO  domains  (Ashton  &  Lee,  2020)  have  been opposing research strategies.  We advocate  for  the explicit
particularly  instrumental  broad  personality  constructs,  so articulation of these aims when designing, conducting, and
much  so  that  they  have  become  the  default  way  of reporting  the  results  of  personality  research  rather  than
operationalizing  personality  differences  among  people;  we defaulting to research practices that are widely used but may
refer to them as the Big Few. in  fact  be  suboptimal  for  any given  research  project.  For
Yet it  is not evident that  the Big Few “carve nature at its example, we propose that:
joints”.  They  are  useful  for  conveniently  summarizing a ● Descriptive  findings  should  be  published  in  as  much
variety  of  ways  in  which  people  can  differ  with  a detail  as  possible  (e.g.,  at  the  individual  item  level)
manageable  number  of  dimensions.  But  there  is  little besides  being  organized  (e.g.,  according  to  attributes
evidence that they are particularly good units for explaining such  as  the  strength  of  relations or  the  psychological
behaviour  or  psychological  processes  underlying  it modalities of the characteristics involved) or aggregated
(Baumeister  et  al.,  2007;  Wood  et  al.,  2015;  Jonas  & into broader constructs such as the Big Few. This offers
Markon, 2016) or even that they are the best predictors of more flexibility than the common practice of  a priori
real-world outcomes (Mõttus & Seeboth, 2018; Elleman et aggregating findings for simplicity. 
al.,  2020).  The  Big  Few  were  formed  by  combining ● Although traits’  predictive  validity  is  often  seen  as  a
subjective  perceptions  of  traits1 that  statistically  co-vary major reason for doing personality research in the first
among people rather than based on models of processes that place, its robustness and ways of maximising it remain
happen  in individuals. Currently we do not know of many under-explored.  Availability  of  large  datasets  and
genetic  variants,  neurobiological  systems,  experiences,  or advanced statistical tools are beginning to improve this.
developmental  processes  that  specifically  contribute  to Predictive  models  should  always  be  independently
variance  in the  certain  Big  Few  domains  such  as cross-validated and should not depend on parsimony or
Extraversion or Conscientiousness and set them apart from consistency with researchers’ theoretical intuitions. 
other  domains such as Openness and Honesty-Humility  or ● Many  phenomena  that  interest  personality  scientists
from traits allegedly beyond the Big Few such as motives, such as broad patterns of naturally occurring individual
beliefs,  or  abilities.  Moreover,  the  domains  partly  overlap differences  (e.g.,  constructs  in  the  personality  trait
and  can  be  combined  into  even  broader  ones  (DeYoung, hierarchy) may not have individually tractable  causes,
2006),  but  also  broken into  numerous  more  specific  traits at  least  as  this  term is  typically understood and/or in
(McCrae & Sutin, 2018). ways that are meaningfully interpretable and allow for
None of this is necessarily a problem. But it means that the targeted interventions. This is because the phenomena
variance  found  in  typical  personality  measures  can  be are inherently decontextualized and relative,  and their
described as a hierarchy of traits, that there are few reasons indistinguishable  levels  can  arise  through  many
to automatically prefer any one of its levels over others, and combinations  of  processes  and  may  not  result  from
that the mechanisms of the variance can be highly multiply unidirectional  cause-effect  associations,  among  other
determined. Researchers  are  also  increasingly  considering reasons. When this applies, useful explanations may be
processes  and  related  variance  within  individuals,  besides narratives  that  integrate  many  pieces  of  descriptive
differences between individuals; it is a crucial question how findings  into  broad  principles  rather  than  attempts  to
these variation levels are connected or whether they can be identify  individual  and potentially  intervenable cause-
addressed with the same statistical and/or theoretical models effect  associations.  If  so,  for  example, individual
at  all.  Likewise,  there  may  be  personality  variance  both regression  coefficients  provide  poor  causal
between and within individuals (e.g., behavioural frequencies explanations.  However,  by  defocusing  from  broader
or  relationship  dynamics)  that  is  not  captured  in  the variability  patterns and meticulously studying specific
subjective  perceptions  commonly  used  for  personality and  contextualized  behaviours,  thoughts,  feelings  and
assessment. goals,  individual causes of  variance may ultimately be
As a result, particular models of personality may work better identifiable in useful and potentially even controllable
for some purposes than others. This leads to a central idea of ways.  Still,  such  causal  explanations  may  be  more
this special issue generally and this article specifically: as our complex, phenomenon-specific and person-specific than
anticipated thus far.
1 Here, we define traits similarly to Baumert and colleagues (2017): trait is a ● Progress in all three areas – description, prediction, and
descriptive dimension of any kind of relatively stable psychological and explanation  –  will  likely  require  availability  of  far
behavioural  differences between people,  independent  of  its  content  and higher-dimensional models based on traits much more
breadth.
Description, prediction and explanation 3
specific  than  the  Big  Few,  as  well  as  supplementing analyzing  many-dimensional  data  and  communicating
typical subjective trait-ratings with alternative sources of findings that involve numerous statistical associations may
information  such  as  informant-reports  and behavioural seem overwhelming. 
measurements  (Rauthmann,  2020).  Therefore,  an  area But  these  difficulties  have  recently  become  less  relevant.
with  immediate  and  immense  opportunities  is Technological progress has made accessing participants and
developing a new generation of psychometric tools that collecting data much easier, with sample sizes now routinely
allow  sampling persome – the  universe  of  variables in  the  thousands  (Gosling  &  Mason,  2015).  Self-report
capturing  personality  variability  –  more  broadly  than scales have turned out to be more reliable than previously
currently available measures do. thought, with their many-dimensionality often mistaken for
measurement  error  (e.g.,  because  internal  consistency
Descriptive personality science systematically  underestimates  reliability;  Cronbach  &
Descriptive  personality  research  explores  associations Shavelson, 2004; McCrae, 2015). This allows us to measure
between the measurements of personality constructs and/or a broader selection of narrower traits with the same number
their links with phenomena allegedly beyond the personality of  carefully  selected  items,  because  fewer  conceptually
domain (e.g.,  demographic characteristics, experiences, and interchangeable items are required for each trait (McCrae &
behavioural outcomes). The results can and do contribute to Mõttus,  2019;  Wood,  Nye,  &  Saucier,  2010;  Yarkoni,
explanatory  or  predictive  research,  but  they  are  also 2010).  Improved computational power and accessible data
important in their own right and should not be constrained by analytic tools have eased working with many-dimensional
theoretical  models  (purview  of  explanatory  research)  or data  to  efficiently  summarize,  communicate  and  compare
attempts to maximise prediction (aim of predictive research). association patterns (Costantini et al., 2015; Revelle, 2020;
For  example,  there  is  ample  evidence  that  individual Ellemann et al., 2020; Stachl et al., 2020).
differences in personality characteristics can be clustered into Many  researchers  now  agree  that  population-level
replicable groups such as the Big Few (Schmitt et al., 2007), personality  variation  is  best  represented as a  hierarchy  of
are  relatively stable  over several  years  (Terracciano et  al., increasingly  specific  traits,  with  no  level  uniquely
2006),  are  persistently  correlated  with  a  variety  of  life representing  nature  carved  at  its  joints  (DeYoung,  2015;
outcomes (Roberts et al., 2007; Soto, 2019), and perceived at Markon et al., 2005; McCrae & Sutin, 2018). This hierarchy
least somewhat similarly by different observers (Connelly & arises  because  most  Big  Few  traits  inter-correlate,
Ones, 2010). Genetically related individuals resemble each suggesting  few very  general  super-traits  such  as  Stability
other in personality  characteristics,  accounting for  most  of and  Plasticity  (DeYoung,  2006),  although  methodological
the  similarity  of  family  members  (Briley  & Tucker-Drob, artifacts  may contribute  to  this  (Bäckström, Björklund, &
2014), although the specific genetic variants correlated with Larsson,  2009; Riemann & Kandler,  2010).  The Big Few
the characteristics have remained elusive (Lo et al.,  2017). domains also break down into constituents that develop and
Changes  in  personality  characteristics  barely  track  with correlate  with  life  outcomes in  distinct  ways  (Jang  et  al.,
specific life experiences (Bleidorn et al., 2020; Denissen et 1998;  Paunonen  &  Ashton,  2001).  Some  models  have
al.,  2019),  are  similarly  distributed  across  geographically therefore  delineated  “aspects”  (DeYoung  et  al.,  2007)  or
diverse regions (Allik et al., 2017), but vary systematically “facets” (Costa & McCrae, 1992) for the Big Few. These are
across genders (Lee et al., 2020). Recently, research has also more than just different ways in which the Big Few can be
started  to  describe  systematic  patterns  of  short-term expressed (Jang et al., 1998): moreover, the hypothesis that
variations  in  personality  as  another  aspect  of  individual some  traits  such  as  the  Big  Few  are  more  “core”  or
differences  (e.g.,  Danvers,  Wundrack,  &  Mehl,  2020; “temperamental” than other, ostensibly more “surface” traits
Horstmann & Ziegler, 2020; Lazarus et al., 2020; Sosnowska such as facets, has found limited empirical support (Kandler,
et al., 2020).2 Zimmermann, & McAdams, 2014). However, there has been
little  systematic  research  yet  to  delineate  an  empirically
The trait hierarchy based  and  comprehensive  model  of  personality  facets  for
researchers to coalesce around (Saucier & Iurino, in press). 
Within the descriptive kind, a lot of research has been carried Moreover,  most  personality  questionnaire  items  contain
out  on  the  relations  between  (subjectively  perceived)  trait unique personality variance beyond the Big Few domains,
scores with the aim to reduce personality variation among aspects  and  facets  they  were  designed  to  measure.
people  to  as  few  broad  trait  dimensions  as  possible. Therefore, even the most comprehensive of the current facet
Summarizing variance with a small number of traits has been models (e.g., Costa & McCrae, 1992) can be broken further
a practical  approach,  both  in  terms of  data  collection  and down into numerous yet more specific traits, or “nuances”.
reporting.  For  example,  accessing  sufficient  participant Empirically,  nuances are every bit  as trait-like as  the Big
numbers and tabulating data can be burdensome, especially Few domains or their aspects and facets, because even the
when  each  trait  is  measured  with  numerous  items,  and unique variance in hundreds of items, reflecting the nuances
but  not  facets,  aspects  and  domains,  has  essential  trait
2 These  are all  findings  of  descriptive  research,  even though correlations
with  life  outcomes  are  sometimes  called  prediction  and  the  correlation properties of stability over many years, transcendence across
between genetic and phenotypic similarities  are sometimes taken as  the assessment method such as self- and informant-reports, and
former explaining the latter.
Description, prediction and explanation 4
heritability;  item-specific  variances  also  have  distinct Agreeableness, and Conscientiousness but slightly lower in
developmental  trends  and  associations  with  life  outcomes Extraversion and Openness than younger adults. But some
(Mõttus  et  al.,  2017; Mõttus,  Sinick,  et  al.,  2019).  Of the of the findings are specific to questionnaires (Costa et al.,
error-free variance of a typical Big Five item, less than half 2019), hinting that age differences are at least in part driven
has been estimated to pertain to the domains and their facets, by narrower traits that are sampled in different proportions
leaving at least a half for nuances (McCrae, 2015). There are across  instruments,  and thereby potentially  misrepresented
also personality traits that are either in the peripheries of the by the broad trait domains. There is indeed ample evidence
Big  Five domains,  as  commonly defined,  or  beyond them that  facets  of  the  same  Big  Few traits  vary  in  their  age
(e.g.,  competitiveness, loyalty, jealousy, humour, sexuality, differences (Terracciano et al.,  2005; Jackson et al., 2009;
or  others;  Bouchard,  2016;  Paunonen  &  Jackson,  2000). Lucas & Donnellan, 2009). But even facets may not provide
These traits are often not well covered in currently popular a  full  understanding,  because  items  of  the  same  facets—
personality  measures.  The  true  ubiquity  and  utility  of reflecting  nuances  within  them—often  vary  in  their  age
nuance-like  narrow  personality  traits  is  thus  yet  to  be trends,  conveying  unique  developmental  information.  For
properly  estimated,  as  available  evidence  is  based  on example,  item-level  analysis  of  the Assertiveness facet  of
questionnaires carefully developed to assess little but the Big the revised NEO Personality Inventory (NEO-PI-R; Costa &
Few  and  their  selected  facets.  This universe of  narrow McCrae, 1992) showed that older people were more likely
personality  traits  that  forms  the  basis  of  the  personality to take charge of situations but less likely to make others do
hierarchy has also been referred to as the  persome (Mõttus, things,  and  items  of  the  Achievement  Striving  facets
Bates et al., 2017; Revelle, Dworak, & Condon, 2017). referring  to  hard  work  tended to  increase  with  age while
In principle, therefore, there are many ways for researchers items  referring  to  success-motivation  trended  downwards
to  describe  personality  variation  such  as  using  different (Mõttus  &  Rozgonjuk,  2019).  Such  examples  abound
levels of the trait hierarchy. In practice, they often default to (Mõttus et al., 2015); for example, Mõttus and Rozgonjuk
the  Big  Few,  likely  because  these  trait  models  appear (2019)  reported  that  items  within  half  of  the  personality
intuitive and familiar,  are  already widely used and can be facets  varied  in  the  directions  of  their  age  differences,
readily measured with existing instruments. Social pressure leading  items  to  contain  over  40%  more  age-sensitive
from  peers,  reviewers  and  editors  may  also  play  a  role. information than facets and over twice as much as the Big
Although these are legitimate practical reasons, there is no Five  domains.  More  nuanced  investigations  into  how
inherent scientific reason why this level of the trait hierarchy personality  is  linked  with  various  life  outcomes  or  vary
should be a priori and always preferred over others for each across  cultures  have  led  to  similar  conclusions  (Achaa-
and  every  research  purpose.  In  fact,  this  may  often  be Amankwaa,  Olaru,  &  Schroeders,  2020;  Elleman  et  al.,
counter-productive,  in  constraining  research  choices  and 2020;  Seeboth & Mõttus, 2018; Wessels, Zimmermann, &
inspiring potentially misleading generalizations. Leising, 2020).
At which level of a personality hierarchy should descriptive
What makes good descriptive research? findings  stop?  The  answer  will  depend  on  the  research
To  select  an  appropriate  way  of  representing  personality questions  under  consideration,  but  the  goal  should  be  to
variance  for  a  descriptive  research  question,  it  helps  to represent  descriptive  findings  such  as  age  or  gender
outline criteria for what would be a good descriptive account differences or links between personality characteristics and
of  whatever  is  being  described  in  relation  to  particular other variables at the level from which going more detailed
personality  constructs  (e.g.,  other  psychological  constructs, would not add further useful information. Technically, this
different measurements of the same constructs, demographic means  the level  where the  measurable  constituents  of  the
variables  or  life  outcomes).  We  illustrate  this  with  how traits  relate  to  the  other  variables  alike,  because  traits’
personality varies with age. associations should not depend on which indicators are usedto  operationalize  them  (Mõttus,  2016;  Spearman,  1927;
Information  should  be  elaborate.  Is  a  good  descriptive Gonzales,  MacKinnon,  &  Muniz,  2020).  Often  this  may
account  simple  and  parsimonious  or  comprehensive  and mean  levels  from  which  we  simply  cannot  go  any  more
detailed?  The  tension  between  these  priorities  can  be detailed, such as individual test items, given that personality
alleviated  by  recognizing  that  parsimonious  accounts  can is,  and  possibly  will  be  for  some  time  at  least,  most
always  be  extracted  from  detailed  ones  containing  more commonly  assessed  with  questionnaires.  On  other
numerous  and  less  aggregated  variables.  The  reverse, occasions, broader traits such as the Big Few or their facets
however, is not possible (Saucier & Iurino, in press). With may turn out to be the most suitable levels of description,
remarkable  flexibility,  many-dimensional  findings  can  be because  their  constituents  follow  the  same  association
subsequently  zoomed  into  or  summarized  with  fewer patterns. Following this simple principle makes choosing the
dimensions,  such  as  for  ease  of  interpretation  and appropriate  level  of  the personality hierarchy a defensible
communication. empirical  question  rather  than  a  matter  of  personal
Being able to zoom in rather than  a priori  aggregating can preference, peer pressure or editorial policy.
pay off. For example, age differences in personality are often It  is  sometimes  thought  that  theories  should  constrain
described using the Big Few traits, showing that older adults research questions. For descriptive research (as well as for
tend  to  be  somewhat  higher  in  Emotional  Stability, predictive, below), we argue the opposite: theory should be
Description, prediction and explanation 5
used to expand rather than constrain the personality construct theoretical  constraints  on  the  findings  (e.g.,  Nagel  et  al.,
space and thereby descriptive findings. For example, theories 2018; Plomin & von Stumm, 2018) and there is no reason
of how personality may relate to the phenomenon of interest why following suit could not help personality scientists.
can be used to suggest items to our item pools to make them More  detailed  findings  can  be  aggregated  into  any  trait
more comprehensive and sensitive to the topic at hand. If we construct, either at the time when they are first published or
only operationalize personality with the Big Few, we a priori in subsequent research. This flexibility is especially useful,
exclude  possibilities  to  uncover  additional  aspects  of because most items represent several traits at different levels
personality,  and how they develop and co-vary with other of the trait hierarchy or even at the same level; think of the
phenomena. International Personality Item Pool as an example of how
But we can use theoretical models to help with  organizing items  are  “recycled”  to  measure  disparate  constructs
our  findings  (e.g.,  Bem  &  Funder,  1978).  For  example, (Goldberg, 1999). For example, to estimate how a (latent)
Mõttus and Rozgonjuk (2019) described age differences in trait  correlates  with a  criterion  from the  correlations of k
personality  using  300  items  (many  reflecting  unique items with this criterion, the item-criterion correlations can
personality  nuances),  but  organized  the  associations be multiplied by the traits’ loadings on the items (which can
according to the Big Five and their facets using a Manhattan be extracted from correlations among items) and the  sum
plot  (Revelle,  2020;  Revelle,  Dworak,  &  Condon,  2020). product divided by the sum of the squared factor loadings:3
This allowed them to show the general organisation of age
differences  in  personality  (they  were  wide-spread  across k
hundreds  of  items)  and  how they  were  distributed  across (r (X ,Criterion)∗r (X ,Trait ))
particular  Big  Five  domains  and  their  facets  (i.e.,  mean ∑ i ii=1
difference between domains and facets in age-trajectories), r (Trait ,Criterion)= k
but also how the age differences deviated from the patterns ∑(r (X i , Trait)2)
expected under the Big Five model (i.e., items of the same i=i
domains/facets often substantially varied in age differences). 
Or,  item-level  findings  can  be  organized  according  to  the The same applies to facet-level findings, of course. 
degrees  to  which  the  items  represent  affect,  behaviour, As a  general  rule  for  basic  research,  thus,  comprehensive
cognition,  or  desires/motivation  (ABCDs;  Wilt  & Revelle, and detailed descriptions of personality-related phenomena
2015). For instance, a mental health variable could be most are preferable to those that  a priori  impose parsimony. But
strongly linked with affective items, regardless of which Big this  does  not  mean  that  each  and  every  study  should
Few  domain  or  facet  they  belong  to;  a  physical  health necessarily  measure hundreds of  constructs,  nor that  each
outcome may be mostly linked with behavioural items; and paper  reports  many  hundreds  of  associations.  Instead,
other outcomes may predominantly track with other types of personality scientists should collectively (across studies) aim
items. For a few more examples, findings could be organized towards  maximum  comprehensiveness.  This  can  be
according to the extents to which items reflect universal traits achieved if individual studies a) consider diverse constructs
as  opposed  to  contextual  adaptations  (Henry  &  Mõttus, rather  than focus all  on the same trait  model (e.g.,  a  Big
2020),  social  desirability  (Wessels,  Zimmermann,  Biesanz, Few),  thereby  distributing  the  workload  and  pooling
& Leising, 2020; Leising et al., 2020), visibility (Funder & findings either in a directed co-ordination or spontaneously,
Dobroth,  1987),  social  maturity  (Caspi  & Roberts,  2001), and b) provide their findings at various levels of specificity
pathology  (Vachon  et  al.,  2013;  Bleidorn  et  al.,  2019)  or and  aggregation  (including  disaggregated,  item-level
stability,  cross-method  agreement,  and  associations  with findings).  Also needed are  accessible  tools  for  integrating
other variables (Mõttus, Sinick, et al., 2019). This way, we the  findings  of  different  studies  (e.g.,  for  meta-analysing
can use theory to  expand association maps to hundreds of findings  for  available  constructs,  collating  and  publicly
variables and still extract intelligible information from these, depositing  them).  Individual  research  reports  can  then
especially  when  we  use  suitable  (e.g.,  interactive) contribute  to,  and  draw  from,  a  central  repository  of
visualization tools. Large samples and cross-validations are descriptive findings. This is not the default modus operandi
vital, but this is no longer an insurmountable barrier in the of  current  personality  research  although  it  is  common in
current data-centric age. some other fields such as genetics and neuroscience.
Patterns in how personality differences relate to the variables Findings should not depend on methodologies. When we
of interest can also be explored atheoretically. For example, link something to personality constructs, we typically expect
item-  or  facet-level  associations  can  be  organized  in  the that the associations pertain to psychological characteristics
descending  order  of  effect  size  to  highlight  the  strongest that exist independently of how they happen to be assessed
associations  and find  commonalities  in  them (e.g.,  Achaa- (Hilbig, Moshagen, Zettler, 2016; Mõttus, 2016; Thielmann
Amankwaa et al., 2020; Elleman et al., 2020; Bem & Funder, & Hilbig, 2016). When conclusions reliably differ, say, as a
1978;  Block,  Block,  &  Gjerde,  1986;  Block,  Gjerde,  & function  of  which  personality  questionnaire  was  used  for
Block,  1991).  In  some  fields  such  as  genetics,  recent
progress  has  almost  entirely  resulted  from  atheoretically 3 If the combinations of items ought to represent summary-traits rather than
scanning  association  patterns  rather  than  imposing shared variance-based latent traits, principal component loadings can be
used instead of factor loadings.
Description, prediction and explanation 6
assessing  the  construct  (e.g.,  the  associations  of  Openness Some recommendations for descriptive research
and Extraversion with age or that of Neuroticism with Body
Mass Index vary across studies; Costa et al., 2019; Vainik, A new trait taxonomy and instruments for it. Besides the
Dagher  et  al.,  2019),  this  points  to  the  association  being Big Few, we need a more encompassing trait taxonomy to
driven  by  narrower  traits  that  are  captured  by  differing be  able  to  comprehensively  describe  associations  of
degrees  across  measurement  tools.  This  implies  labelling personality  traits  among  themselves  and  with  other
issues (or “jingle  fallacies”;  Block,  1995; Larsen & Bong, phenomena, coupled with instruments for measuring these
2016),  whereby  investigators  mean  different  things  when traits. In other words, we need to sample the persome more
invoking  the  same  scale  or  construct  name.  If  so,  these broadly than the available taxonomies allow for.  This does
narrower  traits  should  be  isolated,  because  generalizing not mean doing away with the Big Few, but developing a
associations beyond them is misleading. Reporting item-level properly  hierarchical  model  in  which  traits  can  be
association in particular can help to reveal jingle as well as investigated at lower (nuance) levels as well as aggregated
jangle fallacies. into increasingly broad traits, including the Big Few. It may
Unless  there  are  explicit  reasons  for  the  contrary,  the also  be  that  the  Big  Few  models  eventually  require  a
associations  should  also  generalize  across  assessment revision to account for lower level traits that are informative
methods such as,  most  readily,  self-  and informant-reports but  do  not  easily  fit  into  the  current  Big  Few  models
(ideally,  the  aggregate  ratings  of  multiple  informants). (Saucier  & Iurino,  in  press).  Likewise,  many  lower-level
Findings  that  self-  and  informant-reports  are  measurement traits may belong to more than one of the Big Few.
invariant are consistent with this (e. g., Mõttus, Allik, et al., Such  models  are  not  unrealistic,  nor  impractical.  For
2019).  For some traits,  self-  and informant-ratings may in example,  careful  item selection  –  such  as  avoiding  items
part measure different aspects of personality (Vazire, 2010; with  low  retest  reliability  and  excessive  redundancy
McAbee  &  Connelly,  2016),  in  which  case  discrepant (Christensen,  Golino,  & Silvia,  2020;  McCrae  & Mõttus,
findings may be expected, and even hint at what contributes 2019)  –  may  allow  measuring  a  usefully  comprehensive
to the observed associations in the first place. For example, pool  of  nuances  with  one  or  perhaps  two  items  each.
associations  between  personality  traits  and  age  tend  to  be Remember:  nuances  are  narrow,  so  no  broad  content
stronger in self- than informant-reports (Costa et al., 2019), sampling is required for them because measurement breadth
possibly  because  people  have  clearer  perceptions  of  their comes from the pool of nuances collectively, not from items
own changes than they do of changes in others, or because within individual nuances. If so, a say 100- or 200-item test
age differences in self-reports are inflated due to increasing can encompass around 100 nuances that can be aggregated
socially  desirable  responding  with  age  (Soubelet  & into  a  few  dozens  of  facets,  and  still  fewer  aspects  and
Salthouse, 2011). domains. Common psychometric concerns about the use of
Researchers should explore the generality of associations short scales can be addressed. For example, the typical retest
across  contexts  and  other  potential  moderators. We reliability of single items of existing questionnaires over a
should  routinely  strive  to  replicate  findings  in  multiple one-week or two-week interval is around .65 (e.g., Mõttus,
diverse cultures, clarifying the extents to which the observed Sinick  et  al,  2019;  Henry  & Mõttus,  2020),  even  though
associations characterise larger populations than our typical these instruments have rarely been constructed with item-
study  participants  (e.g.,  Henrich,  Heine,  &  Norenzayan, level  reliability  in  mind.  Therefore,  after  careful  item
2010), or even humans in general. Some already have been. selection  the  majority  of  them can  have  reliabilities  well4
For example, age differences in personality are fairly robust above  .60,  with  the  average  plausibly  at  about  .70.  This
across cultures (McCrae et al., 2005), even at the levels of means that the retest reliability of most two-item scales can5
facets  and  nuances  (Mõttus,  Sinick,  et  al.,  2019).  Other be notably higher, often above .80.  
findings  may  vary  systematically across  context;  in  these Findings  obtained  with  such  multi-nuance  tests  can  be
cases,  we should establish that  the variabilities themselves interpreted  at  any  one  trait  hierarchy  level  or  at  multiple
are  replicable  and  attempt  to  identify  their  sources levels at the same time, as appropriate for the goal at hand.
(moderators). For example, the magnitudes (but not profiles For  example,  broad-trait  associations  can  be  qualified  by
across  multiple  traits)  of  gender  differences  vary which specific narrower traits drive them, in the likely case
systematically between cultures and we know how: gender that  the  associations  within  the  scale  have  meaningful
differences are larger in more prosperous societies (Schmitt heterogeneity.  Importantly,  the  measurement  of  broader
et al., 2008; Mac Giolla & Kajonius, 2019; Lee & Ashton, traits  themselves  could  also  improve  as  a  result  of  their
2020). It has been reported that the timing of age trajectories encompassing  more  lower-level  traits  because  good
may also systematically vary across cultures (Bleidorn et al., measures  of  broad  trait  domains  sample  their  content
2013),  but  these  findings  have  not  yet  been  successfully broadly. This is therefore a win-win scenario.
replicated (McCrae et al., in press).
One possible benefit of routine attempts to replicate findings
across  cultures  is  diversifying  the  range  of  researchers 4 Retest-correlations  over  shorter  testing  intervals  can  be  higher  still
participating  in  personality  research,  including  those  from (Lowman, Wood, Armstrong, Harms, & Watson, 2018) and may provideeven more accurate reliability estimates.
currently less represented regions and backgrounds. 5 For an example of creating a high-dimensional personality trait pool, see
Saucier, Iurino, & Thalmeyer (2020).
Description, prediction and explanation 7
Of course, several of the Big Few instruments already allow Gniewosz, Ortner, & Scherndl, 2020) or keep their hand in
for  the measurement of  their  facets,  but  few authors have cold water (e.g., Schmeichel & Vohs, 2009) to measure their
provided  comprehensive,  empirical  evidence-based  facet self-control,  or  asking  them  to  categorize  adjectives  to
taxonomies (but  see MacCann et  al.,  2009; Roberts  et  al., measure their implicit self-concept (Greenwald & Farnham,
2005;  Ziegler  et  al.,  2019)  and these  facet  models  are  by 2000).  But despite  circumventing  the  biases  of  subjective
definition constrained to the Big Few that have been defined ratings,  these  methods  may  not  always  enable  as
a priori. Little  taxonomic  research  yet  has simultaneously comprehensive personality measurements as self-reports do.
encompassed the Big Few, their aspects and facets as well as They may also lack inherent psychological  meaning (face
traits beyond them (Condon, 2018; McCrae & Costa, 1996), validity)  comparable  to  typical  questionnaire  items.  Also,
and  there  has  been  virtually  no  taxonomic  research  for the objective measurement approaches may often have poor
nuances yet (but see Wood et al., 2010). convergent and discriminant validity  (Dreves et al.,  2020;
Being  realistic,  it  may  never  be  possible  to  devise  the Mazza et al., 2020; Schimmack, 2020), possibly in part due
ultimate  hierarchical  model  of  personality  variance  that to  low  reliability (e.g.,  Egloff  et  al.,  2010;  Wood  &
covers  all  narrow  personality  traits  in the  persome,  as Brumbaugh, 2009). 
somehow carved out by nature. There may be too many of Measurements  with  likely  greater  face  validity  are  direct
them, their boundaries are likely inherently as fuzzy as those observations  of  behaviour  and  temporal  and  cross-
of  broader  traits,  and  many  might  apply  to  only  some situational patterns in this. These may include in situ self-or
individuals  and  thereby  have  limited  variance  across informant-reports  of  behaviour  (via  experience  sampling)
individuals.  But it  is  almost  certainly plausible  to  develop and visual and/or audio recordings taken in labs or everyday
models that sample from among the universe of important settings  (Breil  et  al.,  2019;  Geukes  et  al.,  2019;  Schmid,
traits far more comprehensively than the currently popular, Gatica‐Perez,  Frauendorfer,  Nguyen,  & Choudhury,  2015;
Big Few-centric models do. Wrzus  &  Mehl,  2015).  Indeed,  there  is  a  long-standing
Additional  sources  of  information. To  validate  findings tradition  in  personality  science  to  call  for  greater  use  of
based on self-reports and explore patterns that may not be behavioural observations (e.g., Baumeister, Vohs, & Funder,
accurately captured with self-reports, researchers should use 2007; Back, 2020; Back, in press), and well-cited articles
alternative  sources  of  information  about  personality have discussed suitable methods for this (e.g., Furr, 2009).
variation,  while  also  being  mindful  of  the  limitations  of We fully join with these calls and second that personality
these. psychology  that  exclusively  relies  on  subjective  ratings,
especially  self-ratings,  can  only  provide  understanding  of
For  example,  technological  progress  has  provided  new subjectively  perceived  variations  and  inevitably  ignores
sources  of  information  (Rauthmann,  2020).  Several  recent anything  not  detectable,  or  inaccurately  detected,  by
articles  describe  how personality  and  its  associations  with subjective  perceptions.  However,  direct  observations  of
other variables can be assessed through objectively measured behaviour have remained comparatively rare in personality
behaviour or digital traces of behaviour (e.g., Cooper et al., research,  likely  because  they  are  harder  to  obtain  for
2020; Wiernik et al., 2020; Hall & Matz, 2020; Stachl, Au et sufficiently large samples and broad domains of behaviour.
al.,  2020).  These approaches offer  great potential for  non- We hope that recent technological advances, such as those
invasively  collecting  personality-related  information  about described in a recent special issue of European Journal of
large numbers of people and possibly over extended periods Personality (Rauthmann, 2020), will improve the situation.
of  time,  hence  allowing  measurement  of  short-  and  even
longer-term  changes  in  personality.  But  often  these Combining self-  and informant-reports. Objective and /
assessment  methods  have  to  be  given  personality-relevant or  in situ measurements of personality variance are highly
interpretation  in  relation  to  subjectively  rated  personality desirable and increasingly practical, without any doubt. But
traits before they become useful. For example, on its own, it is also likely that subjective and decontextualized ratings
mobile  phone  sensor  data  do  not  have  psychological will remain  among the cost-efficient and ecologically valid
meaning; they do once we know how they track with self- methods of measuring stable personality traits, all the more
reported traits (Wiernik et al.,  2020; Stachl, Pargent et al., so because the Big Few-centric research strategies have not
2020; Hall & Matz, 2020). As a result, these methods often yet  fully  exhausted  this  method’s  potential  (e.g.,  Wood,
approximate  subjectively  rated  traits  rather  than  provide Gardner,  and  Harms,  2015).  A  well-established  but  still
entirely  new information,  and  any  issues  with  self-reports underused  way  to  improve  the  reliability  and  validity  of
can  spill  over  to  their  digital  approximations  (Tay  et  al., subjective  personality  ratings  is  to  supplement  one  rater
2020). Currently, typical correlations between self-reported (e.g., the self) with others (e.g., well-informed other people).
traits  and  their  digital  approximations  are  in  the  range With online testing, this is far easier than it was during the
from .30 to .40 (Tay et al., 2020; Stachl, Au et al., 2020), so paper-and-pencil testing era (e.g., participants can nominate
the gap between them remains non-trivial. It may narrow as an informant for them, who is sent an automatic invitation to
research progresses, though. participate in the study). 
Likewise,  many  researchers  may  strive  towards  objective, Combining  multiple  raters  can  reduce  systematic
laboratory measurements of personality traits such as asking idiosyncrasies inherent in only one ratings source (McCrae
people  to  persevere  with  tedious  and  boring  tasks  (e.g., et  al.,  2019;  Vazire,  2006);  indeed,  such  method-specific
Description, prediction and explanation 8
effects may make up a large proportion of observed variance Better use of already existing data. Researchers can help
(McCrae,  2015).  Self-ratings  capture  self-identity  while to describe the associations of personality constructs among
informant-reports capture reputation; both are likely biased themselves and their relations with other variables in more
in their own ways, but what is shared between them is more detail  than  has  been  typically  done  –  in  fact,  with  little
likely  to  provide  valid  information.  For  example,  most additional effort and by using data already collected.
people have developed an implicit theory of which traits go For this, we recommend routinely a) using facets of the Big
together and adjust their self-ratings or ratings of someone Few  and/or  b)  testing  extents  to  which  associations  are
else  accordingly,  which  can  lead  to  distorted  correlations driven by narrower-still traits such as nuances (e.g., single
between  data-points  obtained  with  one  rating  source items).  Where  the  associations  are  driven  by  particular
(McCrae  et  al.,  2019).  Combining  ratings  can  also  reduce facets  or  nuances,  they  should  not  be  automatically
random measurement error, especially in single- or few-item generalized  beyond  these,  including  to  broader  domains.
nuances  where  its  proportion  is  higher  than  in  broad Faceted  and  nuanced  association  patterns  can  be  as
aggregate  traits.6 This  in  turn  can  result  in  stronger informative and hypothesis-generative as  the comfortingly
associations with other variables of interest (e.g., Wright et predictable  association  patterns  typical  to  the  Big  Few –
al., 2019). Of course, informants may have different or less desirable traits all too often going with desirable outcomes
information about their target than the targets themselves do and  the  other  way  around,  with  most  “significant”
and they may often be biased towards the targets because of correlations  somewhere  between  .10  and  .30.  We
being  non-randomly  selected  (Wessels  et  al.,  2020). recommend  that  facet-  and/or  item-level  findings  be
Likewise, we rarely know  how discrepancies between self- routinely published in article supplementary materials; this
and  informant  ratings  arise –  from  biases  in  the  former, costs  very  little  to  authors  (calculation  and  tabulation  of
latter,  or  both  –  and  thereby  how  to  weigh  them  in  the findings)  or  journals,  but  it  adds transparency  to  findings
combined results. No single source of information is perfect and facilitates their subsequent re-analysis and (e.g., meta-
– but, again, combining them is very likely to improve data analytic) integration. This is different from making raw data
quality in most cases. available,  because  calculating  the  correlations  of  interest
Multiple sources of ratings can sometimes be “triangulated” from these can often be cumbersome, unless very easy-to-
to  estimate  associations  a)  with  reduced  single  method use statistical programming code is made available.
effects  while  b)  also  accounting  for  imperfect  agreement Some  may  think  that  item-level  findings  are  notoriously
between raters due to different information, rating biases or unreliable.  But as  was discussed before,  items often have
error (e.g., Biesanz & West, 2004; Eid et al., 2008; Riemann retest  reliabilities  of .65 to  .70 or higher (Lowman et  al.,
& Kandler, 2010). For example, using cross-trait, cross-twin 2019;  Mõttus  et  al.,  2019;  Wood  et  al.,  2010;  Henry  &
ratings  and  cross-trait,  cross-time  ratings,  Mõttus  and Mõttus, 2020), which may be higher than many intuitively
colleagues  (2017)  calculated  bias-and-error-reduced expect.  Higher-than-assumed single item reliability is  also
estimates  of  heritability  and  rank-order  stability  of consistent  with  findings  that  items  out-predict  scales  for
personality  nuances  and  found  that  the  average  estimates outcomes and other variables (Mõttus & Rozgonjuk, 2019;
were  comparable  to  those  of  aggregate  traits,  defying  the Seeboth  &  Mõttus,  2018;  Vainik  et  al.,  2015;  Achaa-
intuition that broad psychological traits are more “biological” Amankwaa et al, in press; Ellemann et al., 2020). Therefore,
than  circumscribed  behaviours,  feelings,  thoughts  and the allegedly low reliability of items should not be a reason
motivations. for not reporting item-level findings. Where reliability is a
Combining  test-retest  data. The  reliability  of  personality concern,  however,  it  can  be  compensated  with  large
trait  assessments  and thereby their  associations  with  other samples,  meta-analytic  integration  of  findings,  and  by
variables can also be substantially  improved by measuring aggregating or triangulating self- and informant-reports.
presumably enduring traits twice over reasonably short time A Personality Research Hub.  We recommend developing
intervals (e.g., two weeks); besides, the associations can then a central repository of descriptive findings. These findings
be  corrected  for  unreliability.  Again,  with  online  testing, could involve anything from associations among personality
organizing  two  or  more  measurement  occasions  is  not  as traits or their associations with demographic characteristics,
taxing as it used to be when testing was done on paper and life events and outcomes to their heritability, stability, and
when  much  of  our  current  assessment  practices  were  set, cross-method agreement estimates.  We think that  findings
including the one-assessment-only tradition. It is especially are  best  deposited  disaggregated  (e.g.,  at  the  item level),
useful  if  multiple self-ratings  can  be  supplemented  with allowing for a flexible aggregation into different scales as
informant-ratings: combining multiple pieces of information well as for analysis at the item level. Centrally and publicly
allows breaking correlations between variables into several available  findings  can  be  tested  for  robustness  across
components such as the association net of single-rater and studies,  as well as for  moderators that  help to  understand
occasion-specific biases, rater-specific effects and occasion- why they vary from study to study or from scale to scale.
specific effects (e.g., Koch et al., 2017). They can also be meta-analytically combined and used for
setting  up  and  testing  novel  hypotheses  (e.g.,  a  routine
6 For example, if 50% of an item’s variance is free of measurement error and practice in quantitative genetic research; Lee et al., 2018).
single source method biases, then combining two raters yields a reliability Some such datasets  have already been published (Mõttus,
of .67 for the aggregate, according to Spearman-Brown formula. 
Description, prediction and explanation 9
Sinick,  et  al.,  2019;  Condon,  Roney,  &  Revelle,  2017; (in  preparation)  found  that  more  socially  desirable  traits
Goldberg & Saucier, 2016), but there is no central repository showed  stronger  age-differences  in  self-reports  than  in
of personality research findings yet. informant-reports,  suggesting  that  age-differences  may  be
For integrating findings across studies it is not necessary that inflated  in  self-reports;  and  Wood  and  Wortman  (2012)
all  or  even  most  studies  use  similar  instruments.  In  fact, showed  that  traits  which  varied  least  in  their  desirability
having all  researchers assessing the same personality traits across participants were least stable over time.
may  not  even  be  preferable  for  many  research  questions, For a parallel, recent developments in quantitative genetics
because this  would constrain the range of  traits  for  which have been substantially facilitated by a wide-spread practice
findings  can  become  available  over  time.  Instead,  it  is of sharing genotype-phenotype associations at the most fine-
sufficient  when studies  rely  on  at  least  partly  overlapping grained level (millions of single nucleotide polymorphisms)
measures  so  that  their  associations  can  be  compared  for in repositories  such as the LD Hub (Zheng, et al.,  2017).
robustness  and integrated into larger association networks. Geneticists  routinely  (meta-analytically)  integrate  and  re-
This  directly  parallels  the  idea  of  Synthetic  Aperture analyze such data for various research questions, developing
Personality Assessment (Revelle et al., 2016), which allows novel methodologies in the process. Much of this work is
calculating “synthetic” correlation matrices from only partly based on examining variabilities between genetic markers in
overlapping  sets  of  participants.  That  is,  not  only  can their phenotype-associations or other attributes (e.g., allele
correlation  matrices  be  based  on  different  participant frequencies  or  linkage  disequilibrium),  exactly  as  we
combinations of the same study, they can also be based on recommend  examining  systematic  variabilities  between
combined (synthetic) correlations from different studies.  A personality traits in their quantifiable attributes. The high-
similar  procedure  is  routinely  used  in  modern  genetic dimensional findings are filtered and aggregated in various
research (e.g., Bulik-Sullivan et al., 2015). For working with ways such as by chromosome or gene expression patterns, to
such data, it is sufficient if (nearly) identical items and traits test  hypotheses  and  summarize  patterns.  This  is  a
share  annotation  (common  labels)  –  something  that  also fundamentally  more  flexible  approach  to  data  than  the  a
helps against jingle-jangle fallacies. priori aggregation  of  data-points  that  has  prevailed  in
Readily available descriptive findings, especially if they are personality research.
not a priori aggregated into the Big Few, would facilitate a New data  analytic  tools. In  conjunction  with  depositing
currently underused research strategy: setting up and testing (disaggregated)  findings,  we  recommend  that  researchers
hypotheses  that  rely  on  systematic  variability  between develop  tools  for  collecting,  annotating,  archiving,
personality  traits  in  their  attributes  such  as  demographic processing,  meta-analysing,  and  processing  many-
differences,  stability,  heritability,  or  links  with  outcomes dimensional personality data. For example, we can imagine
(e.g.,  Funder  &  Dobroth,  1987;  Block,  Block,  &  Gjerde, a software package (e.g., in R, possibly in combination with
1986; Funder & Sneed, 1993; Mõttus et al.,  2017; Vainik, other platforms) that facilitates:
Misic et al., 2019). That is, much like we study differences ● administering subsets of item pools, selected according
between people, we can also study  quantitative differences to pre-defined criteria;
between  traits  such  as  facets  and  nuances.  This  is  not
possible with only, say, five traits, but becomes increasingly ● scoring  them  into  various  scales  (e.g.,  the  Big  Five,
viable as the number of traits increases. HEXACO, Dark Triad, or well-being);
For example, we could numerically test the hypothesis that ● uploading  and  downloading  data  from  a  central
personality development reflects social maturation (Caspi & repository of findings according to specified criteria;
Roberts,  2001).  If  associations  between hundreds of  items ● automatically  meta-analyzing  new  and/or  existing
with age are meta-analyzed into reliable estimates, one could findings for user-selected variables;
select,  say,  200  diverse  items,  quantify  their  degrees  of ● cross-validating  findings  across  different  subsets  of
reflecting  social  maturity  (e.g.,  using  expert  ratings  or existing data and identifying candidate moderators;
correlations  with  objective  maturity-criteria)  and  expect ● leveraging  existing  information  (covariances  among
these degrees to track with empirical age differences in the items)  to  impute  unmeasured  variables  and  to  cross-
items. This would be a powerful and quantitative alternative walk  from  measured  scales  to  (partly)  unmeasured
to eyeball-judging that mean-level change patterns in traits scales;
such as the Big Few look like people are generally becoming ● summarising  findings  (e.g.,  personality-outcome
socially more mature. For other examples, Henry and Mõttus correlations)  at  different  levels  of  aggregation
(2020)  examined  whether  items  that  corresponded  to  the (personality hierarchy);
definition of  traits  as  opposed to  characteristic  adaptations
demonstrated empirical properties often associated with traits ● identifying the variables (pre-defined scales, individual
such  as  stability,  cross-rater  agreement,  and  heritability; items,  or  computer-identified  item  collections;
Hang, Soto, Lee and Mõttus (under review) studied whether Ellemann et  al,  2020)  that  uniquely  (over  and  above
items  representing  traits  with  stronger  social  expectations other  variables)  drive  particular  associations  (e.g.,
had larger age differences in means and variances throughout Vainik et al., 2015);
childhood and adolescence; Kööts-Ausmees and colleagues
Description, prediction and explanation 10
● testing  the  extent  to  which  items’  or  broader  traits’ (Rauthmann, 2020), already contain papers that do exactly
associations  with  particular  variables  track  with  their this. Here, we only note two things. 
previously  established  properties  such  as  reliability, First,  much  of  the  research  on  short-term  variance  in
social  desirability,  degrees  of  reflecting  affect, personality  states  repurposes  the  descriptive  models
motivation,  and  other  psychological  domains, developed for  summarizing individual  differences  such  as
developmental trajectories, and so forth, so as to better the  Big  Few.  But  the  extent  to  which  this  is  appropriate
understand  the  associations  and  detect  possible needs  to  be  studied  not  presumed  (e.g.,  Molenaar  &
confounders; Campbell, 2009; Fisher et al., 2018). There is no reason to
● visualizing  association  patterns  according  to  user- assume  personality  hierarchy  operates  the  same  way  for
selected filters (e.g., compare item-outcome correlations individual  difference  traits  and  within-individual  variance
in whether they pertain more to affective or behavioural states,  although sometimes it  may. Many trait  models  are
items). designed and measured with the specific purpose of glossing
Some  of  these  functions  have  already  been  implemented over temporal and situational variations, because personality
(e.g., Arslan, 2019; Arslan,  Walter, & Tata, 2020; Revelle, is often conceived of as broad and decontextualised patterns
2020; Rosenbusch, Wanders, & Pit,  2020), but there is no of individual differences (Funder, 1991; McCrae & Sutin,
comprehensive  toolbox  yet.  Possibly,  the  main  reason  for 2018). It is useful to recall that the adjective pools that were
why this does not already exist is lack of suitable databases; used to derive the Big Few systematically excluded terms
to date, personality researchers simply have not pooled their concerning  moods  or  states  (Saucier,  1997).  For  this  and
(disaggregated) findings, as some other fields have done to a other reasons,  employing the Big Few-like broad traits  in
good  effect.  We  hope  this  will  change.  For  a  relevant studies on how personality states fluctuate just because this
example in cooperation research see Spadaro and colleagues model is often used in individual differences research may
(2020). not be a good idea, just as assuming that narrower traits such
If personality science is moving towards higher-dimensional as  facets  or  nuances  are  somehow  more  contextual-
representations  of  phenomena,  as  we  hope,  this  will  also situational  and  thereby  more  appropriate  candidates  for
have implications for which skills needed to be taught to, and personality  states  may  be  ill-conceived  (Horstmann  &
expected  from,  graduate  students  pursuing  personality Ziegler, 2020). Being artistic may be a useful narrow trait,
research. but  uninformative  as  a  personality  state.  We suspect  thatsome phenomena – for  example,  being talkative or  sad –
Collaborations.  Any one researcher or research group can may  constitute  reliable  variance  units  both  as  traits  and
collect  only  so  much  data.  Individually,  even  the  largest states (Zimmermann et al., 2019), whereas others may only
panel studies with often brief measures of personality traits be appropriate as either. 
may  provide  increasingly  diminishing  returns  when  the
phenomena they explore are many-dimensional. But there is Second, many of the recommendations that we propose for
no  rule  that  all  research  teams  have  to  rely  on  the  same descriptive  research  on  individual  differences  should  also
omnibus  model  of  personality  and  be  constrained  by  the apply to descriptive research on within-individual variance
same  practical  limitations  that  prevent  them  from in personality states. Among them are the need to develop a
comprehensive  measurement.  Instead,  we  may  need flexible  descriptive  framework  that  allows  measuring
collaborations where different researchers explicitly set out phenomena with the most  appropriate  level  of  granularity
to  examine  different  aspects  of  personality  (e.g.,  different for the purpose at hand, validating findings across methods,
traits) and only subsequently integrate their findings. measures  and  contexts,  combining  self-  and  informant-reports,  developing  tools  for  flexibly  working  with  and
Within-individual variance efficiently  summarizing  many-dimensional  data,  and
developing efficient tools for data sharing and collaboration
We  have  focused  on  variance  between  individuals  in (e.g., Kirtley, 2020).
enduring  patterns  of  thinking,  feeling,  behaving,  and
motivation, partly because this is what much of personality Predictive personality science
science is about. But recent years have seen the emergence of
a powerful new stream of research that maps variance within Personality researchers often take pride in how personality
individuals over  very  short  time-periods  and  across traits “predict” life outcomes such as academic performance,
situational  experiences  in  what  is  often  called  personality relationship  satisfaction,  or  health.  Strictly  speaking,
states (Wendt et al, 2020, Sosnowska et al., 2020, Danvers et however, many of these findings – correlations or regression
al.,  2020,  Horstmann  & Ziegler,  2020),  as  well  as  stable coefficients  calculated  using  the  same  observations  being
individual differences in the distributions of these. This will predicted  –  are  actually  descriptive.  Truly  predictive
likely provide more detailed descriptions of how particular research aims to create models where characteristics such as
individuals  and  people  more  generally  interact  with  their personality  traits  are  used  to  model the  best  possible
environments and differ in this. Here, we do not describe this predictions  of  outcomes  in  data  that  have  not  yet  been
new and blooming stream of research in detail only because accessed or even collected (out-of-sample prediction). First,
this  special  issue,  as  well  as  another  recent  special  issue this means that the observations that are used to create, or“train”, predictive models must not be the same observations
Description, prediction and explanation 11
that will eventually be predicted (Yarkoni & Westfall, 2017; Why is predictive research different from descriptive and 
Stachl, Pargent et al.,  2020). Second, such research should explanatory research?
explore the limits of predictive accuracy, whereas descriptive
models often have other priorities, as we argue below. It  may  not  be  obvious  why  descriptive  models  are  not
Given that the scientific value of personality traits is often necessarily optimal for prediction. For example, doesn’t  R
2
said  to  hinge  on  their  predictive  power  for  important  life of  a  regression  model  provide  a  good  estimate  of  its
outcomes  (Ozer  &  Benet-Martínez,  2006;  Roberts  et  al., predictive accuracy, even if that model was intended as a
2007; Soto, 2019), it may come as a surprise that this power descriptive research tool to show how the variables in the7
and ways of maximizing it have rarely been directly assessed model are linked with an outcome?  It can, especially when
in empirical studies. We suspect that this is in part due to a the model comprehensively covers relevant variables at the
common failure to distinguish predictive research from other appropriate  level  of  the  personality  hierarchy,  as  we
kinds of research and a tacit—but often likely mistaken— recommended for descriptive research, and was developed
assumption that priorities  and methodologies most suitable on  a  sufficiently  large  sample  to  obtain  stable  parameter
for  descriptive  or  explanatory  objectives  must  also  be estimates. However, the best descriptive models do not have
optimal for predictive purposes. to be the most predictive ones, because efforts to optimizemodels for descriptive as well as explanatory appeal often
Why do predictive personality research? decrease their predictive power, for two reasons. 
First, a failure to cross-validate performance estimates (e.g.,
Maximizing  the  out-of-sample  predictive  utility  of reporting an adjusted R2 estimate derived from the same data
personality  traits  can  be  an  end  in  itself,  sometimes  even the model was trained on) may result in overfitting (Yarkoni
irrespective of its potential descriptive or explanatory utility. &  Westfall,  2017;  Stachl  et  al.,  2020)  and  give  overly
Consider,  for  example,  using  personality  assessments  for optimistic  impressions  of  predictive  accuracy,  while
candidate selection (Lievens, 2017): what matters most is the estimating how individual variables in models contribute to
accuracy of the estimated probability that the candidates will their  cross-validated  prediction  reduces  the  models’
succeed in the job. Although for transparency it is useful to descriptive simplicity (for examples, see Stachl, Pargent et
know which  individual  traits  contribute  to  these  predicted al., 2020). To be fair, the issue of overfitting is probably less
probabilities, the implications of those contributions for our prevalent in more recent personality research and compared
understanding  of  personality  more  broadly  are  less to  many  other  fields  of  psychology,  because  often
important. Where the most accurate estimates of future job sufficiently large samples are used. But even so, an adjusted
performance are based on the Big Few scores, it makes sense R2 estimates  a  model's  predictive  performance  in  a
to use them. But where the best predictions are achieved by hypothetical  and  infinitely  large  sample  that  was
measuring, say, 100 unrelated personality items and feeding compositionally  exactly  identical  to  the  one  in  which  the
them  directly  into  a  predictive  model,  it  may  be model  was  fitted,  whereas  cross-validation  allows  one  to
counterproductive  to  combine  them  into  broader  trait estimate the robustness of the model across different kinds
constructs  and  use  these  for  predictions,  however of samples. Researchers often assume that their findings are
descriptively elegant or comfortingly familiar this may seem. robust  to  variations  in  sample  composition,  but  R2 is
The same applies to using personality traits to decide which insensitive to this.8 
products  are  best  advertised to  which people (Matz et  al., Second,  human  researchers’  and  their  readers’  cognitive
2016) or for predicting important outcomes in medical and constraints  introduce  a  tension  between
academic contexts, among other possible applications. descriptive/explanatory  and  predictive  research  objectives,
Maximising predictive accuracy has theoretical importance, because increased predictive accuracy is often achieved by
too. Quite simply, to the extent that predictive accuracy is increasing model complexity, which reduces interpretability
one of  the main reasons for pursuing personality  research, and theoretical parsimony. For example, for descriptive and
the case for this pursuit will be even stronger if we manage explanatory purposes researchers tend to look for and group
to  increase  the  predictive  accuracy.  Likewise,  one  of  the correlated variables, whereas sets of variables that capture
main  theoretical  implications  of  the  pervasive  personality maximally unique portions of variance likely confer better
trait-life  outcome associations  is  that  the traits  may partly prediction  (Saucier,  Iurino,  &  Thalmeyer,  2020).  The
shape everyday experiences linked to  these outcomes (e.g, increased  complexity  of  predictive  models  may  not  only
differential education, career and relationship success confer mean including many predictor variables (we do recommend
different  life  trajectories  and  subsequent  experiences)  and high-dimensional  descriptive  research!),  but  also
thereby also shape psychological development more broadly
(e.g.,  Scarr, 1983; Roberts & Nickel, 2017). That is, many 7 In fact, many studies linking personality traits with outcomes only report
psychologically  consequential  experiences  are  unlikely correlations and not R2 estimates.
random  but  related  to  pre-existing  psychological 8 One may expect that increasingly common meta-analyses provide average
characteristics:  traits’  predictive  accuracy  is  the  formal association estimates across different samples that are more generalizablethan  estimates  from  individual  studies,  and  therefore  less  overfit.
measure of how pervasive this tendency is. However, although likely more accurate due to aggregation, meta-analytic
estimates may also be inflated due to overfitting in individual samples. 
Description, prediction and explanation 12
capitalizing  on  often  uninterpretably  small  differences this.  And  sometimes  comparatively  more  accurate
between already small weights of individual predictors and predictions  result  from  even  more  counter-intuitive
sometimes also incorporating non-linear associations and/or modeling.  For  example,  Mõttus  &  Rozgonjuk  (2019)
interactions between the predictors. unsurprisingly  found  that  regularized  regression  models
For example, Mõttus and Rozgonjuk (2019) reported that age predicted age from items much better than models based on
could  be  out-of-sample  predicted  (in  statistical,  not the zero-order correlations of these items with age (i.e.,  if
substantive  sense)  more  strongly  from 300  individual  test the predictions were formed by multiplying the standardized
items (r = .65) than from 120 items (r = .54), 30 personality score  of  each  item by its  correlation  with  age in  another
facets (r = .44) or the Big Five domains (r = .28). This shows sample and subsequently summing the products). But using
that  hundreds  of  items  contain  reliable  and  age-sensitive zero-order correlations calculated with items’ standardized
information  about  individual  differences  that  is  not  fully residuals (i.e.,  after  removing  the  variance  of  Big  Five
exhausted by a set of 119, or possibly even 299, other items domains  and  facets  from  them)  to  create  the  prediction
and  that  including  this  information  in  predictive  models models improved their performance to levels comparable to
makes a material difference in their performance. But from a regularized  regression  models.  That  is,  removing  the
descriptive/explanatory standpoint, a model with 300 small variance  of  the  Big  Five  domains  and  facets  from  items
regression  coefficients  that  are  carefully  selected  to prior to using them in the models  increased their ability to
maximize  prediction  may  be  suboptimal,  because  human out-of-sample  predict,  despite  these  items  having  been
researchers struggle to  reason  in  so many dimensions and selected to measure the domains and facets in the first place.
fathom the small differences between the coefficients.  The This  surely  leaves  classical  test  theorists  scratching  their
findings have to be filtered or organized somehow to make heads: how can what is supposed to be error (i.e., left-over
them useful for descriptive and explanatory purposes. This variance in items beyond the traits that they were designed
predictive research just  revealed that the Big Five (or any to  measure)  out-predict  traits?  A plausible  explanation  is
Big  Few)  may  be  a  particularly  suboptimal  way  of that  predictive  modeling  benefits  from  uncorrelated
organizing items in their age differences. predictors and minimizing their redundancy (Saucier, Iurino,
For  a  parallel,  the  same  applies  to  quantitative  genetics, & Thalmeyer, 2020). If so, capturing personality variation
where polygenic models based on contributions from more using  sparsely  placed  markers  (items)  throughout  the
numerous genetic variations (e.g., 100,000) generally allow persome is  more  useful  for  prediction  than  relying  on
for  stronger  out-of-sample  predictions  of  phenotypic intuitive  variables such as the Big Few or even their facets
variables than models based on fewer genetic variants (e.g., that  capitalize  on,  and  aggregate,  correlated  traits  (i.e.,
50,000), even though the contributions of individual variants oversample certain areas of the persome). This means a very
are  mostly  far  too  small  to  be  meaningfully  interpretable different measurement philosophy than classical test theory.
(Plomin  &  von  Stumm,  2018).  Likewise,  in  fields  like It  is  important  to  avoid  pejoratively  calling  predictive
computer  vision  and  natural  language  processing,  opaque models with predictors and parameters that are not intuitive
and complex statistical learning methods such as deep neural or familiar to human researchers “black box” models. They
networks  (DNNs)  vastly  outperform  simpler,  more are not black boxes because, having designed them, humans
interpretable  statistical  models  (for  review,  see  LeCun, can  understand  their  working  principles  (Hasson  et  al.,
Bengio, & Hinton, 2015). Many of these models capitalize 2020).   Besides, researchers  know the  data  on  which  the
on so many parameters and small variations in them that they models are trained because they designed the measures and
may never be fathomable by humans: not because the models collected the data. It is just that the specific parameter values
are overly complex per se, but because human minds have that the models develop to do what modellers designed them
constraints  that  models  do  not  have  to  obey  (Hasson, to do are often not interpretable to these modellers, possibly
Nastase, & Goldstein, 2020). We don’t know yet whether the due to their own cognitive constraints, but possibly also due
same  will  prove  true  for  the  prediction  of  individual to insufficient research and familiarization yet. Personality
differences in behavior (e.g.,  DNNs often require volumes researchers  should  be  open  to  the  possibility  that  some,
and quality  of  training  data  rarely  available  in  personality perhaps  even  many,  of  their  familiar tools  may  become
research), but this is not an unreasonable hypothesis. As it suboptimal  when  we  start  to  systematically  explore  the
stands,  there  have  simply  been  too  few  attempts  to limits of real-world predictions.
systematically explore the limits of personality traits-based Thus,  there  may  often  be  an  inherent  tension  between
predictions. parsimony and predictive power that forces researchers to
But initial evidence does suggest that techniques providing choose  between  descriptively/theoretically  elegant  models
less  human-interpretable  model  parameters  such  as that  have  lower  predictive  power  and  better-performing
regularized  regressions  or  random  forests  may  at  least predictive  models  that  benefit  from  the  contributions  of
sometimes substantially out-perform more intuitive modeling numerous variables with sometimes very small coefficients
approaches  (e.g.,  Elleman  et  al.,  2020).  For  example, that individually make limited sense. Of course, other things
regularized regression models often shrink many coefficients being equal, it is always better to understand how a system
to  a  range  that  descriptively  looks  close  to  zero;  even operates  than  not.  But  sometimes,  and  maybe  even  very
ordinary regression models with many predictors tend to do often,  the  true  data-generating  processes  underlying
Description, prediction and explanation 13
behaviour are too complex for a model to be simultaneously only minimally (Mõttus & Rozgonjuk, 2019).9 Likewise, a
both comprehensible to humans and predictively maximally finding that predictive models allowing for non-linear and/or
useful. interactive associations (e.g., recursive partitioning, random
forests) do (or do not) out-perform those that only allow for
Can predictive models help descriptive and explanatory linear additive associations can be equally informative about
ones, and vice versa? possible  causal  mechanisms,  at  least  when  the
Predictive  modeling  can  also  facilitate  progress  in  other underperformance  of  complex  models  is  not  due  to
kinds  of  research,  where  maximizing  out-of-sample measurement  error  (Jacobucci  &  Grimm,  2020).  Such
prediction is not an end in itself (for review, see Yarkoni and findings  can  also  inform  intended  personality-based
Westfall, 2017). interventions, not least about their likely limits in real-lifesettings. 
First,  routine cross-validation can provide researchers with
more realistic estimates of not only the predictive, but also Fourth, cross-validation as it is routinely done in predictive
the  descriptive  and  explanatory  capacity  of  their  models. modeling provides an elegant way of estimating systematic
Impressive  in-sample  performance  estimates  derived  from (lack  of)  generalizability  of  results  across  measurable
small-to-medium samples may decrease substantially when factors. For example, one can train a model on only some
evaluated  in  independent  samples,  whereas  the  predictive samples  (e.g.,  only  for  men,  North  Americans,  people
power can hold up well with larger samples. But regardless younger  than  50  years)  and  evaluate  its performance  on
of this, where predictive models with tens of well-chosen and others  (e.g.,  women,  Asians,  those  aged  over  50);  if  the
well-measured  predictors  are  able  to  account  for  only  a models perform equally well,  the factors that  differentiate
fraction  of  the  variability  in  the  phenomenon  of  interest, between  the  samples  do  not  moderate  the  associations
researchers may want to remain humble about being able to captured by the model. 
map out the causes of this phenomenon, at least using the On the other hand, attending to descriptive and explanatory
kinds  of  explanatory  variables  approximated  by  their concerns  can  also  help  improve  the  performance  of
predictors. That is, because one could argue that something predictive models. Most importantly, researchers can draw
can  only  be  mechanistically  explained  to  the  extent  it on  their  domain  expertise  to  facilitate  better  “feature
behaves  predictably,  predictive  accuracy  may  often  signal engineering”; that is, choosing which variables are used in
the limits of the explanatory powers of causal models. the  predictive  models  and  how  they  are  pre-processed
Second, predictive models can help researchers understand (Stachl,  Pargent  et  al.,  2020).  No  amount  of  machine
the  trade-offs  inherent  in  emphasizing  certain  goals  over learning expertise is likely to produce optimal predictions if
others  and  identify  important  lacunae  in  descriptive  or the available predictors contain mostly noise (Jacobucci &
explanatory models.  For example, even if one’s goal  is  to Grimm, 2020) or lack coverage of the critical features of the
develop a readily interpretable prediction equation using only target  phenomenon.  An understanding  of  the  sources  and
the  Big  Few  domains,  quantifying  the  performance structure of human personality and psychometric expertise
improvement one might obtain by using a more expansive set can  be  particularly  helpful  for  maximising  predictive
of  predictors  can  help  calibrate  expectations  about  what potential  and for  anticipating  issues  with  generalizing  the
“good” performance constitutes. It is not uncommon to learn models beyond original training settings. For example, it is
that the Big Few are “powerful” predictors of life outcomes: likely  that  personality  trait  inventories  that  contain  items
comparing the predictive power of the Big Few to other trait with  high  reliability  but  relatively  little  redundancy  are
models would help to either support or at least qualify such particularly  useful  for  prediction,  despite  the  trait  scales
claims.  The  predictive  models  may  also  help  to  identify having lower internal consistencies and thereby potentially
additional  sources  of  variance  for  further putting  off  users  with  less  or  outdated  psychometric
descriptive/explanatory model development such as facets or expertise (Yarkoni, 2010). 
nuances that could be included into the Big Few or besides In particular, because accuracy of out-of-sample predictions
them. entirely  depends  on  comprehensive,  well-measured  and
Third, predictively comparing different kinds of models can generalizable  sets  of  predictors,  theoretical  accounts
also  shed  light  on  the  general architecture of  personality
variation  in  relation  to  predicted  outcomes.  For  example, 9 When predicting age from personality test items, Mõttus and Rozgonjuk(2019)  tried  removing  items  of  several  facets  that  had  the  strongest
models based on hundreds of predictors out-performing those correlations with age. Surprisingly, they found that the overall predictive
based on the Big Few or their facets would suggest that the capacity of the models decreased minimally, suggesting that the bulk of
associations of personality with the outcome could be driven the  predictive  information  was  not  uniquely  concentrated  to  a  smallselection  of  items  or  the  traits  that  they  were  supposed  to  index.  Not
by  numerous  specific  processes,  rather  than  a  few  broad reported in the original paper, but specifically for the current article, we
mechanisms  –  to  the  extent  that  causality  is  involved,  of ran additional out-of-sample predictions of age in these data, by dropping
course.  Among  other  possibilities,  this  can  be  tested  by 5%, 10%, 25% and 50% of the most predictive items from the total of 300
dropping  the  strongest  predictors  from  the  model  and items: the correlation between predicted and actual ages dropped from .65to .61, .59, .51 and .41, respectively. These predictions were still far more
estimating changes in the collective predictive power of the accurate than those provided by the Big Five domains (.28) and mostly
remaining predictors: it may be that this changes the results also  more  accurate  than  those  of  the  facets  (.44),  even  with  these
including all their items. This suggests that small amounts of unique age-
sensitive information were allocated across many individual items.
Description, prediction and explanation 14
elucidating the processes by which personality relates to the data from more people or more data from fewer people. In
outcome and descriptive accounts showing how the outcome such  cases,  larger  participant  numbers  are  not  always
is  correlated with personality  traits  can both be useful  for desirable. Instead,  prioritizing the coverage of the persome
expanding the  range  of  predictors  included  in  predictive by increasing the number of variables at the expense of the
models.  This  may  go  against  the  intuition  of  some number  of  participants  may  confer  substantial  predictive
researchers to use prior knowledge to constrain models. For advantages (the same likely applies to descriptive research),
training predictive models, however, it does not matter how provided  that  the  variables  used  during  training  are  also
many  predictors  are  initially  involved  or  what  putative available in the validation data and any future observations
personality hierarchy levels they come from, so long as they for  which  predictions  are  intended.  A  large  number  of
help  maximize  suitably  generalizable  out-of-sample responses to a short personality questionnaire can be a poor
prediction accuracy. As long as the models are not validated substitute for a rich dataset, even if the latter contains fewer
using  the  observations  on  which  they  were  trained,  any observations.  For example,  a sample of 3,000 participants
excesses in predictor selection will become apparent in the measured with 200 items may often enable more predictive
model validation phase and can be corrected. (as  well  as  descriptive)  models  than  a  sample  of  12,000
participants measured with 50 items, and a sample of 60,000
Some recommendations for predictive research measured  with  10  items  is  likely  to  fare  worse  still.
Cross-validation. For  an  accurate  evaluation  of  the Ultimately, the predictive information is in the variables and
predictive value of personality traits, it is most important to most  outcomes  are  highly multiply  determined,  with
use cross-validation procedures that distinguish between the observations  only  needed  to  reliably  estimate  relevant
training  sample  and  the  validation  sample  (Yarkoni  & information  in  the  variables.  Besides,  many  statistical
Westfall,  2017; Stachl, Pargent et al.,  2020). These can be estimation  methods  such  as regularized  regressions are
independent partitions of one larger sample (as in k-folds or designed to help stabilize predictions even in cases where
leave-one-out cross validation), but it is even better if they the number of variables exceeds the number of independent
are  independently  collected  datasets,  potentially  with observations.  A  particularly  useful  solution  to  balance
somewhat  varying  demographic  characteristics.  Cross- participant  and  item  numbers  is  to  collect  data  with
validation helps to mitigate against model overfitting due to massively  planned  missingness,  where  each  participant
random sampling variance as well as due to systematic biases provides responses to a different random subset of variables
in sampling (e.g., demographic imbalances), and it can guard (e.g., Revelle et al., 2017; Elleman et al., 2020). 
against  the  effects  of  idiosyncracies  in  data  collection, Flexibility  in  selecting  and  transforming  predictors.
processing, and statistical modeling. It is especially valuable When constructing predictive models from personality data,
if the training and validation data were collected by different researchers have flexibility over how,  or whether at all, to
researchers. transform  single  data  points  such  as  item  scores  into
Sufficiently large datasets. Predictive performance tends to predictive variables; this may involve aggregating, raising to
improve with increasing model complexity, so long as the powers  or  grouping  values,  for  example.  In  machine
training data is sufficiently large to mitigate over-fitting. As learning,  this  process  is  termed  feature  engineering.
a  general  rule,  the  more  predictors  in  a  model  and/or  the Aggregation  tends  to  filter  out  potentially  useful
more complex the functional form relating predictors to the information, so measuring many traits with one item each
criterion  (e.g.,  allowing  for  non-linear  associations),  the can result  in  more predictive  models  than measuring few
larger the training sample that is required. The incremental traits with many items. But aggregation may be useful when
gains associated with larger sample sizes also depend on the this  demonstrably  improves  the  generalizability  of  the
effect  sizes  in  question,  as  large  effects  require  smaller prediction  models  across  contexts  and  instruments.  For
samples,  and  the  amount  of  missing  data  (Elleman  et  al., example,  it  may  be  that  an  item-based  prediction  model
2020).  For  example,  Mõttus  &  Rozgonjuk,  found  that vastly out-predicts a model based on fewer aggregate traits
prediction models stabilized with a few hundred observations in a given sample, but when the trained model is applied in a
when based on up to 30 variables, but required about 3,000 different  demographic  group,  the  gap  may  close  or  even
observations when based on 300 predictors with the smallest reverse.  As  a  general  rule,  different  ways  of  aggregation
individual  effect  sizes  and  presumably  most  measurement could be empirically compared to each other as well as to
error. We therefore do not suggest universally “acceptable” completely disaggregated models in their ability to predict
sample sizes; instead, this can be estimated with simulations outcomes  in  independent  data  (e.g.,  Mõttus,  Bates  et  al.,
for individual study designs. For many predictive modeling 2017). 
applications  in  personality  psychology,  it  is  possible  that Comparing  statistical  models  in  their  performance.
increased sample sizes will have diminishing returns beyond Sometimes,  well-tuned regularized regression models  may
a few thousand observations. provide  far  more  robust  and  accurate  predictions  than
But  more  variables  is  often  preferable  to  more “standard” (i.e.,  ordinary least-squares) regression models;
participants.  Researchers  rarely  have  the  luxury  of sometimes the latter may work just as well.  Also, models
acquiring  massive  samples  with  many  well-measured that  allow for non-linear  and interactive associations  may
variables,  and  often  face  a  choice  between collecting  less sometimes  provide  the  most  accurate  predictions,  even  ifthey require larger training samples. In some circumstances
Description, prediction and explanation 15
such as high levels of missing data, less sophisticated and Explanatory personality science
less  data-hungry models  may provide comparably accurate
predictions:  for  example,  Elleman  and  colleagues  (2020) Many  psychologists  are  not  satisfied  with  describing  and
introduced  the  Best  Items Scales  that  are  Cross  validated, predicting  personality-relevant  phenomena  (e.g.,  traits  or
Unit  weighted,  Informative,  and  Transparent  (BISCUIT) their  correlates;  events,  actions,  affects,  goals,  life
model that allows researchers to create bespoke personality outcomes) and also aspire to explain them (e.g., Baumert et
scales for particular outcomes, consisting of as few items as al.,  2017).  Few would  disagree,  however,  that  explaining
possible  and  each  contributing  exactly  the  same  amount something is  harder than describing and predicting it,  not
towards  the  prediction  for  greater  interpretability.  Our only because of methodological challenges but also because
general point is: to date, there has been too little research that of  more  fundamental  questions  about  the  very  nature  of
has  systematically  explored  the  ways  of  maximising  the useful explanations. In fact, even the authors of this article
predictive accuracy of personality variables and therefore we could  not  entirely  agree  on  some  fundamental  questions
cannot  know  yet  which  modeling  practices  are  generally around causes,  explanations  and  their  roles  in  personality
preferable. science.  Fortunately,  there  have  been  other  recentcontributions  regarding  how  to  explain  phenomena  that
Alternative sources of personality information personality  scientists  consider  as  falling  into  their
jurisdiction (e.g., Baumert et al., 2017; Briley et al., 2018;
Predictive personality research may not only use personality Grosz  et  al.,  2020),  including  articles  in  this  issue  (e.g.,
traits as predictors, but also as outcomes. A wealth of recent Quirin et al., 2020; Costantini et al,. 2020; Lukaszewski et
research has explored the possibility to extract personality- al., 2020). Here we offer general ideas about how one could
relevant information not only from traditional  sources like think of causes and useful explanations – and why these are
self-reports, but from records that people leave behind such not necessarily the same things. 
as social media or credit card records, mobile sensor data or Crucially, there are different approaches to personality that
diaries (Kosinski et al., 2013; Stachl, Au et al., 2020; Weston vary  in  what  their  advocates  may  consider  useful  and
et al.,  2019; Wiernik et al.,  2020).  Typically,  such data is realistic goals of explanation. Some conceive of personality
given  psychological  meaning  by  first  collating  them  into as  broad   regularities  in  relatively  stable  individual
scores that approximate self-reported personality traits (e.g., differences,  whereas  others  think  of  it  as  a  dynamic  and
using machine learning techniques; Wiernik et al., 2020) and potentially idiosyncratic within-person system, and see the
then  using  these  digital  records-based  self-report- role  of  personality  science  as  providing  an  integrative
approximations for  descriptive or  predictive purposes.  The account of how the mind and behaviour come together. The
standard approach so far has been to predict the Big Few first former  approach  focuses  on  decontextualised  patterns  in
and then use these predictions for whatever is their intended naturally  occurring,  normal and continuous (dimensional)
purpose,  but  recent  evidence  suggests  that  predicting variance among individuals (e.g., Funder, 1991; McCrae &
narrower  traits  such  as  nuances  first  and  using  these Sutin, 2018); in this view, personality is a population-level
predictions in subsequent analyses may be preferable (Hall et variance phenomenon such as the trait hierarchy. The latter
al., 2020). Again, more research is needed before we could approach is primarily about specific processes pertaining to
recommend  generally  preferable  research  practices,  and individuals and resultant variability  within them, as well as
therefore  it  may  be  useful  to  systematically  compare about how individuals may differ in these processes and/or
different approaches in their performance. their distant causes (e.g., Quirin et al, 2020). In many cases,
it is not evident that variability/processes taking place within
Competitions among research teams individuals  and variability  among people arise  for  similar
Prediction  is  different  from description and explanation  in reasons  (e.g.,  DNA  structure,  anthropometry,  parental
that  there  is  an  objective  ground  truth  for  assessing socioeconomic status or other possible sources of individual
performance:  the  agreement  of  predictions  with  actual differences  do  not  even  vary  much  within  individuals),
observations. This creates an opportunity for researchers to although sometimes they may (see Lukaszewski et al., 2020;
directly compete against one another in developing the best Quirin  et  al,  2020).  But  even  more  importantly,  while
possible  prediction  models,  which  could  go  a  long  way advocates  of  the  latter  approach  may  hope  to  identify
towards  eventually  establishing  the  best  practices  for  the specific  causes  of  specific  phenomena  (why  a  particular
field.  For  example,  teams  of  researchers  could  be  given person  reacts  to  a  situation  in  a  particular  way)  and
similar training data with the only instruction to develop the eventually  perhaps  even  individual  differences  in  these,
most accurate prediction models for given outcomes, and the advocates of the former approach may prefer explanations
submitted models could be compared in their performance in that  propose  general  principles  rather  than  target  specific
hold-out data  that  were  not  available  to  model  developers causes, for reasons that we’ll describe shortly.
(e.g., Salganik et al, 2020). Causes
Causes can be defined as broad and specific factors (e.g.,
neurological structures or repeated experiences) or processes
(e.g.,  situation  selection  or  associations among
Description, prediction and explanation 16
psychological  constructs)  that  play  roles  in  producing However, it is not self-evident that researchers who think of
particular  responses  to  environments,  or  vice-versa,  either personality  as  population-level  patterns  in  naturally-
psychologically  or  behaviourally.  Even  if  inferred  from occurring individual differences and seek to make sense of
comparing  individuals,  causes  and  effects  pertain  to these should target their individual causes. This is because
processes  and  variability  within  particular  individuals  in these patterns may not have many tractable causes to begin
their  particular  circumstances.  Causal  relations  have with, at least according to our definition of cause, or they
boundary conditions, which can range from the exceptionally may  be  too  numerous  and  too  complex  to  provide
narrow (e.g., where affecting  X should only affect  Y in rare explanations that are  interpretable for the human mind and
circumstances  and  needs  to  be  studied idiographically)  to therefore  useful.  Instead,  useful  explanations  for  these
very broad (e.g., on Earth, releasing an object almost always patterns could postulate general principles that may or may
causes it to fall toward the Earth). Explanations that target not apply to potentially controllable processes  in particular
causes thus mean specifying (1) the nature of the cause-effect individuals. We now elaborate on this position, because we
relation or process (such as  X→Y or  X→M→Y) and (2) the feel  that  it  is  implicitly  adopted  by  many  personality
circumstances under  which  the  relation  or  process  is researchers but may cause unrealistic expectations when left
expected to occur. unarticulated (Grosz, Rohrer, & Thoemmes, 2020). We will
The gold-standard for identifying causes is the  potential to later  return to  the  alternative  view  according  to  which
control  the  outcome by  experimentally  manipulating  these personality researchers should hope to reveal the individual
conditions and/or processes. For instance, if we have learned causes of personality-relevant phenomena in the strict sense
that Helen exercises  because of  X1, or Tom parties  because of the term.
of X2, we should be able to at least in principle influence the
levels of X  and X  to change Helen’s rate of exercising and Why many causes may be inherently elusive1 2
Tom’s  rate  of  partying.  This  involves  counterfactual In one part,  causes may often remain elusive because the
arguments: if  X and  Y occurred and we assert that  X was a phenomena that personality scientists seek to explain and/or
cause of Y, then we have to be able to show that, without X, Y their plausible explanatory variables are, by definition and
would not have happened in the way or to the degree that it intentionally, abstract hypothetical constructs that cut across
did, all else being equal. Formalized models of hypothesized different  circumstances  within  and  across  individuals
processes that enable controlling them at least conceptually (Funder, 1991), with quantitative levels that are inherently
(e.g.,  Directed  Acyclic  Graphs,  DAGs,  with  do-operators; relative. 
Pearl,  2018)  can  be  particularly  useful  for  probing  such
specific causal relations. Think of individual differences in neuroticism, self-esteem,agency,  trustfulness,  or  procrastination  as  quintessential
To  be  clear,  causes  do  not  have  to  be  deterministic;  for examples  of  the  kinds  of  personality  constructs  many
example, smoking causes lung cancer, but not every smoker researchers work with. To be personality constructs rather
gets it. But the probabilistic link between the cause and effect than just specific instances of behaviour, thoughts, feelings,
has to be consistent and strong enough such that changing and  desires,  they  represent  individual  differences  in
the  former  makes  a  non-trivial  difference  for  the  latter. reactions that integrate across many kinds of situations and
Indeed, the risk of smokers developing lung cancer is about over  time,  and  are  therefore  taken  out  of  their  specific
20 times higher than that of non-smokers (Surgeon General's circumstances.  Unless  one  commits  to  the  view that  they
Reports,  2004),  so  starting  to  smoke  makes  a  material represent singular traits (like height) that exist independently
difference for the probability of developing lung cancer. In of  how  and  where  they  are  expressed  and  measured
contrast,  very  small  individual  causal  effects  have (arguably,  most  personality  researchers  do  not;  e.g.,
commensurately small explanatory power. Baumert  et  al.,  2017),  this  inevitably  makes  them
Defined as such, identifying causes may be a useful target for decontextualized  aggregates  that  correspond  to  different
approaches that see the primary role of personality science as things  in  different  people  and  circumstances.  Also,
identifying  potentially  controllable  processes  that  underlie individuals’ “raw” scores on them can only be interpreted in
within-individual  variance  and  perhaps  subsequently  also comparison to those of others, because there are few if any
individual differences in these processes (e.g., Quirin et al., concrete “anchors” (e.g., specific behaviours) that invariably
2020).  For  example,  a  therapist  may  be  able  to  identify correspond  to  specific  trait  levels  and  ground  these  in
causes  of  a  patient’s  problematic  behaviours  and  perhaps individuals.10 According to our definition, however, causes
even  help  the  patient  to  control  them to  facilitate  desired need to represent concrete “things” (e.g., thoughts, feelings,
personality  change  (Hopwood,  2018;  Magidson,  Roberts, behaviours,  desires,  skills,  experiences,  brain  structures,
Collado-Rodriguez, & Lejuez, 2014). Likewise, functionalist
and  process  approaches  may  attempt  to  explain  how 10 There have been attempts to create personality rating scales that
particular  beliefs  and  skills  interact  to  produce  certain provide raters with concrete behavioural anchors rather than the
behaviours or self-perceptions, which can similarly provide typical  disagreement-agreement  dimension such as  Likert  scale(e.g.,  Muck,  Hell,  &  Höft,  2008).  These  may  be  useful  for
‘levers’  for  influencing behaviours  or  trait  change  (Wood, assessing  the  manifestation  of  personality  traits  in  specific
Spain,  Monroe,  &  Harms,  in  press;  Metcalfe  &  Mischel, circumstances in a non-relativistic way, but the measures tend to
1999). be  too  context-specific  to  be  of  general  use  and  to  allow for
comparing individuals from different circumstances. 
Description, prediction and explanation 17
even genes) that do correspond to specific circumstances and relative  way to  serve  as  causes  per  our  definition?  Some
apply  to  particular  individuals,  irrespective  of  other already do, and many more may think that they should in
individuals. order to make progress. For example, we made the case for a
Of  course,  although  many  personality  constructs  are,  by greater use of personality nuances in other sections of this
nature,  decontextualized  and  relativistic  aggregates,  their article; these are at least somewhat more concrete than broad
constituents such as behaviours measured with questionnaire trait  domains.  Likewise,  we  echo  those  arguing  for  the
items could be concrete enough to also represent situation- importance  of  moving  beyond  subjective  trait-ratings  to
specific reactions of particular individuals.  If so, we could objectively measured behaviour (see also Back, 2020). But,
work  backwards  from  construct  levels  to  what  they again, for many researchers the core of personality science is
correspond  to  in  individuals.  This may  sometimes  be  the just  something  else  by  definition  –  broader  and
case,  especially for narrower constructs that aggregate few decontextualized  patterns  of  individual  differences  (e.g.,
constituents; this alone is a good reason to consider lower McCrae  & Sutin,  2018)  –  so  asking  them  to  study  only
levels of the trait hierarchy. However, not many personality highly specific and contextualized variables instead amounts
constructs  can  boast  a  well-defined  set  of  concrete to  asking  them  to  redefine  their  field  of  study.  The
constituents:  The  Act  Frequency  Approach  (e.g.,  Buss  & decontextualized nature of personality traits, for example, is
Craik, 1983) was one prominent attempt to delineate them, often  seen  as  their  particular  strength  (Funder,  1991;
but has been largely abandoned for decades. Even for narrow McAdams,  1994)  and  something  that  makes  personality
constructs  such  as  the  tendency  for  aggressive  behaviour, science  unique  among  other  fields  such  as  social,
researchers often ask those who provide information on it to developmental,  cognitive,  or  clinical  psychology.  This  is
make abstract inferences (“I tend to get into fights”, “S/he hard to argue with. 
often hits others”) rather than count the frequencies of the But equally importantly, the specificity required of variables
specific behaviours involved – because these are too context- that  could have causal  impacts  on personality  phenomena
bound to be meaningfully comparable across people. such as patterns in naturally-occurring individual differences
But even if researchers did have, or will manage to reach, a may  often  mean  that  they  are  too  numerous  to  be
consensus on what are the concrete constituents of specific individually  useful  as  explanations  (Yarkoni,  2020).
traits and how to measure them in a non-relative way, they Besides,  many  causes  can  have  multiple  effects,  which
will face another challenge: there are often so many different further  complicates  disentangling  them.  For  an  extreme
configurations of these constituents through which any given example,  even  if  individual  DNA  base  pair  variations
aggregate  value can arise  that  it  is  virtually  impossible  to directly  cause  individual  differences  in  personality
connect  a  specific  construct  score  to  the  values  of  its constructs, it will take many thousands of them to account
constituents  in individuals.  Any  non-extreme  level  of  a for  even  a  small  fraction  of  the  variance, because  their
construct with even just a handful of facets or nuances can individual effects are miniscule (e.g., Lo et al., 2017; Nagel
correspond  to  hundreds  of  unique  facet/nuance et  al.,  2018).  Most  of  the  individual  effects  are  not  even
configurations,  with  even  the  most  common  of  them statistically significant in any given sample. This is now so
remaining rare. Intuitively, we may expect that if a person well  established  that  it  is  called  the  Fourth  Law  of
has a medium score on a construct they must  also have a Behaviour  Genetics  (Chabris  et  al.,  2015).  Likewise,  the
medium level on most of its constituents; in fact, generally very same genetic variants pervasively matter for variations
this is  not  the case.11 This is a mathematical and empirical in a whole range of behavioural, social and somatic traits,
fact that may be greatly underappreciated among researchers. known  as  pleiotropy  (e.g.,  Turkheimer,  Pettersson,  Horn,
Given  this,  why  do  personality  scientists  not  work  with 2014; Mõttus, Realo et al., 2017; Nagel et al., 2018). 
variables  (e.g.,  individual  genes,  brain  variables,  life In  many cases,  the  number  of  potentially  relevant  causes
experiences,  personality  nuances,  behaviours,  or  feelings) may  be  smaller  than  the  very  high  number  of  somehow-
that  are  sufficiently  concrete  and  measurable  in  a  non- personality-related  genetic  variants.  But  the  typical  effect
sizes in psychology and the pervasive tendency for all things
11 For  illustration,  we  simulated  an  unrealistically  simple  construct  (N = to  correlate  (a  manifestation  of  the  psychological
10,000,000) that was defined by only five independent constituents, each “pleiotropy” that is sometimes called the crud factor; Orben
having only three levels (-1, 0, 1 with 25%, 50%, and 25% probabilities),
and a small amount of uniformly distributed “error” (ranging from -1 to 1 &  Lakens,  2020)  make  it  unlikely  for  many  personality
and accounting for about 12% of variance in construct scores). We then phenomena  to  have  distinct  causes  that  are  sufficiently
extracted about 20,000 scores of this construct with nearly identical values strong  to  explain  both  behaviour  and  psychological
(0  +/-  .005)  and  found  that  these  corresponded  to  hundreds  of processes in particular individuals and a non-trivial amount
configurations  of  their  five  constituents.  By  far  the  most  obvious
configuration of the five constituents (all 0s) corresponded to only 7% of of normal variability between people. Among other things,
the scores and each of the second most prevalent combinations (three 0s, this is  consistent  with the lack of robust  evidence for  the
one -1, and one 1) corresponded to less than 2% each. In the real world, of effects of specific life experiences on personality constructs,
course, few personality-related constructs are almost  completely defined
by only a handful of well-defined constituents,  so our ability to deduce even in the most powerful studies to date  (e.g., Asselmann
from a construct score to what this may represent in particular individuals &  Specht,  202012;  Chopik  et  al.,  2020; Denissen  et  al.,
is much smaller still.  If the constituents are not completely independent
(e.g., as semantically non-redundant items of a scale), some configurations 12 One may want to adjust the associations reported for personality change in
become relatively more likely, but this does not change the conclusion. See this study for multiple testing. Depending on method of adjustment, this
also Østergaard, Jensen, and Bech (2011). may result in only one significant association between life events and trait
Description, prediction and explanation 18
2019). Bleidorn and colleagues (2020) recently called for far better estimate the extent to which this applies and whether
more detailed examinations of the effects of life experiences this  generalizes  across  types  of  associations  (e.g.,  links
on changes  in  personality  constructs  than  are  available  to between psychological or behavioural phenomena as well as
date  (“Longitudinal  Experience-Wide  Association  Studies” their  links  with  physiological,  anatomical,  and  genetic
or LEWAS, p.  285). If Genome-Wide Association Studies variables)  or  levels  of  the trait hierarchy  (Wright  &
are anything to go by, then linking numerous life experience Zimmermann, 2019). 
“variants”  with  changes  in  personality  constructs  in  large Given all  this,  it  may seem sensible  to  keep explanations
samples  will  indeed  account  for  a  fraction  of  variance  in that could apply to what particular individuals do in their
them, although the findings should always be cross-validated particular  circumstances  and  that  could  potentially  be
in independent samples to avoid overfitting. This would be manipulated separate from explanations of population-level
an impressive and important empirical feat, but whether this variability in situation-general patterns such as traits in the
could  help  us  towards  potentially  controllable  and personality  hierarchy.  These  may  end  up  being  very
theoretically meaningful causes of why particular individuals different kinds of explanations.
do  what  they  do,  or  why  they  differ  in  this,  is  another
question. Explanations short of specific and potentially modifiable 
An equally  fundamental  reason  that  identifying  specific causes
cause-effect  associations  is  often  impractical  is  that  it
requires unrealistic assumptions, in particular that causality Where  identifying  specific  causes  is  not  feasible  or
runs in only one direction (Pearl, 2018). Naturally occurring reasonable,  internally  coherent  and  consistent-with-
personality  variability  represents  how  free-ranging available-observations narratives of how normal variation in
individuals  spontaneously  differ  when  left  to  their  own clearly defined phenomena comes about may serve as the
devices in largely self-created environments. In fact, the very most useful explanations. A useful explanation may state its
essence of personality is the means by which people choose, scope (what kinds of variance patterns are being explained)
adapt  to,  and  modify  their  real-world  situations  and and premises (what is assumed and not further explained),
experiences to suit them (Buss, 1987). As a result, what may and specify  its  observed and unobserved  components  and
be considered causes of personality characteristics often do general  principles of  relations among them (how they are
not  happen  to  people  randomly,  but  are  influenced  by organized  or  tend  to  inter-relate  over  time,  and  in  which
something  coming from within  them –  their  personalities, circumstances they are likely to occur or not occur).
13 For
potentially including the variables to be explained and other example, abstract narratives about developmental principles
variables  linked  with  these.  For  example,  people’s of  individual  differences  (e.g.,  Caspi  &  Moffitt,  1993;
experiences,  not  just  observable  traits,  are  correlated  with Roberts & Nickel, 2017) may be good candidates to become
genetic  variance  among  them  (Scarr,  1983).  Where  this useful explanations, despite – and maybe exactly because of
applies, there are no clear cause and effect associations and – not attempting to outline the specific causes of the patterns
formal models of causality (e.g., DAGs) and counterfactuals that  they  try  to  explain.  Articulating  only  a  few  causes
fail:  flipping  an  explanans  to  its  counterfactual  state would explain just about nothing, whereas attempting to list
automatically means flipping its explanandum as one of its a sufficient number of them, even if feasible at some point,
causes, suggesting that we cannot eliminate “back-doors” to could make explanations unintelligible.
explanandum  (Pearl,  2018).  This  is  also  a  reason  that It  is  particularly  useful  if  such  explanations  can  be
experimental manipulations and other interventions, even if formalized as computational  models  (Quirin et  al.,  2020).
feasible  practically  and  ethically,  could  sometimes Although  these  cannot  provide  empirical  proof  and  are
misrepresent causality in personality science and beyond. In unlikely to reveal causes in the strict sense of the term, they
real  life,  people  often  choose  the  “manipulations”  and allow  playing  through  complex  hypotheses  that  involve
“interventions” that suit them and do all they can to avoid large  numbers  of  hypothetical  variables  with  potentially
others, in part based on their personalities. many-to-many  and  bidirectional  relationships  that  can
Finally, in some and maybe even many cases, links between unfold  over  many  iterations.  Setting  up  a  computational
phenomena and their plausible causes exist in such narrow model that runs and produces results that are even broadly
circumstances as to be unique to individuals or only small consistent  with  observations  of  relevant  real-world
subsets of them (e.g., Beck & Jackson, 2020), which further phenomena often takes a lot of rigorous thinking and is all
complicates connecting them with population-level variance too  likely  to identify  gaps  in  verbal-only  explanations
in  individual  differences  constructs  (cf.  Beltz  et  al.,  2016; (Mõttus, Allerhand, & Johnson, 2020). Examples of the use
Dotterer et al., in press;  Lazarus et al., 2020;  Woods et al., of  computational  models  in  personality  science  include
2020;  Wright  et  al.,  2019).  The  more  idiosyncratic  the Revelle  and  Condon's  (2015)  dynamics  of  action  model,
associations are, the less practical and even plausible it is to
identify the specific causes of individual differences, at least 13 Besides the ‘how’ part, there may also be a ‘why’ part of an explanation,referring to the function (outcome) of the phenomenon in relation to a
as  long  as  these  are  defined  as  dimensions  along  which broader phenomenon (e.g., the function of anger may be to restore equity
individuals vary. At present, far more research is needed to in  social  transactions;  Lukazsewski  et  al.,  2020);  assuming  that  every
explanation involves a function may be problematic, however (e.g., some
phenomena are no longer functional or may even appear dysfunctional,
change (β = .08 for decrease in emotional stability after divorce). but still require an explanation).
Description, prediction and explanation 19
Read  and  colleagues'  (2010)  neural  network  model, tractable  unidirectional  causes  (Yarkoni,  2020).  The  best
Smaldino  and  colleagues’  model  of  niche  diversity explanations for these phenomena may often hinge on the
(Smaldino  et  al.,  2019),  or  Mõttus  and  colleagues'  (2020) most coherent available narratives that combine many pieces
model  of  person-environment  transactions  and  the of, and patterns in, descriptive findings rather than rely on
corresponsive principle. specific and definitive experiments or statistical models. For
But  are  explanations,  defined  this  way,  really  more  than example, whether a particular regression coefficient does or
descriptions? We argue that they are if they help to interpret, does not represent a causal effect in a strict sense may often
organize, and integrate descriptive observations. That is,  if be a moot  question and (suppressing) arguments over this
they fill in knowledge gaps, help researchers to envisage yet- may simply  reflect  naivety.  Regardless of  this,  regression
to-be made observations, and suggest possible directions for coefficients alongside other findings of descriptive research
more detailed explanations. However, we realize that the line can be a useful basis for narrative explanations.
between the explanations, defined this way, and descriptive
findings is probably far less clear than many would prefer. Alternative view: Identifying tractable causes may be a 
Indeed,  what  may seem as identifying  causal  explanations tractable problem after all
may often,  at  a  closer  look,  amount  to  more  detailed  and On the  other  hand,  many  researchers  –  including  several
better  organized  descriptions  (Yarkoni,  2020).  If  so,  well- authors of this article – disagree with the view that attempts
documented and detailed basic descriptive findings are and to explain personality may often be best off not targeting its
likely  will  be  central  parts  of  many  personality  scientific specific  causes.  Instead,  they believe that  researchers will
explanations.  Descriptive  findings  are  then  not  just eventually  identify  the  specific  and  potentially  even
uninspired examples of personality research to be replaced controllable causes of key personality phenomena, including
with “proper” causal explanations; they are the ingredients naturally-occuring  individual  differences  in  them  and
that useful explanations organize into coherent narratives. broader patterns in these. This will require better methods,
For  example,  theories  that  seek  to  explain  personality measures,  and  models.  But  even  more  importantly,  this
variations  through  social  interactions  may  benefit  from  a likely entails (a) defocusing from the broad and situation-
large-scale project (say N = 10,000) that documents, in both general  patterns  of  variation  as  the  starting  points  of
lab  and  naturalistic  settings,  hundreds  of  objectively explanations in favour of specific and contextualized within-
measured behaviours, social interaction processes and their individual  processes  and  (b)  tolerating  the  complex  and
subjective perceptions, besides including detailed trait ratings potentially  phenomenon-  and person-specific  (idiographic)
of the participants (see also Back, in press). Using such data, explanations that result from this shift. In what follows, we
researchers could look for patterns in behaviour, perception discuss  what  may  be  particularly  important  to  facilitate
and relationship dynamics and link these to measurements of moving  towards  causes-based  explanations  in  personality
individual differences, possibly being able to account for a science.
non-trivial fraction of variation in personality nuances, facets
and domains. Almost certainly, however, a large number of Some recommendations for explanatory research that seeks 
such  patterns  would  uniquely  contribute  to  accounting  for to identify causes
trait variance. These findings, such as those from LEWAS
(Bleidorn et al., 2020), would be descriptive and unlikely to Identifying  the  right  level  of  analysis  for  explanation.
reveal causes of naturally occurring individual differences in Units at certain levels of analysis may be too far apart to
the strict sense of the term. But they could  help to identify construct  meaningful  causal  accounts,  at  least  without
recurring  regularities  in  behaviour  and  psychological intermediate  steps.  For  instance,  reductionists  may  argue
processes and thereby develop and refine useful explanatory that  all  psychology  can  be  understood  by  biology,  all
models of personality variation. biology by chemistry, and all chemistry by physics. But it is
unlikely that we will ever identify a tractable explanation of
Grosz, Rohrer, and Thoemmes (2020) have recently argued how  a  leader’s  personality  affects  her  organization’s
that there is a widely-spread taboo against causal inference in longevity  through  particle  physics.  Instead,  explanations
non-experimental  personality  science  in  that  what using units at more proximal levels to the phenomena we
researchers are allowed to explicitly claim to have achieved wish  to  explain  may  be  more  useful  and  appropriate
is often not what their findings and interpretations actually (Borsboom,  Cramer,  &  Kalis,  2019;  Dennett,  2013;
imply – between the lines.  We suspect  that  this  is  in  part Hofstadter, 2007; Sperry, 1966). Social cognitive, learning,
because of a failure to distinguish useful explanations from or  functionalist  accounts  which  explain  personality  trait
causes in the sense that we defined them above and many levels  as  arising through the  interactions  of  units  such  as
other  researchers  do  as  well,  at  least  implicitly.  In  many goals, expectancies, affordances, and perceptual processes,
cases, researchers can hope to achieve explanations, but not may  be  more  appropriate  and  necessary  components  of
necessarily  identify specific causes, because these are either causal accounts of the phenomena than explanations through
intractable, unintelligible, or both. specific  genes  or  even  specific  neurological  structures
It may help to  tackle this  taboo to realize and accept that (Back, in press; Baumert et al., 2017).14
many phenomena that personality scientists are focused on
may, by their very nature, be distinctly unique in not having 14 Another level of analyses is personal narratives, discussed by Pasupathi
and colleagues (2020).
Description, prediction and explanation 20
Once  armed  with  proximal  causal  explanations,  however, protein  shakes  and  spending  hours  lifting  weights  at  the
researchers  can  move  on  to  identify  the  causes  of  these gym.15
causes, which ultimately can  serve as a strategy for making It  is  important,  however,  not  to  confuse  variation  within
sense of associations across different levels of analysis. For individuals  with  individual  differences.  The  former  may,
instance, given the extremely distal relations between genes and in many cases likely does, contribute to the latter. But
and psychological  traits  (e.g.,  Johnson & Edwards,  2002), the individual differences in within-individual processes that
identifying  genetic  variants  responsible  for  between- could  contribute  to  other  individual  differences  have  to
individual  variation  in  dominance  might  be  aided  by  first come  from  somewhere  in  the  first  place  (Lunansky,
identifying the major proximal causes of the variation, and Borkulo, & Borsboom, 2020; Quirin et al., 2020) and, as we
then  working  backwards.  Trait  dominance  tends  to  be know from  well  documented  behaviour  genetics  findings
elevated  among individuals  high in  physical  formidability, (e.g.,  Briley  &  Tucker-Drob,  2014),  many  sources  of
which  in  turn  tends  to  be  correlated  with  the  individual’s individual differences are a) hardly random and b) often not
physical  height  (Lukaszewski,  Simmons,  Anderson,  & something in which individuals even vary greatly over time
Roney, 2016). If so, understanding the genes affecting height (e.g., DNA structure). It may thus be that to a large extent
can  help  to  understand  the  genetic  variants  affecting the processes reflected in within-individual variance either
formidability, which can help to understand some part of the amplify  (e.g.,  corresponsive  processes  between  traits  and
genes  affecting  dominance.  The  large  number  of  specific experiences;  Nickel  &  Roberts,  2007)  or  dampen/reverse
genes affecting height in turn can be organized into smaller (e.g., somebody with maladaptive characteristics seeking to
sets of specific genes affecting narrower biological processes change  these)  pre-existing  individual  differences,  or
such  as  those affecting bone lengths,  cartilage production, translate  some  other  traits  (e.g.,  non-psychological
hormone  production,  skeleton  morphology,  and  other characteristics such as height, metabolic, endocrine or other
processes (e.g., A. Wood et al., 2014). Thus, as we improve traits) into psychological traits, rather than create individual
our  accounts  of  the  important  proximal  causes  of  a differences from scratch.
phenomenon  of  interest,  we  can  in  turn  identify  the  most
important  proximal  causes of  these  variables,  at  each  step Working with cleaner  units. As noted above,  there  is  a
identifying  more  specific  targets  we  can  place  as tendency for personality psychologists to combine diverse,
intermediators to bridge the gulf across more distal levels of causally efficacious sets of variables into single aggregates.
analysis. Those versed in the structural modeling literature However, excessive emphasis on broad all-purpose domains
may think of  this strategy as building a series of  multiple such as the Big Few impedes representing the personality
indicators, multiple causes (MIMIC) models. processes  or  dynamics  underlying  the  phenomena  (e.g.,Block, 1995; Mischel and Shoda, 1995; Cramer et al., 2012;
Even if  we can  eventually  identify  a  tractable  number  of Wood, Gardner, & Harms, 2015; van der Mass et al., 2006).
major proximal causes of our phenomenon of interest,  this This  is  a  point  that  we consistently make throughout this
strategy of iteratively identifying the proximal causes of each paper: we should be flexible about how, and whether at all,
proximal cause as outlined in this example will likely result we aggregate variables. For instance, we might imagine that
in  hundreds  or  thousands  of  distal  causes  with  miniscule tendencies  toward  [1]  liking  and  caring  about  people
effects. However, at each end of the long and complex causal increases a person’s likelihood of [2] doing favours for other
chains linking one of these distal causes to the outcome of people, which in turn can increase a person’s likelihood of
interest,  we  could  be  able  to  identify  stronger  causal [3]  being liked by other people. Averaging such tendencies
associations. For instance, on one end of the chain linking into  a  single  scale  score  complicates  understanding  the
specific  genes  to  height  or  dominance,  the  NOX4  gene’s nature  of  the  causal  relationships  that  the  conceptually
association with height is likely mediated through stronger distinguishable  attributes  have  with  one  another  (van  der
effects on the number of osteoclasts cells produced, which Maas  et  al.,  2006;  Wood,  Gardner,  &  Harms,  2015;
aid in bone repair and maintenance (Marouli et al., 2017). On Epskamp, Waldorp, Mõttus, & Borsboom, 2018).16 This can
the other, physical formidability and other proximal causes also  contribute  to  the  view that  even  moderate  (possibly)
may each have moderate to large main effects (e.g., r > .30) causal relations among personality variables are hard to find
on  dominant  behaviour  (Lukaszewski  et  al.,  2016).  This when in fact they are often hiding in plain sight – within our
strategy may thus help to organize the legions of variables scales (Afzali et al., 2020). A key recommendation, then, is
showing small distal effects by showing how they contribute that researchers a) aim for constructs and their measures that
to more proximally related variables and processes, such as prioritize  conceptual distinctions  between  variables  (e.g.,
the  psychological  mechanisms  or  systems  that  calibrate
dominant and aggressive behavior (e.g., Balliet, Tybur, Van 15 If the likelihood of increasing formidability turns out to be systematically
Lange,  2017;  Lukaszewski  et  al.,  2020).  The  successful linked  to  its  plausible  downstream  causes  such  as  dominance  (lessdominant  people  may  bother  less  with  having  physical  means  of
identification  of  the  most  proximally  related  processes  in appearing threatening), the situation becomes more complicated, though,
turn offers  the greatest  potential  for  intentionally  affecting because the cause and effect become entangled, as we  discussed above.
outcomes  of  interest.  For  instance,  a  man  might  try  to Scenarios such as this may in fact be uniquely prevalent for personality-related phenomena.
facilitate his displays of dominant behaviour by increasing 16 It will also often result in putting indicator items of the outcomes we want
his  formidability,  perhaps  by  ‘bulking  up’  by  downing to  predict  with  personality  scales  directly  into  the  personality  scale,
making  it  difficult  to  rule  out  that  the  correlations  may  reflect
uninteresting tautologies (Mõttus, 2016; Nicholls, Licht, & Pearl, 1982).
Description, prediction and explanation 21
items that concern self-perceptions of behaviour vs affect or methodological  and  practical  challenges.  Descriptive
motivation;  Wilt  &  Revelle,  2015;  Wood,  Gardner,  & research aims to delineate associations among personality-
Harms, 2015) over purely  empirical ones (e.g.,  average all relevant phenomena and their  link with other variables as
items with factor loadings over .40) or b) deliberately create comprehensively as possible, while also doing this in ways
measures  for  distinct  classes  of  personality-relevant that  allow  flexibly  summarizing  and  organizing  this
phenomena (e.g., Jackson et al., 2010; Costantini, Saraulli, & information;  predictive  research  aims  to  maximize
Perugini, 2020). generalizable out-of-sample predictive power without much
Extending our range of methods and models. Establishing regard  to  the  descriptive  or  explanatory  elegance  of  the
causal  relations  between  variables  often  requires  stronger statistical  models;  and  approaches  aiming  to  explain
evidence  than  cross-sectional  correlations.  It  is  ultimately personality phenomena need to be clear about their levels of
important to provide evidence that manipulating X within a analysis  (patterns  in  naturally  occurring  individual
potential  X→Y relationship would alter the level of  Y.  But differences  vs psychological  processes  and  behaviour  of
this is often difficult as many of the X’s that we examine as particular  people)  and set  targets  that  are  appropriate  and
potential causes of personality phenomena, such as specific realistic for the type of variability or processes that are being
genes, or the size or connectivity of neurological areas, do explained. 
not  lend  themselves  to  manipulation  and  many  Y’s  also It does not seem to us that these research kinds should strive
influence their X’s, entangling the causes with effects. towards homogenization between and even within them, at
Meanwhile, what is almost certain to help is greater use of least not any time soon. An approach that aims to achieve all
repeated measures designs, over both long (e.g., multi-wave goals may eventually not achieve any of them particularly
longitudinal studies such as Denissen et al., 2019) and short well.  Descriptively  most  useful  models  may  not  be  most
measurement  windows  (e.g.,  experience  sampling  studies predictive  or  provide  satisfactory  explanations;  most
such  as  Sosnowska  et  al.,  2020;  Danvers  et  al.,  2020). predictive models may be too complicated to be useful for
Within such studies, finding that the levels of X at one time description or explanation; and limiting descriptive research
point  t are associated with  the  levels of  Y concurrently (at or  predictive  modeling  to  variables  and  associations  that
time  t) or even prospectively (e.g., can predict how  Y will make conceptual sense may be counterproductive. 
change from t to  t+1; e.g., Epskamp et al., 2018) is useful That said, it would be equally wrong to suggest that they are
for bolstering evidence of causal associations, also allowing in  isolation  from  one  another.  For  example,  descriptive
to separate within- and between-individual variances. Time- findings  can  be  the  basis  for  building  predictive  and
series  data  may  be  combined  with  experimental  designs, explanatory models, predictive models can help to expand
such as  by experimentally  manipulating  the  X state  –  for the  range  of  descriptive  research,  hint  at  the  limits  of
example, instructing people to pursue certain goals or to act explanatory  models  (e.g.,  how  much  variability  among
extraverted  –  and  see  if  the Y state  tends  to  increase  in people in  a phenomenon can models hope to account for),
response (e.g.,  Margolis  & Lyubomirsky, 2019; Steiger et and  explanations  can  suggest  which  further  descriptive
al.,  2020).  There  remain  important  questions  about  the research is needed or what could be included in prediction
extent to which experimentally manipulating psychological models. For  these reasons, it  is important that descriptive,
states serves as an ecologically valid means of understanding predictive and explanatory approaches rely on at least partly
how the states naturally covary, however, due to issues such overlapping sets of constructs wherever possible. However,
as self-selection effects (i.e., reverse causality) and issues of we  argue  that  the  commonly-used  Big  Few  alone  is
finding  the  ideal  time  intervals  to  identify  causal  effects suboptimal for this and we need to develop flexible models
(e.g., Jacques-Hamilton et al., 2019). of  personality  variance  that  fully  embrace  its  hierarchical
We also encourage within-individual variance designs that organization  and  do  not  confuse  patterns  of  individual
focus on estimating idiographic association patterns besides differences with variance and processes within individuals.
nomothetic  ones  (Beck  &  Jackson,  2020;  Lazarus  et  al., We also need tools to assess the variance and processes that
2020;  Wright,  Gates  et  al.,  2019).  It  is  crucial  that  we rely on different sources and types of information, not just
understand  how  far  our  typical  nomothetic  models  of self-reports.
variance  can go in  principle – that  is,  how broad are  the
boundary conditions of possible causal effects. The broader References
the  boundary  conditions and less  idiosyncratic  personality Achaa-Amankwaa,  P.,  Olaru,  G.,  &  Schroeders,  U.
processes are, the more useful nomothetic models can be in (2020).  Coffee  or  Tea?  Examining  Cross-Cultural
identifying the causes of  personality  phenomena, however Differences  in  Personality  Nuances  Across  Former
numerous and multi-leveled these end up being,  and vice Colonies  of  the  British  Empire.
versa. https://doi.org/10.31234/osf.io/dpqrx
Concluding remarks Afzali, M. H., Stewart, S. H., Séguin, J. R., & Conrod, P.(2020). The Network Constellation of Personality and
In this article, we discussed three main kinds of personality Substance  Use:  Evolution  from  Early  to  Late
research  –  descriptive,  predictive,  and  explanatory  –  and Adolescence.  European  Journal  of  Personality.
argued that they involve different priorities and face different https://doi.org/10.1002/per.2245
Description, prediction and explanation 22
Allik,  J.,  Church,  A.  T.,  Ortiz,  F.  A.,  Rossier,  J., Beck, E. D., & Jackson, J. J. (2020). Idiographic Traits:
Hřebíčková, M., de Fruyt, F., Realo, A., & McCrae, R. A  Return  to  Allportian  Approaches  to  Personality.
R.  (2017).  Mean  Profiles  of  the  NEO  Personality Current  Directions  in  Psychological  Science,  29,
Inventory. Journal of Cross-Cultural Psychology, 48, 301–308. https://doi.org/10.1177/0963721420915860
402–420. https://doi.org/10.1177/0022022117692100 Beltz, A.M., Wright, A.G.C., Sprague, B., & Molenaar,
Arslan, R. C. (2019).  How to Automatically Document P.C.M.  (2016).  Bridging  the  nomothetic  and
Data  With  the  codebook Package  to  Facilitate  Data idiographic  approaches  to  the  analysis  of  clinical
Reuse:  Advances  in  Methods  and  Practices  in data. Assessment, 23, 447-458.
Psychological  Science. Bem, D. J., & Funder, D. C. (1978). Predicting more of
https://doi.org/10.1177/2515245919838783 the  people  more  of  the  time:  Assessing  the
Arslan,  R.  C.,  Walther,  M.  P.,  &  Tata,  C.  S.  (2020). personality  of  situations. Psychological  Review, 85,
formr:  A  study  framework  allowing  for  automated 485–501.  https://doi.org/10.1037/0033-
feedback  generation  and  complex  longitudinal 295X.85.6.485
experience-sampling  studies  using  R.  Behavior Biesanz,  J.  C.,  &  West,  S.  G.  (2004).  Towards
Research  Methods,  52,  376–387. understanding assessments of the big five: Multitrait-
https://doi.org/10.3758/s13428-019-01236-y multimethod analyses of convergent and discriminant
Ashton,  M.  C.,  &  Lee,  K.  (2020).  Objections  to  the validity  across  measurement  occasion  and  type  of
HEXACO Model of Personality Structure—And Why observer. Journal of Personality, 72, 845–876. 
Those  Objections  Fail.  European  Journal  of Bleidorn, W., Hopwood, C. J., Ackerman, R. A., Witt, E.
Personality,  34,  492–510. A.,  Kandler,  C.,  Riemann,  R.,  Samuel,  D.  B.,  &
https://doi.org/10.1002/per.2242 Donnellan,  M.  B.  (2020).  The  healthy  personality
Asselmann, E., & Specht, J. (2020). Taking the ups and from a basic trait perspective. Journal of Personality
downs  at  the  rollercoaster  of  love:  Associations and  Social  Psychology  ,  118,  1207.
between major life events in the domain of romantic https://doi.org/10.1037/pspp0000231
relationships  and  the  Big  Five  personality  traits. Bleidorn, W., Hopwood, C. J., Back, M. D., Denissen, J.
Developmental Psychology. doi:10.1037/dev0001047 J. A., Hennecke, M., Jokela, M., Kandler, C., Lucas,
Back,  M.  D.  (2020).  Editorial:  A  Brief  Wish  List  for R.  E.,  Luhmann,  M.,  Orth,  U.,  Roberts,  B.  W.,
Personality  Research.  European  Journal  of Wagner, J., Wrzus, C., and Zimmermann, J. (2020)
Personality, 34, 3–7. https://doi.org/10.1002/per.2236 Longitudinal Experience‐Wide Association Studies—
Back, M. D. (in press). Social interaction processes and A  Framework  for  Studying  Personality  Change.
personality. In J. Rauthmann (Ed.), The handbook of European  Journal  of  Personality,  34,  285–  300.
personality dynamics and processes. Elsevier. https://doi.org/10.1002/per.2247. 
Bäckström, M., Björklund, F., & Larsson, M. R. (2009). Bleidorn,  W.,  Klimstra,  T.  A.,  Denissen,  J.  J.  A.,
Five-factor  inventories  have  a  major  general  factor Rentfrow, P. J., Potter, J., & Gosling, S. D. (2013).
related to social desirability which can be reduced by Personality Maturation Around the World A Cross-
framing  items  neutrally.  Journal  of  research  in Cultural  Examination  of  Social-Investment  Theory.
personality,  43,  335-344. Psychological  Science,  24,  2530–2540.
doi:10.1016/j.jrp.2008.12.013 https://doi.org/10.1177/0956797613498396 
Balliet,  D.,  Tybur,  J.  M.,  & Van Lange,  P.  A. (2017). Block, J. (1995). A contrarian view of the Five-Factor
Functional  interdependence  theory:  An  evolutionary Approach  to  personality  description.  Psychological
account  of  social  situations.  Personality  and  Social Bulletin, 117, 187215.
Psychology Review, 21, 361-388. Block,  J.  H.,  Block,  J.,  &  Gjerde,  P.  F.  (1986).  The
Baumeister, R. F., Vohs, K. D., & Funder, D. C. (2007). Personality  of  Children  Prior  to  Divorce:  A
Psychology as the Science of Self-Reports and Finger Prospective Study. Child Development, 57, 827–840.
Movements: Whatever Happened to Actual Behavior? https://doi.org/10.2307/1130360
Perspectives  on  Psychological  Science,  2,  396–403. Block,  J.  H.,  Gjerde,  P.  F.,  &  Block,  J.  H.  (1991).
https://doi.org/10.1111/j.1745-6916.2007.00051.x Personality  antecedents  of  depressive  tendencies  in
Baumert,  A.,  Schmitt,  M.,  Perugini,  M.,  Johnson,  W., 18-year-olds:  A  prospective  study.  Journal  of
Blum, G., Borkenau, P., Costantini, G., Denissen, J. J. Personality  and  Social  Psychology,  60, 726–738.
A.,  Fleeson,  W.,  Grafton,  B.,  Jayawickreme,  E., https://doi.org/10.1037/0022-3514.60.5.726
Kurzius, E., MacLeod, C., Miller, L. C., Read, S. J., Borsboom, D.,  Cramer,  A.,  & Kalis,  A.  (2019).  Brain
Roberts, B., Robinson, M. D., Wood, D., & Wrzus, C. disorders?  Not  really...:  Why  network  structures
(2017).  Integrating Personality  Structure,  Personality block  reductionism  in  psychopathology  research.
Process,  and  Personality  Development.  European Behavioral and Brain Sciences, 42, 1–11
Journal  of  Personality,  31,  503–528.
https://doi.org/10.1002/per.2115
Description, prediction and explanation 23
Bouchard,  T.  J.  (2016).  Experience  producing  drive and  Validation  of  Personality  Trait  Questionnaires.
theory:  Personality  “writ  large.”  Personality  and European  Journal  of  Personality.
Individual  Differences,  90,  302–314. https://doi.org/10.1002/per.2265
https://doi.org/10.1016/j.paid.2015.11.007 Condon, D. M. (2018). The SAPA Personality Inventory:
Breil,  S.  M.,  Geukes,  K.,  Wilson,  R.  E.,  Nestler,  S., An empirically-derived, hierarchically-organized self-
Vazire, S., & Back, M. D. (2019). Zooming into Real- report  personality  assessment  model.
Life  Extraversion  –  how  Personality  and  Situation https://doi.org/10.31234/osf.io/sc4p9
Shape  Sociability  in  Social  Interactions.  Collabra: Condon,  D.M.,  Roney,  E.  and  Revelle,  W.  (2017).  A
Psychology, 5, 7 SAPA Project  Update:  On the Structure of phrased
Briley, D. A., & Tucker-Drob, E. M. (2014). Genetic and Self-Report  Personality  Items.  Journal  of  Open
environmental continuity in personality development: Psychology  Data,  5,  p.3.  DOI:
A meta-analysis. Psychological Bulletin, 140, 1303– http://doi.org/10.5334/jopd.32
1331. https://doi.org/10.1037/a0037091 Connelly,  B.  S.,  &  Ones,  D.  S.  (2010).  An  other
Briley,  D.  A.,  Livengood,  J.,  &  Derringer,  J.  (2018). perspective on personality: Meta-analytic integration
Behaviour Genetic Frameworks of Causal Reasoning of  observers’  accuracy  and  predictive  validity.
for  Personality  Psychology.  European  Journal  of Psychological  Bulletin,  136,  1092–1122.
Personality,  32,  202–220. https://doi.org/10.1037/a0021212
https://doi.org/10.1002/per.2153 Cooper, A. B., Blake, A. B., Pauletti, R. E., Cooper, P. J.,
Bulik-Sullivan, B., Finucane, H. K., Anttila, V., Gusev, Sherman,  R.  A.,  &  Lee,  D.  I.  (2020).  Personality
A.,  Day,  F.  R.,  Loh,  P.-R.,  ReproGen  Consortium, Assessment Through the Situational  and Behavioral
Psychiatric  Genomics  Consortium,  Genetic Features  of  Instagram Photos.  European  Journal  of
Consortium for  Anorexia  Nervosa  of  the  Wellcome Psychological  Assessment.
Trust Case Control Consortium 3, Duncan, L., Perry, https://doi.org/10.1027/1015-5759/a000596
J. R. B., Patterson, N., Robinson, E. B., Daly, M. J., Costa,  P.  T.,  & McCrae,  R.  R.  (1992).  Revised  NEO
Price,  A.  L.,  &  Neale,  B.  M.  (2015).  An  atlas  of Personality  Inventory  (NEO PI-R)  and  NEO Five-
genetic correlations across human diseases and traits. Factor  Inventory  (NEO-FFI)  professional  manual.
Nature  Genetics,  47,  1236–1241. Psychological Assessment Resources.
https://doi.org/10.1038/ng.3406 Costa, P. T., McCrae, R. R., & Löckenhoff, C. E. (2019).
Buss,  D.  M.  (1987).  Selection,  evocation,  and Personality Across the Life Span. Annual Review of
manipulation.  Journal  of  Personality  and  Social Psychology,  70,  423–448.
Psychology, 53, 1214-1221. https://doi.org/10.1146/annurev-psych-010418-
Buss, D. M., & Craik, K. H. (1983). The act frequency 103244
approach  to  personality.  Psychological  Review,  90, Costantini, G., Epskamp, S., Borsboom, D., Perugini, M.,
105–126. https://doi.org/10.1037/0033-295X.90.2.105 Mõttus, R., Waldorp, L. J., & Cramer, A. O. (2015).
Caspi, A., & Moffitt, T. E. (1993). When Do Individual State  of  the aRt personality  research:  A tutorial  on
Differences  Matter?  A  Paradoxical  Theory  of network analysis of personality data in R. Journal of
Personality Coherence. Psychological Inquiry, 4, 247– Research  in  Personality,  54,  13–29.
271. https://doi.org/10.1207/s15327965pli0404_1 https://doi.org/10.1016/j.jrp.2014.07.003
Caspi,  A.,  &  Roberts,  B.  W.  (2001).  Personality Costantini,  G.,  Saraulli,  D.,  &  Perugini,  M.  (2020).
Development across the Life Course: The Argument Uncovering  the  Motivational  Core  of  Traits:  The
for Change and Continuity. Psychological Inquiry, 12, Case  of  Conscientiousness.  European  Journal  of
49–66. https://doi.org/10.2307/1449487 Personality, n/a(n/a). https://doi.org/10.1002/per.2237
Chabris, C. F., Lee, J. J., Cesarini, D., Benjamin, D. J., & Cramer,  A.  O.  J.,  van  der  Sluis,  S.,  Noordhof,  A.,
Laibson, D. I.  (2015).  The Fourth Law of Behavior Wichers, M., Geschwind, N., Aggen, S. H., Kendler,
Genetics.  Current  Directions  in  Psychological K.  S.,  &  Borsboom,  D.  (2012).  Dimensions  of
Science,  24,  304–312. Normal  Personality  as  Networks  in  Search  of
https://doi.org/10.1177/0963721415580430 Equilibrium:  You  Can’t  Like  Parties  if  You  Don’t
Chopik, W. J., Oh, J., Kim, E. S., Schwaba, T., Krämer, Like  People.  European  Journal  of  Personality,  26,
M. D.,  Richter,  D.,  & Smith,  J.  (2020).  Changes in 414–431. https://doi.org/10.1002/per.1866
optimism and pessimism in  response  to  life  events: Cronbach, L. J., & Shavelson, R. J. (2004). My current
Evidence  from three  large  panel  studies.  Journal  of thoughts  on  coefficient  alpha  and  successor
Research  in  Personality,  88,  103985. procedures.  Educational  and  Psychological
https://doi.org/10.1016/j.jrp.2020.103985 Measurement,  64,  391418.
Christensen, A. P., Golino, H., & Silvia, P. J. (2020). A https://doi.org/10.1177/0013164404266386
Psychometric  Network  Perspective  on  the  Validity
Description, prediction and explanation 24
Danvers,  A.  F.,  Wundrack,  R.,  &  Mehl,  M.  (2020). Fisher, A. J., Medaglia, J. D., & Jeronimus, B. F. (2018).
Equilibria in Personality States: A Conceptual Primer Lack  of  group-to-individual  generalizability  is  a
for Dynamics in Personality States. European Journal threat to human subjects research. Proceedings of the
of Personality. https://doi.org/10.1002/per.2239 National  Academy  of  Sciences,  115,  E6106.
Denissen,  J.  J.  A.,  Luhmann,  M.,  Chung,  J.  M.,  & https://doi.org/10.1073/pnas.1711978115
Bleidorn, W. (2019). Transactions between life events Funder, D. C. (1991). Global Traits:  A Neo-Allportian
and personality traits across the adult lifespan. Journal Approach  to  Personality.  Psychological  Science,  2,
of Personality and Social Psychology, 116, 612–633. 31–39.  https://doi.org/10.1111/j.1467-
https://doi.org/10.1037/pspp0000196 9280.1991.tb00093.x
Dennett, D. C. (2013). Intuition pumps and other tools for Funder,  D.  C.,  &  Dobroth,  K.  M.  (1987).  Differences
thinking. Oxford, England: Norton. between traits: Properties associated with interjudge
DeYoung, C. G. (2006). Higher-order factors of the Big agreement.  Journal  of  Personality  and  Social
Five  in  a  multi-informant  sample.  Journal  of Psychology,  52,  409–418.
Personality  and  Social  Psychology,  91,  1138–1151. https://doi.org/10.1037/0022-3514.52.2.409
https://doi.org/10.1037/0022-3514.91.6.1138 Funder,  D.  C.,  &  Sneed,  C.  D.  (1993).  Behavioral
DeYoung,  C.  G.  (2015).  Cybernetic  Big  Five  Theory. manifestations of personality: An ecological approach
Journal  of  Research  in  Personality,  56,  33–58. to  judgmental  accuracy.  Journal  of  Personality  and
https://doi.org/10.1016/j.jrp.2014.07.004 Social  Psychology,  64, 479–490.
DeYoung, C. G., Quilty, L. C., & Peterson, J. B. (2007). https://doi.org/10.1037/0022-3514.64.3.479
Between facets  and domains:  10 aspects  of  the  Big Furr,  R.  M.  (2009).  Personality  psychology as  a  truly
Five.  Journal  of  Personality  and  Social  Psychology, behavioural science. European Journal of Personality,
93,  880–896.  https://doi.org/10.1037/0022- 23, 369–401. https://doi.org/10.1002/per.724
3514.93.5.880 Geukes,  K.,  Breil,  S.  M.,  Hutteman,  R.,  Nestler,  S.,
Dotterer, H.L., Beltz, A.M., Foster, K.T., Simms, L.J., & Küfner,  A.C.P.,  Back,  M.D.  (2019).  Explaining the
Wright,  A.G.C.  (in  press).  Personalized  models  of longitudinal  interplay  of  personality  and  social
personality  disorders:  Using  a  temporal  network relationships in the laboratory and in the field: The
method  to  understand  symptomatology  and  daily PILS  and  the  CONNECT  study. PlosOne,  14,
functioning  in  a  clinical  sample.  Psychological e0210424
Medicine. https://psyarxiv.com/bnxkq/ Gniewosz,  G.,  Ortner,  T.  M.,  &  Scherndl,  T.  (2020).
Dreves, P. A., Blackhart, G. C., & McBee, M. T. (2020). Personality  in  Action:  Assessing  Personality  to
Do  behavioral  measures  of  self-control  assess Identify an ‘Ideal’ Conscientious Response Type with
construct-level  variance?  Journal  of  Research  in Two Different Behavioural Tasks. European Journal
Personality,  88,  104000. of Personality. https://doi.org/10.1002/per.2296 
https://doi.org/10.1016/j.jrp.2020.104000 Goldberg,  L.  R.  (1990).  An alternative “description of
Eid,  M.,  Nussbeck,  F.  W.,  Geiser,  C.,  Cole,  D.  A., personality”: The Big-Five factor structure. Journal of
Gollwitzer,  M.,  & Lischetzke,  T.  (2008).  Structural Personality  and  Social  Psychology,  59,  1216–1229.
equation  modeling  of  multitrait-multimethod  data: https://doi.org/10.1037/0022-3514.59.6.1216
Different  models  for  different  types  of  methods. Goldberg,  L.  R.  (1999).  A  broad-bandwidth,  public
Psychological Methods, 13, 230-253. domain, personality inventory measuring the lower-
Egloff, B., Schwerdtfeger, A., & Schmukle, S. C. (2005). level  facets  of  several  five-factor  models.  In  I.
Temporal  Stability  of  the Implicit  Association Test- Mervielde, I. J. Deary, F. De Fruyt, & F. Ostendorf ,
Anxiety. Journal of Personality Assessment, 84, 82– Personality Psychology in Europe (Vol. 7, pp. 7–28).
88. Tilburg University Press.
Elleman, L.  G.,  McDougald,  S.  K.,  Condon,  D. M.,  & Goldberg, L.  R.,  & Saucier, G. (2016).  ORI Technical
Revelle,  W.  (2020).  That  takes  the  BISCUIT:  A Report. (Vol. 56 No. 1). Eugene, OR.
comparative  study  of  predictive  accuracy  and Gonzalez, O., MacKinnon, D. P., & Muniz, F. B. (2020).
parsimony  of  four  statistical  learning  techniques  in Extrinsic  Convergent  Validity  Evidence  to  Prevent
personality  data,  with  data  missingness  conditions. Jingle  and  Jangle  Fallacies.  Multivariate  behavioral
European Journal of Psychological Assessment. research.
Epskamp, S., Waldorp, L. J., Mõttus, R., & Borsboom, D. https://doi.org/10.1080/00273171.2019.1707061
(2018).  The  Gaussian  Graphical  Model  in  Cross- Gosling, S. D., & Mason, W. (2015). Internet Research
Sectional  and  Time-Series  Data.  Multivariate in  Psychology.  Annual  Review  of  Psychology,  66,
Behavioral  Research,  53:4,  453-480. 877–902.  https://doi.org/10.1146/annurev-psych-
https://doi.org/10.1080/00273171.2018.1454823 010814-015321
Description, prediction and explanation 25
Greenwald, A. G., & Farnham, S. D. (2000). Using the Conscientiousness. Journal of Personality and Social
Implicit Association Test to measure self-esteem and Psychology,  96,  446–459.
self-concept.  Journal  of  Personality  and  Social https://doi.org/10.1037/a0014156
Psychology,  79,  https://doi.org/10.1037/0022- Jackson,  J.  J.,  Wood,  D.,  Bogg,  T.,  Walton,  K.  E.,
3514.79.6.1022 Harms,  P.  D.,  &  Roberts,  B.  W.  (2010).  What  do
Grosz, M. P., Rohrer, J. M., & Thoemmes, F. (2020). The conscientious people do? Development and validation
Taboo  Against  Explicit  Causal  Inference  in of  the  Behavioral  Indicators  of  Conscientiousness
Nonexperimental  Psychology.  Perspectives  on (BIC). Journal of Research in Personality, 44, 501–
Psychological  Science. 511. https://doi.org/10.1016/j.jrp.2010.06.005
https://doi.org/10.1177/1745691620921521 Jacobucci, R., & Grimm, K. J. (2020). Machine Learning
Hall, A. N., & Matz, S. C. (2020.). Targeting Item-level and Psychological Research: The Unexplored Effect
Nuances Leads to Small but Robust Improvements in of  Measurement:  Perspectives  on  Psychological
Personality  Prediction  from  Digital  Footprints. Science. https://doi.org/10.1177/1745691620902467
European  Journal  of  Personality. Jang, K. L., McCrae, R. R., Angleitner, A., Riemann, R.,
https://doi.org/10.1002/per.2253 & Livesley, W. J. (1998). Heritability of facet-level
Hang,  Soto,  Lee  and  Mõttus  (under  review).  Social traits in a cross-cultural twin sample: Support for a
expectations  and  abilities  to  meet  them  as  possible hierarchical  model  of  personality.  Journal  of
mechanisms of youth personality development. Personality and Social Psychology, 74, 1556–1565.
Hasson,  U.,  Nastase,  S.  A.,  &  Goldstein,  A.  (2020). Johnston,  T.  D.,  &  Edwards,  L.  (2002).  Genes,
Direct Fit to Nature: An Evolutionary Perspective on interactions,  and  the  development  of  behavior.
Biological  and  Artificial  Neural  Networks.  Neuron, Psychological Review, 109, 26-34.
105,  416–434. Jonas, K. G.,  & Markon, K. E.  (2016). A descriptivist
https://doi.org/10.1016/j.neuron.2019.12.002 approach  to  trait  conceptualization  and  inference.
Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The Psychological  Review,  123, 90–96.
weirdest people in the world?.  Behavioral and brain https://doi.org/10.1037/0022-3514.74.6.1556
sciences, 33, 61-83. Kandler,  C.,  Zimmermann,  J.,  &  McAdams,  D.  P.
Henry, S., & Mõttus, R. (2020). Traits and Adaptations: (2014).  Core  and  Surface  Characteristics  for  the
A  Theoretical  Examination  and  New  Empirical Description  and  Theory  of  Personality  Differences
Evidence. European Journal of Personality, 34, 265– and Development.  European Journal  of  Personality,
284. https://doi.org/10.1002/per.2248 28, 231–243. https://doi.org/10.1002/per.1952
Hilbig, B. E., Moshagen, M., Zettler, I. (2016). Prediction Kirtley, O. J., Hiekkaranta, A. P., Kunkels, Y. K., Eisele,
consistency:  A  test  of  the  equivalence  assumption G.,  Verhoeven,  D.,  Van  Nierop,  M.,  &  Myin-
across  different  indicators  of  the  same  construct. Germeys,  I.  (2020).  The  Experience  Sampling
European  Journal  of  Personality,  30,  637–647. Method  (ESM)  Item  Repository.
https://doi.org/10.1002/per.2085 https://doi.org/10.17605/OSF.IO/KG376
Hofstadter,  D.  R.  (2007).  I  am  a  strange  loop.  Basic Koch, T., Schultze, M., Holtmann, J., Geiser, C., & Eid,
books. M. (2017). A Multimethod Latent State-Trait Model
Hopwood,  C.  J.  (2018).  Interpersonal  Dynamics  in for  Structurally  Different  And  Interchangeable
Personality  and  Personality  Disorders.  European Methods. Psychometrika, 82, 17–47.
Journal  of  Personality,  32,  499–524. Kööts-Ausmees, L., Kandler, K., McCrae, R. R., Realo,
https://doi.org/10.1002/per.2155 A.,  Allik,  J.,  Borkenau,  P.,  Hřebíčková,  M.,  &
Horstmann,  K.  T.,  &  Ziegler,  M.  (2020).  Assessing Mõttus,  R.  (in  preparation).  Social  Desirability  and
Personality  States:  What  to  Consider  when Age Differences in Personality Traits: A Multi-Rater,
Constructing  Personality  State  Measures.  European Multi-Sample Study
Journal  of  Personality. Kosinski, M., Stillwell, D., & Graepel, T. (2013). Private
https://doi.org/10.1002/per.2266 traits  and  attributes  are  predictable  from  digital
Jacques-Hamilton, R., Sun, J., & Smillie, L. (2019). Costs records  of  human  behavior.  Proceedings  of  the
and  benefits  of  acting  extraverted:  A  randomized National  Academy  of  Sciences,  110,  827–840.
controlled trial. Journal of Experimental Psychology: https://doi.org/10.1073/pnas.1218772110
General, 148, 1538–1556. Larsen,  K.  R.,  &  Bong,  C.  H.  (2016).  A  tool  for
Jackson,  J.  J.,  Walton,  K.  E.,  Harms,  P.  D.,  Bogg,  T., addressing construct identity in literature reviews and
Wood,  D.,  Lodi-Smith,  J.,  Edmonds,  G.  W.,  & meta-analyses. MIS Quarterly, 40, 123.
Roberts,  B.  W.  (2009).  Not  all  Conscientiousness Lazarus,  G.,  Sened,  H.,  &  Rafaeli,  E.  (2020).
Scales  Change  Alike:  A  Multimethod,  Multisample Subjectifying  the  Personality  State:  Theoretical
Study  of  Age  Differences  in  the  Facets  of Underpinnings and an Empirical Example. European
Description, prediction and explanation 26
Journal  of  Personality. Formidability  in  Human  Social  Status  Allocation.
https://doi.org/10.1002/per.2278 Journal  of  Personality  and  Social  Psychology,  110,
LeCun,  Y.,  Bengio,  Y.,  &  Hinton,  G.  (2015).  Deep 385-406. https://doi.org/10.1037/pspi0000042
learning.  Nature,  521,  436–444. Lunansky, G., Borkulo, C. van, & Borsboom, D. (2020).
https://doi.org/10.1038/nature14539 Personality,  Resilience,  and  Psychopathology:  A
Lee,  K.,  &  Ashton,  M.  C.  (2020).  Sex  differences  in Model  for  the  Interaction  between  Slow  and  Fast
HEXACO personality characteristics across countries Network Processes in the Context of Mental Health.
and  ethnicities.  Journal  of  Personality. European  Journal  of  Personality.
https://doi.org/10.1111/jopy.12551 https://doi.org/10.1002/per.2263 
Lee, J. J., Wedow, R., Okbay, A., Kong, E., Maghzian, Mac Giolla, E., & Kajonius, P. J. (2019). Sex differences
O.,  Zacher,  M.,  Nguyen-Viet,  T.  A.,  Bowers,  P., in  personality  are  larger  in  gender  equal  countries:
Sidorenko, J., Linnér, R. K., Fontana, M. A., Kundu, Replicating  and  extending  a  surprising  finding.
T., Lee, C., Li, H., Li, R., Royer, R., Timshel, P. N., International  Journal  of  Psychology,  54,  705–711.
Walters,  R.  K.,  Willoughby,  E.  A.,  … Cesarini,  D. https://doi.org/10.1002/ijop.12529 
(2018). Gene discovery and polygenic prediction from McAdams, D. P. (1994). A Psychology of the Stranger.
a  genome-wide  association  study  of  educational Psychological  Inquiry,  5,  145–148.
attainment in 1.1 million individuals. Nature Genetics, https://doi.org/10.1207/s15327965pli0502_12 
50,  1112–1121.  https://doi.org/10.1038/s41588-018- MacCann,  C.,  Duckworth,  A.  L.,  &  Roberts,  R.  D.
0147-3 (2009). Empirical identification of the major facets of
Leising,  D.,  Vogel,  D.,  Waller,  V.,  & Zimmermann, J. Conscientiousness.  Learning  and  Individual
(2020). Correlations between person-descriptive items Differences,  19,  451–458.
are  predictable  from the product of  their  mid-point- https://doi.org/10.1016/j.lindif.2009.03.007
centered social desirability  values.  European Journal Magidson, J. F., Roberts, B. W., Collado-Rodriguez, A.,
of Personality. & Lejuez, C. (2014). Theory-driven intervention for
Lievens,  F.  (2017).  Assessing  Personality–Situation changing  personality:  Expectancy  value  theory,
Interplay  in  Personnel  Selection:  Toward  More behavioral  activation,  and  conscientiousness.
Integration  into  Personality  Research.  European Developmental  Psychology,  50,  14421450.
Journal  of  Personality,  31,  424–440. https://doi.org/10.1037/a0030583
https://doi.org/10.1002/per.2111 Margolis,  S.,  & Lyubomirsky,  S.  (2019).  Experimental
Lo, M.-T., Hinds, D. A., Tung, J. Y., Franz, C., Fan, C.- manipulation of extraverted and introverted behavior
C., Wang, Y., Smeland, O. B., Schork, A., Holland, and its effects on well-being. Journal of Experimental
D., Kauppi, K., Sanyal, N., Escott-Price, V., Smith, D. Psychology:  General.  Advance  online  publication.
J.,  O’Donovan, M.,  Stefansson,  H., Bjornsdottir,  G., https://doi.org/10.1037/xge0000668
Thorgeirsson, T. E., Stefansson, K., McEvoy, L. K., Marouli,  E.,  Graff,  M.,  Medina-Gomez, C.,  Lo,  K. S.,
…  Chen,  C.-H.  (2017).  Genome-wide  analyses  for Wood,  A.  R.,  Kjaer,  T.  R.,  Fine,  R.  S.,  Lu,  Y.,
personality traits identify six genomic loci and show Schurmann,  C.,  Highland,  H.  M.,  Rüeger,  S.,
correlations  with  psychiatric  disorders.  Nature Thorleifsson,  G.,  Justice,  A.  E.,  Lamparter,  D.,
Genetics,  49,  152–156. Stirrups, K. E., Turcot, V., Young, K. L., Winkler, T.
https://doi.org/10.1038/ng.3736 W.,  Esko,  T.,  … Lettre,  G.  (2017).  Rare  and low-
Lowman, G. H., Wood, D., Armstrong, B. F., Harms, P. frequency coding variants alter human adult  height.
D., & Watson, D. (2018). Estimating the reliability of Nature,  542,  186–190.
emotion  measures  over  very  short  intervals:  The https://doi.org/10.1038/nature21039 
utility of within-session retest correlations. Emotion, Markon, K.  E.,  Krueger,  R.  F.,  & Watson,  D. (2005).
18, 896–901. https://doi.org/10.1037/emo0000370 Delineating the Structure  of  Normal and Abnormal
Lucas, R. E., & Donnellan, M. B. (2009). Age differences Personality:  An  Integrative  Hierarchical  Approach.
in  personality:  Evidence  from  a  nationally Journal  of  Personality  and  Social  Psychology,  88,
representative  Australian  sample.  Developmental 139–157.  http://dx.doi.org/10.1037/0022-
Psychology,  45,  1353–1363. 3514.88.1.139
https://doi.org/10.1037/a0013914 Matz, S. C., Kosinski, M., Nave, G., & Stillwell, D. J.
Lukaszewski,  A.  W.,  Lewis,  D.  M.  G.,  Durkee,  P.  K., (2017).  Psychological  targeting  as  an  effective
Sell,  A. N., Sznycer, D., & Buss, D. M. (2020). An approach to digital mass persuasion. Proceedings of
Adaptationist  Framework  for  Personality  Science. the  National  Academy  of  Sciences,  114,  12714.
European  Journal  of  Personality. https://doi.org/10.1073/pnas.1710966114
https://doi.org/10.1002/per.2292 Mazza, G. L., Smyth, H. L., Bissett, P. G., Canning, J.
Lukaszewski, A. W., Simmons, Z. L.,  Anderson, C., & R.,  Eisenberg,  I.  W.,  Enkavi,  A.  Z.,  Gonzalez,  O.,
Roney,  J.  R.  (2016).  The  Role  of  Physical Kim, S. J., Metcalf, S. A., Muniz, F., III, W. E. P.,
Description, prediction and explanation 27
Scherer, E. A., Valente, M. J., Xie, H., Poldrack, R. 246–268.  http://dx.doi.org/10.1037/0033-
A.,  Marsch,  L.  A.,  &  MacKinnon,  D.  P.  (2020). 295X.102.2.246
Correlation  Database  of  60  Cross-Disciplinary Molenaar, P. C. M., & Campbell, C. G. (2009). The new
Surveys  and  Cognitive  Tasks  Assessing  Self- person-specific  paradigm  in  psychology.  Current
Regulation.  Journal  of  Personality  Assessment. Directions  in  Psychological  Science,  18,  112–117.
https://doi.org/10.1080/00223891.2020.1732994 https://doi.org/10.1111/j.1467-8721.2009.01619.x
McAbee, S. T., & Connelly, B. S. (2016). A multi-rater Mõttus,  R.  (2016).  Towards  more  rigorous  personality
framework  for  studying  personality:  The  trait- trait-outcome  research.  European  Journal  of
reputation-identity model. Psychological Review, 123, Personality,  30,  292–303.
569-591 https://doi.org/10.1002/per.2041
McCrae,  R.  R.  (2015).  A  More  Nuanced  View  of Mõttus,  R.,  Allerhand,  M.,  &  Johnson,  W.  (2020).
Reliability:  Specificity  in  the  Trait  Hierarchy. Computational  Modeling  of  Person-Situation
Personality  and  Social  Psychology Review,  19,  97– Transactions:  How  Accumulation  of  Situational
112. https://doi.org/10.1177/1088868314541857 Experiences  Can  Shape  the  Distributions  of  Trait
McCrae,  R.  R.,  De  Bolle,  M.,  Löckenhoff  ,  C.  E.,  & Scores.  In  D.  C.  Funder,  R.  A.  Sherman,  &  J.  F.
Terracciano, A. (in press). Lifespan trait development: Rauthmann  (Eds.),  Handbook  of  Psychological
Towards an adequate  theory of  personality.  In  J.  F. Situations. (pp. xx – xx).
Rauthmann (Ed.), Handbook of personality dynamics Mõttus, R., & Rozgonjuk, D. (2019). Development is in
and processes. Amsterdam: Elsevier. the details: Age differences in the Big Five domains,
McCrae, R. R., & Costa Jr., P. T. (1996). Towards a new facets and nuances. Journal of Personality and Social
generation  of  personality  theories:  Theoretical Psychology. http://dx.doi.org/10.1037/pspp0000276
contexts for the five-factor model. In J. S. Wiggins , Mõttus,  R.,  Allik,  J.,  &  Realo,  A.  (2020).  Do  Self-
The  five-factor  model  of  personality:  Theoretical Reports  and  Informant-Ratings  Measure  the  Same
perspectives (Vol. 51, pp. 51–87). Guilford Press. Personality  Constructs?  European  Journal  of
McCrae, R. R., & John, O. P. (1992). An introduction to Psychological  Assessment,  36,  289–295.
the Five-Factor Model and its applications. Journal of https://doi.org/10.1027/1015-5759/a000516 
Personality,  60,  175–215. Mõttus, R., Bates, T. C., Condon, D. M., Mroczek, D., &
https://doi.org/10.1111/j.1467-6494.1992.tb00970.x Revelle, W. (2017). Leveraging a more nuanced view
McCrae, R. R., & Mõttus, R. (2019). What Personality of  personality:  Narrow  characteristics  predict  and
Scales  Measure:  A  New  Psychometrics  and  Its explain  variance  in  life  outcomes.
Implications  for  Theory  and  Assessment.  Current https://doi.org/10.31234/osf.io/4q9gv
Directions  in  Psychological  Science,  28,  415–420. Mõttus, R., Kandler, C., Bleidorn, W., Riemann, R., &
https://doi.org/10.1177/0963721419849559 McCrae, R. R. (2017). Personality traits below facets:
McCrae,  R.  R.,  &  Sutin,  A.  R.  (2018).  A  Five-Factor The  consensual  validity,  longitudinal  stability,
Theory  Perspective  on  Causal  Analysis.  European heritability, and utility of personality nuances. Journal
Journal  of  Personality,  32,  151–166. of  Personality  and  Social  Psychology,  112,  474.
https://doi.org/10.1002/per.2134 https://doi.org/10.1037/pspp0000100
McCrae, R. R., Mõttus, R., Hřebíčková, M., Realo, A., & Mõttus, R., Realo, A., Allik, J., Esko, T., Metspalu, A.,
Allik,  J.  (2019).  Source  method  biases  as  implicit & Johnson, W. (2015). Within-trait heterogeneity in
personality  theory  at  the  domain  and  facet  levels. age  group  differences  in  personality  domains  and
Journal  of  Personality,  87(4),  813–826. facets:  Implications  for  the  development  and
https://doi.org/10.1111/jopy.12435 coherence  of  personality  traits.  PLoS  ONE,  10,
McCrae,  R. R.,  Terracciano,  A.,  & 78 Members of  the e0119667.
Personality  Profiles  of  Cultures  Project.  (2005). https://doi.org/10.1371/journal.pone.0119667
Universal  features  of  personality  traits  from  the Mõttus, R., Realo, A., Vainik, U., Allik, J., & Esko, T.
observer’s perspective: Data from 50 cultures. Journal (2017).  Educational  attainment  and  personality  are
of  Personality  and Social  Psychology,  88,  547–561. genetically  intertwined.  Psychological  Science,  28,
https://doi.org/10.1037/0022-3514.88.3.547 1631–1639.
Metcalfe,  J.,  & Mischel,  W. (1999).  A hot/cool-system https://doi.org/10.1177/0956797617719083
analysis  of  delay  of  gratification:  Dynamics  of Mõttus, R., Sinick, J., Terracciano, A., Hrebickova, M.,
willpower. Psychological Review, 106, 3–19. Kandler,  C.,  Ando,  J.,  Mortensen,  E.  L.,  Colodro-
Mischel, W., & Shoda, Y. (1995). A cognitive-affective Conde,  L.,  &  Jang,  K.  (2019).  Personality
system  theory  of  personality:  Reconceptualizing characteristics below facets: A replication and meta-
situations,  dispositions,  dynamics,  and  invariance  in analysis of cross-rater agreement, rank-order stability,
personality  structure.  Psychological  Review,  102, heritability and utility of personality nuances. Journal
Description, prediction and explanation 28
of Personality and Social Psychology, 117, e35–e50. Rauthmann, J. (in press). A (More) Behavioral Science
https://doi.org/10.1037/pspp0000202 of  Personality  in  the  Age of  Multi-Modal  Sensing,
Muck, P. M., Hell, B., & Höft, S. (2008). Application of Big  Data,  Machine  Learning,  and  Artificial
the principles of Behaviorally Anchored Rating Scales Intelligence. European Journal of Personality.
to assess the Big Five personality constructs at work. Read, S. J., Monroe, B. M., Brownstein, A. L., Yang, Y.,
In J. Deller , Research contributions to personality at Chopra, G., & Miller, L. C. (2010). A neural network
work (pp. 77-97). München, Germany: Rainer Hampp model  of  the  structure  and  dynamics  of  human
Nagel, M., Watanabe, K., Stringer, S., Posthuma, D., & personality. Psychological Review, 117, 61–92.
Sluis,  S.  (2018).  Item-level  analyses  reveal  genetic Revelle,  W.,  &  Condon,  D.  M.  (2015).  A  model  for
heterogeneity  in  neuroticism.  Nature personality  at  three  levels.  Journal  of  Research  in
Communications,  9,  905. Personality, 56, 70–81.
https://doi.org/10.1038/s41467-018-03242-8 Revelle,  W.  (2020)  psych:  Procedures  for  Personality
Nicholls, J. G., Licht, B. G., & Pearl, R. A. (1982). Some and  Psychological  Research.  (Version  2.0.9).
dangers of  using personality  questionnaires to  study Northwestern  University.  http://CRAN.R-
personality.  Psychological  Bulletin,  92,  572-580. project.org/package=psych
https://doi.org/10.1037/0033-2909.92.3.572 Revelle,  W.,  Condon,  D.  M.,  Wilt,  J.,  French,  J.  A.,
Orben,  A.,  &  Lakens,  D.  (2020).  Crud  (Re)Defined. Brown, A., & Elleman, L. G. (2016). Web and phone
Advances in Methods and Practices in Psychological based data collection using planned missing designs.
Science,  3,  238–247. Sage handbook of online research methods (2nd ed.,
https://doi.org/10.1177/2515245920917961 p. 578-595). Sage Publications, Inc.
Østergaard,  S.D.,  Jensen,  S.O.W.  and  Bech,  P.  ,  The Revelle, W., Dworak, E. M., & Condon, D. M. (2020).
heterogeneity  of  the  depressive  syndrome:  when Exploring  the  persome:  The  power  of  the  item  in
numbers get serious. Acta Psychiatrica Scandinavica, understanding  personality  structure.  Personality  and
124: 495-496. doi:10.1111/j.1600-0447.2011.01744.x Individual Differences, 109905.
Ozer, D. J., & Benet-Martínez, V. (2006). Personality and Roberts,  B.  W.,  &  Nickel,  L.  B.  (2017).  A  critical
the  prediction  of  consequential  outcomes.  Annual evaluation  of  the  Neo-Socioanalytic  Model  of
Review  of  Psychology,  57,  401–421. personality.  In  J.  Specht  ,  Personality  Development
https://doi.org/10.1146/annurev.psych.57.102904.190 Across the Lifespan (pp. 157–177). Academic Press.
127 https://doi.org/10.1016/B978-0-12-804674-6.00011-9
Pasupathi, M., Fivush, R., Greenhoot, A. F., & McLean, Roberts,  B.  W.,  Chernyshenko,  O.  S.,  Stark,  S.,  &
K. C. (2020). Intraindividual Variability in Narrative Goldberg,  L.  R.  (2005).  The  Structure  of
Identity:  Complexities,  Garden Paths,  and Untapped Conscientiousness: An Empirical Investigation Based
Research Potential.  European Journal  of Personality. on  Seven  Major  Personality  Questionnaires.
https://doi.org/10.1002/per.2279 Personnel  Psychology,  58,  103–139.
Paunonen,  S.  V.,  &  Ashton,  M.  C.  (2001).  Big  Five https://doi.org/10.1111/j.1744-6570.2005.00301.x
factors  and  facets  and  the  prediction  of  behavior. Roberts, B. W., Kuncel, N. R., Shiner, R., Caspi, A., &
Journal  of  Personality  and  Social  Psychology,  81, Goldberg,  L.  R.  (2007).  The  power  of  personality:
524–539.  psyh.  https://doi.org/10.1037/0022- The  comparative  validity  of  personality  traits,
3514.81.3.524 socioeconomic  status,  and  cognitive  ability  for
Paunonen,  S.  V.,  &  Jackson,  D.  N.  (2000).  What  is predicting  important  life  outcomes.  Perspectives  on
beyond the big five? Plenty!  Journal  of  Personality, Psychological  Science,  2,  313–345.
68,  821–835.  https://doi.org/10.1111/1467- https://doi.org/10.1111/j.1745-6916.2007.00047.x
6494.00117 Rohrer, J. M. (2018). Thinking clearly about correlations
Pearl, J. (2018). The Book of Why: The New Science of and  causation:  Graphical  causal  models  for
Cause and Effect. New York: The Basic Books. observational  data.  Advances  in  Methods  and
Plomin, R., & von Stumm, S. (2018). The new genetics of Practices  in  Psychological  Science,  1,  27-42.
intelligence. Nature Reviews Genetics,  19, 148–159. https://doi.org/10.1177/2515245917745629
https://doi.org/10.1038/nrg.2017.104 Rosenbusch, H., Wanders, F.,  & Pit,  I.  L.  (2020). The
Quirin, M., Robinson, M. D., Rauthmann, J. F., Kuhl, J., Semantic  Scale  Network:  An  online  tool  to  detect
Read, S. J., Tops, M., & DeYoung, C. G. (2020). The semantic overlap of psychological scales and prevent
Dynamics of Personality Approach (DPA): 20 Tenets scale redundancies. Psychological Methods, 25, 380-
for  Uncovering  the  Causal  Mechanisms  of 392. http://dx.doi.org/10.1037/met0000244
Personality.  European  Journal  of  Personality. Salganik, M. J., Lundberg, I., Kindel, A. T., Ahearn, C.
https://doi.org/10.1002/per.2295 E., Al-Ghoneim, K., Almaatouq, A., Altschul, D. M.,
Brand, J. E., Carnegie, N. B., Compton, R. J., Datta,
Description, prediction and explanation 29
D., Davidson, T., Filippova, A., Gilroy, C., Goode, B. Personality,  32,  186–201.
J., Jahani, E., Kashyap, R., Kirchner, A., McKay, S., https://doi.org/10.1002/per.2147
…  McLanahan,  S.  (2020).  Measuring  the Smaldino,  P.  E.,  Lukaszewski,  A.,  von Rueden, C.,  &
predictability of life outcomes with a scientific mass Gurven, M. (2019). Niche diversity can explain cross-
collaboration.  Proceedings of  the National  Academy cultural  differences  in  personality  structure.  Nature
of  Sciences,  117,  8398. Human  Behaviour,  3,  1276–1283.
https://doi.org/10.1073/pnas.1915006117 https://doi.org/10.1038/s41562-019-0730-3
Saucier, G. (1997).  Effects of  variable selection on the Sosnowska, J., Kuppens, P., Fruyt, F. D., & Hofmans, J.
factor  structure  of  person  descriptors.  Journal  of (2020). New Directions in the Conceptualization and
Personality  and  Social  Psychology,  73,  12961312. Assessment  of  Personality—A  Dynamic  Systems
https://doi.org/10.1037/0022-3514.73.6.1296 Approach.  European  Journal  of  Personality.
Saucier,  G.,  &  Iurino,  K.  (2019).  High-dimensionality https://doi.org/10.1002/per.2233
personality structure in the natural language: Further Soto, C. J. (2019). How Replicable Are Links Between
analyses  of  classic  sets  of  English-language  trait- Personality Traits and Consequential Life Outcomes?
adjectives.  Journal  of  Personality  and  Social The  Life  Outcomes  of  Personality  Replication
Psychology. https://doi.org/10.1037/pspp0000273 Project.  Psychological  Science,  30,  711–727.
Saucier,  G.,  Iurino,  K.,  &  Thalmayer,  A.  G.  (2020). https://doi.org/10.1177/0956797619831612
Comparing predictive validity in a community sample: Soubelet,  A.,  & Salthouse,  T.  A.  (2011).  Influence  of
High-dimensionality and traditional domain-and-facet Social  Desirability  on  Age  Differences  in  Self‐
structures of  personality  variation.  European Journal Reports  of  Mood  and  Personality.  Journal  of
of Personality. https://doi.org/10.1002/per.2235 Personality,  79,  741–762.
Scarr,  S.,  & McCartney,  K.  (1983).  How people  make https://doi.org/10.1111/j.1467-6494.2011.00700.x
their  own  environments:  A  theory  of  genotype→ Spadaro, G., Tiddi, I., Columbus, S., Jin, S., Teije, A. t.,
environment  effects.  Child  Development,  424–435. &  Balliet,  D.  (2020).  The  Cooperation  Databank.
https://doi.org/10.2307/1129703 https://doi.org/10.31234/osf.io/rveh3
Schimmack, U. (2020). The Implicit Association Test: A Spearman, C. (1927). The abilities of man. Macmillan.
Method  in  Search  of  a  Construct:  Perspectives  on
Psychological  Science. Sperry, R. W. (1966). Mind, brain, and humanist values.
https://doi.org/10.1177/1745691619863798 Bulletin  of  the  Atomic  Scientists,  22,  26.https://doi.org/10.1080/00963402.1966.11454956
Schmeichel,  B.  J.,  & Vohs,  K. (2009).  Self-affirmation
and  self-control:  Affirming  core  values  counteracts Stachl, C., Au, Q., Schoedel, R., Gosling, S. D., Harari,
ego  depletion.  Journal  of  Personality  and  Social G.  M.,  Buschek,  D.,  Völkel,  S.  T.,  Schuwerk,  T.,
Psychology,  96,  770–782. Oldemeier, M., Ullmann, T., Hussmann, H., Bischl,
https://doi.org/10.1037/a0014635 B., & Bühner, M. (2020). Predicting personality frompatterns  of  behavior  collected  with  smartphones.
Schmid,  M.  M.,  Gatica‐Perez,  D.,  Frauendorfer,  D., Proceedings  of  the  National  Academy of  Sciences,
Nguyen, L., & Choudhury, T. (2015). Social sensing 117, 17680–17687.
for  psychology:  Automated  interpersonal  behavior
assessment.  Current  Directions  in  Psychological Stachl,  C.,  Pargent,  F.,  Hilbert,  S.,  Harari,  G.  M.,
Science, 24, 154–160. Schoedel, R., Vaid, S., Gosling, S. D., & Bühner, M.(2020). Personality Research and Assessment in the
Schmitt,  D.  P.,  Allik,  J.,  McCrae,  R.  R.,  &  Benet- Era  of  Machine  Learning.  European  Journal  of
Martinez,  V.  (2007).  The  geographic  distribution  of Personality. https://doi.org/10.1002/per.2257 
big  five  personality  traits—Patterns  and  profiles  of
human self-description across 56 nations. Journal  of Surgeon  General's  Report  (2004).  The  Health
Cross-Cultural  Psychology,  38,  173–212. Consequences  of  Smoking.  Retrieved  from
https://doi.org/10.1177/0022022106297299 https://www.cdc.gov/tobacco/data_statistics/sgr/2004on 14th October 2020.
Schmitt,  D.  P.,  Realo,  A.,  Voracek,  M.,  &  Allik,  J.
(2008). Why can’t a man be more like a woman? Sex Riemann, R., & Kandler, C. (2010). Construct validation
differences  in  big  five  personality  traits  across  55 using multitrait‐multimethod‐twin data: The case of a
cultures.  Journal  of  Personality  and  Social general  factor  of  personality.  European  Journal  of
Psychology,  94,  168–182. Personality, 24, 258–277.
http://dx.doi.org/10.1037/0022-3514.94.1.168 Stieger,  M.,  Wepfer,  S.,  Regger,  D.,  Kowatsch,  T.,
Seeboth,  A.,  &  Mõttus,  R.  (2018).  Successful Roberts, B. W., & Allemand, M. (2020). Becoming
Explanations  Start  with  Accurate  Descriptions: More  Conscientious  or  More  Open  to  Experience?
Questionnaire Items as Personality Markers for More Effects  of  a  Two‐Week  Smartphone‐Based
Accurate  Predictions.  European  Journal  of Intervention  for  Personality  Change.  European
Description, prediction and explanation 30
Journal  of  Personality. mutualism.  Psychological  Review,  113,  842861.
https://doi.org/10.1002/per.2267 https://doi.org/10.1037/0033-295X.113.4.842
Tay, L., Woo, S. E., Hickman, L., & Saef, R. M. (2020). Vazire, S. (2006). Informant reports: A cheap, fast, and
Psychometric  and  Validity  Issues  in  Machine easy  method  for  personality  assessment.  Journal  of
Learning  Approaches  to  Personality  Assessment:  A Research  in  Personality,  40,  472–481.
Focus  on  Social  Media  Text  Mining.  European https://doi.org/10.1016/j.jrp.2005.03.003
Journal  of  Personality. Vazire, S. (2010). Who knows what about a person? The
https://doi.org/10.1002/per.2290 self-other  knowledge  asymmetry  (SOKA)  model.
Terracciano, A., Costa, P. T., & McCrae, R. R. (2006). Journal  of  Personality  and  Social  Psychology,  98,
Personality  Plasticity  After  Age  30.  Personality  & 281–300. https://doi.org/10.1037/a0017908
Social  Psychology  Bulletin,  32,  999–1009. Wendt, L. P., Wright, A. G. C., Pilkonis, P. A., Woods,
https://doi.org/10.1177/0146167206288599 W.  C.,  Denissen,  J.  J.  A.,  Kühnel,  A.,  &
Terracciano, A., McCrae, R. R., Brant, L. J., & Costa, P. Zimmermann,  J.  (2020).  Indicators  of  Affect
T.,  Jr.  (2005). Hierarchical linear modeling analyses Dynamics:  Structure,  Reliability,  and  Personality
of the NEO-PI-R scales in the Baltimore Longitudinal Correlates.  European  Journal  of  Personality.
Study of Aging. Psychology and Aging, 20, 493–506. https://doi.org/10.1002/per.2277
https://doi.org/10.1037/0882-7974.20.3.493 Wessels,  N.  M.,  Zimmermann,  J.,  Biesanz,  J.  C.,  &
Thielmann,  I.,  &  Hilbig,  B.  E.  (2019).  Nomological Leising,  D.  (2020).  Differential  associations  of
consistency: A comprehensive test of the equivalence knowing and liking with accuracy and positivity bias
of  different  trait  indicators  for  the  same  constructs. in  person  perception.  Journal  of  Personality  and
Journal  of  Personality,  87,  715–730. Social  Psychology,  118,  149–171.
https://doi.org/10.1111/jopy.12428 https://doi.org/10.1037/pspp0000218
Turkheimer, E., Pettersson, E., & Horn, E. E. (2014). A Wessels, N. M., Zimmermann, J., & Leising, D. (2020).
phenotypic  null  hypothesis  for  the  genetics  of Who Knows Best What the Next Year Will Hold for
personality. Annual Review of Psychology, 65, 515– You? The  Validity  of  Direct  and  Personality-based
540.  https://doi.org/10.1146/annurev-psych-113011- Predictions  of  Future  Life  Experiences  Across
143752 Different  Perceivers.  European  Journal  of
Vachon, D. D., Lynam, D. R., Widiger, T. A., Miller, J. Personality. https://doi.org/10.1002/per.2293
D., McCrae, R. R., & Costa, P. T. (2013). Basic Traits Weston, S. J., Gladstone, J. J., Graham, E. K., Mroczek, 
Predict the Prevalence of Personality Disorder Across D. K., & Condon, D. M. (2019). Who are the 
the  Life  Span:  The  Example  of  Psychopathy. scrooges? Personality predictors of holiday spending. 
Psychological  Science,  24,  698–705. Social Psychological and Personality Science, 10, 
https://doi.org/10.1177/0956797612460249 775-782.
Vainik,  U.,  Misic,  B.,  Zeighami,  Y.,  Michaud,  A., Wiernik, B. M., Ones, D. S., Marlin, B. M., Giordano,
Mõttus, R., & Dagher, A. (2019). Obesity has limited C.,  Dilchert,  S.,  Mercado,  B.  K.,  Stanek,  K.  C.,
behavioural  overlap  with  addiction  and  psychiatric Birkland, A., Wang, Y., Ellis, B., Yazar, Y., Kostal,
phenotypes.  Nature  Human J.  W.,  Kumar,  S.,  Hnat,  T.,  Ertin,  E.,  Sano,  A.,
Behaviour. https://doi.org/10.1038/s41562-019-0752- Ganesan,  D.  K.,  Choudhoury,  T.,  &  al’Absi,  M.
x (2020).  Using  Mobile  Sensors  to  Study Personality
Vainik, U., Mõttus, R., Allik, J., Esko, T., & Realo, A. Dynamics.  European  Journal  of  Psychological
(2015).  Are  trait-outcome  associations  caused  by Assessment.
scales  or  particular  items?  Example  analysis  of https://doi.org/10.1027/1015-5759/a000576 
personality  facets  and  BMI.  European  Journal  of Wilt,  J.,  &  Revelle,  W.  (2015).  Affect,  Behaviour,
Personality,  29,  622–634. Cognition and Desire in the Big Five: An Analysis of
https://doi.org/10.1002/per.2009 Item  Content  and  Structure.  European  Journal  of
Vainik,  U.,  Dagher,  A.,  Realo,  A.,  Colodro‐Conde,  L., Personality,  29,  478–497.
Mortensen,  E.  L.,  Jang,  K.,  Juko,  A.,  Kandler,  C., https://doi.org/10.1002/per.2002
Sørensen, T. I. A., & Mõttus, R. (2019). Personality- Wood, A. R., Esko, T., Yang, J., Vedantam, S., Pers, T.
obesity  associations  are  driven  by  narrow  traits:  A H., Gustafsson, S., Chu, A. Y., Estrada, K., Luan, J.,
meta-analysis.  Obesity  Reviews,  20,  1121–1131. Kutalik, Z., Amin, N., Buchkovich, M. L., Croteau-
https://doi.org/10.1111/obr.12856 Chonka,  D.  C.,  Day,  F.  R.,  Duan,  Y.,  Fall,  T.,
van Der Maas, H. L. J., Dolan, C. V., Grasman, R. P. P. Fehrmann,  R.,  Ferreira,  T.,  Jackson,  A.  U.,  …
P., Wicherts, J. M., Huizenga, H. M., & Raijmakers, Frayling, T. M. (2014). Defining the role of common
M.  E.  J.  (2006).  A  dynamical  model  of  general variation in the genomic and biological architecture
intelligence: The positive manifold of intelligence by of  adult  human height.  Nature  Genetics,  46,  1173–
1186. Scopus. https://doi.org/10.1038/ng.3097
Description, prediction and explanation 31
Wood, D., & Brumbaugh, C. C. (2009). Using revealed Journal  of  Research  in  Personality,  44,  180–198.
mate  preferences  to  evaluate  market  force  and https://doi.org/10.1016/j.jrp.2010.01.002
differential preference explanations for mate selection. Yarkoni, T. (2020). Implicit realism impedes progress in
Journal  of  Personality  and  Social  Psychology,  96, psychology:  Comment  on  Fried  (2020).
1226–1244. https://doi.org/10.31234/osf.io/xj5uq
Wood, D., Gardner, M. H., & Harms, P. D. (2015). How Yarkoni, T., & Westfall, J. (2017). Choosing Prediction
functionalist and process approaches to behavior can Over  Explanation  in  Psychology:  Lessons  From
explain trait covariation. Psychological Review, 122, Machine  Learning.  Perspectives  on  Psychological
84–11. Science,  12,  1100–1122.
Wood,  D.,  Nye,  C.  D.,  &  Saucier,  G.  (2010). https://doi.org/10.1177/1745691617693393
Identification  and  measurement  of  a  more Zheng, et al. (2017). LD Hub: a centralized database and
comprehensive set of person-descriptive trait markers web  interface  to  perform  LD score  regression  that
from  the  English  lexicon.  Journal  of  Research  in maximizes  the  potential  of  summary  level  GWAS
Personality,  44,  258–272. data  for  SNP  heritability  and  genetic  correlation
https://doi.org/10.1016/j.jrp.2010.02.003 analysis. Bioinformatics, 33, 272-279.
Wood, D., Spain, S. M., Monroe, B. M., & Harms, P. D. Ziegler,  M.,  Horstmann,  K.  T.,  &  Ziegler,  J.  (2019).
(in  press).  Using  functional  fields  to  represent Personality in situations: Going beyond the OCEAN
accounts of the psychological processes that produce and  introducing  the  Situation  Five.  Psychological
actions. In J. F. Rauthmann , Handbook of Personality Assessment,  31,  567–580.
Dynamics and Processes. San Diego, CA: Academic https://doi.org/10.1037/pas0000654
Press. Zimmermann, J., Woods, W. C., Ritter, S., Happel, M.,
Wood,  D.,  &  Wortman,  J.  (2012).  Trait  Means  and Masuhr, O., Jaeger, U., Spitzer, C., & Wright, A. G.
Desirabilities  as  Artifactual  and  Real  Sources  of C.  (2019).  Integrating  structure  and  dynamics  in
Differential Stability of Personality Traits. Journal of personality  assessment:  First  steps  toward  the
Personality,  80,  665–701. development and validation of a personality dynamics
https://doi.org/10.1111/j.1467-6494.2011.00740.x diary.  Psychological  Assessment,  31,  516–531.
Woods, W.C., Arizmendi, C., Gates, K.M., Stepp, S.D., https://doi.org/10.1037/pas0000625
Pilkonis, P.A., & Wright, A.G.C. (2020). Personalized
models of psychopathology as contextualized dynamic
processes:  An  example  from  individuals  with
borderline personality disorder. Journal of Consulting
and  Clinical  Psychology,  88,  240-254.
https://psyarxiv.com/amdu8/
Wright, A.G.C., Gates, K.M., Arizmendi, C., Lane, S.T.,
Woods,  W.C.,  &  Edershile,  E.A.  (2019).  Focusing
personality  assessment  on  the  person:  Modeling
general,  shared,  and  person  specific  processes  in
personality  and  psychopathology.  Psychological
Assessment, 32, 502-515. https://osf.io/nf5me/
Wright, A. G., Creswell, K. G., Flory, J. D., Muldoon, M.
F.,  &  Manuck,  S.  B.  (2019).  Neurobiological
functioning and the personality-trait hierarchy: Central
serotonergic  responsivity  and  the  stability  metatrait.
Psychological Science, 30, 1413-1423
Wright,  A.G.C.  &  Zimmermann,  J.  (2019).  Applied
ambulatory  assessment:  Integrating  idiographic  and
nomothetic principles of measurement. Psychological
Assessment,  31,  1467-1480.
https://psyarxiv.com/6qc5x/
Wrzus,  C.,  &  Mehl,  M.  (2015).  Lab  and/or  field?
Measuring  personality  processes  and  their  social
consequences.  European  Journal  of  Personality,  29,
250–271.
Yarkoni,  T.  (2010).  The abbreviation of  personality,  or
how to measure 200 personality scales with 200 items.