The “Narratives” fMRI dataset for evaluating models of naturalistic language comprehension

Samuel A. Nastase, Yun Fei Liu, Hanna Hillman, Asieh Zadbood, Liat Hasenfratz, Neggin Keshavarzian, Janice Chen, Christopher J. Honey, Yaara Yeshurun, Mor Regev, Mai Nguyen, Claire H.C. Chang, Christopher Baldassano, Olga Lositsky, Erez Simony, Michael A. Chow, Yuan Chang Leong, Paula P. Brooks, Emily Micciche, Gina ChoeAriel Goldstein, Tamara Vanderwal, Yaroslav O. Halchenko, Kenneth A. Norman, Uri Hasson

Research output: Contribution to journalArticlepeer-review

35 Citations (Scopus)


The “Narratives” collection aggregates a variety of functional MRI datasets collected while human subjects listened to naturalistic spoken stories. The current release includes 345 subjects, 891 functional scans, and 27 diverse stories of varying duration totaling ~4.6 hours of unique stimuli (~43,000 words). This data collection is well-suited for naturalistic neuroimaging analysis, and is intended to serve as a benchmark for models of language and narrative comprehension. We provide standardized MRI data accompanied by rich metadata, preprocessed versions of the data ready for immediate use, and the spoken story stimuli with time-stamped phoneme- and word-level transcripts. All code and data are publicly available with full provenance in keeping with current best practices in transparent and reproducible neuroimaging.

Original languageEnglish
Article number250
JournalScientific data
Issue number1
Publication statusPublished - Dec 2021

ASJC Scopus subject areas

  • Statistics and Probability
  • Information Systems
  • Education
  • Computer Science Applications
  • Statistics, Probability and Uncertainty
  • Library and Information Sciences


Dive into the research topics of 'The “Narratives” fMRI dataset for evaluating models of naturalistic language comprehension'. Together they form a unique fingerprint.

Cite this