Data

Wordbank Book

Many analyses of Wordbank data are described in detail in the Wordbank Book, available at wordbank-book.stanford.edu and from MIT Press.

Accessing data from Wordbank

  • Interactive apps

    The apps above allow analyses of normative growth in vocabulary size and individual item trajectories. They also allow for download of several forms of tabular data. These apps cover some of the most common use cases for Wordbank data but not all.

  • Programmatic access

    In addition to these apps, you can access Wordbank data using the wordbankr R package. There is a data access tutorial for wordbankr available here. The R package is the most flexible way to work with data from the database.

  • Wordbank book

    If you are interested in item- and group-related analyses of the Wordbank data that are not available through the Shiny apps, it may also be helpful to identify the relevant analysis in the Wordbank book and look at the code for those analyses.

  • Previous versions

    Periodic snapshots of the Wordbank database are available for download. For instructions on how to access them, see here.

About Wordbank infrastructure

Unilemmas

Wordbank includes mappings between words in different languages using hand-checked translation equivalents called "unilemmas" (universal lemmas). These allow comparison of similar conceptual material across different languages. Translation equivalence is a very tricky concept! Just think about how the Spanish word "reloj" should be translated to English - is it "clock"? "watch"? or both? We have an extensive policy that has guided our choices, available here. You can also see and contribute to our unilemma codebase.

Teaching

Wordbank can be a useful teaching tool. For some examples, see this in-class exercise and accompanying assignment created by Caitlin Fausey, as well as the materials for a workshop taught by Michael Frank and Mika Braginsky.