Corpus Collections

Welcome to The WAC Corpus Collection

The WAC Corpus Collection is a curated repository of academic and professional writing corpora gathered from a range of institutions, disciplines, and instructional contexts. Launched as an initiative of the WAC Clearinghouse Associate Publishers New Scholar Fellowship (2025-26) and housed within the WAC Clearinghouse, this collection is designed to increase access to large text datasets for researchers, teachers, and students interested in writing studies and writing analytics 

The corpora in this collection come from a range of institutions and represent a range of writing contexts. Some corpora (such as the University of South Carolina First-Year English corpus) are stored directly on this site, while others (such as Michigan Corpus of Upper-Level Student Papers (MICUSP) and British Academic Written English (BAWE) Corpus) are hosted externally and linked here for ease of access.

Each corpus page provides:

  • A description of how the corpus was collected and its scope
  • Information about how to access and download the corpus 
  • Tutorials for how to explore the corpus using computational tools
  • References to past research conducted with the corpus 
  • Suggestions for future research that can be done with the corpus

Purpose and Design

The WAC Corpus Collection makes writing corpora more visible, accessible, and pedagogically valuable. By centralizing these resources, the project:

  • Enables new inquiries into how writing varies across disciplines, genres, and contexts
  • Reduces barriers to accessing and analyzing corpora at scale
  • Supports instructors in integrating authentic writing data into their teaching

Developed through the WAC Clearinghouse Associate Publishers New Scholar Fellowship, this collection advances the initiative’s mission to expand shared research infrastructure for writing studies.

Interested in Contributing a Corpus?

If you are interested in contributing a corpus to this collection, please contact Megan Kane (Assistant Editor, The Journal of Writing Analytics) at megan.kane@shu.edu, Duncan Buell (Co-Editor in Chief, The Journal of Writing Analytics) at duncan.buell@gmail.com, and Laura Aull (Research Leave 2025-2026) at laull@umich.edu. We welcome corpora representing a variety of genres and disciplines.

Links

Join this Project

If you would like to serve as guest editor for a special issue, or if you would like to suggest a topic for a special issue, please contact Michael J. Cripps, Editor, at mcripps@une.edu
or 207-602-2908.