Writing Data Stories: Integrating Computational Data Investigations into the Middle School Science Classroom


Preparing today’s students to work with data fluently is critical to ensuring a scientifically literate and empowered citizenry. But most efforts to incorporate data analysis into the K-12 curriculum are limited to short, isolated activities; basic skills development; or, are introduced as new courses devoted specifically to data and computing, limiting both the potential benefits and the audience for such efforts. The Data Stories project (2019-2022) brings together a team of researchers from University of California Berkeley, NC State University, and The Concord Consortium to integrate computational data analysis into the middle school science curriculum in a longitudinal, interdisciplinary way – drawing from the computer and data sciences, literacy studies, statistics, and science education to saturate the classroom with relevant tools, resources, and support.


Middle school classrooms, in the San Francisco CA area, will analyze and draw conclusions about publicly available scientific datasets using a free, innovative, computational data analysis platform called the Common Online Data Analysis Platform (CODAP). Units will be designed specifically with Dual Language Learners (DLL) in mind, inviting students to share their investigations by writing multimodal texts that blend both familiar and academic modes of expression to explain and contextualize their data analysis processes. Units that build on one another in difficulty and complexity will be introduced throughout the academic year, and participating teachers will receive significant training opportunities. Overall the project is anticipated to directly impact approximately 2,500 students and 20 teachers in the greater San Francisco Bay area, from predominantly high needs schools.

The project will provide a research context to address the following questions:

  • How do students learn, over time, to use computational tools to structure, calculate, filter, and transform data for scientific inquiry?
  • What patterns of engagement in scientific practices are supported by the integration of computational data analysis and visualizations into the science curriculum?
  • What new literacy practices might support DLL and learners with limited access to technology or who are still developing academic literacy in constructing oral and written arguments and explanations using data and visualizations as evidence?

Specifically, it brings together and seeks to extend three complementary research constructs. Data moves are the computational actions analysts take to transform and analyze datasets. Syncretic texts are specialized texts that blend academic discourse, such as the formulae and statistical language needed to explain data analysis, with the familiar modes of expression (including home languages and alternative forms of expression such as video, art, animation, etc.), which has been found to invite a wide range of marginalized students to experiment with and develop academic language. Finally, a community of learners approach allows different student groups in a classroom to conduct and share the results of different investigations with the same dataset, making the exploration of large and complex data more feasible in everyday classrooms.

Classroom Ready Materials

The Writing Data Stories project team has developed two main sets of classroom-ready materials that can be used to support data explorations in middle school science (and mathematics or social studies) instruction.

SocioScientific Curricula Units

Writing Data Stories Curriculum Units are bilingual (English/Spanish) 3-4 week long collections of activities. These guide students to explore and critique scientific datasets focused on nutrition, climate, and the environment. In the process, students question how datasets are constructed, by whom, and what questions can be answered by those datasets. As they critique and reflect on these datasets, they learn to use data transformation skills (e.g. adding, filtering, grouping data) to adapt and use the datasets to explore their own relevant guiding questions. For example, students might explore issues of nutrition and access for their own foods, or they may explore environmental trends and impacts related to climate in a region they care about. Students then create data stories that weave together their own personal and social histories with these scientific datasets. The data tool students use in this curriculum is CODAP, a free online tool for analyzing and visualizing data. In the process students learn to visualize, transform, interpret, and humanize data toward the goal of informing and inspiring others to action.

Check out our Teachers’ Getting Started document or this How-To Video to learn more about the WDS Curriculum units. All units and associated materials are available for free in this Google folder. Learn more about CODAP here.

Data Story Bytes

How can students learn to critically read and make sense of the data and graphs used in the media to communicate news stories? Data Story Bytes are:

  • Short, 30 minute or less explorations based on a data visualization.
  • Focused on critical analysis and interpretation of graphs and data, including personal connections and what/who is “counted” and why.
  • Integrates translations and English/Spanish language support for multilingual students.
  • Connected to many of your upcoming standards (e.g. MS – ESS3 Earth & Human Activity; MS – LS2 Ecosystems). For example, we have several DataBytes focused on carbon emissions, climate change, and ocean habitats.
  • Some DataBytes have an option of viewing and interacting with the original data in CODAP, if you or your students would like to explore further.

Complete Teacher Guide to get started with DataBytes, and 

Student-facing slide decks meant for use with students in this Google folder.

Related Publications

Article on Writing Data Stories

This pronsf3ject is supported by the National Science Foundation under Grant No. DRL 1900606 awarded to University of California Berkeley. Any opinions, findings, and conclusions or recommendations expressed herein are those of the principal investigators and do not necessarily reflect the views of the National Science Foundation.