Writing Data Stories: Integrating Computational Data Investigations into the Middle School Science Classroom


Preparing today’s students to work with data fluently is critical to ensuring a scientifically literate and empowered citizenry. But most efforts to incorporate data analysis into the K-12 curriculum are limited to short, isolated activities; basic skills development; or, are introduced as new courses devoted specifically to data and computing, limiting both the potential benefits and the audience for such efforts. The Data Stories project (2019-2022) brings together a team of researchers from University of California Berkeley, NC State University, and The Concord Consortium to integrate computational data analysis into the middle school science curriculum in a longitudinal, interdisciplinary way – drawing from the computer and data sciences, literacy studies, statistics, and science education to saturate the classroom with relevant tools, resources, and support.


Middle school classrooms, in the San Francisco CA area, will analyze and draw conclusions about publicly available scientific datasets using a free, innovative, computational data analysis platform called the Common Online Data Analysis Platform (CODAP). Units will be designed specifically with Dual Language Learners (DLL) in mind, inviting students to share their investigations by writing multimodal texts that blend both familiar and academic modes of expression to explain and contextualize their data analysis processes. Units that build on one another in difficulty and complexity will be introduced throughout the academic year, and participating teachers will receive significant training opportunities. Overall the project is anticipated to directly impact approximately 2,500 students and 20 teachers in the greater San Francisco Bay area, from predominantly high needs schools.

The project will provide a research context to address the following questions:

  • How do students learn, over time, to use computational tools to structure, calculate, filter, and transform data for scientific inquiry?
  • What patterns of engagement in scientific practices are supported by the integration of computational data analysis and visualizations into the science curriculum?
  • What new literacy practices might support DLL and learners with limited access to technology or who are still developing academic literacy in constructing oral and written arguments and explanations using data and visualizations as evidence?

Specifically, it brings together and seeks to extend three complementary research constructs. Data moves are the computational actions analysts take to transform and analyze datasets. Syncretic texts are specialized texts that blend academic discourse, such as the formulae and statistical language needed to explain data analysis, with the familiar modes of expression (including home languages and alternative forms of expression such as video, art, animation, etc.), which has been found to invite a wide range of marginalized students to experiment with and develop academic language. Finally, a community of learners approach allows different student groups in a classroom to conduct and share the results of different investigations with the same dataset, making the exploration of large and complex data more feasible in everyday classrooms.

This pronsf3ject is supported by the National Science Foundation under Grant No. DRL 1900606 awarded to University of California Berkeley. Any opinions, findings, and conclusions or recommendations expressed herein are those of the principal investigators and do not necessarily reflect the views of the National Science Foundation.