Building Computational Capacity in the Research Community
We’re Supporting The Carpentries to Advance Access to Computational Skills Globally
Imagine trying to analyze data in a spreadsheet with 300 columns and 40,000 rows. It’s now easier than ever for scientists to generate large datasets, making this hypothetical scenario a common task for wet-lab biologists today. Yet these datasets are only useful if we can extract meaning from them, which often requires the ability to analyze, annotate, and model them through computational tools.
Access to computational skills remains a bottleneck for many researchers. They have a deep understanding of their biological domain of expertise and sophisticated laboratory methods but may not have received much training in programming or bioinformatics. To accelerate science, we need to both build tools and build capacity for engaging with data.
We consistently hear the need to build capacity in computational skills from scientists in the communities we support at the Chan Zuckerberg Initiative. We’ve spent the last several months learning about how scientists build these skills, where this process is working well, and where it gets stuck. We talked to more than 35 faculty, students, informal educators, learning scientists, university administrators, curriculum developers, and research staff members, and we also surveyed postdocs and graduate students in our grantee communities to better understand their specific needs.
“The computational training available to me was part of my undergrad. Currently there is no training available, and any new subjects, such as single cell analysis, are learned from my colleagues. There are a few classes available, but they are scarce and usually expensive.”
The two most important things we learned were that building scientists’ ability to engage with their data is a pressing need, and the scientific community has already developed fantastic approaches to meeting this need. We want to find ways to amplify these efforts to accelerate science.
Community-driven efforts to provide training in computational skills largely take the form of informal education, or education outside of university courses. These trainings occur within an ecosystem of learners at different stages, instructors in a variety of roles (many of whom are volunteers), and curriculum developers across many domains; each of these roles has unique needs.
From learners, we heard the need for training opportunities that are more discoverable and tailored to the task at hand. Just as importantly, they voiced a need for a community of practice — a supportive group of peers with similar goals and skills — especially as they start to apply their newly-learned skills to their own data.
“I would benefit from case studies or examples of other researchers’ workflows and how they make use of different tools and processes in their work.”
This learning depends on instructors and curriculum developers; most of this work is done by other researchers who volunteer their time to support the community. From instructors, we heard the need for lightweight training in effective teaching strategies; ways to connect with other informal educators; recognition for their work; and easily discoverable, high-quality curriculum. From curriculum developers, we heard about the need to crowd-source the laborious task of curriculum maintenance and updating, especially in rapidly evolving fields.
As a first step towards addressing these needs, CZI is supporting The Carpentries in collaboration with the Gordon and Betty Moore Foundation. The Carpentries, a fiscally sponsored project of the nonprofit organization Community Initiatives, helps researchers develop computational skills and has a deep understanding of its community’s needs. They provide training through inclusive workshops based on collaboratively developed, openly-available curricula.
These workshops are taught by community-based instructors who are trained in best practices through The Carpentries’ Instructor Training Program. To date, they have trained over 2,300 instructors and taught more than 56,000 researchers in over 2,200 workshops across 65 countries. These instructors and learners are supported through The Carpentries’ extensive community infrastructure and engagement.
CZI and the Moore Foundation are supporting The Carpentries’ continued growth and development by focusing on the cornerstones of instructor training, curriculum development, and community engagement.
With support from CZI and the Moore Foundation, The Carpentries will scale up their Instructor Training Program by formalizing and expanding the process by which trainers — who teach instructors — are onboarded. The need for this training is demonstrated by the current 400-person waitlist; we hope this award will help address this need in a sustainable way.
This grant will also enable The Carpentries to pilot CarpentriesLabs, a platform and editorial process for community-contributed curricula. This platform will make lessons more discoverable and centralized, with crowd-sourced maintenance and curation. In response to the needs expressed by curriculum developers, we anticipate this will help reduce duplicated effort, make maintenance less of a burden (especially in rapidly evolving fields of science), and reduce barriers to access high-quality curricula.
The Carpentries will also be able to strengthen the infrastructure that supports their thriving communities of learners, instructors, and trainers, while expanding to more underserved communities. These communities of practice are crucial sources of support for learners and instructors alike.
At CZI, we are committed to investing in and supporting scientists. We believe that enabling researchers to fully engage with their data is crucial to accelerating scientific progress, and will help create a level playing field for more people to participate in the research process. We are delighted to partner with the Moore Foundation to support The Carpentries, and look forward to continuing to learn from their work to empower researchers with critical computational skills.
To learn more about our work at CZI, visit our website or follow us on Twitter.
Sidney Bell, Computational Biologist
Sidney is a computational biologist at CZI. She develops programs and software tools that empower scientists to explore and understand their data. Prior to joining CZI, she studied how viruses evolve and move around the world during her PhD at the Fred Hutch Cancer Research Center. Her current projects focus on data visualization, capacity building, and open science.
Dario Taraborelli, Science Program Officer, Open Science
Dario is a social computing researcher and an open knowledge advocate. As the Science Program Officer for Open Science at CZI, his goal is to build programs and technology to support open, reproducible, and accessible research. Prior to joining CZI, he served as the Director, Head of Research at the Wikimedia Foundation, the non-profit that operates Wikipedia and its sister projects. As a co-author of the Altmetrics Manifesto, a co-founder of the Initiative for Open Citations, and a long-standing open access advocate, he has been designing systems and programs to accelerate the discoverability and reuse of scientific knowledge by scholars, policy makers, and the general public alike.