The Chan Zuckerberg Initiative hosted a hackathon on Mapping the Impact of Research Software in Science (October 24-27, 2023). This repository serves as a project index page to faciliate the discovery and preservation of the output of that event.
Mapping the usage and impact of research software in science remains a challenge due to the lack of canonical data/ infrastructure and inconsistent software citation practices in the scientific literature. The lack of a “software citation graph” means that it’s very hard to answer questions such as:
- Which software tools are most frequently used by scientists in any given field?
- How does the use of open source compare to proprietary software in any given field?
- Are emerging new tools replacing legacy ones?
- What is the prevalent programming language in any given field?
- Which software tools should be part of a student’s computational curriculum?
- Which software projects should funders prioritize as critical infrastructure for science?
In recent years, several attempts have been made to answer these questions by mining the scientific literature, by analyzing electronic notebooks or code repositories. With this event, our goal is to convene practitioners in different areas of computer science / data science / ML, as well as organizations active in this space, to develop comprehensive datasets, methods, approaches, and resources to map the adoption and impact of research software in science (specifically scientific open source software).
Hackathon Participants: Please submit a pull request to this repo in which you edit this readme to include the appropriate link to your project repo, or add your repo to the appropriate list if it is not already present.
- Tracing the dependencies of open source software mentioned in the biomedical literature https://github.com/borisveytsman/SoftwareImpactHackathon2023_Tracing_dependencies
- Determining the citation intent for software repositories (project repo)
- Playing with LLMs and research software (project repo)
- Tracing the dependencies of open source software mentioned in the biomedical literature (project repo)
- Disciplinary differences in software usage and mention (project repo)
- Gold dataset (project repo)
- Bidirectional paper-repository traceability (project repo)
- Improving tool mention clustering (project repo)
- Linking research software to research organizations (project repo)
- Visualizing adoption and impact of open source software in academia (project repo)
This project adheres to the Contributor Covenant code of conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to [email protected].
If you believe you have found a security issue, please responsibly disclose by contacting us at [email protected].