Skip to content

GSoC 2024 Project ideas

gnikit edited this page Mar 23, 2024 · 23 revisions

Welcome to the Fortran-Lang ideas page for contributors applying for Google Summer of Code (GSoC). If you are interested in applying for GSoC, see the Contributor Instructions for more information on how to apply.

The list here is based on priorities identified by Fortran-Lang contributors and should inform you about the state and direction of each project. If you are interested in an idea on this page, please contact us on our Discourse to ask any questions and get the latest information about the project idea. Please read the existing discussion(s) in any linked issues.

The project ideas on this page are grouped by the repository. Please familiarize yourself with each repository before exploring the ideas here.

We are not limited to the project ideas listed on this page. If you have your own project idea that is not listed here, let us know.

Contacts for prospective mentors: Mentors list

Project Index


Version Constraint Resolution (fpm)

The current decentralized package system in fpm allows dependencies to be fetched via a git repository URL. As part of this, a git tag or commit can be given to require a specific version of a dependency. There is however no way of specifying version compatibility requirements (e.g. >= 1.0.0, <2.0.0) and no way to resolve such requirements across a dependency tree.

This project will involve:

  • Defining a manifest syntax for version compatibility matching
  • Implementing support in fpm for solving a set of version compatibility constraints

A possible approach would be to interface with an existing satisfiability solver such as:

  • libsolv: interface via iso_c_binding as a separate fpm package

See also: existing options for version matching syntax:

Expected outcomes: Implemented a working version constraint mechanism in fpm

Skills preferred: Fortran programming, experience with one or more build systems

Difficulty: Intermediate, 350 hours

Mentors: Brad Richardson (@everythingfunctional), Sebastian Ehlert (@awvwgk), Umashankar Sivakumar (@usivakum)

Build Process Enhancements (fpm)

Fortran Package Manager (fpm) is pivotal for long-term Fortran success. This GSoC project aims to improve fpm’s build process by improving dependency detection, optimizing linking, implementing shared libraries, ensuring safe concurrent builds, and introducing external Makefile generation.

The project will address the following tasks:

  1. Dependency Detection:
    • Enhance fpm’s dependency detection to minimize rebuilds by parsing or hashing module/submodule files or parsing procedure interfaces in module files. fpm should not rebuild dependencies to a module whose public interface has not changed.
  2. Linking Optimization:
    • Replace one-liner linking with static libraries to prevent line buffer overflow in Windows builds.
  3. Shared Library Implementation:
    • Introduce support for shared library targets for project flexibility.
  4. Safe Concurrent Builds:
    • Implement file locking for safe concurrent invocations, especially during OpenMP builds, to prevent data corruption.
  5. External Makefile Generation:
    • Enable generation of external Makefiles akin to cmake -G for advanced project configuration.

Expected Outcomes:

  • Enhanced dependency tracking and reduced rebuild times.
  • Improved reliability in linking, particularly in Windows.
  • Increased project versatility with shared library support.
  • Safer concurrent builds through file locking.
  • Greater project configuration flexibility with external Makefile generation.

Difficulty: Intermediate, 175 hours.

Skills preferred: Fortran programming, experience with one or more build systems

Mentors: Federico Perini (@perazz), José Alves (@jalvesz), Henil Panchal (@henilp105)

Extended Testing Support (fpm)

The aim of this project is to create a manifest specification to provide defaults to executable targets in fpm projects. Information can be passed as environment variables, command-line arguments or as a runner. Desired features include:

  • Programs should have a way to find resources of which the relative position within the project source directory is known.
  • The current binary directory to access other targets within a project.
  • Default runners like mpirun/cafrun or scripts from test frameworks should be usable to launch programs.
  • A general syntax to define environment variables and command-line arguments should be defined.

Some features should be implemented directly in fpm, while more elaborated functionality could be implemented in a separate fpm package as an official Fortran-lang fpm package.

Related issues:

Related discussions:

  • fpm#328: Example which requires external data

Expected outcomes: fpm has broader and deeper testing functionality

Skills preferred: Fortran programming and writing unit tests

Difficulty: Easy, 175 hours

Mentors: Sebastian Ehlert (@awvwgk), Brad Richardson (@everythingfunctional)

Export build order and compile_commands.json (fpm)

fpm has the ability to automatically determine the build order of a project's source files. This information is valuable to third party tools such as language servers and code analysis tools. The goal of this project is to export the build order of a project's source files in the compile_commands.json.

The second leg of this project is to implement the full syntax of compile_commands.json as described in the Clang documentation. This would bring fpm a step closer to being compatible with other build tools.

Expected outcomes: fpm will export a complete compile_commands.json file.

Skills preferred: Fortran programming, experience with one or more build systems

Difficulty: Hard, 350 hours

Mentors: Giannis Nikiteas (@gnikit)

Support of external third-party preprocessors

Adding support for external third-party preprocessors is important for fpm due to the additional flexibility they provide when building complex packages. In particular, the Fortran-lang stdlib project exploits the powerful fypp preprocessor for code generation and the support of fypp by fpm is required for stdlib to eventually be compatible as an fpm package.

This project will require to:

  • Modify fpm to optionally invoke a third-party preprocessor before compiling sources;
  • Extend the current manifest syntax of fpm for defining preprocessor variables in a preprocessor-independent manner, if necessary;
  • Extend the current manifest syntax of fpm for specifying a third-party preprocessor and the corresponding file suffixes, if necessary;
  • Passe defined preprocessor variables to built-in preprocessors if necessary;

Third-party preprocessors should be specified on a per-project basis, i.e. multiple preprocessors might be required, and fpm should be able to report useful errors for missing third-party preprocessors.

Related issues:

  • fpm#78: support for third-party preprocessors (e.g. fypp)
  • fpm#308: Fortran-based smart code generation in fpm
  • fpm#469: Source pre-processing prior to determining dependencies

Expected outcomes: fpm has a working preprocessing capability

Skills preferred: Fortran, C, or Python programming, experience using one or more preprocessors

Difficulty: easy, 175 hours

Mentors: Laurence Kedward (@lkedward), Milan Curcic (@milancurcic), Federico Perini (@perazz), Jeremie Vandenplas (@jvdp1)

File system library (stdlib)

Currently, file system operations such as listing contents of directories, traversing directories, and similar, are restricted to 3rd party libraries and compiler extensions that are platform-specific and not portable. This project will entail designing and implementing a cross-platform solution for file system operations.

Related issues:

  • stdlib#201: File system operations
  • stdlib#220: API for file system operations, directory manipulation

WIP implementation:

Expected outcomes: Implemented an stdlib module that provides cross-platform file-system utilities

Skills preferred: Fortran and C programming, experience using Linux, macOS, and Windows

Difficulty: Intermediate, 350 hours

Mentors: Arjen Markus (@arjenmarkus), Milan Curcic (@milancurcic)

Library to work with OS processes (stdlib)

Cross-platform solution to abstract POSIX and Windows API for creating subprocesses.

Related issues:

Discourse thread:

Skills preferred: Fortran and C programming, experience using Linux, macOS, and Windows

Difficulty: Intermediate, 350 hours

Mentors: Sebastian Ehlert (@awvwgk)

Linear algebra and sparse matrices (stdlib)

Implementing a standardized API for procedures to handle sparse matrices and linear algebra operations. The API should contain the well known formats COO, CSR and optionally include also CSC, ELLPACK and DIA. A hierarchical architecture shall be chosen such that non-OO low-level API and high level OO API can can be easily implemented, tested and extended.

The API development should closely follow the developements on dense linear algebra in order to keep a coherent interface for sparse and dense matrices.

Related issue: #38 749

WIP implementations: #189 #760 FSPARSE

Expected outcomes: Implemented sparse matrix functionality in the stdlib_linalg module

Skills preferred: Fortran programming, understanding of linear algebra

Difficulty: Hard, 350 hours

Mentors: Ondřej Čertík (@certik), Ivan Pribec (@ivan-pi), Jeremie Vandenplas (@jvdp1), Jose Alves (@jalvesz), Federico Perini (@perazz)

String to number conversion (stdlib)

This project will enhance stdlib's string handling capabilities for fast number parsing in Fortran.

Recently, a new module was added to stdlib called stdlib_str2num which implements fast routines for converting strings to numerical types. The participant would get familiar with these implementations and subsequently:

  • Create a full benchmark suite for the string to number conversion, across compiler vendors, operating systems, and CPU architectures.
  • Explore ways to improve robustness and efficiency, e.g. error handling.
  • Propose a shallow interface for the string_type facility in stdlib.
  • Propose an enhancement to the loadtxt facility function to speed-up file reading.
  • Depending on the advancement, the participant is also encouraged to include a roadmap for inclusion of the inverse conversion by following the intitiative in this thread ryu-based to_string function

Relevant thread on Fortran Discrouse: Faster string to double

Expected outcomes: Enhancement of stdlib fast string to number conversion

Skills preferred: Fortran and C programming, understanding of floating-point arithmetic

Difficulty: Hard, 350 hours

Mentors: Jose Alves (@jalvesz), Carl Burkert (@carltoffel) Brad Richardson (@everythingfunctional), Ivan Pribec (@ivan-pi)

Compile benchmarking code written in Fortran with LFortran and improving LFortran's performance on these benchmarks (LFortran)

https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/fortran.html contains all the benchmark codes written for various problems such as n-body, sepctral norm, mandelbrot. The workflow would involve first doing bug fixes to compile the code (modifying the input code would be okay) with LFortran and producing correct outputs. Then, improving LFortran to perform better or equivalent to other Fortran compilers such as GFortran.

n-body already compiles with workarounds with LFortran main. See, https://github.com/lfortran/lfortran/pull/1213. More work needs to be done for other benchmark codes.

Expected outcomes: LFortran can compile as many benchmark codes as possible. Performing better than other compilers would be an additional plus.

Skills preferred: Fortran and C++ programming

Difficulty: intermediate/hard, 350 hours

Mentors - Gagandeep Singh (Github - @czgdp1807)

Compiling SciPy with LFortran (LFortran)

Currently LFortran compiles about 60% of all SciPy Fortran packages and can parse all the Fortran source code in SciPy. The goal of this project is to compile the rest of them. This project involves implementing the rest of the semantics that is needed to compile the Fortran files with LFortran.

Being able to compile SciPy with LFortran would make a huge impact on both LFortran and SciPy.

Expected outcomes: LFortran can compile all Fortran code in SciPy.

Skills preferred: Fortran and C++ programming

Difficulty: intermediate, 350 hours

Mentors - Ondřej Čertík (@certik)

Allow running Fortran in the browser (LFortran)

We have LFortran running in the browser using WASM here: https://dev.lfortran.org/, the goal of this project would be to improve the user interface. Here is a list of issues that the project can work on fixing: https://github.com/lfortran/lcompilers_frontend/issues

This project would entail working with LFortran, LLVM, Emscripten, and Webassembly to allow running Fortran in the browser.

Skills preferred: Fortran and C++ programming

Difficulty: intermediate, 350 hours

Mentors - Ondřej Čertík

Language Server (LFortran)

This project would be used to first serialize the ASR and then use it within a language server.

Expected outcomes: LFortran can be used as a Fortran language server that can be used in other software such as source code editors and IDEs.

Skills preferred: Fortran and C++ programming

Difficulty: intermediate, 350 hours

Mentors: Ondřej Čertík (@certik)

Parser + ASR (for F2PY) (LFortran)

Enhance the frontend parser to allow for more generic extensions and keywords. In particular, parsing the pyf or signature file format consumed by F2PY. This would mean that SciPy can be directly transformed to the ASR representation without a fixed form parser.

This also involves adding nodes to the ASR itself.

Difficulty: intermediate, 350 hours

Expected outcomes: LFortran's ASR is improved such that it can parse the pyf format consumed by F2PY.

Skills preferred: Fortran and C++ programming

Mentors: Rohit Goswami (@HaoZeke)

Other LFortran ideas (LFortran)

More LFortran project ideas for GSoC can be found at: https://github.com/lfortran/lfortran/wiki/GSoC-2024-Ideas

MPI support (fortls)

fortls has support for Fortran intrinsics, Standard modules and OpenMP. It does not however support MPI. The goal of this project is to add full support for completions, hover and signature help for MPI variables, subroutines and functions.

Due to the size of the MPI standard, the process of extracting the necessary information from the standard such as names, interfaces and documentation will be automated. The student will be responsible for creating a scraper/parser to fetch the necessary information from the MPI standard and then create the serialised data (JSON) to be used by fortls.

Discourse thread: MPI documentation and interfaces

Expected outcomes: fortls will have completion and hover support for MPI.

Skills preferred: Python programming and understanding of Fortran

Difficulty: Intermediate, 175 hours

Mentors: Giannis Nikiteas (@gnikit)

Semantic highlighting and collapsable scopes (fortls)

As part of this project the student will add support to fortls for the Semantics Tokens request, which is used to provide improved syntax highlighting and the Folding Range request, which is used to provide collapsable scopes.

Related Issues:

Expected outcomes: fortls will serve for semantic highlighting and collapsable scopes requests.

Skills preferred: Python programming and understanding of Fortran

Difficulty: Intermediate, 175 hours

Mentors: Giannis Nikiteas (@gnikit)

Replace explicit LSP interface with pygls (fortls)

fortls uses explicit interfaces to the Language Server Protocol (LSP). To decrease code duplication and increase maintainability, the work of maintaining the explicit interfaces should be replaced with the use of pygls' module.

Related Issues:

Expected outcomes: fortls uses pygls' to define LSP interfaces, types and requests.

Skills preferred: Python programming and understanding of the Language Server Protocol

Difficulty: Hard, 350 hours

Mentors: Giannis Nikiteas (@gnikit)

Python environment manager (vscode-fortran-support)

In the Modern Fortran for VS Code extension, the use of Python as a means to install third party tools is essential. The goal of this project is to create a robust Python environment manager for installing and running third party tools such as fortls, fpm, findent, etc., taking into account the user's setup (venv, conda, system Python, etc.).

Expected outcomes: Modern Fortran for VS Code will have a robust Python environment manager for installing and running third party tools.

Skills preferred: Typescript, Python programming

Difficulty: Hard, 175 hours

Mentors: Giannis Nikiteas (@gnikit)

vscode integration with fpm (vscode-fortran-support)

The goal of this project is to allow fpm integration with the Modern Fortran extension for Visual Studio Code, similar to how CMake and Meson are integrated in VS Code.

Using an Activity bar icon, the user will be able to build and run projects, tests and examples. The student will be responsible for creating the GUI integration and the necessary backend to communicate with fpm.

Expected outcomes: Modern Fortran for VS Code will have a GUI integration with fpm to build and run projects, tests and examples.

Skills preferred: Typescript, Fortran programming

Difficulty: Hard, 350 hours

Mentors: Giannis Nikiteas (@gnikit)

Standard Conformance Suite

Fortran compilers' support for ISO Fortran standards generally lag the publication of the standard by several years or longer. Fortran consultants Ian Chivers and Jane Sleightholme periodically publish a paper containing a table detailing the standard features supported by 10 compilers. Gathering the tabulated data requires a considerable amount of effort on the part of the authors and the compiler developers. The chosen venue for publishing the table also puts it behind a paywall: access requires a subscription to ACM SIGPLAN Fortran Forum. The project will automate the generation of the table, make it more detailed and empower the community to contribute to by submitting small tests to an open-source conformance test suite.

Prior work:

Expected outcomes: A comprehensive test suite that generates a report of standard conformance for any Fortran compiler. The suite is not expected to be 100% complete by the end of the project, but should be significant in terms of standard coverage.

Skills preferred: Fortran programming, experience reading and interpreting the Fortran Standard, and writing tests

Difficulty: Hard, 350 hours

Mentors: Damian Rouson (@rouson), Arjen Markus (@arjenmarkus), Ondřej Čertík (@certik)

Coarray Fortran Framework of Efficient Interfaces to Network Environments (Caffeine)

This project would add support for grouping images (parallel processes) into teams that allow submodes to execute independently. Caffeine 0.1.0 uses the GASNet-EX networking middleware software as a back end for supporting most of the non-coarray parallel features of Fortran 2018 except for the intrinsic derived team_type and related features. Work is underway to support the coarray features that most applications will need for expressing custom parallel algorithms. The teams feature set is the one significant non-coarray parallel group of features not yet implemented in Caffeine.

Expected outcomes: Caffeine can be used to create images groups in execution parallel programs

Skills preferred: Fortran and C programming

Difficulty: Intermediate, 175 hours

Mentors: Damian Rouson (@rouson)

Get fortran-lang/minpack to be used in SciPy

fortran-lang/minpack #14

The participant would work with Fortran-lang and SciPy teams toward implementing fortran-lang/minpack in SciPy.

Expected outcomes: fortran-lang/minpack is incorporated into SciPy.

Skills preferred: Fortran-C interop, Python programming

Difficulty: Easy, 175 hours

Mentors: Sebastian Ehlert (@awvwgk)

Improving fastGPT: Making it Faster, Easier to Use, and More General

The fastGPT project is a Fortran implementation of GPT-2 that is comparable in speed to PyTorch. Although it is already very fast on CPUs, there is still room for improvement in terms of usability and performance on CPU and other architectures, such as GPUs.

This project aims to explore various aspects of fastGPT to improve its usability and performance. Some potential areas of exploration include:

  • Parallelism: Investigate the use of parallelism in fastGPT, including MPI and coarrays, to potentially make it even faster. Given that GPT inference is dominated by large matrix-matrix multiplications over a few layers, we will carefully investigate which parallel approach is the best (whether MPI, coarrays, OpenMP or just parallel BLAS that we already have).

  • Reduced precision models: Experiment with using reduced precision models (e.g., 16-bit or 8-bit floats) instead of the default 32-bit to potentially speed up inference.

  • GPU acceleration: Explore how to optimize fastGPT for GPU architectures to potentially make it even faster.

  • UI improvements: Add a chat mode (similar to chatGPT). Explore how to make it easier to use as a grammar checker, or creating summaries, or other areas where GPT-2 is strong. Make it a nice Fortran library, installable using fpm, usable in other projects. Investigate how to use it with the neural-fortran project.

Expected outcomes: Create an improved fastGPT implementation that is faster, easier to use, and more general.

Skills preferred: Fortran, linear algebra

Difficulty: Intermediate, 175 hours

Mentors: Ondřej Čertík (@certik), Milan Curcic (@milancurcic)

Fortran Graphics Library

Fortran does not have native graphics handling capabilities. While several bindings interfacing Fortran to graphics and plotting libraries are available (e.g., f03gl, sdl, pyplot, dislin, plplot ), no up-to-date open-source graphics package with a pure, modern Fortran API is available.

The aim of this project is to lay out the basics of an object-oriented "canvas" representation in object-oriented Fortran. The contributor would implement, document, and test basic graphics classes (2d points, lines, brushes, etc.), an abstract graphics canvas API with backends to both file and graphics devices (i.e., bitmap, PNG, OpenGL, SVG, etc.) The outcome of this project would be a contribution to the development of a platform-agnostic graphics library for Fortran.

Expected outcomes: Design and implement classes for 2d graphics primitives, a unified graphics canvas API, and several backend implementations.

Skills preferred: Fortran, C, 2D graphics basics

Difficulty: Intermediate, 350 hours

Mentors: Federico Perini (@perazz)*

Improved generation of Fortran interfaces for PETSc

PETSc, the Portable, Extensible Toolkit for Scientific Computation, pronounced PET-see, is for the scalable (parallel) solution of scientific applications modeled by partial differential equations (PDEs). It has bindings for C, Fortran, and Python (via petsc4py). PETSc also contains TAO, the Toolkit for Advanced Optimization, software library. It supports MPI, and GPUs through CUDA, HIP, Kokkos, or OpenCL, as well as hybrid MPI-GPU parallelism; it also supports the NEC-SX Tsubasa Vector Engine.

Currently, only a part of the Fortran interfaces can be generated automatically using bfort. Since the manual generation of the remaining interfaces is tedious and error prone, this project is about an improved generation of Fortran interfaces from PETSc's C code.

The main tasks of this project are

  • Definition of a robust and future-proof structure for the Fortran interfaces
  • Selection and/or development of a tool that creates the interfaces automatically

More specifically, the first task is about finding a suitable structure of the C-to-Fortran interface that reduces the need of 'stubs' on the C and Fortran side making use of modern Fortran features where appropriate. This task will involve evaluating different approaches found in other projects taking into account the object-oriented approach of PETSc. Prototypes will be implemented manually and evaluated with the help of the PETSc community. The second task is then the automated generation of the Fortran interfaces using the approach selected in the first task. To this end, it will be evaluated whether an extension of bfort, the use of another existing tool, or the development of a completely new tool (probably in Python) is the most suitable approach.

Links:

Expected outcomes: Stable and robust autogeneration of Fortran interfaces for PETSc that works for almost all routines

Skills preferred: Programming experience in multiple languages, ideally C and/or Fortran

Difficulty: Intermediate, 350 hours

Mentors: Martin Diehl (@MarDiehl)

Clone this wiki locally