Skip to content

pyGecko is an open-access python libary for the parsing, processing and analysis of GC/MS and GC/FID raw data.

License

Notifications You must be signed in to change notification settings

SPP2363/pyGecko

 
 

Repository files navigation

pyGecko

pyGecko_Logo

pyGecko an open-source Python library for the parsing, processing and analysis of GC-MS and GC-FID raw data.

With increasing amounts of analytical and metadata generated in HTE, data processing and analysis quickly become a workflow's limiting step if conducted manually. The automated processing of analytical data opens up time for chemists to focus on relevant outcomes, enables the standardized storage of reaction data, and facilitates the integration of analytical methods into closed-loop systems. Herein we present pyGecko, an open-source Python library for the parsing, processing and analysis of GC-MS and GC-FID raw data. pyGecko offers a variety of analysis tools for the automated or semi-automated handling of GC measurements and sequences. This includes the interpretation of measurements in the context of the experiment, the automatic identification of internal standards and compound identifications based on retention times, the mass of a molecular ion or fragment and spectral comparison. Quantification relative to an internal standard can be performed for GC-FID measurements. Results of an analysis as well as chromatograms and spectra can be visualized and reported in standardized formats like the Open Reaction Database (ORD) schema. pyGecko is designed to be easily integrated into automated workflows and can be used as a stand-alone tool or as a python library.

Preprint: https://chemrxiv.org/engage/chemrxiv/article-details/66adfc465101a2ffa8001761

Installation

Important

To read vendor files you need to install the msConvert tool from ProteoWizard. You can download it from here. You need to specify the path to the msConvert.exe before the first run of pyGecko.

pyGecko can be installed via pip:

git clone https://github.com/FelixKatz77/pyGecko.git
cd pyGecko
pip install ./

Afterward the path to the msConvert.exe needs to be specified. This can be done by running the following command:

cd pygecko
python __init__.py

This will prompt you to specify the path to the msConvert.exe file:

Please provide the path to the msConvert executable or specify it in the config.ini:

After that pyGecko is ready to use.

Documentation

The documentation for pyGecko can be found here.

Usage

For non-automated workflows pyGecko is best used with jupyter notebooks. The notebooks folder of the repository contains examples for the usage of pyGecko for the quantitative analysis of reaction outcomes and spectral matching. The Python scripts used to perform the data processing for the publication can be found in the examples folder. GC-MS and GC-FID raw data for all experiments is available on Zenodo.

Supported File Formats

pyGecko supports the following file formats:

GC-MS GC-FID
.mzML .xy
.mzXML .CSV
.D (Agilent)
.RAW (Thermo)

Note

To achieve the best performance, we recommend using the .mzML file format for GC-MS data.

About

pyGecko is an open-access python libary for the parsing, processing and analysis of GC/MS and GC/FID raw data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 51.5%
  • Python 46.5%
  • MAXScript 2.0%