Skip to content
forked from jwhulette/pipes

A PHP ETL solution for use with Laravel or Laravel-Zero

License

Notifications You must be signed in to change notification settings

xswirelab/pipes

 
 

Repository files navigation

Banner

Minimum PHP Version GitHub Tests Action Status GitHub Code Style Action Status Total Downloads

Pipes

Pipes is a PHP Extract Transform Load [ETL] package for Laravel or Laravel Zero

Installation

You can install the package via composer:

composer require jwhulette/pipes

Usage

  1. Create a new EtlPipe object.

  2. Add an extractor to the object to read the input file

  3. Add one or more transforms to transform the data

    • You can add as many transformers as you want.

    • Data is passed to the transfomers in the order they are defined

  4. Add a loader to save the data

Notes

  • Data is passed line by line in the pipeline using the generators
$etl = new EtlPipe();
$etl->extract(new CsvExtractor('my-file.csv'));
$etl->transforms([
    new CaseTransformer()
        ->transformColumn('first_name', 'lower'),
    new TrimTransformer(),
]);
$etl->load(new CsvLoader('saved-file.csv'));
$etl->run();

or

(new EtlPipe())
    ->extract(new CsvExtractor('my-file.csv'))
    ->transforms([
        new CaseTransformer()
            ->transformColumn('first_name', 'lower'),
        new TrimTransformer(),
    ])
    ->load(new CsvLoader('saved-file.csv'))
    ->run();

Performance

I used the datasets from the below link to test the library performance

http://eforexcel.com/wp/downloads-18-sample-csv-files-data-sets-for-testing-sales/

Sample runs on my notebook:

  • MacBook Pro (Retina, 15-inch, Late 2013)
  • 2.3 GHz Quad-Core Intel Core i7
  • 16 GB 1600 MHz DDR3

Using the following pipeline:

  1. Transform the Sales Channel column value to lowercase
  2. Trim the values in all columns
  3. Format the date in the Order Date & Ship Date values
    (new EtlPipe())
    ->extract(new CsvExtractor($filename))
    ->transformers([
        (new CaseTransformer())->transformColumn('Sales Channel', 'lower'),
        (new TrimTransformer())->transformAllColumns(),
        (new DateTimeTransformer())->transformColumn('Order Date')
            ->transformColumn('Ship Date'),
    ])
    ->load(new CsvLoader($filepath.'/output/output.csv'))
    ->run();

Test results

Reading, transforming and writing to another csv file.

Running CSV performance tests

---- Processing file: 100000 Sales Records.csv ---- Peak usage: 10.599MB of memory used. Total execution time in seconds: 4.331 ---- Processing file: 1000000 Sales Records.csv ---- Peak usage: 10.599MB of memory used. Total execution time in seconds: 44.176

Reading XLSX file, tranforming and inserting into sqlite database

Running SQL performance tests

---- Processing file: 100000 Sales Records.xlsx ---- Peak usage: 14.996MB of memory used. Total execution time in seconds: 33.372

Testing

composer test

Changelog

Please see CHANGELOG for more information on what has changed recently.

Contributing

Please see CONTRIBUTING for details.

Security Vulnerabilities

Please review our security policy on how to report security vulnerabilities.

Credits

License

The MIT License (MIT). Please see License File for more information.

About

A PHP ETL solution for use with Laravel or Laravel-Zero

Resources

License

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages

  • PHP 100.0%