Skale-engine Roadmap

Last updated December 29 2016

This document describes the high level features the Skale-engine maintainers have decided to prioritize in the near to medium term.

Schema description for datasets

This would be useful for several purposes:

support of columnar, such as parquet
integration with a query language compatible with B.I. tools
optimization of datasets serialization and transfers

## Add support for Parquet

Status: in progress

Parquet is a columnar storage format from the Apache Software Foundation available to any project in the Hadoop ecosystem.

A separate nodeJS module supporting Apache parquet format, both for reading an writing is first required.

Such an experimental module has been started by skale-engine maintainers here

Add support for Avro

Avro is a data serialization system from the Apache Software Foundation which provides rich data structures and a compact fast binary data format.

It is well suited to structured data where a schema is required to encode and decode values.

A pure Javascript implementation exists here.

## Add realtime streaming capabilities

The current processing model is action driven, suitable for batch, or micro-batch processing. See if it is possible to apply the same API, or at least a subset, to source driven processing better suited for realtime data processing, while retaining skale-engine scalability and efficiency.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap.md

Roadmap.md

Skale-engine Roadmap

Schema description for datasets

Add support for Avro

Files

Roadmap.md

Latest commit

History

Roadmap.md

File metadata and controls

Skale-engine Roadmap

Schema description for datasets

Add support for Avro