v2.5.0 & Python 1.0.0b1
xuchen-plus
released this
10 Jan 04:56
·
38 commits
to release/2.5.0
since this release
LakeSoul 2.5.0 Release Note
What's New
- Python Reader supports PyTorch, PyArrow, Pandas, Ray, and distributed execution;
- Support Spark Gluten Vectorized Engine;
- Spark SQL supports Compaction, Rollback and other Call Procedures;
- Flink CDC’s entire database synchronization supports MySQL, PostgreSQL, PolarDB, and Oracle;
- Support streaming and batch export to MySQL, PostgreSQL, PolarDB, and Apache Doris;
- Optimized NativeIO performance.
更新内容
- Python Reader 支持 PyTorch、PyArrow、Pandas、Ray,支持分布式执行;
- 支持 Spark Gluten Vectorized Engine;
- Spark SQL 支持 Compaction、Rollback 等 Call Procedures;
- Flink CDC 整库同步支持 MySQL、PostgreSQL、PolarDB、Oracle;
- 支持流式、批式出湖至 MySQL、PostgreSQL、PolarDB、Apache Doris;
- 优化 NativeIO 性能.
What's Changed
- [Spark]rename MetaVersion at lakesoul-spark as SparkMetaVersion by @Ceng23333 in #353
- [Metadata]Replace table_info.table_schema with arrow kind schema (Backward Compatibility) by @Ceng23333 in #354
- [Python][Dataset] Add Ray reading support by @codingfun2022 in #355
- [Spark]optimize incremental read and fix compact operation cause column disorder bug by @F-PHantam in #352
- [Rust] Create Rust CI by @Ceng23333 in #356
- [Rust][Metadata]Create Rust MetadataClient & add CI test cases by @Ceng23333 in #357
- [Rust][NativeIO]Use stable rustc for lakesoul-io feature default by @Ceng23333 in #358
- [Python][Rust][Metadata] Update python metadata interface && Full arrow types test by @Ceng23333 in #359
- [Spark] Spark Sql Support 'drop partition' Operation by @F-PHantam in #360
- [Python]python deserialized schema from java by @Ceng23333 in #361
- [Python] Fix wheel building; update version to 1.0.0b1 by @codingfun2022 in #362
- [Rust][Metadata]Asynchronized rust metadata method by @Ceng23333 in #365
- Add some rust test cases by @zhaishuangszszs in #364
- [Datafusion]Implement LakeSoul Catalog by @Ceng23333 in #366
- [Rust] add upsert test cases by @zhaishuangszszs in #367
- [Flink] update fury version to 0.4 by @xuchen-plus in #368
- refine upsert test by @zhaishuangszszs in #369
- [Spark] support call sql syntax by @moresun in #370
- [Rust]DataFusion version upgraded to 33.0.0 by @Ceng23333 in #372
- [Spark] Support Gluten Vectorized Engine by @xuchen-plus in #374
- [Flink] Support oracle cdc source by @ChenYunHey in #375
- [NativeIO] Use rust block api in file read by @xuchen-plus in #377
- [Flink] Add export to external dbs for LakeSoul's tables by @ChenYunHey in #376
- [Rust] Add LakeSoulHashTable Sink for DataFusion by @Ceng23333 in #382
- [NativeIO] Enable parquet rowgroup prefetch. Support s3 host style access by @xuchen-plus in #384
- [Rust]fix hash value to spark_murmur3 by @Ceng23333 in #385
- [BugFix]Fails when create table with nullable hash colmun by @Ceng23333 in #387
- [Flink] Add Jdbc cdc sources and sinks by @ChenYunHey in #381
- [Python] fix python meta config parse logic by @xuchen-plus in #388
- [Project/Doc] Bump version to 2.5.0 and update docs by @xuchen-plus in #389
- Bump postcss from 8.4.23 to 8.4.33 in /website by @dependabot in #396
- Bump @babel/traverse from 7.21.5 to 7.23.7 in /website by @dependabot in #393
- Bump follow-redirects from 1.15.2 to 1.15.4 in /website by @dependabot in #399
- Bump org.apache.avro:avro from 1.11.0 to 1.11.3 in /lakesoul-spark by @dependabot in #394
- Bump com.google.guava:guava from 30.1.1-jre to 32.0.0-jre in /lakesoul-presto by @dependabot in #395
- [Rust] Update arrow rs dependencies by @xuchen-plus in #400
New Contributors
- @zhaishuangszszs made their first contribution in #364
Full Changelog: v2.4.1...v2.5.0