Due to inactivity and lack of support for spark.NET this has been archived. I would recommend building Spark applications in a supported language, not in dotnet.
🏃 Getting Started 📚 Documentation
Typesafe bindings for ⭐ Spark.NET
IMPORTANT: Please note this library under active construction :construction_worker: and should not be used in production. Help is always appreciated, create an issue, check the code out and have a play!
- Check Spark programs at compile time
- Zero dependencies (except spark dotnet!)
- Easy to use, its LINQ for Spark
- Replace stringly typed code with strong models
- No more APIs untyped Spark APIs
// create a model using typed columns
public sealed Person: TypedSchema<Person>
{
public StringColumn Name { get; private set; }
public IntegerColumn Age { get; private set; }
public Person(string? alias): base(alias) {}
public Person(): this(default) {}
}
// now it can be used in typed query operations using LINQ
DataFrame df;
TypedDataFrame<Person> personDf = df.AsTyped<Person>();
personDf
.Where(x => x.Age > 18)
.OrderBy(x => new { Age = x.Age.Desc() })
.Select(x => new { PersonName = x.Name });
// more to come!!
Coming Soon.
Strong types facilitate better code, Spark is typed, leveraging the C# type system we can expose those types and enforce correct Spark applications before they are even run.
More details to come!