-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: move unity catalog integration to it's own crate #3044
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Stephen Carman <[email protected]>
…e code base Signed-off-by: Stephen Carman <[email protected]>
Signed-off-by: Stephen Carman <[email protected]>
…024 changes Signed-off-by: Stephen Carman <[email protected]>
…024 changes Signed-off-by: Stephen Carman <[email protected]>
Signed-off-by: Stephen Carman <[email protected]>
Signed-off-by: Stephen Carman <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3044 +/- ##
==========================================
- Coverage 72.62% 72.40% -0.23%
==========================================
Files 129 128 -1
Lines 41626 41368 -258
Branches 41626 41368 -258
==========================================
- Hits 30232 29951 -281
- Misses 9419 9445 +26
+ Partials 1975 1972 -3 ☔ View full report in Codecov by Sentry. |
// impl From<RetryError> for reqwest::Error { | ||
// fn from(value: RetryError) -> Self { | ||
// Into::into(value) | ||
// } | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓
r#" | ||
let client = reqwest_middleware::ClientBuilder::new(Client::new()).build(); | ||
|
||
let _retry_config = RetryConfig::default(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if I was going to remove it at that point from the testing, so it's a side effect left over from that.
Response::new(Body::from( | ||
r#" | ||
let client = reqwest_middleware::ClientBuilder::new(Client::new()).build(); | ||
let _retry_config = RetryConfig::default(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another unused variable that should be removed? 🤔
crates/core/Cargo.toml
Outdated
@@ -137,4 +138,4 @@ datafusion = [ | |||
datafusion-ext = ["datafusion"] | |||
json = ["parquet/json"] | |||
python = ["arrow/pyarrow"] | |||
unity-experimental = ["reqwest", "hyper"] | |||
unity-experimental = ["reqwest", "reqwest-middleware"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this feature come off the core crate now that this code is moved into its own crate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is a discussion point I wanted to talk about. The main trait for the catalog and the errors remained in the core crate. So core can build I assume generic catalog support and catalog crates just depend on core and implement. At first glance I kinda didn't have a cohesive error handling thing here, but I think thinking about it. There should be a generic catalog error in core, but have no conditionals features on it. And individual crates implement a conversion to that error type. Sound good?
Co-authored-by: R. Tyler Croy <[email protected]>
Co-authored-by: R. Tyler Croy <[email protected]>
Signed-off-by: Stephen Carman <[email protected]>
let compare_expr = match generalize_filter( | ||
let compare_expr = generalize_filter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed a few warnings along the way, this statement could be just written with an ?
operator instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, just some minor questions.
More gene4rally speaking though, I think we need to make some more deliberate choices around how we want to integrate with catalogs.
AS I have been out for a while, I am not sure where we are in terms of eagerness to push the next release. I would hope that we can do some more cleaning and most importantly testing before we release get "more serious" with unity ...
w.r.t. testing I hope that we can even do some integration tests using OSS UC. Happy the help with that.
@@ -216,7 +226,10 @@ impl FromStr for UnityCatalogConfigKey { | |||
"workspace_url" | "unity_workspace_url" | "databricks_workspace_url" => { | |||
Ok(UnityCatalogConfigKey::WorkspaceUrl) | |||
} | |||
_ => Err(UnityCatalogError::UnknownConfigKey(s.into()).into()), | |||
_ => Err(DataCatalogError::UnknownConfigKey { | |||
catalog: "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess in this case the "catalog" would be "Unity".
Btw... a lot fi the client impl was just stolen from Object store, where this field identifies the specific implementation (azure, gcp, ...). If we have this error live in a sepate create, this dioes not make much sense anymore.
|
||
Ok(resp.json().await?) | ||
Ok(resp.json().await.map_err(UnityCatalogError::from)?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we have an impl From<...>
for the error, why do we need the map? does the function return a different kind of err.
"GOOGLE_SERVICE_ACCOUNT", | ||
account_path.as_path().to_str().unwrap(), | ||
); | ||
unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I missing something, or did we remove unsafe
around set_var
further above, specifically in crates/aws/src/storage.rs
, and here we are adding it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to add it back to the other one, thanks for pointing it out
Description
This moves the catalog integration of unity catalog to it's own crate. It also replaces the mock server with the httpmock crate and the custom retry logic with request_middleware's retry middleware. I think this will be easier to manage as a client and I do not believe many people were using this integration anyways.
Related Issue(s)
Documentation