Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-8487: implement HGLM gaussian #16403

Open
wants to merge 2 commits into
base: rel-3.46.0
Choose a base branch
from

Conversation

wendycwong
Copy link
Contributor

This PR fixes this issue: #8487

I have separated HGLM from GLM as its own toolbox. The only family that is supported now is Gaussian. I still need to do the following:

  1. client tests (java, python/R) to make sure model metrics are passed;
  2. client tests to make sure model summary, scoring history and coefficient tables are passed;
  3. check and make sure we use the correct formula to estimate the residual noise variance, refer to the doc.
  4. check and make sure we choose one of the likelihood methods. I implemented two. Refer to the doc.
    HGLM_H2O_Implementation.pdf

@wendycwong wendycwong force-pushed the wendy_gh_8487_HGLM_gaussian branch 2 times, most recently from 9ecc510 to 925042a Compare October 7, 2024 23:13
@wendycwong wendycwong force-pushed the wendy_gh_8487_HGLM_gaussian branch 3 times, most recently from 60ecdae to d7eeb43 Compare October 14, 2024 16:55
@wendycwong wendycwong force-pushed the wendy_gh_8487_HGLM_gaussian branch 4 times, most recently from f80efd6 to 283b66b Compare October 20, 2024 23:49
GH-8487: crafting HGLM parameters.
GH-8487: implement EM algo.
GH-8487: forming the fixed matrices and vectors.
GH-8487: add test to make sure correct initialization of fixed, random coefficients, sigma values and T matrix.
GH-8487: Finished implementing EM to estimate fixed coefficients, random coefficients, tmat and tauEVar
GH-8487: finished implementing prediction but still need to figure out the model metrics calculation.
GH-8487: Adding support for models without random intercept.
GH-8487: adding normalization and denormalization of coefficients for fixed and random.
GH-8487: Completed prediction implementation and added tests to make sure prediction is correct when standardize=true/false, random_intercept = true/false.
GH-8487: fixing model metric classes.
GH-8487: add python and R tests.
GH-8487: adding hooks to generate synthetic data.
GH-8487: added scoring history, model summary, coefficient tables.
GH-8487: added modelmetrics for validation frame.
GH-8487: From experiment to find best tauEVar calculation process.  The one in equation 10 is best.
GH-8487: add capability in Python client to extract  scoring history, model summary, model metrics, model coefficients (fixed and random), icc, T matrix, residual variance.
GH-8487: done checking scoring history, model summary and model metrics.
GH-8487: added R client test for utility functions.
GH-8487: use lambda_ instead lf Lambda in pyunit_benign_glm.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant