GH-8487: implement HGLM gaussian #16403

wendycwong · 2024-09-30T22:39:06Z

This PR fixes this issue: #8487

I have separated HGLM from GLM as its own toolbox. The only family that is supported now is Gaussian. I still need to do the following:

client tests (java, python/R) to make sure model metrics are passed;
client tests to make sure model summary, scoring history and coefficient tables are passed;
check and make sure we use the correct formula to estimate the residual noise variance, refer to the doc.
check and make sure we choose one of the likelihood methods. I implemented two. Refer to the doc.
HGLM_H2O_Implementation.pdf

GH-8487: crafting HGLM parameters. GH-8487: implement EM algo. GH-8487: forming the fixed matrices and vectors. GH-8487: add test to make sure correct initialization of fixed, random coefficients, sigma values and T matrix. GH-8487: Finished implementing EM to estimate fixed coefficients, random coefficients, tmat and tauEVar GH-8487: finished implementing prediction but still need to figure out the model metrics calculation. GH-8487: Adding support for models without random intercept. GH-8487: adding normalization and denormalization of coefficients for fixed and random. GH-8487: Completed prediction implementation and added tests to make sure prediction is correct when standardize=true/false, random_intercept = true/false. GH-8487: fixing model metric classes. GH-8487: add python and R tests. GH-8487: adding hooks to generate synthetic data. GH-8487: added scoring history, model summary, coefficient tables. GH-8487: added modelmetrics for validation frame. GH-8487: From experiment to find best tauEVar calculation process. The one in equation 10 is best. GH-8487: add capability in Python client to extract scoring history, model summary, model metrics, model coefficients (fixed and random), icc, T matrix, residual variance. GH-8487: done checking scoring history, model summary and model metrics. GH-8487: added R client test for utility functions. GH-8487: use lambda_ instead lf Lambda in pyunit_benign_glm.py

wendycwong requested review from krasinski, valenad1, maurever and tomasfryda September 30, 2024 22:39

wendycwong force-pushed the wendy_gh_8487_HGLM_gaussian branch 2 times, most recently from 9ecc510 to 925042a Compare October 7, 2024 23:13

wendycwong force-pushed the wendy_gh_8487_HGLM_gaussian branch 3 times, most recently from 60ecdae to d7eeb43 Compare October 14, 2024 16:55

wendycwong force-pushed the wendy_gh_8487_HGLM_gaussian branch 4 times, most recently from f80efd6 to 283b66b Compare October 20, 2024 23:49

wendycwong force-pushed the wendy_gh_8487_HGLM_gaussian branch from dc61200 to ecf9a91 Compare October 21, 2024 00:56

Rmove HGLM reference in codes of other algo.

e3c6ffe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-8487: implement HGLM gaussian #16403

GH-8487: implement HGLM gaussian #16403

wendycwong commented Sep 30, 2024

GH-8487: implement HGLM gaussian #16403

Are you sure you want to change the base?

GH-8487: implement HGLM gaussian #16403

Conversation

wendycwong commented Sep 30, 2024