Skip to content

Init VariableImportance class#129

Merged
egenn merged 18 commits into
mainfrom
supervised
Apr 5, 2026
Merged

Init VariableImportance class#129
egenn merged 18 commits into
mainfrom
supervised

Conversation

@egenn

@egenn egenn commented Apr 5, 2026

Copy link
Copy Markdown
Member

No description provided.

Copilot AI review requested due to automatic review settings April 5, 2026 05:12

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR appears to be a small maintenance/release update across the supervised-training codepaths and documentation, plus a package version bump.

Changes:

  • Remove redundant check_is_S7(hyperparameters, <Algo>Hyperparameters) checks from several train_ methods.
  • Minor documentation/comment tweaks (tests note, LightGBM predict docs, draw_varimp param docs) and removal of unused get_ranger_config().
  • Bump package Version and Date in DESCRIPTION.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/testthat/test_Supervised.R Adds context about small datasets and expected GLM warnings in tests.
R/train_TabNet.R Removes redundant S7 hyperparameters check and some commented code.
R/train_SVM.R Clarifies preprocessing section comments.
R/train_Ranger.R Removes redundant S7 hyperparameters check.
R/train_LightRuleFit.R Removes redundant S7 hyperparameters check.
R/train_LightRF.R Removes redundant S7 hyperparameters check.
R/train_LightGBM.R Removes redundant S7 hyperparameters check; expands predict docstring re preprocessing.
R/train_LightCART.R Removes redundant S7 hyperparameters check.
R/train_Isotonic.R Removes redundant S7 hyperparameters check.
R/train_GLMNET.R Removes redundant S7 hyperparameters check.
R/train_GAM.R Removes redundant S7 hyperparameters check.
R/draw_varimp.R Documents that x can be coercible to numeric.
R/02_Hyperparameters.R Removes get_ranger_config() helper.
man/draw_varimp.Rd Keeps Rd docs in sync with draw_varimp() param doc.
DESCRIPTION Version/date bump.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/train_LightGBM.R
Comment thread DESCRIPTION
Comment on lines +2 to +4
Version: 1.0.1
Title: Machine Learning and Visualization
Date: 2026-03-14
Date: 2026-04-03

Copilot AI Apr 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR title suggests initializing a VariableImportance class, but the changes here are primarily cleanup (removing redundant check_is_S7() calls), small documentation/comment tweaks, and a version/date bump. Please either update the PR title/description to match the actual changes or include the missing VariableImportance implementation changes in this PR.

Copilot uses AI. Check for mistakes.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the rtemis package to version 1.0.1, primarily focusing on code cleanup and documentation refinements. Key changes include the removal of redundant check_is_S7 calls across various training methods, the deletion of the unused get_ranger_config function, and updates to the documentation for draw_varimp and LightGBM prediction. Additionally, commented-out code was removed from the TabNet training script, and a note regarding test datasets was added to the test suite. Feedback highlights a discrepancy between the PR title and the actual code changes, and suggests improving consistency in documentation and section headers across similar methods.

Comment thread R/draw_varimp.R
#' which can be used to visualize variable importance, model coefficients, etc.
#'
#' @param x Numeric vector: Input.
#' @param x Numeric vector (or coercible to numeric): Input.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The pull request title mentions initializing a VariableImportance class, but no such class definition or related logic is present in the diff. This discrepancy should be addressed by either including the missing implementation or updating the PR title to reflect the actual changes.

Comment thread R/train_LightGBM.R
Comment on lines +118 to +119
#' @param newdata tabular data: Data to predict on. Will have been preprocessed by
#' `predict.Supervised` before calling this method if algorithm-specific preprocessing was performed during training.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This documentation update regarding newdata preprocessing is helpful, but it has only been added to the predict_super method for lgb.Booster. For consistency and to avoid confusion, similar documentation should be added to other predict_super methods in the package (e.g., for TabNet, GAM, etc.) that follow the same preprocessing logic.

Comment thread R/train_SVM.R
@egenn

egenn commented Apr 5, 2026

Copy link
Copy Markdown
Member Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request standardizes variable importance handling by introducing a new VariableImportance S7 class and updating the Supervised and SupervisedRes classes accordingly. Various algorithm-specific varimp_super methods and plotting functions were updated to support this new structure, and redundant S7 type checks were removed from several training functions. Review feedback highlights potential runtime errors when subsetting dgCMatrix objects in GLMNET and LightRuleFit, a risk of indexing an empty list in SupervisedRes plots, and the need for dynamic column naming in GAM and GLM importance data to reflect the specific measure used.

Comment thread R/train_GLMNET.R Outdated
}

# Exclude intercept
coefs <- coefs[-1]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

coef(model) for a cv.glmnet object returns a dgCMatrix. Subsetting it with [-1] to remove the intercept converts it to a numeric vector but loses the names (variable names). This will cause names(coefs) to be NULL in the subsequent data.table call, leading to an error. Using [, 1] first ensures a named vector is returned.

  coefs <- coefs[, 1][-1]

Comment thread R/train_LightRuleFit.R Outdated
#' @noRd
method(varimp_super, LightRuleFit) <- function(model) {
coef(model@model_glmnet@model)
.coef <- coef(model@model_glmnet@model)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

model@model_glmnet@model is typically a cv.glmnet object, so coef() returns a dgCMatrix. Calling names() on a matrix returns NULL, which will cause the variable column in the data.table to be empty or cause a length mismatch error. You should extract the first column to get a named vector and remove the intercept.

  .coef <- coef(model@model_glmnet@model)[, 1][-1]

Comment thread R/07_Supervised.R
# ylab
if (is.null(ylab)) {
measure_name <- if (is.null(measure)) {
names(x@varimp[[1L]]@data)[2L]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If x@varimp is an empty list (length 0), accessing x@varimp[[1L]] will result in an error. While the check on line 2055 handles NULL, it does not account for an empty list which is a valid state for a class_list property.

      if (length(x@varimp) > 0) names(x@varimp[[1L]]@data)[2L] else "Importance"

Comment thread R/train_GAM.R
Comment on lines +161 to +166
VariableImportance(
data.table(
variable = names(.coef),
Coefficient = unname(.coef)
)
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The column name for the importance measure is hardcoded to "Coefficient". However, this method supports multiple types of importance measures including "p-value" and "edf". Using the type argument as the column name would provide more accurate labels in plots and summaries.

  .vi_dt <- data.table(variable = names(.coef), importance = unname(.coef))
  setnames(.vi_dt, "importance", type)
  VariableImportance(.vi_dt)

Comment thread R/train_GLM.R
Comment on lines +112 to +117
VariableImportance(
data.table(
variable = names(.coef),
Coefficient = unname(.coef)
)
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The column name for the importance measure is hardcoded to "Coefficient". Since this method also supports "p-value" as an importance type, it is better to use the type variable to name the column dynamically.

  .vi_dt <- data.table(variable = names(.coef), importance = unname(.coef))
  setnames(.vi_dt, "importance", type)
  VariableImportance(.vi_dt)

@egenn egenn merged commit bd0ce26 into main Apr 5, 2026
2 checks passed
@egenn egenn deleted the supervised branch June 16, 2026 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants