Model names not showing from a list using performance::compare_performance() in r

Summary

The issue arises when using the performance::compare_performance() function in R to compare the performance of multiple models stored in a list. Key issue: the model names are replaced by generic names (e.g., “Model 1”, “Model 2”, etc.) instead of their original names. This occurs because the compare_performance() function does not properly handle the model names when the models are passed as a list.

Root Cause

The root cause of this issue is the implicit list embedding of S4 objects, which is deprecated. This means that when we pass a list of models to the compare_performance() function, it does not correctly retrieve the model names, resulting in the generic names being used instead.

Why This Happens in Real Systems

This issue can occur in real-world systems when:

  • Working with large datasets and multiple models
  • Using lists to store and manage models
  • Relying on automated workflows and pipelines
  • Lack of awareness about the deprecated behavior of implicit list embedding of S4 objects

Real-World Impact

The impact of this issue includes:

  • Loss of model identifiability: making it difficult to track and compare the performance of specific models
  • Inconsistent output: leading to confusion and errors in interpretation and decision-making
  • Deprecation warnings: indicating that the code may break in future versions of R or the performance package

Example or Code

To reproduce the issue, we can use the following code:

data(iris)
library(performance)
lm1 <- lm(Sepal.Length ~ Species, data = iris)
lm2 <- lm(Sepal.Length ~ Species + Petal.Length, data = iris)
lm3 <- lm(Sepal.Length ~ Species * Petal.Length, data = iris)
compare_performance(lm1, lm2, lm3)
modx3lm <- list(lm1, lm2, lm3)
compare_performance(modx3lm)

How Senior Engineers Fix It

To fix this issue, senior engineers can use the following approaches:

  • Use the name argument: specify the model names explicitly using the name argument in the compare_performance() function
  • Use a character vector: pass a character vector of model names to the compare_performance() function
  • Update the performance package: ensure that the performance package is up-to-date, as newer versions may address this issue

Why Juniors Miss It

Juniors may miss this issue due to:

  • Lack of experience: limited exposure to working with lists and S4 objects in R
  • Insufficient knowledge: unaware of the deprecated behavior of implicit list embedding of S4 objects
  • Overreliance on autopilot: not carefully reviewing the output and warnings generated by the code

Leave a Comment