Summary
The problem at hand involves customizing a {gtsummary} table in R to remove percent symbols from the table body and the “, n (%)” notation from the stat_label. The goal is to create a streamlined table where column headers indicate the statistics to be interpreted, following the AMA Manual of Style guidelines.
Root Cause
The root cause of this issue lies in the default behavior of {gtsummary} when generating tables for categorical variables. By default, {gtsummary} includes percent symbols in the table body and the “, n (%)” notation in the stat_label. To overcome this, we need to customize the table using various {gtsummary} functions.
Why This Happens in Real Systems
This issue occurs in real systems because:
- Default behavior: The default behavior of {gtsummary} is to include percent symbols and the “, n (%)” notation.
- Lack of customization: Without proper customization, the table will not conform to the desired format.
- Complexity of {gtsummary}: The {gtsummary} package has many features and options, making it challenging to customize the table as needed.
Real-World Impact
The real-world impact of this issue includes:
- Inconsistent formatting: Tables may not be formatted consistently, leading to confusion and difficulties in interpretation.
- Non-compliance with style guides: Tables may not comply with style guides, such as the AMA Manual of Style, which can lead to rejection or revision of manuscripts.
- Difficulty in communication: Inconsistent or poorly formatted tables can hinder effective communication of research findings.
Example or Code
library(gt)
library(gtsummary)
library(dplyr)
library(tibble)
# Set seed for reproducibility
set.seed(123)
# Number of observations
n <- 10
# Simulate dataset
analytic mutate(chronic_conditions = replace(chronic_conditions, 3, NA))
# Create summary table
analytic |>
tbl_summary(
label = list(
age ~ "Age",
sex ~ "Sex",
race ~ "Race",
chronic_conditions ~ "Chronic conditions"
),
statistic = list(
all_continuous() ~ "{mean} ({sd}) [{min}-{max}]",
all_categorical() ~ "{n}"
),
digits = list(
all_continuous() ~ 1,
all_categorical() ~ c(0, 1)
),
type = list(
all_dichotomous() ~ "categorical",
chronic_conditions ~ "continuous"
)
) |>
modify_table_styling(
columns = all_stat_cols(),
align = "left"
) |>
modify_header(stat_0 = "**No.**") |>
add_stat_label(
label = list(
all_continuous() ~ "mean (SD) [range]",
age ~ "mean (SD) [range], y"
)
) |>
as_gt() |>
tab_header(title = md("**Table 1.** Sample Characteristics (N = )")) |>
opt_align_table_header(align = "left")
How Senior Engineers Fix It
Senior engineers fix this issue by:
- Customizing the table: Using {gtsummary} functions to customize the table and remove unwanted elements.
- Understanding the default behavior: Recognizing the default behavior of {gtsummary} and taking steps to override it.
- Using the correct functions: Applying the correct {gtsummary} functions to achieve the desired formatting.
Why Juniors Miss It
Juniors may miss this issue because:
- Lack of experience: Limited experience with {gtsummary} and table customization.
- Insufficient knowledge: Inadequate understanding of the default behavior and customization options.
- Overlooking details: Failing to notice the percent symbols and “, n (%)” notation in the table.