Summary
Injecting expressions without losing the calling context is a classic data‑masking pitfall.
- The mask created by
as_data_mask()cuts off the parent environment, so functions likemaxare not found. - The canonical solution is to build a mask that inherits the caller’s environment while still protecting data objects.
Root Cause
-
as_data_mask()creates a fresh environment whose parent is the data frame itself. -
When
inject()evaluates the quasiquoted expression, it only sees that mask and its children. -
Anything outside the mask—including base functions—is hidden, causing “could not find function
max” errors. -
This is not a bug; it’s the intended isolation of the data mask.
-
Bullet list of technical reasons
- Mask isolation prevents data objects from overriding functions. – The parent chain is deliberately truncated.
- No automatic bridge to the calling environment is provided.
Why This Happens in Real Systems
- Many R packages (dplyr, tidyverse) rely on data‑masking to let users write code that looks like
summarise(n = n()). - Those functions intentionally replace the parent with a controlled environment to guarantee reproducibility.
- When you try to reuse that mask for custom quasiquotation, the same isolation becomes a gotcha unless you explicitly restore the parent.
- Real‑world pipelines often embed custom masks inside larger workflows, making the loss of context subtle and hard to debug.
Real-World Impact
- Production code fails silently when a user’s expression references a base function not in the mask.
- Debugging can take hours because the error surfaces only at runtime, not at definition time.
- Teams may end up hard‑coding function listings into masks, leading to maintenance nightmares.
- Collaboration suffers when junior developers inadvertently break downstream analyses.
Example or Code (if necessary and relevant)
# Minimal reproducible example of a broken inject()
fn_inj Error: could not find function "max"
# Correct approach: preserve the calling environment's parent
make_mask <- function(df) {
mask <- as_data_mask(df)
parent_env(mask) <- parent.frame() # inherit caller's context mask
}
fn_fixed <- function(df, col) {
mask returns the max of column A
How Senior Engineers Fix It- Create a mask that explicitly inherits the caller’s environment: parent_env(mask) <- parent.frame().
-
Wrap the injection call so that the expression is evaluated with the restored parent.
-
Keep data and functions separate: place data objects one level below function objects to avoid accidental overrides.
-
Leverage
eval_tidy()when possible; it already handles mask inheritance for you. -
Document the mask‑building step and add unit tests that verify function resolution.
-
Key takeaway: preserving the parent environment is the canonical way to retain calling‑context functions while still using a data mask.
Why Juniors Miss It
- They often view
as_data_mask()as a black box that “just works” and assume it behaves like the environments they create manually. - They may not be aware of
parent_env()manipulation or the importance of environment inheritance. - The error message (“could not find function”) is vague, leading to misdiagnosis and repeated trial‑and‑error.
- Junior developers tend to copy‑paste examples without understanding the underlying environment mechanics, missing the subtle step of restoring the parent.
*All bullet points