Describe the bug, including details regarding any error messages, version, and platform.
Description
Three methods in the arrow R package access R metadata using x$metadata$r. Because $ on a named list uses partial matching, any schema-level metadata key that starts with "r" but is not "r" (e.g. "rachel", "row_count", "result") will be erroneously matched and its value passed to apply_arrow_r_metadata() or used as group var metadata. This causes spurious "Invalid metadata$r" warnings or hard errors depending on the matched value.
The fix in all three locations is to replace x$metadata$r with x$metadata[["r"]].
Affected code
collect.ArrowTabular: apply_arrow_r_metadata(df, x$metadata$r)
as.data.frame.ArrowTabular: apply_arrow_r_metadata(df, x$metadata$r)
group_vars.ArrowTabular: x$metadata$r$attributes$.group_vars
Reprex
library(arrow)
library(dplyr)
# Build a table with a schema metadata key that starts with "r" but isn't "r".
# This can happen when integrating with systems that attach their own metadata
# (e.g., a key called "rachel", "row_count", "result", etc.).
tbl <- arrow_table(x = 1:3)
tbl_rachel <- tbl$cast(
tbl$schema$WithMetadata(list(rachel = "some_value"))
)
# Confirm that $r partial-matches to $rachel, while [["r"]] correctly returns NULL
meta <- tbl_rachel$metadata
meta$r # "some_value" <-- partial match: WRONG
meta[["r"]] # NULL <-- exact match: correct
# as.data.frame() spuriously warns "Invalid metadata$r"
as.data.frame(tbl_rachel)
#> Warning message: Invalid metadata$r
# collect() same spurious warning
collect(tbl_rachel)
#> Warning message: Invalid metadata$r
# group_vars() hard errors because it does x$metadata$r$attributes$.group_vars
# and "$" is invalid on an atomic vector
group_vars(tbl_rachel)
#> Error in x$metadata$r$attributes : $ operator is invalid for atomic vectors
Expected behaviour
as.data.frame() and collect() should return the data without any warning — there is no "r" metadata key, so no R metadata should be applied.
group_vars() should return character(0) — there are no group vars encoded.
Actual behaviour
as.data.frame() and collect() emit a spurious "Invalid metadata$r" warning.
group_vars() throws "$ operator is invalid for atomic vectors".
Root cause
schema$metadata returns a plain R list. R's $ operator performs partial matching on lists, so meta$r resolves to meta$rachel when no exact "r" key exists. The fix is to use [[ (which never partial-matches) everywhere $metadata$r appears:
# Before (all three methods)
x$metadata$r
# After
x$metadata[["r"]]
Session Info
R version 4.6.0 (2026-04-24)
Platform: aarch64-apple-darwin25.4.0
Running under: macOS Tahoe 26.5.1
Matrix products: default
BLAS: /opt/homebrew/Cellar/openblas/0.3.33/lib/libopenblasp-r0.3.33.dylib
LAPACK: /opt/homebrew/Cellar/r/4.6.0/lib/R/lib/libRlapack.dylib; LAPACK version 3.12.1
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
time zone: Europe/London
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dplyr_1.2.1 arrow_24.0.0
loaded via a namespace (and not attached):
[1] assertthat_0.2.1 R6_2.6.1 bit_4.6.0 tidyselect_1.2.1
[5] magrittr_2.0.5 glue_1.8.1 tibble_3.3.1 pkgconfig_2.0.3
[9] bit64_4.8.2 generics_0.1.4 lifecycle_1.0.5 cli_3.6.6
[13] vctrs_0.7.3 compiler_4.6.0 purrr_1.2.2 pillar_1.11.1
[17] rlang_1.2.0
Component(s)
R
Describe the bug, including details regarding any error messages, version, and platform.
Description
Three methods in the arrow R package access R metadata using
x$metadata$r. Because$on a named list uses partial matching, any schema-level metadata key that starts with"r"but is not"r"(e.g."rachel","row_count","result") will be erroneously matched and its value passed toapply_arrow_r_metadata()or used as group var metadata. This causes spurious"Invalid metadata$r"warnings or hard errors depending on the matched value.The fix in all three locations is to replace
x$metadata$rwithx$metadata[["r"]].Affected code
collect.ArrowTabular:apply_arrow_r_metadata(df, x$metadata$r)as.data.frame.ArrowTabular:apply_arrow_r_metadata(df, x$metadata$r)group_vars.ArrowTabular:x$metadata$r$attributes$.group_varsReprex
Expected behaviour
as.data.frame()andcollect()should return the data without any warning — there is no"r"metadata key, so no R metadata should be applied.group_vars()should returncharacter(0)— there are no group vars encoded.Actual behaviour
as.data.frame()andcollect()emit a spurious"Invalid metadata$r"warning.group_vars()throws"$ operator is invalid for atomic vectors".Root cause
schema$metadatareturns a plain R list. R's$operator performs partial matching on lists, someta$rresolves tometa$rachelwhen no exact"r"key exists. The fix is to use[[(which never partial-matches) everywhere$metadata$rappears:Session Info
Component(s)
R