Skip to content

Introduce SplitId(Arc<str>) newtype, confine Ulid to split id generation#6564

Draft
fulmicoton-dd wants to merge 1 commit into
mainfrom
split-id-newtype
Draft

Introduce SplitId(Arc<str>) newtype, confine Ulid to split id generation#6564
fulmicoton-dd wants to merge 1 commit into
mainfrom
split-id-newtype

Conversation

@fulmicoton-dd

Copy link
Copy Markdown
Collaborator

Replace the pub type SplitId = String alias with an opaque SplitId(Arc<str>) newtype so that no code outside of split id generation depends on the internal structure of a split id. This is the first step towards adding a random prefix to split ids (to spread S3 read load across prefixes), keeping that future change localized to the generator.

The newtype is serde-transparent, so on-disk and wire representations are unchanged (backward compatible). It is adopted by SplitMetadata, the searcher split cache (where Ulid was used as an opaque key) and the indexing split store registry. Wire/proto types, search-internal hit tracking, CLI args and id-list helpers feeding proto requests stay String.

After this change ulid::Ulid is referenced for split ids in only two places:

  • new_split_id(), the generation site
  • a temporary shim in the indexing split cache that recovers a split's creation time from the trailing ULID of its id, tolerant of a future random prefix. This store is slated for removal with the compactor service.

Description

Describe the proposed changes made in this PR.

How was this PR tested?

Describe how you tested this PR.

@fulmicoton-dd fulmicoton-dd force-pushed the split-id-newtype branch 13 times, most recently from 68bd42d to 2a13d2f Compare June 30, 2026 13:00
Replace the `pub type SplitId = String` alias with an opaque
`SplitId(Arc<str>)` newtype so that no code outside of split id
generation depends on the internal structure of a split id. This is the
first step towards adding a random prefix to split ids (to spread S3
read load across prefixes), keeping that future change localized to the
generator.

The newtype is serde-transparent, so on-disk and wire representations
are unchanged (backward compatible). It is adopted by SplitMetadata, the
searcher split cache (where Ulid was used as an opaque key) and the
indexing split store registry. Wire/proto types, search-internal hit
tracking, CLI args and id-list helpers feeding proto requests stay
String.

After this change ulid::Ulid is referenced for split ids in only two
places:
- new_split_id(), the generation site
- a temporary shim in the indexing split cache that recovers a split's
  creation time from the trailing ULID of its id, tolerant of a future
  random prefix. This store is slated for removal with the compactor
  service.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant