Skip to content

metrics: per-session download goodput histogram#674

Merged
reflog merged 2 commits into
mainfrom
reflog/session-goodput-histogram
Jun 25, 2026
Merged

metrics: per-session download goodput histogram#674
reflog merged 2 commits into
mainfrom
reflog/session-goodput-histogram

Conversation

@reflog

@reflog reflog commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Why

The measurement signal for Lantern's automatic bandit experimentation system. To decide whether a challenger track (a new protocol/datacenter combination) actually delivers a big win, the evaluator needs a real per-session throughput signal it can compare between the challenger track and the incumbent, per region×country. proxy.io (a byte counter) can't give a rate/distribution; this adds one.

Port of getlantern/lantern-box#277 to http-proxy-lantern.

What

proxy.session.goodput — a Float64Histogram (unit bytes/s) recorded once at connection close = received bytes / connection seconds, for sessions that moved ≥ 1 MB.

  • Tagged with network.io.direction=receive plus geo.country.iso_code (point attr), so it's sliceable by track × dc/region (resource attrs from buildResource()) × country — exactly the strata the experiment evaluator compares.
  • No device_id tag (cardinality).

Since the challenger and control are separate tracks, the arm is the track — no per-connection experiment tagging needed; the evaluator queries this histogram's p50 by track name per stratum.

How it differs from lantern-box#277

lantern-box wraps Conn/PacketConn to accumulate per-connection rx bytes (atomic) and emits at Close(). http-proxy-lantern already measures per-connection bytes and duration via the measured listener: the reporting callback (reporting.go) receives cumulative stats (with RecvTotal and Duration) and a final=true flag at close. So goodput is recorded there — before the zero-delta early return, so a session that was idle during its last reporting interval is still counted.

Caveat (documented in code)

Duration is connection open time (includes idle), so goodput is a floor on true transfer speed. Both experiment arms are measured identically, so it's a fair relative signal, and the ≥1 MB floor filters idle-dominated noise.

Tests

TestSessionGoodput (≥1 MB session → one sample, value ≈ bytes/sec, receive tag), TestSessionGoodputBelowThreshold (sub-threshold → no sample), and TestSessionGoodputZeroDuration (divide-by-zero guard). go test ./instrument/ green.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added session download goodput tracking to capture per-connection throughput when sessions close.
    • Included geo (ISO country) and network direction details with the recorded metric.
  • Bug Fixes
    • Avoids recording goodput for very small transfers and for zero/negative durations.
    • Improves reporting flow by parsing client IP once and skipping unnecessary work for “no delta” non-final intervals.
  • Tests
    • Added metric-based tests validating histogram emission, attributes, and correct count/sum behavior across scenarios.

Adds proxy.session.goodput, a Float64Histogram (unit bytes/s) recorded
once at connection close = received bytes / connection seconds, for
sessions that moved >= 1 MB.

This is the measurement signal for Lantern's automatic bandit
experimentation system: to decide whether a challenger track actually
delivers a big win, the evaluator needs a real per-session throughput
signal it can compare between challenger and incumbent, per
region x country. proxy.io (a byte counter) can't give a
rate/distribution; this adds one.

Tagged with network.io.direction=receive plus geo.country.iso_code
(point attr), so it's sliceable by track x dc/region (resource attrs
from buildResource) x country -- exactly the strata the evaluator
compares. No device_id tag (cardinality).

Unlike lantern-box#277, which wraps Conn/PacketConn to accumulate rx
bytes and emit at Close(), http-proxy-lantern already measures
per-connection bytes and duration via the measured listener: the
reporting callback receives cumulative stats and a final=true flag at
close. So goodput is recorded there, before the zero-delta early return
so a session idle during its last reporting interval is still counted.

Caveat (documented in code): duration is connection open time (includes
idle), so goodput is a floor on true transfer speed. Both experiment
arms are measured identically, so it's a fair relative signal, and the
>=1 MB floor filters idle-dominated noise.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 08a30248-440a-4d0f-b7e5-c999f6a46a30

📥 Commits

Reviewing files that changed from the base of the PR and between 533f65c and 23f4f7c.

📒 Files selected for processing (2)
  • instrument/goodput_test.go
  • reporting.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • instrument/goodput_test.go
  • reporting.go

📝 Walkthrough

Walkthrough

Adds session goodput instrumentation by registering a new proxy.session.goodput histogram, recording it from session-close reporting with client IP context, and covering threshold and zero-duration behavior in tests.

Changes

Session Goodput Instrumentation

Layer / File(s) Summary
Metric contract and registration
instrument/instrument.go, instrument/otelinstrument/otelinstrument.go
Adds the goodput threshold, the SessionGoodput interface surface, the no-op implementation, and proxy.session.goodput histogram registration.
Goodput recording logic
instrument/instrument.go
Implements defaultInstrument.SessionGoodput to skip short or zero-duration sessions, compute bytes per second, derive country from clientIP, and record receive-direction goodput.
Reporting hook
reporting.go
Moves client_ip extraction before the final reporting branch and calls instrument.SessionGoodput(...) from proxiedBytesReporter.
Goodput metric tests
instrument/goodput_test.go
Adds manual-reader collection helpers and tests for emitted, suppressed, and zero-duration goodput histograms.

Sequence Diagram(s)

sequenceDiagram
  participant proxiedBytesReporter
  participant defaultInstrumentSessionGoodput as defaultInstrument.SessionGoodput
  participant sessionGoodputHistogram as otelinstrument.SessionGoodput
  proxiedBytesReporter->>defaultInstrumentSessionGoodput: SessionGoodput(ctx, recvBytes, duration, clientIP)
  defaultInstrumentSessionGoodput->>sessionGoodputHistogram: record goodput with attrs
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

A bunny hopped through bytes and light,
And metrics glowed at closing time bright.
Tiny nibs stayed in the grass,
Big goodput hops went zooming past,
🐇 Hooray for histograms tonight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding a per-session download goodput histogram metric.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch reflog/session-goodput-histogram

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
instrument/goodput_test.go (1)

49-51: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Also assert the geo.country.iso_code attribute to fully cover the metric contract.

The positive-path test currently validates only network.io.direction; adding a country-attribute assertion would better lock the intended emitted shape.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@instrument/goodput_test.go` around lines 49 - 51, The positive-path check in
the goodput test only verifies network.io.direction, so the emitted metric shape
is not fully covered. Update the test that uses extractHistogramAttrs and assert
the geo.country.iso_code attribute as well, alongside the existing
proxy.session.goodput and network.io.direction checks, so the contract is
validated in the same test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@instrument/goodput_test.go`:
- Around line 20-29: newGoodputInstrument mutates the global OTEL meter provider
without restoring it, which can leak the manual-reader provider into later
tests. In newGoodputInstrument, save the current provider from
sdkotel.GetMeterProvider before calling sdkotel.SetMeterProvider, then use
t.Cleanup to restore that original provider after the test finishes. Keep the
change localized to newGoodputInstrument and the MeterProvider setup around
sdkmetric.NewManualReader/NewMeterProvider.

---

Nitpick comments:
In `@instrument/goodput_test.go`:
- Around line 49-51: The positive-path check in the goodput test only verifies
network.io.direction, so the emitted metric shape is not fully covered. Update
the test that uses extractHistogramAttrs and assert the geo.country.iso_code
attribute as well, alongside the existing proxy.session.goodput and
network.io.direction checks, so the contract is validated in the same test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: cee7f2ce-83d2-4324-b3ab-2e668290e5ec

📥 Commits

Reviewing files that changed from the base of the PR and between 2e42d57 and 533f65c.

📒 Files selected for processing (4)
  • instrument/goodput_test.go
  • instrument/instrument.go
  • instrument/otelinstrument/otelinstrument.go
  • reporting.go

Comment thread instrument/goodput_test.go

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a per-session download “goodput” telemetry signal (proxy.session.goodput) so the bandit evaluator can compare per-track throughput distributions (e.g., p50) across region/country strata using a single sample emitted at connection close.

Changes:

  • Emit a Float64Histogram sample at connection close with recv_bytes / connection_open_seconds for sessions with ≥1,000,000 received bytes.
  • Wire the emission into the measured listener reporting callback (so final/idle-last-interval sessions are still counted).
  • Add unit tests covering thresholding and zero-duration guard.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
reporting.go Records session goodput at connection close from cumulative measured stats.
instrument/otelinstrument/otelinstrument.go Defines the proxy.session.goodput Float64Histogram with unit bytes/s.
instrument/instrument.go Adds SessionGoodput() to the instrumentation interface and implements recording with geo + direction attrs.
instrument/goodput_test.go Adds tests for goodput emission, threshold behavior, and zero-duration guard.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread reporting.go
Comment thread instrument/goodput_test.go
Comment thread instrument/goodput_test.go
- reporting.go: skip client IP parsing on idle, non-final reporting
  intervals (return on zero-delta before parsing), and use a safe type
  assertion for ctx[common.ClientIP]. Avoids the per-interval overhead
  regression introduced by recording goodput at close.
- goodput_test.go: restore the global meter provider via t.Cleanup so
  the manual-reader provider doesn't leak into other tests.
- goodput_test.go: also assert the geo.country.iso_code point attribute
  is present, fully covering the emitted metric contract.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@reflog reflog merged commit b0d6915 into main Jun 25, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants