update sdk with new adds by luke-e-schaefer · Pull Request #467 · scaleapi/nucleus-python-client

luke-e-schaefer · 2026-06-25T21:41:50Z

Added

Evaluations V2 slice scoping and exclusion rules. create_evaluation_v2() accepts slice_id (restrict the evaluation to a slice's items) and exclusion_rules (drop items/annotations before metrics are computed) via the new MetadataExclusionRule, LabelExclusionRule, and BoxAreaExclusionRule types (or equivalent dicts). The EvaluationV2 resource exposes slice_id, exclusion_rules, and exclusion_stats. EvaluationV2FilterArgs gains gt_area_range (filter by ground-truth box area, e.g. COCO small/medium/large bands) and slice_ids, applied by both charts() and examples().
Evaluation V2 presets. Save and reuse evaluation configurations (name + allowed_label_matches + exclusion_rules) via NucleusClient.list_evaluation_v2_presets(), create_evaluation_v2_preset(), update_evaluation_v2_preset(), and delete_evaluation_v2_preset(), plus the new EvaluationV2Preset resource (with update() / delete()). Apply a preset directly when creating an evaluation: create_evaluation_v2(model_run_id, preset=preset) seeds the matches and rules (explicit arguments override the preset).
create_evaluation_v2() accepts only_items_with_predictions to restrict the evaluation to items that have at least one prediction.
Batch create. create_evaluations_v2_batch() creates one evaluation per (model_run_id, slice_id) pair with a shared configuration, running concurrently and returning a BatchEvaluationResult per job (capturing the created evaluation or the per-job error).
Cancel & retry. EvaluationV2.cancel() stops a running evaluation; EvaluationV2.retry() re-runs a failed one, reusing its slice/matches/exclusion rules.
Dataset.evaluation_label_schema() returns the dataset's ground-truth and prediction label vocabularies (gt_labels / prediction_labels) for building label matches and label exclusion rules.

Changed

EvaluationV2.examples() now treats match_type as optional — omit it to return examples of all match types.

Fixed

EvaluationV2.charts() issues a POST (matching the backend route) instead of a GET with a query string, which did not reach the server.

Greptile Summary

This PR expands Evaluation V2 support in the Python SDK. The main changes are:

Adds slice scoping, exclusion rules, and prediction-only evaluation creation options.
Adds Evaluation V2 preset CRUD helpers and preset-based evaluation creation.
Adds batch Evaluation V2 creation, cancel, retry, and label schema helpers.
Updates Evaluation V2 charts to use POST and examples to allow all match types.

Confidence Score: 5/5

The SDK changes are merge-safe based on the reviewed API surface and tests.

The implementation is covered by targeted Evaluation V2 and preset tests, and no blocking correctness issues were identified.

T-Rex Logs

What T-Rex did

Baseline state showed charts using GET query strings and missing newer preset/batch/dataset surfaces.
Head state after the change shows POST-based evaluationsV2/{id}/charts, examples POST bodies, POST cancel/retry routes, evaluationV2Presets GET/POST/PATCH/DELETE routes, dataset/{id}/labelSchema GET-equivalent call, and batch cross-product result capture.
Baseline state lacked the five new imports and nucleus.__all__ omitted them.
Head state now has import_ok: true, all requested names exported in __all__, serialized exclusion rules, succeeded states with with_evaluation: true and with_error: false, and parsed presets for both camelCase and snake_case payloads; the test script used for both runs is saved as an artifact.

_{Ran code and verified through T-Rex}

_{Reviews (4): Last reviewed commit: "greptile" | Re-trigger Greptile}

update sdk with new adds

02e6d87

luke-e-schaefer requested a review from edwinpav June 25, 2026 21:41

luke-e-schaefer self-assigned this Jun 25, 2026

luke-e-schaefer requested a review from vinay553 June 25, 2026 21:42

remove verbose comment

3a29f12

greptile-apps Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread nucleus/evaluation_v2_preset.py Outdated

luke-e-schaefer added 2 commits June 25, 2026 16:55

greptile

81b1381

remove api doc add

3b73b6c

greptile-apps Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread nucleus/__init__.py

greptile

7cc33e7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

update sdk with new adds#467

update sdk with new adds#467
luke-e-schaefer wants to merge 5 commits into
masterfrom
update-nuc-sdk-for-new-eval-stuff-pt1

luke-e-schaefer commented Jun 25, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

luke-e-schaefer commented Jun 25, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Added

Changed

Fixed

Greptile Summary

Confidence Score: 5/5

T-Rex Logs

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

luke-e-schaefer commented Jun 25, 2026 •

edited by greptile-apps Bot

Loading