Skip to content

Candidate Duplicate Test Cases For Cleanup #8942

Description

@ericspod

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python3 - <<'EOF'
import ast
import os
import sys
from pathlib import Path

def find_duplicates_in_list(elements):
    seen = []
    dupes = []
    for i, el in enumerate(elements):
        key = ast.dump(el)
        for j, (prev_key, prev_idx) in enumerate(seen):
            if key == prev_key:
                dupes.append((prev_idx, i))
                break
        else:
            seen.append((key, i))
    return dupes

results = []

for path in Path("tests").rglob("*.py"):
    try:
        src = path.read_text(errors="replace")
        tree = ast.parse(src, filename=str(path))
    except SyntaxError:
        continue

    for node in ast.walk(tree):
        if not isinstance(node, ast.Assign):
            continue
        for target in node.targets:
            if not isinstance(target, ast.Name):
                continue
            name = target.id
            if "TEST" not in name.upper() and "CASES" not in name.upper() and "PARAMS" not in name.upper():
                continue
            val = node.value
            if not isinstance(val, ast.List):
                continue
            elems = val.elts
            if len(elems) < 2:
                continue
            dupes = find_duplicates_in_list(elems)
            if dupes:
                results.append((str(path), name, dupes))

for fpath, varname, dupes in sorted(results):
    for (i, j) in dupes:
        print(f"{fpath}  {varname}  indices {i} and {j} are identical")

if not results:
    print("No duplicates found.")
EOF

Length of output: 16409



@ericspod Here's the full list of files with byte-identical duplicate entries in their test case lists:

apps/

  • tests/apps/deepgrow/test_deepgrow_dataset.pyTEST_CASE_4, TEST_CASE_5
  • tests/apps/detection/utils/test_detector_utils.pyTEST_CASE_3
  • tests/apps/pathology/transforms/post/test_generate_succinct_contour.pyTEST_CASE_1
  • tests/apps/pathology/transforms/post/test_generate_succinct_contourd.pyTEST_CASE_1

bundle/

  • tests/bundle/test_bundle_ckpt_export.pyTEST_CASE_1
  • tests/bundle/test_bundle_trt_export.pyTEST_CASE_1, TEST_CASE_2

data/

  • tests/data/test_arraydataset.pyTEST_CASE_1/2/3, test_labels
  • tests/data/test_lmdbdataset.pyTEST_CASE_3
  • tests/data/test_nrrd_reader.pyTEST_CASE_1 through TEST_CASE_8 (indices 0 and 2 identical in each)
  • tests/data/test_persistentdataset.pyTEST_CASE_4/5/6/7
  • tests/data/test_pil_reader.pyTEST_CASE_1 through TEST_CASE_6
  • tests/data/utils/test_decollate.pytest_case

handlers/

  • tests/handlers/test_handler_checkpoint_saver.pyTEST_CASE_1 through TEST_CASE_8 (many cross-index duplicates — likely worth a separate look; the list entries may be intentional combinations of primitives)
  • tests/handlers/test_handler_hausdorff_distance.pyTEST_SAMPLE_2, TEST_SAMPLE_4
  • tests/handlers/test_handler_surface_distance.pyTEST_SAMPLE_2, TEST_SAMPLE_4

inferers/

  • tests/inferers/test_patch_inferer.py — 9 different TEST_CASE_* variables (indices 0 and 3 identical in each)

metrics/

  • tests/metrics/test_calibration_metric.pyTEST_VALUE_1B1C

networks/

  • tests/networks/blocks/test_mlp.py, test_se_blocks.py, test_selfattention.pytest_case
  • tests/networks/layers/test_savitzky_golay_filter.pyTEST_CASE_1D_REP, TEST_CASE_SINGLE_VALUE_REP
  • tests/networks/nets/test_ahnet.py — 5 cases
  • tests/networks/nets/test_autoencoder.pyTEST_CASE_0/1/2
  • tests/networks/nets/test_autoencoderkl.pyCASES_ATTENTION
  • tests/networks/nets/test_cell_sam_wrapper.py, test_fullyconnectednet.py, test_spade_autoencoderkl.py, test_varautoencoder.py, test_vnet.py, test_voxelmorph.py
  • tests/networks/test_save_state.pyTEST_CASE_6, TEST_CASE_10

transforms/

  • tests/transforms/test_border_pad.py, test_border_padd.py
  • tests/transforms/test_center_spatial_crop.py
  • tests/transforms/test_cucim_dict_transform.py, test_cucim_transform.py (and rand variants)
  • tests/transforms/test_fill_holes.py, test_fill_holesd.py — 8 cases each
  • tests/transforms/test_load_image.py — 5 cases
  • tests/transforms/test_rand_torchio.py, test_rand_torchiod.py, test_torchio.py, test_torchiod.py
  • tests/transforms/test_savitzky_golay_smooth.py, test_savitzky_golay_smoothd.py
  • tests/transforms/test_spatial_padd.py

The test_handler_checkpoint_saver.py cases are likely combining shared primitive values intentionally (not pure parametrize duplicates), so worth manually verifying those. The rest look like straightforward cleanup candidates similar to this PR.

Originally posted by @coderabbitai[bot] in #8896 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions