LCORE-1613: RAG tool calls loop indefinitely by arin-deloatch · Pull Request #1998 · lightspeed-core/lightspeed-stack

arin-deloatch · 2026-06-25T18:13:48Z

Description

When using small models with limited context windows (e.g., Llama 3.1 8B on vLLM), RAG tool calls can loop indefinitely. The model repeatedly searches for a specific piece of text, each iteration appending tokens to the context until the model's context window is exceeded, resulting in a 400 error:

Error code: 400 - This model's maximum context ength is 113920 tokens. However, your request has 116873 input tokens.

The max_infer_iters and max_tool_calls fields already existed as optional per-request parameters on the Responses API, but both defaulted to None (no limit). There was no server-side mechanism for operators to set safety defaults, so unless a client explicitly set these values, nothing prevented the runaway loop.

Changes

Added max_infer_iters (default 10) and max_tool_calls (default 30) as configurable fields on InferenceConfiguration in src/models/config.py. Operators can override these in their YAML config or set them to None to disable the limit. Per-request values from clients always take precedence.

Applied config defaults in prepare_responses_params() for the /v1/query, /v1/streaming_query, and A2A endpoints.

Applied config defaults in responses_endpoint_handler() for the /v1/responses endpoint, only when the client omits the values.

Added 10 unit tests for the new config fields (default values, positive int acceptance, zero/negative rejection, None acceptance).

Configuration example

  inference:

    default_model: "meta-llama/Llama-3.1-8B-Instruct"

    default_provider: "vllm"

    max_infer_iters: 10

    max_tool_calls: 30

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: Claude Code (Claude Opus 4.6)

Related Tickets & Documents

Closes LCORE-1613

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Run config tests: uv run pytest tests/unit/models/config/test_inference_configuration.py -v
Run responses endpoint tests: uv run pytest tests/unit/app/endpoints/test_responses.py -v
Run responses utils tests: uv run pytest tests/unit/utils/test_responses.py -v
All 286 tests pass across the 3 affected test files.
uv run make format and uv run make verify pass (only pre-existing mypy errors in test_container_lifecycle.py).

Test Regressions

8 existing tests in tests/unit/utils/test_responses.py set mock_config.inference = None, which caused AttributeError when the new code accessed configuration.inference.max_infer_iters. Updated these mocks to use a real InferenceConfiguration() instance.

Summary by CodeRabbit

New Features
- Added configurable server-side limits for inference iterations and tool calls, including defaults (10 and 30) and the ability to disable each by setting to null.
Bug Fixes
- Requests now automatically apply the server-side default limits when those values aren’t explicitly provided, improving consistent behavior during request processing.
Tests
- Expanded unit tests to validate defaults, positive integer constraints (reject zero/negative), and null disabling for the new configuration fields.

coderabbitai · 2026-06-25T18:14:03Z

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ad61c2fd-da64-4c23-a4e4-859bcc4765ef

📥 Commits

Reviewing files that changed from the base of the PR and between c16c5d3 and 59edfd1.

📒 Files selected for processing (7)

docs/openapi.json
src/app/endpoints/responses.py
src/models/config.py
src/utils/responses.py
tests/unit/models/config/test_dump_configuration.py
tests/unit/models/config/test_inference_configuration.py
tests/unit/utils/test_responses.py

📜 Recent review details

⏰ Context from checks skipped due to timeout. (7)

GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E Tests for Lightspeed Evaluation job

🧰 Additional context used

📓 Path-based instructions (4)

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

src/utils/responses.py
src/app/endpoints/responses.py
src/models/config.py

src/app/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/app/**/*.py: FastAPI dependencies: Import from fastapi module for APIRouter, HTTPException, Request, status, Depends
Use FastAPI HTTPException with appropriate status codes for API endpoints and handle APIConnectionError from Llama Stack

Files:

src/app/endpoints/responses.py

src/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type

Files:

src/models/config.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/unit/models/config/test_inference_configuration.py
tests/unit/utils/test_responses.py
tests/unit/models/config/test_dump_configuration.py

🧠 Learnings (5)

📚 Learning: 2026-02-23T14:56:59.186Z

Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1198
File: src/utils/responses.py:184-192
Timestamp: 2026-02-23T14:56:59.186Z
Learning: In the lightspeed-stack codebase (lightspeed-core/lightspeed-stack), do not enforce de-duplication of duplicate client.models.list() calls in model selection flows (e.g., in src/utils/responses.py prepare_responses_params). These calls are considered relatively cheap and removing duplicates could add unnecessary complexity to the flow. Apply this guideline specifically to this file/context unless similar performance characteristics and design decisions are documented elsewhere.

Applied to files:

src/utils/responses.py

📚 Learning: 2026-06-24T13:45:37.249Z

Learnt from: Jdubrick
Repo: lightspeed-core/lightspeed-stack PR: 1971
File: src/utils/markdown_repair.py:31-36
Timestamp: 2026-06-24T13:45:37.249Z
Learning: In the lightspeed-stack repository, docstrings must use the section header name "Parameters:" (not "Args:") for function arguments, even if the project references Google Python docstring conventions. Ensure docstrings follow the project’s established "Parameters:" header format for any documented function parameters.

Applied to files:

src/utils/responses.py
src/app/endpoints/responses.py
src/models/config.py
tests/unit/models/config/test_inference_configuration.py
tests/unit/utils/test_responses.py
tests/unit/models/config/test_dump_configuration.py

📚 Learning: 2026-04-06T20:18:07.852Z

Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.

Applied to files:

src/app/endpoints/responses.py

📚 Learning: 2026-01-12T10:58:40.230Z

Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

src/models/config.py

📚 Learning: 2026-02-25T07:46:33.545Z

Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.

Applied to files:

src/models/config.py

🛑 Comments failed to post (1)

src/models/config.py (1)
2335-2337: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Reject blank model_id values at config load.

str accepts "", so a malformed guardrail config survives parsing and fails later when the model is used. Add non-empty validation here so invalid config is rejected early.
Proposed fix
-    model_id: str = Field(
-        ..., title="Model id", description="The model_id to use for the guard"
-    )
+    model_id: str = Field(
+        ...,
+        min_length=1,
+        title="Model id",
+        description="The model_id to use for the guard",
+    )
As per coding guidelines, src/models/**/*.py: "Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type`."
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
    model_id: str = Field(
        ...,
        min_length=1,
        title="Model id",
        description="The model_id to use for the guard",
    )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/models/config.py` around lines 2335 - 2337, Reject blank model_id values
in the config model so malformed guardrail config is caught during load. Update
the Guard/Config Pydantic model that defines model_id to add a non-empty
validation check using a field_validator (or model_validator if that’s the
existing pattern), ensuring empty or whitespace-only strings are rejected. Keep
the validation close to the model_id field so invalid config cannot pass parsing
and only valid IDs reach later model usage.
Source: Coding guidelines

🔇 Additional comments (6)

src/models/config.py (1)

681-682: LGTM!

Also applies to: 1732-1750

docs/openapi.json (1)

13618-13644: LGTM!

tests/unit/models/config/test_inference_configuration.py (1)

95-160: LGTM!

tests/unit/models/config/test_dump_configuration.py (1)

166-167: LGTM!

Also applies to: 380-381, 738-739, 995-996, 1232-1233, 1464-1465, 1841-1842, 2064-2065, 2287-2288, 2517-2518

src/utils/responses.py (1)

432-433: LGTM!

src/app/endpoints/responses.py (1)

467-470: LGTM!

Walkthrough

The PR adds optional inference limit fields to InferenceConfiguration, threads them into Responses request parameters, and applies server-side defaults in the responses endpoint when requests omit them.

Changes

Responses inference limits

Layer / File(s)	Summary
InferenceConfiguration limits `src/models/config.py`, `docs/openapi.json`, `tests/unit/models/config/test_inference_configuration.py`, `tests/unit/models/config/test_dump_configuration.py`	`InferenceConfiguration` gains optional `max_infer_iters` and `max_tool_calls` fields, the OpenAPI schema documents both defaults, and unit tests cover construction and serialized config output.
Responses param defaults `src/utils/responses.py`, `tests/unit/utils/test_responses.py`	`prepare_responses_params` adds `max_infer_iters` and `max_tool_calls` to `ResponsesApiParams`, and tests initialize `mock_config.inference` for the updated request-parameter paths.
Endpoint request defaults `src/app/endpoints/responses.py`	The responses endpoint fills missing `max_infer_iters` and `max_tool_calls` from server configuration before building `api_params`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

max-svistunov
tisnik

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly references the main bug being fixed: infinite RAG tool-call loops.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

tisnik

LGTM

asimurka · 2026-06-26T11:37:09Z

+        api_params.max_infer_iters = configuration.inference.max_infer_iters
+    if "max_tool_calls" not in original_request.model_fields_set:
+        api_params.max_tool_calls = configuration.inference.max_tool_calls
+


It's just a nit, but could you please follow the same pattern as other parameter overrides do. Namely, override attributes of updated_request (if not explicitly set) above the model_validate command.

asimurka · 2026-06-29T08:10:10Z

/ok-to-test

asimurka · 2026-06-29T10:59:29Z

@arin-deloatch please fix unit tests, config asserts should contain default values of new config options you created

…uration

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/unit/utils/test_responses.py (1)

1983-2029: 📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Assert the propagated inference defaults in a happy-path test.

These tests now instantiate InferenceConfiguration(), but they still don't verify that prepare_responses_params() actually forwards max_infer_iters and max_tool_calls. A regression dropping either field would still pass here.

Suggested assertion update

         result = await prepare_responses_params(
             mock_client, query_request, None, "token"
         )
         assert result.input == "test"
         assert result.model == "provider1/model1"
         assert result.conversation == "llama_conv1"
+        assert result.max_infer_iters == 10
+        assert result.max_tool_calls == 30

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/utils/test_responses.py` around lines 1983 - 2029, The happy-path
tests for prepare_responses_params currently set configuration.inference but do
not assert that its defaults are propagated, so add assertions in the existing
prepare_responses_params test cases to verify the returned params include the
inference values for max_infer_iters and max_tool_calls. Use the
prepare_responses_params symbol and the InferenceConfiguration setup in these
tests to confirm both fields are forwarded in the result.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/models/config.py`:
- Around line 2335-2337: Reject blank model_id values in the config model so
malformed guardrail config is caught during load. Update the Guard/Config
Pydantic model that defines model_id to add a non-empty validation check using a
field_validator (or model_validator if that’s the existing pattern), ensuring
empty or whitespace-only strings are rejected. Keep the validation close to the
model_id field so invalid config cannot pass parsing and only valid IDs reach
later model usage.

---

Outside diff comments:
In `@tests/unit/utils/test_responses.py`:
- Around line 1983-2029: The happy-path tests for prepare_responses_params
currently set configuration.inference but do not assert that its defaults are
propagated, so add assertions in the existing prepare_responses_params test
cases to verify the returned params include the inference values for
max_infer_iters and max_tool_calls. Use the prepare_responses_params symbol and
the InferenceConfiguration setup in these tests to confirm both fields are
forwarded in the result.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ad61c2fd-da64-4c23-a4e4-859bcc4765ef

📥 Commits

Reviewing files that changed from the base of the PR and between c16c5d3 and 59edfd1.

📒 Files selected for processing (7)

docs/openapi.json
src/app/endpoints/responses.py
src/models/config.py
src/utils/responses.py
tests/unit/models/config/test_dump_configuration.py
tests/unit/models/config/test_inference_configuration.py
tests/unit/utils/test_responses.py

📜 Review details

⏰ Context from checks skipped due to timeout. (7)

GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E Tests for Lightspeed Evaluation job

🧰 Additional context used

📓 Path-based instructions (4)

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

src/utils/responses.py
src/app/endpoints/responses.py
src/models/config.py

src/app/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/app/**/*.py: FastAPI dependencies: Import from fastapi module for APIRouter, HTTPException, Request, status, Depends
Use FastAPI HTTPException with appropriate status codes for API endpoints and handle APIConnectionError from Llama Stack

Files:

src/app/endpoints/responses.py

src/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type

Files:

src/models/config.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/unit/models/config/test_inference_configuration.py
tests/unit/utils/test_responses.py
tests/unit/models/config/test_dump_configuration.py

🧠 Learnings (5)

📚 Learning: 2026-02-23T14:56:59.186Z

Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1198
File: src/utils/responses.py:184-192
Timestamp: 2026-02-23T14:56:59.186Z
Learning: In the lightspeed-stack codebase (lightspeed-core/lightspeed-stack), do not enforce de-duplication of duplicate client.models.list() calls in model selection flows (e.g., in src/utils/responses.py prepare_responses_params). These calls are considered relatively cheap and removing duplicates could add unnecessary complexity to the flow. Apply this guideline specifically to this file/context unless similar performance characteristics and design decisions are documented elsewhere.

Applied to files:

src/utils/responses.py

📚 Learning: 2026-06-24T13:45:37.249Z

Learnt from: Jdubrick
Repo: lightspeed-core/lightspeed-stack PR: 1971
File: src/utils/markdown_repair.py:31-36
Timestamp: 2026-06-24T13:45:37.249Z
Learning: In the lightspeed-stack repository, docstrings must use the section header name "Parameters:" (not "Args:") for function arguments, even if the project references Google Python docstring conventions. Ensure docstrings follow the project’s established "Parameters:" header format for any documented function parameters.

Applied to files:

src/utils/responses.py
src/app/endpoints/responses.py
src/models/config.py
tests/unit/models/config/test_inference_configuration.py
tests/unit/utils/test_responses.py
tests/unit/models/config/test_dump_configuration.py

📚 Learning: 2026-04-06T20:18:07.852Z

Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.

Applied to files:

src/app/endpoints/responses.py

📚 Learning: 2026-01-12T10:58:40.230Z

Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

src/models/config.py

📚 Learning: 2026-02-25T07:46:33.545Z

Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.

Applied to files:

src/models/config.py

🔇 Additional comments (6)

src/models/config.py (1)

681-682: LGTM!

Also applies to: 1732-1750

docs/openapi.json (1)

13618-13644: LGTM!

tests/unit/models/config/test_inference_configuration.py (1)

95-160: LGTM!

tests/unit/models/config/test_dump_configuration.py (1)

166-167: LGTM!

Also applies to: 380-381, 738-739, 995-996, 1232-1233, 1464-1465, 1841-1842, 2064-2065, 2287-2288, 2517-2518

src/utils/responses.py (1)

432-433: LGTM!

src/app/endpoints/responses.py (1)

467-470: LGTM!

coderabbitai

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/unit/utils/test_responses.py (1)

1983-2029: 📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Assert the propagated inference defaults in a happy-path test.

These tests now instantiate InferenceConfiguration(), but they still don't verify that prepare_responses_params() actually forwards max_infer_iters and max_tool_calls. A regression dropping either field would still pass here.

Suggested assertion update

         result = await prepare_responses_params(
             mock_client, query_request, None, "token"
         )
         assert result.input == "test"
         assert result.model == "provider1/model1"
         assert result.conversation == "llama_conv1"
+        assert result.max_infer_iters == 10
+        assert result.max_tool_calls == 30

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/utils/test_responses.py` around lines 1983 - 2029, The happy-path
tests for prepare_responses_params currently set configuration.inference but do
not assert that its defaults are propagated, so add assertions in the existing
prepare_responses_params test cases to verify the returned params include the
inference values for max_infer_iters and max_tool_calls. Use the
prepare_responses_params symbol and the InferenceConfiguration setup in these
tests to confirm both fields are forwarded in the result.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/models/config.py`:
- Around line 2335-2337: Reject blank model_id values in the config model so
malformed guardrail config is caught during load. Update the Guard/Config
Pydantic model that defines model_id to add a non-empty validation check using a
field_validator (or model_validator if that’s the existing pattern), ensuring
empty or whitespace-only strings are rejected. Keep the validation close to the
model_id field so invalid config cannot pass parsing and only valid IDs reach
later model usage.

---

Outside diff comments:
In `@tests/unit/utils/test_responses.py`:
- Around line 1983-2029: The happy-path tests for prepare_responses_params
currently set configuration.inference but do not assert that its defaults are
propagated, so add assertions in the existing prepare_responses_params test
cases to verify the returned params include the inference values for
max_infer_iters and max_tool_calls. Use the prepare_responses_params symbol and
the InferenceConfiguration setup in these tests to confirm both fields are
forwarded in the result.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ad61c2fd-da64-4c23-a4e4-859bcc4765ef

📥 Commits

Reviewing files that changed from the base of the PR and between c16c5d3 and 59edfd1.

📒 Files selected for processing (7)

docs/openapi.json
src/app/endpoints/responses.py
src/models/config.py
src/utils/responses.py
tests/unit/models/config/test_dump_configuration.py
tests/unit/models/config/test_inference_configuration.py
tests/unit/utils/test_responses.py

📜 Review details

🔇 Additional comments (6)

src/models/config.py (1)

681-682: LGTM!

Also applies to: 1732-1750

docs/openapi.json (1)

13618-13644: LGTM!

tests/unit/models/config/test_inference_configuration.py (1)

95-160: LGTM!

tests/unit/models/config/test_dump_configuration.py (1)

166-167: LGTM!

Also applies to: 380-381, 738-739, 995-996, 1232-1233, 1464-1465, 1841-1842, 2064-2065, 2287-2288, 2517-2518

src/utils/responses.py (1)

432-433: LGTM!

src/app/endpoints/responses.py (1)

467-470: LGTM!

🛑 Comments failed to post (1)

src/models/config.py (1)
2335-2337: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Reject blank model_id values at config load.

str accepts "", so a malformed guardrail config survives parsing and fails later when the model is used. Add non-empty validation here so invalid config is rejected early.
Proposed fix
-    model_id: str = Field(
-        ..., title="Model id", description="The model_id to use for the guard"
-    )
+    model_id: str = Field(
+        ...,
+        min_length=1,
+        title="Model id",
+        description="The model_id to use for the guard",
+    )
As per coding guidelines, src/models/**/*.py: "Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type`."
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
    model_id: str = Field(
        ...,
        min_length=1,
        title="Model id",
        description="The model_id to use for the guard",
    )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/models/config.py` around lines 2335 - 2337, Reject blank model_id values
in the config model so malformed guardrail config is caught during load. Update
the Guard/Config Pydantic model that defines model_id to add a non-empty
validation check using a field_validator (or model_validator if that’s the
existing pattern), ensuring empty or whitespace-only strings are rejected. Keep
the validation close to the model_id field so invalid config cannot pass parsing
and only valid IDs reach later model usage.
Source: Coding guidelines

asimurka · 2026-06-29T16:59:51Z

/ok-to-test

tisnik requested a review from asimurka June 26, 2026 11:29

tisnik approved these changes Jun 26, 2026

View reviewed changes

asimurka requested changes Jun 26, 2026

View reviewed changes

arin-deloatch force-pushed the bug/LCORE-1613 branch from c16c5d3 to de9d863 Compare June 26, 2026 17:55

openshift-ci Bot added the ok-to-test label Jun 29, 2026

asimurka approved these changes Jun 29, 2026

View reviewed changes

arin-deloatch added 5 commits June 29, 2026 08:43

LCORE-1613: Add max_infer_iters and max_tool_calls to InferenceConfig…

418b340

…uration

LCORE-1613: Apply inference defaults in query and responses endpoints

cfdcdc3

LCORE-1613: Fix test mocks broken by inference defaults

d08ee9a

LCORE-1613: generate OpenAPI spec; address code pattern nit

675fa72

LCORE-1613: fix failing tests caused by new inference configuration

59edfd1

arin-deloatch force-pushed the bug/LCORE-1613 branch from de9d863 to 59edfd1 Compare June 29, 2026 16:41

coderabbitai Bot reviewed Jun 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LCORE-1613: RAG tool calls loop indefinitely#1998

LCORE-1613: RAG tool calls loop indefinitely#1998
arin-deloatch wants to merge 5 commits into
lightspeed-core:mainfrom
arin-deloatch:bug/LCORE-1613

arin-deloatch commented Jun 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

Uh oh!

tisnik left a comment

Uh oh!

asimurka Jun 26, 2026

Uh oh!

asimurka commented Jun 29, 2026

Uh oh!

asimurka commented Jun 29, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

asimurka commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

arin-deloatch commented Jun 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Configuration example

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Test Regressions

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

asimurka Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

asimurka commented Jun 29, 2026

Uh oh!

asimurka commented Jun 29, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

asimurka commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

arin-deloatch commented Jun 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading