LCORE-2080: Added E2E Steps for Agent Skills by jrobertboos · Pull Request #1941 · lightspeed-core/lightspeed-stack

jrobertboos · 2026-06-17T15:07:30Z

Description

Added the missing E2E steps for testing agent skills.

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: Cursor (Composer 2.5)
Generated by: Cursor (Composer 2.5)

Related Tickets & Documents

Related Issue LCORE-2080
Closes LCORE-2080

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- Added skills asset support to end-to-end stack setups, with new sample skills for echoing text and summarizing content.
- Improved end-to-end visibility for skills by capturing and validating tool call/tool result behavior in both streaming and non-streaming responses.
Tests
- Added new end-to-end configurations for skills in both server and library modes.
- Updated skills test coverage to match the updated tool_calls/tool_results response schema, including skill loading, resource reading, multi-skill discovery, and refreshed expectations.

coderabbitai · 2026-06-17T15:07:47Z

Warning

Review limit reached

@jrobertboos, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 3 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2e5e4291-5088-4470-aaf7-3576a9c91614

📥 Commits

Reviewing files that changed from the base of the PR and between e2990a2 and 9af8d80.

📒 Files selected for processing (14)

docker-compose-library.yaml
docker-compose.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-skills-directory.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-skills.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml
tests/e2e/features/skills.feature
tests/e2e/features/steps/common_http.py
tests/e2e/features/steps/llm_query_response.py
tests/e2e/skills/echo/SKILL.md
tests/e2e/skills/echo/references/guide.md
tests/e2e/skills/summarize/SKILL.md
tests/e2e/skills/summarize/references/guide.md
tests/e2e/test_list.txt

Walkthrough

Adds e2e skill fixtures, compose mounts, and Lightspeed stack configs for server and library modes. Updates streaming response helpers to capture tool calls and results, and expands the skills feature scenarios to use the new load/read skill flows.

Changes

Skills e2e wiring

Layer / File(s)	Summary
Compose mounts and skill fixtures `docker-compose-library.yaml`, `docker-compose.yaml`, `tests/e2e/skills/echo/`, `tests/e2e/skills/summarize/`	Compose mounts expose the skills test directory, and new echo and summarize skill documents and guides are added.
Lightspeed stack configs `tests/e2e/configuration/library-mode/lightspeed-stack-skills.yaml`, `tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml`	New server-mode and library-mode LCS configs set binding, logging, authentication, llama-stack client wiring, data storage, and skills paths.
Streaming response helpers `tests/e2e/features/steps/common_http.py`, `tests/e2e/features/steps/llm_query_response.py`	A response-field assertion step is added, and streamed SSE parsing now accumulates and exposes tool calls and tool results.
Skills scenarios `tests/e2e/features/skills.feature`, `tests/e2e/test_list.txt`	The skills feature updates tool-name and tool-call assertions across registration, load, read-resource, multi-skill, and progressive disclosure scenarios, and adds the feature to the e2e list.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

lightspeed-core/lightspeed-stack#1736: Introduces the skills-path configuration model that matches the new skills.paths wiring used here.
lightspeed-core/lightspeed-stack#1742: Updates the same tests/e2e/features/skills.feature flow and tool-name expectations.
lightspeed-core/lightspeed-stack#1870: Changes the streaming query path and response handling that this PR’s SSE parsing step extends.

Suggested reviewers

tisnik
radofuchs
asimurka

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title is concise and accurately reflects the PR’s main theme: adding end-to-end support for agent skills.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

jrobertboos · 2026-06-23T16:40:58Z


-  @SkillsConfig
+  @SkillsConfig @skip
  Scenario: Skill tools are registered when skills are configured


TODO: Need to reflect skill tools (list_skills, load_skill, read_skill_resource) in /tools.

jrobertboos · 2026-06-23T17:06:10Z

      """
      And The token metrics have increased

  # --- Error handling: unknown skill ---


The "Error Paths" will have to be skipped for now as the skill tools do fail and produce a result, but it's a different type that the response-building code silently discards.

Below I have helpful part of conversation with Claude about the issue.

Pydantic-ai catches ModelRetry and wraps the error in a RetryPromptPart (not a ToolReturnPart). The FunctionToolResultEvent.part is typed as ToolReturnPart | RetryPromptPart — it can be either.

Where LCS drops it:

In the non-streaming path, build_turn_summary_from_agent_run only processes ToolReturnPart:

query.py
Lines 266-269

elif isinstance(message, ModelRequest): for request_part in message.parts: if isinstance(request_part, ToolReturnPart): process_function_tool_result(state, request_part)

In the streaming path, the same filter exists:

streaming.py
Lines 522-524

part = event.part if not isinstance(part, ToolReturnPart): return None

Both paths explicitly ignore RetryPromptPart, so the retry/error message for load_skill is never surfaced as a tool_result in the API response.

The result:

Both tool calls appear (because both ToolCallPart instances from the ModelResponse are processed)

Only the list_skills result appears (because it succeeded and produced a ToolReturnPart)

The load_skill result is missing (because it raised ModelRetry → became a RetryPromptPart → silently dropped)

jrobertboos · 2026-06-23T17:08:49Z

      ]
      """

  # --- Full progressive disclosure flow ---


This will likely be quite flaky as the LLM (through appendage of system prompt, I think) is given only the "names" of skills so sometimes will result in just load_skill and read_skill_resource being used completely skipping list_skills.

jrobertboos · 2026-06-25T15:37:22Z

Please Review:

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml`:
- Around line 24-26: The skills discovery config is using a relative path in the
`skills.paths` entry, which can break startup when the working directory
changes. Update the YAML to use the absolute mounted path expected by the stack,
and keep the change localized to the `skills` block in
`lightspeed-stack-skills-directory.yaml` so startup consistently finds the
skills directory.

In `@tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml`:
- Around line 24-26: The skills path in the stack config is CWD-sensitive and
should be pinned to the mounted absolute location instead. Update the
`skills.paths` entry in the YAML so it points to `/app-root/skills/echo` rather
than the relative `skills/echo`, keeping the `skills` configuration
deterministic under the compose mount.

In `@tests/e2e/features/steps/common_http.py`:
- Around line 334-335: The expected JSON in the step implementation still parses
context.text directly, so placeholder tokens like {MODEL} are not substituted
before validation. Update the relevant step in common_http.py to apply the same
placeholder resolution used by the existing partial-body handling before calling
json.loads and validate_json_partially. Keep the fix localized to the step that
consumes context.text and ensure the parsed expected_value reflects substituted
placeholders first.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2beba6a7-1aff-4350-92f7-60524e66a1c4

📥 Commits

Reviewing files that changed from the base of the PR and between 890a6f7 and 1f11ea7.

📒 Files selected for processing (14)

docker-compose-library.yaml
docker-compose.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-skills-directory.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-skills.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml
tests/e2e/features/skills.feature
tests/e2e/features/steps/common_http.py
tests/e2e/features/steps/llm_query_response.py
tests/e2e/skills/echo/SKILL.md
tests/e2e/skills/echo/references/guide.md
tests/e2e/skills/summarize/SKILL.md
tests/e2e/skills/summarize/references/guide.md
tests/e2e/test_list.txt

📜 Review details

⏰ Context from checks skipped due to timeout. (2)

GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request

🧰 Additional context used

📓 Path-based instructions (2)

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/e2e/features/steps/common_http.py
tests/e2e/features/steps/llm_query_response.py

tests/e2e/**/*.{py,feature}

📄 CodeRabbit inference engine (AGENTS.md)

Use behave (BDD) framework for end-to-end testing with Gherkin feature files

Files:

tests/e2e/features/steps/common_http.py
tests/e2e/features/steps/llm_query_response.py
tests/e2e/features/skills.feature

🧠 Learnings (4)

📚 Learning: 2026-05-20T08:09:30.641Z

Learnt from: max-svistunov
Repo: lightspeed-core/lightspeed-stack PR: 1580
File: docs/design/llama-stack-config-merge/poc-results/library-mode/synthesized-run.yaml:107-110
Timestamp: 2026-05-20T08:09:30.641Z
Learning: In Llama-stack config YAMLs, when defining a Llama Guard safety shield entry, set `provider_shield_id` to the *guard model identifier* (e.g., `meta-llama/Llama-Guard-3-8B`). Do not use a chat/generative model id (e.g., `openai/gpt-4o-mini`): a chat-model id (or `native_override`) indicates only an override landed and does **not** mean the safety shield is actually gating queries. Ensure any E2E coverage for the related implementation (JIRA/E2E tests) exercises a real Llama Guard model to verify that the shield is effective.

Applied to files:

docker-compose-library.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-skills.yaml
docker-compose.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-skills-directory.yaml

📚 Learning: 2026-04-07T09:20:26.590Z

Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1467
File: tests/e2e/features/steps/common.py:36-49
Timestamp: 2026-04-07T09:20:26.590Z
Learning: For Behave-based Python tests, rely on Behave’s Context layered stack for attribute lifecycle: Behave pushes a new Context layer when entering feature scope (before_feature) and again for scenario scope (before_scenario). Attributes assigned inside given/when/then steps live on the current scenario layer and are automatically removed when the scenario ends. As a result, step-set attributes should not be expected to persist across scenarios or features, and manual cleanup in after_scenario/after_feature is generally unnecessary for attributes set in step functions. Only perform manual cleanup for attributes that you set explicitly in before_feature/before_scenario, since those live on the respective feature/scenario layers.

Applied to files:

tests/e2e/features/steps/common_http.py
tests/e2e/features/steps/llm_query_response.py

📚 Learning: 2026-04-13T13:39:54.963Z

Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1490
File: tests/e2e/features/environment.py:206-211
Timestamp: 2026-04-13T13:39:54.963Z
Learning: In lightspeed-stack E2E tests under tests/e2e/features, it is intentional to set context.feature_config inside Background/step functions (scenario-scoped Behave layer). The environment.py after_scenario restore logic should only restore configuration when context.scenario_lightspeed_override_active is True; this flag is set by configure_service only when a real config switch occurs (so restore does not run for scenarios without a switch). Additionally, steps/common.py’s module-level _active_lightspeed_stack_config_basename is used to prevent re-applying the same config across subsequent scenarios, ensuring scenario_lightspeed_override_active stays False after the first apply. Therefore, reviewers should not “fix” this flow as if feature_config were incorrectly scoped or if after_scenario restoration is missing—config switching and restoration are meant to happen exactly once per actual switch, not redundantly per scenario.

Applied to files:

tests/e2e/features/steps/common_http.py
tests/e2e/features/steps/llm_query_response.py

📚 Learning: 2026-06-24T13:45:37.249Z

Learnt from: Jdubrick
Repo: lightspeed-core/lightspeed-stack PR: 1971
File: src/utils/markdown_repair.py:31-36
Timestamp: 2026-06-24T13:45:37.249Z
Learning: In the lightspeed-stack repository, docstrings must use the section header name "Parameters:" (not "Args:") for function arguments, even if the project references Google Python docstring conventions. Ensure docstrings follow the project’s established "Parameters:" header format for any documented function parameters.

Applied to files:

tests/e2e/features/steps/common_http.py
tests/e2e/features/steps/llm_query_response.py

🪛 LanguageTool

tests/e2e/skills/echo/SKILL.md

[style] ~17-~17: Using “back” with the verb “return” may be redundant.
Context: ...r's input text 2. Return the exact text back to the user without modification For f...

(RETURN_BACK)

🔇 Additional comments (8)

docker-compose-library.yaml (1)

23-23: LGTM!

docker-compose.yaml (1)

90-90: LGTM!

tests/e2e/skills/echo/SKILL.md (1)

1-19: LGTM!

tests/e2e/skills/echo/references/guide.md (1)

1-20: LGTM!

tests/e2e/skills/summarize/SKILL.md (1)

1-22: LGTM!

tests/e2e/skills/summarize/references/guide.md (1)

1-21: LGTM!

tests/e2e/configuration/library-mode/lightspeed-stack-skills-directory.yaml (1)

1-26: LGTM!

tests/e2e/configuration/library-mode/lightspeed-stack-skills.yaml (1)

1-26: LGTM!

coderabbitai · 2026-06-25T15:42:11Z

+skills:
+  paths:
+    - skills


🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Use absolute skills path to avoid CWD-dependent startup failures.

skills is relative; if the service working directory changes, skills discovery can fail at startup. Use /app-root/skills to match the compose mount explicitly.

Proposed change

skills: paths: - - skills + - /app-root/skills

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

skills:

paths:

- skills

skills:

paths:

- /app-root/skills

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml` around lines 24 - 26, The skills discovery config is using a relative path in the `skills.paths` entry, which can break startup when the working directory changes. Update the YAML to use the absolute mounted path expected by the stack, and keep the change localized to the `skills` block in `lightspeed-stack-skills-directory.yaml` so startup consistently finds the skills directory.

coderabbitai · 2026-06-25T15:42:11Z

+skills:
+  paths:
+    - skills/echo


🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Pin the skill path to the mounted absolute location.

skills/echo is CWD-sensitive. Prefer /app-root/skills/echo for deterministic resolution against the compose mount.

Proposed change

skills: paths: - - skills/echo + - /app-root/skills/echo

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

skills:

paths:

- skills/echo

skills:

paths:

- /app-root/skills/echo

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml` around lines 24 - 26, The skills path in the stack config is CWD-sensitive and should be pinned to the mounted absolute location instead. Update the `skills.paths` entry in the YAML so it points to `/app-root/skills/echo` rather than the relative `skills/echo`, keeping the `skills` configuration deterministic under the compose mount.

coderabbitai · 2026-06-25T15:42:11Z

+    expected_value = json.loads(context.text)
+    validate_json_partially(actual_value, expected_value)


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Apply placeholder substitution before parsing expected JSON.

At Line 334, this step parses context.text directly, so placeholders like {MODEL} won’t be resolved here (unlike the existing partial-body step). That can cause false failures in scenario assertions.

Proposed fix

- expected_value = json.loads(context.text) + json_str = replace_placeholders(context, context.text) + expected_value = json.loads(json_str)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

expected_value = json.loads(context.text)

validate_json_partially(actual_value, expected_value)

json_str = replace_placeholders(context, context.text)

expected_value = json.loads(json_str)

validate_json_partially(actual_value, expected_value)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/e2e/features/steps/common_http.py` around lines 334 - 335, The expected JSON in the step implementation still parses context.text directly, so placeholder tokens like {MODEL} are not substituted before validation. Update the relevant step in common_http.py to apply the same placeholder resolution used by the existing partial-body handling before calling json.loads and validate_json_partially. Keep the fix localized to the step that consumes context.text and ensure the parsed expected_value reflects substituted placeholders first.

anik120

ps: squashing commits to have a single commit for a PR (unless having multiple commits is by design, in which case too, the question would be "why aren't they multiple PRs instead"), is the hygienic thing to do.

Otherwise they show up as

"fix"

"address code rabbit"

when someone is searching through git history trying to figure out what changes were made.

Here's an article I highly recommend reading https://medium.com/@madhav2002/git-hygiene-commits-branching-and-rewriting-history-bc6dee5f953f

radofuchs

LGTM in overall, just a few details

asimurka · 2026-06-26T11:58:51Z

Just a conceptual question: Is the skill invocation really so strict that when you prompt to run a non-existing skill, the LLM really tries to execute it and ends up with failure?

jrobertboos · 2026-06-26T13:29:04Z

@asimurka when u prompt the LLM to use a skill, if u are direct enough, it will try to use the load_skill tool with the highlighted skill. e.g. this is what it looks like right now:

INPUT

curl -X 'POST' \
  'http://localhost:8080/v1/query' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "gpt-4o-mini",
  "provider": "openai",
  "query": "load the skill '\''non-existent'\''."
}'

OUTPUT

{
  "conversation_id": "eb995e1ee43557e33d6c43feacf47a4afc73565ba7478294",
  "response": "It appears that there are currently no available skills to load. Please let me know if you need assistance with something else!",
  "rag_chunks": [],
  "referenced_documents": [],
  "truncated": false,
  "input_tokens": 2290,
  "output_tokens": 52,
  "available_quotas": {},
  "tool_calls": [
    {
      "id": "call_51npMnMSenv6Qnp7encji746",
      "name": "load_skill",
      "args": {
        "skill_name": "non-existent"
      },
      "type": "function_call"
    },
    {
      "id": "call_retBuV3RnzjkfsR8ltVNLoq3",
      "name": "list_skills",
      "args": {},
      "type": "function_call"
    }
  ],
  "tool_results": [
    {
      "id": "call_retBuV3RnzjkfsR8ltVNLoq3",
      "status": "success",
      "content": "{}",
      "type": "function_call_output",
      "round": 1
    }
  ]
}

Does that answer your question?

asimurka · 2026-06-26T14:35:06Z

Is it possible that this is just model-specific behavior? Because I think you shouldn't be able to influence the model behavior like this (with bare prompt).

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/e2e/features/skills.feature (1)
59-92: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Align the advertised tool parameters with the asserted call args.

Line 63 exposes load_skill with parameter name, and Line 87 exposes read_skill_resource with parameter path, but the later call assertions in this same feature expect skill_name and resource_name. Both contracts cannot be correct at once, so either /tools is asserting stale metadata or the tool_calls checks will never match the real invocation shape.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e/features/skills.feature` around lines 59 - 92, The tool metadata in
skills.feature is inconsistent with the expected call arguments: load_skill
currently advertises name while the assertions use skill_name, and
read_skill_resource advertises path while the assertions use resource_name.
Update the feature so the parameter names in the tool definitions and the
tool_calls checks match exactly, using the same symbols load_skill and
read_skill_resource throughout.

♻️ Duplicate comments (2)

tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml (1)
24-26: 🩺 Stability & Availability | 🔴 Critical | ⚡ Quick win

Pin skill path to absolute mounted location.

skills/echo is CWD-sensitive. Use /app-root/skills/echo for deterministic resolution against the compose mount. This was flagged in a previous review and remains unaddressed.
Proposed fix
 skills:
   paths:
-    - skills/echo
+    - /app-root/skills/echo
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml` around
lines 24 - 26, The skills path in the configuration is still relative and
depends on the working directory, so it should be pinned to the mounted absolute
location instead. Update the `skills.paths` entry in the
`lightspeed-stack-skills.yaml` config to use the compose mount target
`/app-root/skills/echo` so resolution is deterministic. Make sure the change is
applied in the `skills` section and not elsewhere in the e2e configuration.
tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml (1)
24-26: 🩺 Stability & Availability | 🔴 Critical | ⚡ Quick win

Use absolute path for skills directory to prevent startup failures.

skills is a relative path. If the service working directory differs from /app-root, skill discovery fails at startup. Change to /app-root/skills to match the compose mount explicitly. This was flagged in a previous review and remains unaddressed.
Proposed fix
 skills:
   paths:
-    - skills
+    - /app-root/skills
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml`
around lines 24 - 26, The skills directory configuration is using a relative
path, which can break startup when the working directory is not the expected
root. Update the skills path in the lightspeed stack config to use the absolute
mounted location instead of the current relative value, and make sure the change
is applied in the skills discovery config that the startup flow reads so skill
loading works reliably regardless of cwd.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docker-compose-library.yaml`:
- Line 23: Align the SELinux label used for the `./tests/e2e/skills` bind mount
in `docker-compose-library.yaml` with the one used in `docker-compose.yaml` to
avoid unnecessary divergence. Update the volume entry under the library compose
service to use the same `ro,z`/`ro,Z` convention consistently across both
compose files, or explicitly document why `docker-compose-library.yaml` should
differ. Use the shared skills mount definition as the reference point when
making the change.

In `@tests/e2e/features/skills.feature`:
- Around line 14-16: The skills feature scenarios are applying the new MCP
skills config before resetting toolgroups, which can leave stale server-mode
registrations in place. Update each affected scenario in skills.feature so
reset_mcp_toolgroups_for_new_configuration runs before The service uses the
lightspeed-stack-skills.yaml configuration, then restart the service afterward.
Keep the step order consistent in all listed scenarios so list_skills and
load_skill assertions always use the fresh toolgroup state.

In `@tests/e2e/skills/echo/SKILL.md`:
- Line 17: The SKILL.md guidance in the echo skill uses the redundant phrase
“return back”; update the wording in the instruction text to say “Return the
exact text to the user without modification” so it stays clear and concise. Make
this edit in the echo skill’s step that describes the response behavior, keeping
the rest of the instruction unchanged.

---

Outside diff comments:
In `@tests/e2e/features/skills.feature`:
- Around line 59-92: The tool metadata in skills.feature is inconsistent with
the expected call arguments: load_skill currently advertises name while the
assertions use skill_name, and read_skill_resource advertises path while the
assertions use resource_name. Update the feature so the parameter names in the
tool definitions and the tool_calls checks match exactly, using the same symbols
load_skill and read_skill_resource throughout.

---

Duplicate comments:
In `@tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml`:
- Around line 24-26: The skills directory configuration is using a relative
path, which can break startup when the working directory is not the expected
root. Update the skills path in the lightspeed stack config to use the absolute
mounted location instead of the current relative value, and make sure the change
is applied in the skills discovery config that the startup flow reads so skill
loading works reliably regardless of cwd.

In `@tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml`:
- Around line 24-26: The skills path in the configuration is still relative and
depends on the working directory, so it should be pinned to the mounted absolute
location instead. Update the `skills.paths` entry in the
`lightspeed-stack-skills.yaml` config to use the compose mount target
`/app-root/skills/echo` so resolution is deterministic. Make sure the change is
applied in the `skills` section and not elsewhere in the e2e configuration.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 07dc80e8-7a8f-4ea7-a655-f00e2ffeee6d

📥 Commits

Reviewing files that changed from the base of the PR and between 1f11ea7 and e2990a2.

📒 Files selected for processing (14)

docker-compose-library.yaml
docker-compose.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-skills-directory.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-skills.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml
tests/e2e/features/skills.feature
tests/e2e/features/steps/common_http.py
tests/e2e/features/steps/llm_query_response.py
tests/e2e/skills/echo/SKILL.md
tests/e2e/skills/echo/references/guide.md
tests/e2e/skills/summarize/SKILL.md
tests/e2e/skills/summarize/references/guide.md
tests/e2e/test_list.txt

📜 Review details

⏰ Context from checks skipped due to timeout. (12)

GitHub Check: integration_tests (3.13)
GitHub Check: integration_tests (3.12)
GitHub Check: build-pr
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
GitHub Check: E2E Tests for Lightspeed Evaluation job

⚠️ CI failures not shown inline (2)

GitHub Actions: OpenAPI (Spectral) / spectral: LCORE-2080: Added E2E Steps for Agent Skills

Conclusion: failure

View job details

##[group]Run set -euo pipefail
 �[36;1mset -euo pipefail�[0m
 �[36;1muv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json�[0m
 �[36;1mif ! diff -u docs/openapi.json /tmp/openapi-generated.json; then�[0m
 �[36;1m  echo "::error::docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json"�[0m

GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt: LCORE-2080: Added E2E Steps for Agent Skills

Conclusion: failure

View job details

##[group]Run set -euo pipefail
 �[36;1mset -euo pipefail�[0m
 �[36;1muv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json�[0m
 �[36;1mif ! diff -u docs/openapi.json /tmp/openapi-generated.json; then�[0m
 �[36;1m  echo "::error::docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json"�[0m

🧰 Additional context used

📓 Path-based instructions (2)

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/e2e/features/steps/common_http.py
tests/e2e/features/steps/llm_query_response.py

tests/e2e/**/*.{py,feature}

📄 CodeRabbit inference engine (AGENTS.md)

Use behave (BDD) framework for end-to-end testing with Gherkin feature files

Files:

tests/e2e/features/steps/common_http.py
tests/e2e/features/steps/llm_query_response.py
tests/e2e/features/skills.feature

🧠 Learnings (4)

📚 Learning: 2026-05-20T08:09:30.641Z

Learnt from: max-svistunov
Repo: lightspeed-core/lightspeed-stack PR: 1580
File: docs/design/llama-stack-config-merge/poc-results/library-mode/synthesized-run.yaml:107-110
Timestamp: 2026-05-20T08:09:30.641Z
Learning: In Llama-stack config YAMLs, when defining a Llama Guard safety shield entry, set `provider_shield_id` to the *guard model identifier* (e.g., `meta-llama/Llama-Guard-3-8B`). Do not use a chat/generative model id (e.g., `openai/gpt-4o-mini`): a chat-model id (or `native_override`) indicates only an override landed and does **not** mean the safety shield is actually gating queries. Ensure any E2E coverage for the related implementation (JIRA/E2E tests) exercises a real Llama Guard model to verify that the shield is effective.

Applied to files:

docker-compose-library.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml
docker-compose.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-skills-directory.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-skills.yaml

📚 Learning: 2026-04-07T09:20:26.590Z

Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1467
File: tests/e2e/features/steps/common.py:36-49
Timestamp: 2026-04-07T09:20:26.590Z
Learning: For Behave-based Python tests, rely on Behave’s Context layered stack for attribute lifecycle: Behave pushes a new Context layer when entering feature scope (before_feature) and again for scenario scope (before_scenario). Attributes assigned inside given/when/then steps live on the current scenario layer and are automatically removed when the scenario ends. As a result, step-set attributes should not be expected to persist across scenarios or features, and manual cleanup in after_scenario/after_feature is generally unnecessary for attributes set in step functions. Only perform manual cleanup for attributes that you set explicitly in before_feature/before_scenario, since those live on the respective feature/scenario layers.

Applied to files:

tests/e2e/features/steps/common_http.py
tests/e2e/features/steps/llm_query_response.py

📚 Learning: 2026-04-13T13:39:54.963Z

Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1490
File: tests/e2e/features/environment.py:206-211
Timestamp: 2026-04-13T13:39:54.963Z
Learning: In lightspeed-stack E2E tests under tests/e2e/features, it is intentional to set context.feature_config inside Background/step functions (scenario-scoped Behave layer). The environment.py after_scenario restore logic should only restore configuration when context.scenario_lightspeed_override_active is True; this flag is set by configure_service only when a real config switch occurs (so restore does not run for scenarios without a switch). Additionally, steps/common.py’s module-level _active_lightspeed_stack_config_basename is used to prevent re-applying the same config across subsequent scenarios, ensuring scenario_lightspeed_override_active stays False after the first apply. Therefore, reviewers should not “fix” this flow as if feature_config were incorrectly scoped or if after_scenario restoration is missing—config switching and restoration are meant to happen exactly once per actual switch, not redundantly per scenario.

Applied to files:

tests/e2e/features/steps/common_http.py
tests/e2e/features/steps/llm_query_response.py

📚 Learning: 2026-06-24T13:45:37.249Z

Learnt from: Jdubrick
Repo: lightspeed-core/lightspeed-stack PR: 1971
File: src/utils/markdown_repair.py:31-36
Timestamp: 2026-06-24T13:45:37.249Z
Learning: In the lightspeed-stack repository, docstrings must use the section header name "Parameters:" (not "Args:") for function arguments, even if the project references Google Python docstring conventions. Ensure docstrings follow the project’s established "Parameters:" header format for any documented function parameters.

Applied to files:

tests/e2e/features/steps/common_http.py
tests/e2e/features/steps/llm_query_response.py

🪛 LanguageTool

tests/e2e/skills/echo/SKILL.md

[style] ~17-~17: Using “back” with the verb “return” may be redundant.
Context: ...r's input text 2. Return the exact text back to the user without modification For f...

(RETURN_BACK)

🔇 Additional comments (8)

tests/e2e/features/steps/common_http.py (1)

331-333: Apply placeholder substitution before parsing the expected JSON.

This step still calls json.loads(context.text) directly, so {MODEL}-style placeholders here will fail even though the sibling partial-body step resolves them first.

tests/e2e/features/steps/llm_query_response.py (1)

94-97: LGTM!

Also applies to: 368-404

docker-compose.yaml (1)

90-90: LGTM!

tests/e2e/skills/echo/references/guide.md (1)

1-20: LGTM!

tests/e2e/skills/summarize/SKILL.md (1)

1-21: LGTM!

tests/e2e/skills/summarize/references/guide.md (1)

1-21: LGTM!

tests/e2e/configuration/library-mode/lightspeed-stack-skills-directory.yaml (1)

1-26: LGTM!

tests/e2e/configuration/library-mode/lightspeed-stack-skills.yaml (1)

1-26: LGTM!

coderabbitai · 2026-06-29T13:51:20Z

+## Instructions
+
+1. Read the user's input text
+2. Return the exact text back to the user without modification


📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Remove redundant "back" in "return back".

"Return" already implies giving back; "return back" is pleonastic.

Proposed fix

-2. Return the exact text back to the user without modification +2. Return the exact text to the user without modification

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

2. Return the exact text back to the user without modification

2. Return the exact text to the user without modification

🧰 Tools

🪛 LanguageTool

[style] ~17-~17: Using “back” with the verb “return” may be redundant.
Context: ...r's input text 2. Return the exact text back to the user without modification For f...

(RETURN_BACK)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/e2e/skills/echo/SKILL.md` at line 17, The SKILL.md guidance in the echo skill uses the redundant phrase “return back”; update the wording in the instruction text to say “Return the exact text to the user without modification” so it stays clear and concise. Make this edit in the echo skill’s step that describes the response behavior, keeping the rest of the instruction unchanged.

refined E2E tests for skills and added necessary step implementations. close: LCORE-2080

jrobertboos force-pushed the lcore-2080 branch 3 times, most recently from c201e27 to fe7754f Compare June 23, 2026 16:29

jrobertboos commented Jun 23, 2026

View reviewed changes

jrobertboos force-pushed the lcore-2080 branch 3 times, most recently from bd2b990 to 159e8ae Compare June 25, 2026 13:43

jrobertboos marked this pull request as ready for review June 25, 2026 15:37

coderabbitai Bot reviewed Jun 25, 2026

View reviewed changes

anik120 reviewed Jun 25, 2026

View reviewed changes

jrobertboos force-pushed the lcore-2080 branch from 1f11ea7 to 2954a53 Compare June 25, 2026 16:07

radofuchs requested changes Jun 26, 2026

View reviewed changes

Comment thread tests/e2e/features/steps/common_http.py Outdated

Comment thread tests/e2e/features/steps/llm_query_response.py Outdated

coderabbitai Bot reviewed Jun 29, 2026

View reviewed changes

jrobertboos requested a review from radofuchs June 29, 2026 14:05

(e2e) added E2E steps for agent skills

9af8d80

refined E2E tests for skills and added necessary step implementations. close: LCORE-2080

jrobertboos force-pushed the lcore-2080 branch from e2990a2 to 9af8d80 Compare June 29, 2026 14:41

		expected_value = json.loads(context.text)
		validate_json_partially(actual_value, expected_value)

	2. Return the exact text back to the user without modification
	2. Return the exact text to the user without modification

Uh oh!

Conversation

jrobertboos commented Jun 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

jrobertboos Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

jrobertboos Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

jrobertboos Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

jrobertboos commented Jun 25, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

anik120 left a comment

Choose a reason for hiding this comment

Uh oh!

radofuchs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

asimurka commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jrobertboos commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asimurka commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jrobertboos commented Jun 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 17, 2026 •

edited

Loading

asimurka commented Jun 26, 2026 •

edited

Loading

jrobertboos commented Jun 26, 2026 •

edited

Loading

asimurka commented Jun 26, 2026 •

edited

Loading