feat: Add ReasoningBank for reusable reasoning strategies by nebrass · Pull Request #702 · google/adk-java

nebrass · 2026-01-05T14:18:26Z

Summary

This PR implements ReasoningBank in ADK Java — a memory framework that lets agents distill
reusable reasoning strategies from their past task executions (both successful and failed)
and retrieve them to guide new, similar tasks.

Ouyang et al. "ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory" (ICLR 2026).
Paper: https://arxiv.org/abs/2509.25140 · Blog: https://research.google/blog/reasoningbank-enabling-agents-to-learn-from-experience/ · Reference implementation: https://github.com/google-research/reasoning-bank

The design mirrors ADK's existing Memory feature (BaseMemoryService, InMemoryMemoryService,
LoadMemoryTool).

What is ReasoningBank?

Unlike memory mechanisms that store raw trajectories (Synapse) or only successful workflows (Agent
Workflow Memory), ReasoningBank distills compact, transferable memory items from both successes
and failures. Failure-derived items become preventative "guardrails" — e.g. "verify the page
identifier before loading more results to avoid infinite-scroll traps."

Components (`com.google.adk.reasoning`)

Data models

ReasoningMemoryItem — immutable memory item with the paper's canonical title / description / content schema, plus sourceTraceSuccessful so failure-derived preventative lessons are first-class (also id, tags, createdAt).
ReasoningTrace — a raw task trajectory (task, output, intermediate reasoning steps, successful flag) retained for later distillation.
SearchReasoningResponse — search result wrapper.

Service layer

BaseReasoningBankService — storage/retrieval contract: storeMemoryItem, storeTrace, searchMemoryItems.
InMemoryReasoningBankService — prototype implementation using bag-of-words keyword scoring (title > description > tags > content). Not production-grade — the reference implementation uses embedding-based retrieval.

Extraction SPI

MemoryExtractor (+ NoOpMemoryExtractor) — extension point for the "judge & extract" step of the loop; extract(query, List<ReasoningTrace>) accommodates parallel/sequential MaTTS distillation later without an API break. LLM-backed extractors are intentionally left to downstream modules to keep this contrib module dependency-free.

Tool integration (com.google.adk.tools)

LoadReasoningMemoryTool — a FunctionTool exposing retrieval to agents as loadReasoningMemory(query).
LoadReasoningMemoryResponse — tool response record.

The closed loop

retrieve ──► act (agent / env) ──► judge (LLM) ──► extract (LLM) ──► consolidate
   ▲                                                                      │
   └──────────────────────────────────────────────────────────────────────┘

searchMemoryItems → retrieve · the agent runtime → act · MemoryExtractor → judge & extract · storeMemoryItem → consolidate (append).

Integration

The module is self-contained and does not modify InvocationContext or ToolContext.
Agents use it by constructing LoadReasoningMemoryTool(reasoningBankService, appName) and adding it
to their tool list (constructor injection). No core ADK changes are required.

Out of scope (documented in the module README)

Embedding-based retrieval (the in-memory service uses keyword matching).
Memory-aware Test-Time Scaling (MaTTS) driver (parallel self-contrast / sequential refinement).
LLM-as-a-judge and LLM extraction prompts (SUCCESSFUL_SI, FAILED_SI, PARALLEL_SI, …).

Usage

BaseReasoningBankService reasoningBank = new InMemoryReasoningBankService();

// Store a distilled memory item (here, a preventative lesson from a failed run)
reasoningBank.storeMemoryItem(
        "myApp",
        ReasoningMemoryItem.builder()
            .id("pagination-guardrail")
            .title("Verify page identifier before pagination")
            .description("Confirm the active page before loading more results.")
            .content(
                "Cross-reference the current page id with active filters to avoid "
                    + "infinite-scroll traps.")
            .tags(ImmutableList.of("web", "pagination"))
            .sourceTraceSuccessful(false)
            .build())
    .blockingAwait();

// Expose retrieval to an agent
LoadReasoningMemoryTool tool = new LoadReasoningMemoryTool(reasoningBank, "myApp");
// add `tool` to your agent's tool list

Test Plan

ReasoningMemoryItemTest (4)
ReasoningTraceTest (5)
InMemoryReasoningBankServiceTest (12) — includes retrieval of failure-derived items
NoOpMemoryExtractorTest (2)
All 23 module unit tests pass

Paper: ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory (ICLR 2026)
Blog: https://research.google/blog/reasoningbank-enabling-agents-to-learn-from-experience/
Reference implementation: https://github.com/google-research/reasoning-bank
Mirrors ADK's Memory feature pattern (BaseMemoryService, InMemoryMemoryService, LoadMemoryTool).

google-cla · 2026-01-05T14:18:32Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

gemini-code-assist · 2026-01-05T14:18:46Z

Summary of Changes

Hello @nebrass, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the foundational ReasoningBank feature, designed to enhance agent capabilities by allowing them to learn from and reuse successful problem-solving approaches. By providing mechanisms to store and retrieve distilled reasoning strategies and raw execution traces, agents can apply proven methods to new, similar tasks, thereby improving their efficiency and effectiveness. The implementation includes core data models, a service interface with an in-memory prototype, and seamless integration into the existing tool and invocation contexts.

Highlights

New Feature: ReasoningBank: Introduces the ReasoningBank feature, enabling agents to store and retrieve proven reasoning strategies, based on the 'Reasoning-Bank: Learning from the Traces of Thought' paper.
New Data Models: Added ReasoningStrategy (for distilled reasoning approaches), ReasoningTrace (for raw task execution data), and SearchReasoningResponse (for strategy search results).
Service Layer Implementation: Defined BaseReasoningBankService interface and provided an InMemoryReasoningBankService implementation for prototyping, utilizing keyword matching for strategy retrieval.
Tool Integration: Integrated LoadReasoningStrategyTool as a function tool, allowing agents to search for and load relevant strategies, along with its corresponding LoadReasoningStrategyResponse.
Context Updates: Modified InvocationContext to include the reasoningBankService and ToolContext to expose a searchReasoningStrategies() method for agent access.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a ReasoningBank feature, which is a significant and well-implemented addition. The new components, including data models, services, and tool integrations, are clearly defined and follow the existing architectural patterns of the project. The code is well-documented and accompanied by a comprehensive set of unit tests, ensuring the new functionality is robust. I have a couple of minor suggestions for code refinement in the InMemoryReasoningBankService to improve conciseness and use more idiomatic Java constructs, but overall, this is excellent work.

glaforge · 2026-02-11T09:53:20Z

Do you think you could move the contribution in the contrib folder?
In core, we'd like to keep feature that are available across all our language runtimes, and agreed upon. Here, the Reasoning Bank would be only available in Java (for now at least). So that would make sense to move it in the contribution section as this is specific to ADK Java.

nebrass · 2026-02-11T14:45:56Z

Thanks @glaforge, that makes sense. I've moved the entire ReasoningBank contribution to contrib/reasoning-bank/:

Created a new contrib/reasoning-bank Maven module with its own pom.xml
Moved all reasoning models (ReasoningStrategy, ReasoningTrace, SearchReasoningResponse), the service interface (BaseReasoningBankService), the in-memory implementation, and the tool (LoadReasoningStrategyTool, LoadReasoningStrategyResponse) from core/ to contrib/reasoning-bank/
Reverted all changes to InvocationContext and ToolContext in core — no reasoning-specific code remains in core
Refactored LoadReasoningStrategyTool to be self-contained: it accepts the BaseReasoningBankService and appName via its constructor rather than relying on ToolContext or InvocationContext
All tests (both core and reasoning-bank) pass

Add a contrib/reasoning-bank module implementing the ReasoningBank pattern (arXiv:2509.25140) for storing and retrieving proven reasoning strategies. Includes data models, in-memory service, and a FunctionTool for agent integration.

Resolves a build failure caused by an unresolvable parent POM version of 0.5.1-SNAPSHOT in the contrib/reasoning-bank module.

glaforge · 2026-04-21T18:56:24Z

Looks like the paper has been updated?
https://research.google/blog/reasoningbank-enabling-agents-to-learn-from-experience/
Is it changing something for your implementation?

The ReasoningBank paper (arXiv:2509.25140) and its reference implementation at google-research/reasoning-bank were updated; the memory item schema and loop are now pinned. This commit aligns the contrib module. Key changes: * Replace ReasoningStrategy with ReasoningMemoryItem matching the paper's schema: title / description / content (+ tags, id, createdAt). The prior problemPattern + ordered 'steps' shape was closer to Agent Workflow Memory, which the paper explicitly positions ReasoningBank against. * Add sourceTraceSuccessful flag on memory items. Failure-derived items (preventative lessons / guardrails) are first-class, matching the paper's emphasis on distilling insights from both successful and failed runs. * Add MemoryExtractor SPI (+ NoOpMemoryExtractor) to represent the 'judge & extract' step of the closed loop. LLM-backed extractors stay out of this module to keep it dependency-free. * extract() takes List<ReasoningTrace> so memory-aware test-time scaling (MaTTS) parallel/sequential distillation can be layered on later without an API break. * Rename service methods storeStrategy/searchStrategies to storeMemoryItem/searchMemoryItems and the tool to LoadReasoningMemoryTool. * Update InMemoryReasoningBankService scoring: title (x3) > description (x2) > tags (x1) > content (flat bonus). Take a snapshot of the synchronized list before iterating. * Add README covering scope, the retrieve -> act -> judge -> extract -> consolidate loop, and what is intentionally out of scope (embedding retrieval, MaTTS driver, LLM extraction prompts). All 23 unit tests pass.

Phase 0 of the closed-loop work. Additive, backward-compatible on the unreleased schema. * ReasoningMemoryItem gains sourceTraceId, judgeVerdict, judgeConfidence (all nullable) and trust (default 1.0). Provenance makes a judge-minted item locatable/evictable and lets failure-derived items be trust-demoted at retrieval -- the audit primitives the closed loop needs to be safe. * InMemoryReasoningBankService default retrieval cap 5 -> 3, matching the paper's k-ablation (more retrieved memories monotonically hurt).

Phase 1 of the closed loop. Both impls use core's BaseLlm only -- no new module dependencies -- and are fully testable offline via a FakeLlm double. * TrajectoryJudge SPI + Verdict (three-state SUCCESS/FAILURE/INDETERMINATE). LlmTrajectoryJudge ports the reference judge's asymmetric-strictness rubric (generalized off WebArena): mark failure when uncertain. A judge that ran but was unparseable -> FAILURE; a judge that errored/returned nothing -> INDETERMINATE (abstain, mint nothing) so a non-run never fabricates a preventative guardrail. * LlmMemoryExtractor implements MemoryExtractor, routing on trajectory count/outcome to the SUCCESSFUL_SI / FAILED_SI / PARALLEL_SI prompts, emitting JSON parsed via outputSchema-style typing, capped in code (3 single / 5 parallel) and never throwing (malformed -> empty list). Minted items carry provenance (sourceTraceId, judgeVerdict, outcome). 13 new tests (judge 6, extractor 7); 39 module tests pass.

Phase 2. One plugin, no ADK core edits (service captured by constructor). * Retrieve (read-only, always on): beforeModelCallback searches the bank for the latest user turn and injects matches as a DE-PRIVILEGED, fenced, escaped 'untrusted DATA' user turn -- never a system instruction. Item text that tries to close the fence is neutralized, so stored memory cannot inject instructions into the agent (a poisoned item is re-injected forever). * Judge -> extract -> consolidate (write, OPT-IN, triple-gated on autoConsolidate + judge + extractor): afterRunCallback judges the trajectory, and on a SUCCESS/FAILURE verdict distills and stores items; an INDETERMINATE verdict (judge errored) abstains and mints nothing. Runs off the critical path (Schedulers.io, onErrorComplete) so it never blocks or fails the run. * Updates README to document the now-complete loop and the safety model. 9 new tests; 48 module tests pass.

…Phase 5) Driven by an adversarial red-team of the memory-injection path. * Injection containment is now structural, not marker whack-a-mole: sanitize strips format/zero-width/bidi (Cf) controls, collapses every line/paragraph separator (incl. U+2028/U+2029/U+0085) to a space, strips C0/C1 controls, neutralizes the exact fence markers, and length-caps fields; buildMemoryTurn caps item count. Attacker-controlled title/content can no longer forge a bullet, preamble, role marker, or confusable/fullwidth fence -- all collapse to inert inline data in the de-privileged user turn. 9-case corpus (C1-C12). * Per-run mint rate-limit (maxItemsPerRun, new constructor overload; existing signatures preserved) bounds how much one verdict can write. * Failure trust-demotion: a failure-derived guardrail surfaces only when no success item matched the query; trust() is now a live within-tier tiebreaker. * ConsolidationPolicy SPI with append-only identity() default (faithful) and a boundedByCreatedAt(n) example; InMemoryReasoningBankService store path is now read-modify-write under its existing monitor, observationally unchanged by default. 20 new tests; 68 module tests pass.

nebrass · 2026-06-18T17:36:11Z

Thanks @glaforge — good catch, and yes. Looking into it turned into a proper alignment plus building out the rest of the loop.

On the paper: arXiv:2509.25140 now has a camera-ready v2 (16 Mar 2026) and was accepted to ICLR 2026, and the blog accompanies the public release of the official reference implementation (google-research/reasoning-bank) — which didn't exist when I first opened this PR. One correction to my own PR while I was at it: the title was always "ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory" (v1 and v2 match on that) — the "Learning from the Traces of Thought" title in my original description was simply wrong, and I've fixed the description.

What it changed for the implementation — the official code + the blog's crystallized Title / Description / Content schema let me align the port and then realize the full closed loop:

Schema — replaced ReasoningStrategy (name/problemPattern/ordered steps, which was actually closer to Agent Workflow Memory, the baseline the paper positions against) with ReasoningMemoryItem (title/description/content) + provenance (sourceTraceId, judgeVerdict, …).
Learn from failure — first-class sourceTraceSuccessful; failure-derived items become preventative guardrails.
The loop — TrajectoryJudge (LLM-as-a-judge with the reference's asymmetric "mark failure when uncertain" rubric, plus a third INDETERMINATE state so a crashed judge mints nothing), LlmMemoryExtractor (the SUCCESSFUL_SI/FAILED_SI/PARALLEL_SI distillation prompts, capped + structured output), and a ReasoningBankPlugin that wires retrieve-before / opt-in consolidate-after through the plugin callbacks — no ADK core changes.
Safety — retrieved memory is injected as a de-privileged, fenced, structurally-contained untrusted-data turn (never a system instruction), with a per-run mint cap and failure trust-demotion; consolidation is opt-in and append-only by default behind a ConsolidationPolicy seam.

Kept dependency-free (the LLM impls use core's BaseLlm; embedding-based retrieval and MaTTS fan-out are noted as follow-ups). All behind tests.

This did grow the PR a fair bit — happy to split it (e.g. the schema alignment first, then the judge/extractor/plugin) if that's easier to review.

Merging main bumped the root POM to 1.4.1-SNAPSHOT, but this module's parent version was still 0.9.1-SNAPSHOT, breaking the reactor build. Align it with the root and the other contrib modules.

nebrass force-pushed the feature/reasoning-bank branch from e874b9e to 37a1f5c Compare January 5, 2026 14:19

gemini-code-assist Bot reviewed Jan 5, 2026

View reviewed changes

nebrass force-pushed the feature/reasoning-bank branch from 37a1f5c to c29b9c6 Compare January 5, 2026 14:23

glaforge added the contrib label Feb 11, 2026

nebrass force-pushed the feature/reasoning-bank branch 2 times, most recently from 067c700 to e49198b Compare February 11, 2026 14:53

glaforge force-pushed the feature/reasoning-bank branch from 087e713 to 373fe3d Compare March 18, 2026 11:35

nebrass added 5 commits March 18, 2026 18:29

Merge branch 'main' into feature/reasoning-bank

50f8398

Merge branch 'main' into feature/reasoning-bank

f0e041f

fix: update outdated parent POM version in reasoning-bank

d7340de

Resolves a build failure caused by an unresolvable parent POM version of 0.5.1-SNAPSHOT in the contrib/reasoning-bank module.

Merge branch 'main' into feature/reasoning-bank

226840b

Merge branch 'main' into feature/reasoning-bank

55d122f

nebrass force-pushed the feature/reasoning-bank branch from 8357b94 to e80d9c0 Compare June 18, 2026 13:52

nebrass added 4 commits June 18, 2026 21:35

nebrass added 2 commits June 18, 2026 21:36

Merge branch 'main' into feature/reasoning-bank

fe1e943

fix(reasoning-bank): bump parent POM to 1.4.1-SNAPSHOT after main merge

cc73f97

Merging main bumped the root POM to 1.4.1-SNAPSHOT, but this module's parent version was still 0.9.1-SNAPSHOT, breaking the reactor build. Align it with the root and the other contrib modules.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add ReasoningBank for reusable reasoning strategies#702

feat: Add ReasoningBank for reusable reasoning strategies#702
nebrass wants to merge 13 commits into
google:mainfrom
nebrass:feature/reasoning-bank

nebrass commented Jan 5, 2026 •

edited

Loading

Uh oh!

google-cla Bot commented Jan 5, 2026

Uh oh!

gemini-code-assist Bot commented Jan 5, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

glaforge commented Feb 11, 2026

Uh oh!

nebrass commented Feb 11, 2026

Uh oh!

glaforge commented Apr 21, 2026

Uh oh!

nebrass commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nebrass commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What is ReasoningBank?

Components (com.google.adk.reasoning)

The closed loop

Integration

Out of scope (documented in the module README)

Usage

Test Plan

Related

Uh oh!

google-cla Bot commented Jan 5, 2026

Uh oh!

gemini-code-assist Bot commented Jan 5, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

glaforge commented Feb 11, 2026

Uh oh!

nebrass commented Feb 11, 2026

Uh oh!

glaforge commented Apr 21, 2026

Uh oh!

nebrass commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nebrass commented Jan 5, 2026 •

edited

Loading

Components (`com.google.adk.reasoning`)