AnyGPT Platform — Modular Multi-Repo Development Plan

## Overview

AnyGPT is a document-based AI assistant platform where users upload files and get accurate, cited answers — built on Jaseci (Jac + byLLM). The [existing MVP](https://github.com/Hushan-10/Any_GPT) already has a working RAG pipeline, FAISS vector search, CrossEncoder reranking, session management, and a full Jac-Client frontend. 

This issue defines how we **modularize** the platform into **4 focused repos** — one per team — designed to interconnect and allow standalone integration into any product.

---

## What Already Exists (MVP)

```
Any_GPT/
├── main.jac                         ← Entry point
├── services/
│   ├── server.jac                   ← Jac walkers (chat, upload, session, docs)
│   ├── server.impl.jac
│   ├── rag_engine.jac               ← RAG orchestrator (FAISS + CrossEncoder)
│   ├── ingestion/                   ← loaders, chunker, file_store
│   ├── retrieval/                   ← vector_store, reranker, retriever
│   └── models/                      ← tenant, document, session nodes
├── components/                      ← Jac-Client UI components
├── pages/                           ← ChatPage, LoginPage, RegisterPage
├── hooks/                           ← useAuth, useChat
├── services/anygptService.cl.jac    ← Frontend API service
└── config/anygpt.json               ← Platform config
```

**Stack**: Jac · byLLM · Jac Client · Mantine · FAISS · OpenAI Embeddings · CrossEncoder · HF TRL (planned) · Tauri 2.0 (planned)

---

## Current Architecture

```mermaid
graph TD
    subgraph CLIENT["Frontend  —  Jac Client + Mantine"]
        P1[LoginPage / RegisterPage]
        P2[ChatPage]
        C1[Sidebar]
        C2[ChatInput + ChatMessage]
        C3[FileUpload + DocumentList]
        H1[useAuth / useChat hooks]
        S1[anygptService]
    end

    subgraph BACKEND["Backend  —  Jac Walkers"]
        W1[interact walker\nchat with ReAct + streaming]
        W2[upload_document walker]
        W3[list/delete/reindex walkers]
        W4[get_user_sessions walker]
        N1[Session node]
        N2[ChatNode → byLLM ReAct]
    end

    subgraph RAG["RAG Engine"]
        R1[RagEngine\nper-user, shared CrossEncoder]
        R2[ingestion/\nloaders · chunker · file_store]
        R3[retrieval/\nvector_store · reranker · retriever]
        R4[FAISS Index\nOpenAI Embeddings]
    end

    subgraph STORAGE["Storage  —  Local FS"]
        S2[anygpt-data/tenants/\<tenant>/\nusers/\<user>/uploads/]
        S3[anygpt-data/tenants/\<tenant>/\nusers/\<user>/faiss_index/]
    end

    CLIENT --> BACKEND
    BACKEND --> RAG
    RAG --> STORAGE
```

---

## Target: 4-Repo Modular Structure

The split is driven by **team ownership**, **standalone reusability**, and **deployment independence** — not by number of files.

```mermaid
graph TD
    CORE["📦 any-gpt\nCore Platform\n(this repo, extended)"]
    RAG["📦 anygpt-rag\nRAG as standalone package\n🔌 pip installable"]
    MCP["📦 anygpt-mcp\nMCP Server + Connectors\n🔌 works with any MCP client"]
    TRAINER["📦 anygpt-slm-trainer\nSLM Auto-Specialization\nGPU pipeline, different lifecycle"]

    CORE -->|uses| RAG
    CORE -->|uses| MCP
    TRAINER -->|exports models to| CORE
    RAG -.->|standalone in any product| EXTERNAL[3rd Party Products]
    MCP -.->|standalone MCP server| EXTERNAL
```

---

## Repo 1: `any-gpt`  (Core Platform — this repo, extended)

> **Team**: Core + Frontend  
> **What it owns**: Platform walkers, UI, auth, multi-tenancy, admin dashboard, Tauri desktop, deployment configs

This is the main product repo. All platform-level features go here. The RAG engine is consumed as a package from `anygpt-rag`.

```mermaid
graph TD
    subgraph ANYG["any-gpt — Phase Roadmap"]
        PH1["✅ Phase 1  MVP\nRAG chat · doc upload · sessions\nFAISS · single tenant"]
        PH2["🔜 Phase 2  Multi-Tenant\nTenant isolation · auth (SSO/SAML)\nRBAC · per-tenant RAG config\nAdmin Dashboard"]
        PH3["🔜 Phase 3  Advanced AI\nQuery Router (SLM vs LLM)\nAgentic RAG Orchestrator\nMCP Context Engine\nSelf-RAG reflection"]
        PH4["🔜 Phase 4  Enterprise\nTauri desktop (offline llama.cpp)\nHelm / K8s deployment\nAir-gapped support\nSSE streaming"]
        PH1 --> PH2 --> PH3 --> PH4
    end
```

**Phase 2 additions**:
- `services/auth/` — SSO/SAML, RBAC, JWT
- `services/tenant/` — Tenant Manager walker (isolation · config · model selection)
- `pages/AdminPage.cl.jac` — Enterprise config dashboard
- Replace FAISS with `anygpt-rag` package (Qdrant/pgvector backend)

**Phase 3 additions**:
- `walkers/query_router.jac` — Domain classifier (in-domain → SLM, out-domain → LLM)
- `walkers/agentic_rag.jac` — Sub-task decomposition orchestrator
- `walkers/mcp_context.jac` — MCP Context Engine walker
- Consume `anygpt-mcp` as a sidecar

**Phase 4 additions**:
- `desktop/` — Tauri 2.0 app with local llama.cpp inference
- `deploy/` — Docker Compose + Helm charts

---

## Repo 2: `anygpt-rag`  (Standalone RAG Package 🔌)

> **Team**: AI/ML  
> **What it owns**: Document ingestion pipeline, vector search, reranking, advanced retrieval strategies  
> **Standalone**: `pip install anygpt-rag` — usable in any Python/Jac project

The ingestion + retrieval modules are **already well-isolated** in the current codebase. This repo extracts and evolves them independently.

```mermaid
graph LR
    subgraph RAG_PKG["anygpt-rag package"]
        I1[ingestion/\nloaders · chunker · file_store]
        I2[retrieval/\nvector_store · reranker · retriever]
        I3[RagEngine\norchestrator]

        subgraph BACKENDS["Pluggable Backends"]
            B1[FAISS\ncurrent Phase 1]
            B2[Qdrant\nPhase 2]
            B3[pgvector\nPhase 2]
        end

        subgraph STRATEGIES["Retrieval Strategies"]
            S1[BM25 + Dense\ncurrent]
            S2[RAPTOR\nPhase 2]
            S3[Self-RAG\nPhase 2]
            S4[Contextual Chunking\nPhase 2]
        end
    end

    I1 --> I3
    I2 --> I3
    I3 --> BACKENDS
    I3 --> STRATEGIES
```

**Phase roadmap**:
- Phase 1: Extract from Any_GPT as-is (FAISS + CrossEncoder)
- Phase 2: Pluggable backends (Qdrant, pgvector), RAPTOR, Self-RAG, BM25 hybrid
- Phase 3: Per-tenant isolated indexes, multimodal (image/table extraction)

**Standalone usage**:
```python
from anygpt_rag import RagEngine
engine = RagEngine(backend="qdrant", tenant_id="acme")
engine.ingest_file("docs/manual.pdf")
results = engine.search("How do I reset my password?")
```

---

## Repo 3: `anygpt-mcp`  (MCP Server + Connectors 🔌)

> **Team**: Integrations  
> **What it owns**: AnyGPT as an MCP server, connectors to external MCP servers, tool actions  
> **Standalone**: Run as an MCP server over Streamable HTTP or STDIO — works with Claude, Cursor, or any MCP client

```mermaid
graph TD
    subgraph MCP_SERVER["AnyGPT MCP Server"]
        R1[Resources\nDocs as MCP resources]
        T1[Tools\nSearch-as-tool · Q&A-as-tool]
    end

    subgraph CONNECTORS["External MCP Connectors"]
        E1[Slack]
        E2[Confluence · Notion]
        E3[GitHub · GitLab]
        E4[Google Drive]
        E5[Jira · Linear]
    end

    subgraph ACTIONS["MCP Tool Actions"]
        A1[DB Writes]
        A2[Webhook Triggers]
        A3[Workflow Automation]
    end

    MCP_CLIENT[Claude · Cursor · Any MCP Client] --> MCP_SERVER
    MCP_SERVER --> CONNECTORS
    MCP_SERVER --> ACTIONS
```

**Standalone usage** — expose your AnyGPT knowledge base as MCP:
```bash
# Run as standalone MCP server (Streamable HTTP)
anygpt-mcp serve --port 3000 --docs ./my-docs/

# Or via STDIO (for Claude Desktop, Cursor)
anygpt-mcp stdio --docs ./my-docs/
```

---

## Repo 4: `anygpt-slm-trainer`  (SLM Auto-Specialization)

> **Team**: AI/ML (GPU workloads)  
> **What it owns**: Auto-specialization pipeline — synthetic data generation, SFT+DPO training, eval, model export  
> **Why separate**: Different infrastructure (GPU), different release cycle, different team skills (ML engineering vs app dev)

```mermaid
graph LR
    subgraph PIPELINE["SLM Training Pipeline"]
        T1["Step 1\nSynthetic Data Gen\nOpus crawls docs → 5K–50K Q&A pairs\n(multi-strategy)"]
        T2["Step 2\nAgent Answer Gen\nOpus + Agentic RAG → answers\nRejection sampling k=8 + DPO pairs"]
        T3["Step 3\nSFT + DPO Training\nQLoRA 4-bit → SFT on Q&A\n→ DPO alignment on preference pairs"]
        T4["Step 4\nEval Loop\nCompare SLM vs Opus\nPass → Deploy  |  Fail → Retrain"]
        T1 --> T2 --> T3 --> T4
    end

    subgraph BASE["Base Model Options"]
        M1[Phi-4-Mini 3.8B]
        M2[Qwen 2.5 7B]
        M3[Llama 3.2 3B]
        M4[Mistral 7B]
    end

    subgraph EXPORT["Model Export"]
        E1[GGUF Q4_K_M → Local/Desktop]
        E2[ONNX → Edge]
        E3[vLLM → Cloud Serving]
    end

    BASE --> PIPELINE
    T4 --> EXPORT
    EXPORT -->|push to| REG[Model Registry\nin any-gpt]
```

**Trigger**: Automatically invoked by `any-gpt`'s Auto-Specialize Trigger walker when:
- New documents are ingested beyond a threshold
- SLM accuracy drops below a threshold vs LLM
- Tenant explicitly requests a re-specialization

---

## How the Repos Connect

```mermaid
sequenceDiagram
    participant U as User
    participant CORE as any-gpt
    participant RAG as anygpt-rag
    participant MCP as anygpt-mcp
    participant TRAINER as anygpt-slm-trainer

    U->>CORE: Upload docs + chat
    CORE->>RAG: ingest_file() / search()
    RAG-->>CORE: results + citations
    CORE->>MCP: fetch external context\n(Slack, Confluence, Notion)
    MCP-->>CORE: additional context chunks
    CORE-->>U: streamed answer (SSE)

    Note over CORE,TRAINER: Background — triggered async
    CORE->>TRAINER: auto_specialize(tenant_id, docs)
    TRAINER->>TRAINER: data gen → train → eval
    TRAINER-->>CORE: push GGUF model to registry
    CORE->>CORE: route in-domain queries to SLM
```

---

## Integration with External Products

Modules marked 🔌 are designed to be **dropped into any product** without AnyGPT:

| Module | How to use standalone |
|--------|----------------------|
| `anygpt-rag` | `pip install anygpt-rag` — RAG pipeline in any Python app |
| `anygpt-mcp` | Run as MCP server (HTTP/STDIO) — works with Claude, Cursor, any MCP client |

---

## Multi-Tenant Isolation (Phase 2)

```mermaid
graph LR
    subgraph TA["Tenant A  —  Branded"]
        TA1[Custom UI · Logo · Colors]
        TA2[API /api/tenant-a/*]
        TA3[Isolated Vector Index]
        TA4[SLM v2 · Phi-4-Mini]
        TA5["RAPTOR=ON · Self-RAG=ON · Model=Claude"]
    end
    subgraph TB["Tenant B  —  White-Label"]
        TB1[Custom UI · Full Rebrand]
        TB2[API /api/tenant-b/*]
        TB3[Isolated Vector Index]
        TB4[SLM v1 · Qwen 2.5 7B]
        TB5["RAPTOR=OFF · Self-RAG=ON · Model=GPT-4"]
    end
```

---

## Query Flow (Phase 3)

```mermaid
sequenceDiagram
    participant U as User
    participant GW as Auth + Tenant Router
    participant DC as Domain Classifier
    participant SLM as Specialized SLM
    participant LLM as Cloud LLM
    participant RAG as RAG Retrieval
    participant SR as Self-RAG Reflection
    participant SSE as SSE Stream

    U->>GW: Query
    GW->>DC: Classify domain
    alt In-Domain (trained docs)
        DC->>SLM: Route to SLM
    else Out-of-Domain
        DC->>LLM: Route to LLM
    end
    SLM->>RAG: Retrieve context
    RAG->>SR: Self-RAG reflection check
    SR->>SSE: Validated response
    SSE->>U: Stream chunks
```

---

## Development Phases

```mermaid
gantt
    title AnyGPT — Development Roadmap
    dateFormat  YYYY-MM-DD
    section Phase 1  MVP  ✅
    any-gpt core (done)           :done, p1, 2025-01-01, 2025-04-01
    section Phase 2  Multi-Tenant
    anygpt-rag (extract + Qdrant) :p2a, 2025-04-01, 6w
    any-gpt auth + tenant walkers :p2b, 2025-04-01, 6w
    any-gpt admin dashboard       :p2c, after p2b, 4w
    section Phase 3  Advanced AI
    anygpt-mcp server             :p3a, after p2a, 5w
    any-gpt query router + Self-RAG :p3b, after p2b, 5w
    section Phase 4  SLM + Enterprise
    anygpt-slm-trainer pipeline   :p4a, after p2a, 8w
    any-gpt Tauri desktop         :p4b, after p3b, 6w
    any-gpt Helm + K8s deploy     :p4c, after p3b, 4w
```

---

## Checklist

- [ ] **Phase 2** — Extract `anygpt-rag` from `Any_GPT/services/ingestion` + `Any_GPT/services/retrieval`
- [ ] **Phase 2** — Add auth, RBAC, tenant isolation walkers to `any-gpt`
- [ ] **Phase 2** — Build admin dashboard (`AdminPage.cl.jac`)
- [ ] **Phase 3** — Create `anygpt-mcp` repo: MCP server + Slack/Confluence/Notion connectors
- [ ] **Phase 3** — Add Query Router + Self-RAG walkers to `any-gpt`
- [ ] **Phase 4** — Create `anygpt-slm-trainer` repo: data gen → SFT+DPO → eval → export
- [ ] **Phase 4** — Tauri 2.0 desktop app (offline llama.cpp via GGUF export)
- [ ] **Phase 4** — Helm charts + K8s manifests, air-gapped deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AnyGPT Platform — Modular Multi-Repo Development Plan #544

Overview

What Already Exists (MVP)

Current Architecture

Target: 4-Repo Modular Structure

Repo 1: `any-gpt` (Core Platform — this repo, extended)

Repo 2: `anygpt-rag` (Standalone RAG Package 🔌)

Repo 3: `anygpt-mcp` (MCP Server + Connectors 🔌)

Repo 4: `anygpt-slm-trainer` (SLM Auto-Specialization)

How the Repos Connect

Integration with External Products

Multi-Tenant Isolation (Phase 2)

Query Flow (Phase 3)

Development Phases

Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Module	How to use standalone
`anygpt-rag`	`pip install anygpt-rag` — RAG pipeline in any Python app
`anygpt-mcp`	Run as MCP server (HTTP/STDIO) — works with Claude, Cursor, any MCP client

AnyGPT Platform — Modular Multi-Repo Development Plan #544

Description

Overview

What Already Exists (MVP)

Current Architecture

Target: 4-Repo Modular Structure

Repo 1: any-gpt (Core Platform — this repo, extended)

Repo 2: anygpt-rag (Standalone RAG Package 🔌)

Repo 3: anygpt-mcp (MCP Server + Connectors 🔌)

Repo 4: anygpt-slm-trainer (SLM Auto-Specialization)

How the Repos Connect

Integration with External Products

Multi-Tenant Isolation (Phase 2)

Query Flow (Phase 3)

Development Phases

Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Repo 1: `any-gpt` (Core Platform — this repo, extended)

Repo 2: `anygpt-rag` (Standalone RAG Package 🔌)

Repo 3: `anygpt-mcp` (MCP Server + Connectors 🔌)

Repo 4: `anygpt-slm-trainer` (SLM Auto-Specialization)