Service topology

Six services in one Docker Compose stack. All internal traffic crosses the dokploy-network bridge.

                ┌──────────────┐
                │   Traefik    │  (managed by Dokploy)
                │  reverse     │
                │  proxy       │
                └──┬────┬───┬──┘
     ai-kb.../    │    │   │  lumen-docs...
                  │    │   │
                  ▼    │   ▼
           ┌──────────┐ │ ┌──────────┐
           │ lumen-   │ │ │ lumen-   │
           │ web      │ │ │ docs     │
           │ Next 15  │ │ │ Next+MDX │
           └──┬───────┘ │ └──────────┘
              │         │
              │    lumen-api...
              │         ▼
              │   ┌──────────┐
              └──▶│ lumen-   │
                  │ api      │
                  │ Hono/Bun │
                  └──┬───┬───┘
                     │   │
           ┌─────────┘   └────────┐
           ▼                      ▼
     ┌─────────┐           ┌──────────┐
     │ Postgres│           │  Redis   │
     │ +pgvec  │           │ (BullMQ) │
     └─────────┘           └──┬───────┘
           ▲                  │
           │                  ▼
           │          ┌───────────────┐
           │          │ lumen-worker  │
           │          │ parse/chunk/  │
           │          │ embed         │
           └──────────┤               │
                      └──────┬────────┘
                             │
                             ▼
                      ┌──────────────┐
                      │ lumen-       │
                      │ embedder     │
                      │ FastAPI      │
                      └──────────────┘

Request paths

Read (chat)

Browser → lumen-web → React renders page
lumen-web fetches data from lumen-api via CORS
API embeds question → queries lumen-embedder for vector
API hybrid-searches Postgres (pgvector cosine + tsvector BM25)
Reranker call to embedder → top 5-8 chunks
API assembles prompt + streams from LLM provider (resolved from llm_providers table)
SSE stream flows back through web to browser

Write (document upload)

Browser → lumen-api POST /documents/projects/:id
API stores file under /data/uploads/<project>/<doc>/
API writes DB row with status = "processing"
API enqueues document-processing job on Redis
lumen-worker picks up job, parses file, chunks, embeds via lumen-embedder
Worker inserts chunks to Postgres, updates doc status = "indexed"
Frontend polls doc status or uses the status widget

Public share

Guest → lumen-web at /share/<token>
lumen-web fetches /share/<token> on API (no auth)
API validates token, returns conversation snapshot
Guest chats via SSE — limited to the frozen snapshot's project context

Data stores

| Store | What | |---|---| | lumen-postgres volume | All relational data + pgvector embeddings | | lumen-redis volume | BullMQ queue state | | upload_data volume | Raw document files (PDFs, docx, etc.) | | model_cache volume | Embedder model weights — persists across deploys |

What's NOT in the stack

No external object storage (S3) — uploads on Docker volume. Add later if scale demands.
No external auth provider — custom JWT. Swap for Clerk/Auth0 later if multi-tenant.
No CDN — Traefik serves Next static directly. Static assets are minimal (the UI isn't image-heavy).
No separate search cluster — pgvector on the same DB handles vector + BM25.

This is a deliberately small stack that can grow. Current hardware: single jaeger VPS. Can split postgres, add read replicas, move uploads to S3, split worker to a separate host — but not today.