Tech stack

Not "what's hot" — what's actually running.

Languages & runtimes

| Component | Language | Runtime | Why | |---|---|---|---| | apps/web | TypeScript | Node 20 (build) | Next 15 App Router, React 19, Tailwind 4 | | apps/api | TypeScript | Bun 1.1+ | Fast startup, native fetch, Hono is fast on Bun | | apps/embedder | Python 3.11 | uvicorn | sentence-transformers ecosystem is Python-first | | apps/worker | Python 3.11 | plain | PyMuPDF / python-docx for parsing, BullMQ client | | apps/docs | TypeScript | Node 20 (build) | Next 15 + @next/mdx — shared tokens with apps/web |

Framework choices

Frontend — Next.js 15 App Router

Server components by default. use client directive for interactivity. No custom build tooling — next dev with Turbopack is the one dev command.

Why not Remix / SvelteKit? Gani already ships Next 15 in other projects; velocity beats novelty.

Backend — Hono on Bun

Hono is a small, fast, middleware-driven framework. It works the same on Bun, Node, Cloudflare Workers — no lock-in. We run on Bun for speed; prod can switch to Node if Bun shows issues.

Why not tRPC / Express / NestJS?

tRPC — great for TS-only apps, but the docs site and future mobile client benefit from a plain REST API. HTTP + zod schemas give us the same type safety with wider reach.
Express — ancient middleware model, no first-class TS, slower on Bun.
NestJS — too much ceremony for a solo-tenant dev-phase app.

DB — Postgres 16 + pgvector

One database for everything: relational (users, projects, grants), full-text (tsvector BM25 for hybrid search), vectors (pgvector for embeddings). No separate vector DB.

Why not Pinecone / Weaviate / Qdrant?

Backup is a single pg_dump instead of two.
Joins between chunks and documents work natively.
pgvector performance is good up to ~1M vectors — we're far from that.

ORM — Prisma

Type-safe client, migration system, schema-as-source-of-truth. One caveat: Prisma cannot cast params to vector type, so vector search uses an inline embedding literal in raw SQL. See RAG pipeline for the query shape.

Queue — BullMQ + Redis

Document processing is async (parse → chunk → embed). BullMQ handles retries, concurrency limits, job history. Redis is also used as a simple cache when needed.

Embedder — multilingual-e5-small

384-dim vectors. Supports 100+ languages including Indonesian, Chinese, English. Requires E5 convention:

Passages (indexed chunks): "passage: <content>"
Queries: "query: <user question>"

Mixing the prefixes degrades retrieval quality — both backend and worker enforce this.

Reranker — BGE-reranker-v2-m3

Cross-encoder second-stage ranker. Takes top-50 hybrid candidates, scores them against the query, returns top 5-8 for LLM context. Language-agnostic.

LLM — configurable per deploy

OpenAI-compatible API. Defaults to DeepSeek via the enowxai proxy at 100.115.42.124:1430 (Gani's homelab). Provider config lives in the llm_providers table, editable from /engineer/providers.

What we don't use (and why)

React Server Actions — RSC is fine, but server actions couple business logic to routes in ways that make the API hard to test and hard to expose to mobile later. We keep it REST.
SWR mutations hooks for CUD — useUsers() returns read-only SWR data; creates/updates/deletes use api.fetch directly. Simpler, and SWR mutations encourage optimistic UI that's wrong for single-tenant admin flows.
GraphQL — over-engineering for a focused surface area. If a future feature needs it, we'll add a /graphql endpoint with graphql-yoga on Hono.
tRPC everywhere — see above.
Separate auth service — custom JWT with 15m access + 7d refresh is adequate for the dev phase. A future multi-tenant rewrite can drop in Clerk or similar.

Versions (pinned in package.json)

Next.js ^15.3.2
React ^19.1.0
Tailwind ^4.1.5 (new @theme inline syntax)
Prisma ^6.8.0
Hono ^4.x (check apps/api/package.json)
Bun 1.1+