Skip to content

Architecture

Xiajiao follows one rule: ship full functionality with the least code and the fewest dependencies.

Xiajiao IM main UI

This is a practical architecture, not a showcase. For product behavior see Tool calling, Agent memory, RAG, Multi-agent chat, and Collaboration flow.

Design philosophy

Three rules

  1. Prefer the standard librarynode:http instead of Express, node:test instead of Jest, node:crypto instead of uuid
  2. Prefer one process — no distributed stack for this workload
  3. Prefer the filesystem — SQLite instead of PostgreSQL, files instead of Redis

Why?

BenefitExplanation
Simple deploynpm start; no multi-service compose required
Smaller riskSix dependencies, tiny supply-chain surface
Lower maintenanceFew packages to track for security
UnderstandableClear modules, readable structure
PortableCopy the folder; minimal external state

System overview

┌────────────────────────────────────────────────┐
│  Browser client (Vanilla JS + CSS)         │
│  ├── Message list + Markdown rendering      │
│  ├── Contacts (agents / groups)             │
│  ├── Settings                               │
│  └── Collaboration flow panel               │
└──────────┬──────────────┬────────────────────┘
           │ HTTP/REST    │ WebSocket
┌──────────▼──────────────▼────────────────────┐
│  Node.js server (single process)             │
│                                              │
│  ┌────────────┐  ┌────────────┐              │
│  │ HTTP routes│  │ WebSocket  │              │
│  │ (node:http)│  │ (ws)       │              │
│  └──────┬─────┘  └──────┬─────┘              │
│         │               │                    │
│  ┌──────▼───────────────▼─────┐              │
│  │       Business logic       │              │
│  │                            │              │
│  │  ┌─────────┐ ┌───────────┐ │              │
│  │  │ LLM     │ │ Tools     │ │              │
│  │  │ (multi) │ │(7+custom) │ │              │
│  │  └─────────┘ └───────────┘ │              │
│  │                            │              │
│  │  ┌─────────┐ ┌───────────┐ │              │
│  │  │ Memory  │ │ RAG       │ │              │
│  │  │ (3 types)│ │ (hybrid)  │ │              │
│  │  └─────────┘ └───────────┘ │              │
│  │                            │              │
│  │  ┌─────────┐ ┌───────────┐ │              │
│  │  │ Chains  │ │ Schedules │ │              │
│  │  └─────────┘ └───────────┘ │              │
│  └─────────────┬──────────────┘              │
│                │                             │
│  ┌─────────────▼────────────────┐            │
│  │        Data layer             │            │
│  │  SQLite (WAL + FTS5)          │            │
│  │  + filesystem (SOUL.md / RAG) │            │
│  └──────────────────────────────┘            │
│                                              │
└────────────────────────────────────────────────┘


  ┌─────────────┐
  │ LLM provider │  OpenAI / Claude / Qwen / Ollama / …
  └─────────────┘

Repository layout

xiajiao/
├── server/
│   ├── index.js               # Entry — HTTP + WebSocket bootstrap
│   ├── storage.js             # Data — SQLite + agent files
│   ├── ws.js                  # WebSocket — live pushes
│   │
│   ├── router.js              # Route dispatch
│   ├── routes/                # REST route modules
│   │   └── settings.js        # Settings + HTTP tools API
│   │
│   ├── services/
│   │   ├── llm.js             # LLM — providers, stream, tool loop
│   │   ├── tool-registry.js   # Centralized tool registration + ACL
│   │   ├── http-tool-engine.js # HTTP custom tools (zero-code API bridge)
│   │   ├── mcp-manager.js     # MCP server connections
│   │   ├── channel-engine.js  # External IM channel management
│   │   ├── tools/             # Built-in tool modules (auto-scanned)
│   │   ├── memory.js
│   │   ├── rag.js
│   │   ├── collab-flow.js     # Collaboration chain state machine
│   │   ├── schedule.js
│   │   └── search-engines.js
│   │
│   └── test/
│       ├── storage.test.js
│       ├── llm.test.js
│       ├── memory.test.js
│       ├── rag.test.js
│       └── ...

├── public/
│   ├── index.html
│   ├── app.js
│   ├── styles.css
│   ├── uploads/
│   └── lib/
│       ├── marked.min.js
│       └── highlight.min.js

├── data/
│   ├── xiajiao.db
│   ├── agents.json
│   ├── http-tools.json        # HTTP custom tool definitions
│   ├── custom-tools/          # User JS tool modules (auto-scanned)
│   ├── channel-presets/       # Channel connector presets
│   ├── workspace-xxx/
│   │   ├── SOUL.md
│   │   ├── memory.db
│   │   └── rag/
│   └── _soul-templates/

├── docs-site/
├── Dockerfile
├── package.json
└── README.md

Core modules

HTTP routing (server/index.js)

Plain node:http—no framework:

javascript
const server = http.createServer(async (req, res) => {
  const url = new URL(req.url, `http://${req.headers.host}`);

  if (url.pathname.startsWith('/api/messages')) {
    return handleMessages(req, res, url);
  }
  if (url.pathname.startsWith('/api/channels')) {
    return handleChannels(req, res, url);
  }
  // …more routes
  return serveStatic(req, res, url);
});

Why not Express? ~15 API endpoints; if/else is enough.

WebSocket (server/ws.js)

Uses ws (Node’s built-in HTTP has no server WebSocket):

Client → WebSocket → server
  ↓                    ↓
Send/receive ← broadcast → push messages, agent replies,
                           tool status, chain progress

Used for live messages, streamed LLM tokens, tool updates, and collaboration status.

LLM (server/services/llm.js)

Centered on a tool-calling loop:

while (true) {
  response = await callLLM(messages)

  if (response.hasToolCalls) {
    for (toolCall of response.toolCalls) {
      result = await executeTool(toolCall)
      messages.push(toolResult)
    }
    continue
  }

  break
}

Protocols: openai-completions, anthropic-messages. Streaming via SSE or WebSocket.

Memory (server/services/memory.js)

See Agent memory.

Write:
  text → embedding → dedupe (cosine > 0.85?)
  ├── yes → update existing
  └── no  → insert (typed)

Retrieve:
  query → embedding → cosine top-K → inject into system prompt

Per-agent memory.db stores embeddings and text.

RAG (server/services/rag.js)

See RAG.

Index:  doc → parse (PDF/TXT/MD) → chunk → embed → SQLite
Search: query → BM25 + vectors → RRF → rerank → top-K

Storage (server/storage.js)

SQLite with WAL:

TablePurpose
messagesMessages + FTS5
channelsChannels / groups
settingsApp + LLM config

Agent files live under data/workspace-xxx/ (SOUL.md, memory, RAG) for easy editing and migration.

Data flow

One user message

1. Browser sends message
2. HTTP POST /api/messages
3. Persist to SQLite
4. WebSocket broadcast
5. Parse @mention → target agent
6. Load SOUL.md → system prompt
7. Inject memory if autoInjectMemory
8. Build context (history + memory + SOUL)
9. Call LLM (stream)
10. Tool calls? → execute → back to 9
11. Stream tokens over WebSocket
12. Save agent reply
13. Collaboration chain? → next agent (back to 5)

Collaboration chain

See Collaboration flow and Multi-agent chat.

User → Agent A → output → context → Agent B → output → Agent C → done
        ↑ WS status    ↑ WS status           ↑ WS status

The six dependencies

PackageRoleWhy keep it
wsWebSocket serverNo stdlib WS server
formidableMultipart uploadsStreaming parse
node-cronCron schedulingNo stdlib cron parser
pdf-parsePDF text for RAG
@larksuiteoapi/node-sdkFeishu connectorPrivate long-lived protocol
@modelcontextprotocol/sdkMCPJSON-RPC + capability negotiation

Everything else uses Node built-ins:

NeedBuilt-inTypical third-party
HTTPnode:httpExpress / Koa / Fastify
DBnode:sqlitepg / mysql2
Testsnode:testJest / Mocha / Vitest
UUIDnode:cryptouuid / nanoid
Pathsnode:path
Filesnode:fsfs-extra

Security model

Authentication

  • Simple password protection (OWNER_KEY environment variable)
  • Session cookie (random token via node:crypto)
  • Suited to individuals and trusted small teams

Data isolation

  • Each agent has its own workspace and memory store
  • Memories are not shared across agents
  • Uploads are confined to designated directories

LLM API key security

  • Keys are stored in local SQLite
  • They are only sent to the configured LLM provider
  • They are never sent to any third party

Performance

Single-process Node + SQLite: fast boot, low overhead. Bottleneck is LLM latency, not Xiajiao. WAL handles chat write patterns comfortably.

Walkthroughs

HTTP routing (simplified)

javascript
const server = http.createServer(async (req, res) => {
  const url = new URL(req.url, `http://${req.headers.host}`);
  const path = url.pathname;
  const method = req.method;

  if (method === 'GET' && !path.startsWith('/api/')) {
    return serveStatic(req, res, path);
  }

  const routes = {
    'POST /api/login': handleLogin,
    'GET /api/messages': handleGetMessages,
    'POST /api/messages': handleSendMessage,
    'GET /api/agents': handleGetAgents,
    'PUT /api/agents/:id': handleUpdateAgent,
  };

  const handler = matchRoute(routes, method, path);
  if (handler) await handler(req, res, params);
  else res.writeHead(404).end();
});

WebSocket streaming

User message → @mention → SOUL + memory + recent messages
→ LLM stream
→ chunks → WS stream_chunk / tool_call / tool_result → loop until finish
→ stream_end → persist → maybe next chain step

Memory pipeline

Write (memory_write):
  embedding(content)
    → compare to existing (cosine)
    → similarity > 0.85 → skip (dedupe)
    → similarity > 0.7  → update existing
    → else → insert
    → persist memory.db

Retrieve:
  new message → embed → top-K by cosine
    → group as semantic / episodic / procedural
    → inject into system prompt, e.g.:

    [Relevant memories]
    Semantic: user prefers Python; company uses Alibaba Cloud
    Episodic: last time we discussed payment API design
    Procedural: keep answers short; code in TypeScript when asked

RAG pipeline

User question
  → BM25 branch: FTS5 full-text → top 20
  → vector branch: embedding similarity → top 20
  → RRF merge: score = Σ 1/(k + rank_i), k = 60
  → top 10 candidates
  → LLM reranking: score each chunk vs question (e.g. 1–10)
  → top 5 chunks → injected into agent prompt

Extensibility

New tool — three methods

Method 1: HTTP custom tools (zero-code)

Configure any REST API as a tool in Settings → HTTP Tools. Supports interpolation, custom headers, body templates, and response extraction. No code, no restart.

Method 2: JS auto-register

Drop a .js file into server/services/tools/ (built-in) or data/custom-tools/ (user-defined):

javascript
// data/custom-tools/my_custom_tool.js
export default {
  description: "My custom tool",
  parameters: {
    type: "object",
    properties: {
      query: { type: "string", description: "Query text" }
    }
  },
  handler: async (params) => {
    return { result: "done" };
  }
};

File name becomes tool name. The tool registry auto-scans both directories on startup.

Method 3: MCP bridged

Connect external MCP servers (stdio or HTTP) in Settings → MCP. Tools auto-register as mcp:{serverId}:{toolName}.

New API

Add server/routes/*.js and register in server/router.js.

New search engine

Extend server/services/search-engines.js.

New LLM provider

OpenAI-compatible /v1/chat/completions → configure in settings; no code change.

New channel

Implement a connector module under server/services/connectors/ (see existing feishu-ws.js, webhook.js for patterns).

Compared with other stacks

AspectXiajiaoDify (rough)Typical Node app
EntryOne index.jsMany servicesOne app.js
RoutingManualFramework routerExpress
Data accessRaw SQLORMORM
Testsnode:testpytestJest
BuildNoneDocker/pipBundler

Xiajiao optimizes for minimal practice, not maximal ceremony.

Features & usage

Development & operations