Skip to content

What is Xiajiao

Xiajiao IM (虾饺) is an open-source AI agent team collaboration platform.

In one sentence: manage your AI agents the way you manage a group chat.

You can create groups, add multiple agents (novelist, editor, translator, coding assistant, …), and talk to them with @mentions. Agents can also collaborate and hand off work to each other like a real team.

Xiajiao IM interface

Xiajiao in 30 seconds

Xiajiao collaboration flow demo

Real conversation screenshots

Agent conversation demo

Real conversation from the Xiajiao steward agent: automatically calling tools to query system status, show channel connection info, and present it in a structured table.

How is it different from other platforms?

Most AI platforms are AI application development platforms—they help you build AI apps for end users.

Xiajiao is an AI agent team collaboration platform—agents are coworkers, not disposable tools.

Design philosophy

XiajiaoDify / FastGPTCoze
Core ideaAgents are “coworkers”Agents are “apps”Agents are “bots”
InteractionIM group chatWorkflow canvasBot configuration UI
Agent relationshipsPeer collaboration, mutual @mentionsPreset DAG pipelinesIndependent runs
Who it is forFor yourself / personal useEnd usersEnd users

Technical architecture

XiajiaoDifyFastGPTCoze
LanguageJavaScriptPythonTypeScriptClosed source
npm dependencies6N/A100+N/A
External services0PostgreSQL + Redis + sandboxMongoDB + PG + OneAPICloud
Start commandnpm startdocker compose updocker compose upSaaS
Installnpm install (6 packages)Multi-service DockerMulti-service DockerNothing to install
Data stays localYes, fully localYes, self-hostedYes, self-hostedNo, cloud

Complementary, not competing

Dify / FastGPT fit customer-facing AI apps. Xiajiao fits a personal or team AI collaboration space for daily use. Different scenarios, different tools.

Core capabilities

CapabilitySummaryDetails
🤖 Multi-agent group chatGroups + @mention routing + agent-to-agent chatDetails
🔧 Tool callingSeven built-in tools (search, memory, RAG, cross-agent calls, …)Details
🧠 Persistent memoryThree memory types (semantic / episodic / procedural), embedding dedupDetails
📚 RAG knowledge baseBM25 + vector hybrid retrieval + RRF + LLM rerankingDetails
🔗 Collaboration flowChains + visual panel + human-in-the-loopDetails
🔌 Multiple modelsOpenAI / Claude / Qwen / DeepSeek / Ollama, …Details

Text-to-image (AI illustrations)

Agents such as the novelist can generate images from copy in group chat; the collaboration panel and chat show illustrations together so “text-to-image” is easy to see.

Collaboration chain and summer night sky AI art—stars, fireflies, moonlight, and figures on a bamboo mat

Use cases

Case 1: AI writing team

Create a group with novelist, editor, and translator. After you set up a collaboration chain, say “write a poem” once and three agents run in sequence:

Novelist drafts → editor polishes → translator renders English

The visual panel shows progress live. You can pause, edit, or re-run mid-flight.

Case 2: Private knowledge assistant

Upload docs and notes to the RAG knowledge base. Agents index them; later, answers come from your material—not generic hallucination.

Good for internal tech knowledge, personal study notes, and product Q&A.

Case 3: Compare models

Assign different models per agent: coding assistant on Claude (strong at code), translator on GPT-4o (strong at multilingual), daily helper on Qwen (cheap and sufficient). @mention several in one group and compare answers.

Case 4: Ops automation

Use the Xiajiao steward with cron: daily 9:00 news digest, Monday weekly report template, monthly health checks.

Case 5: Coding assistant

Coding assistant + RAG. Upload project docs and API specs so code follows your standards, not random snippets from the web.

One-to-one chat

Open an agent from the contact list for a private thread—no group required—for Q&A and code generation.

One-to-one chat with coding assistant

Real coding assistant thread—agent explains the approach then outputs runnable code

SOUL.md: define agent personas in Markdown

Each agent has a SOUL.md file—a Markdown “job description”:

markdown
# 翻译官

你是一位精通中英双语的翻译专家。

## 工作原则
- 信、达、雅:忠实原意,表达通顺,语言优美
- 直接输出译文,不做逐句对照分析
- 遇到专业术语保留原文并附注中文

## 禁止事项
- 不翻译代码块中的内容
- 不要主动 @其他 Agent

Why Markdown?

BenefitWhy it matters
SimpleEdit in any text editor—no complex UI
Version controlGit diff shows exactly what changed
ShareableShare one .md file to clone a persona
PortablePlain text, no vendor lock-in
ExpressiveHeadings, lists, tables, code blocks—enough for rich role specs

Who is it for?

AudienceHow they use it
Indie developersWant an AI team without heavy DevOps
AI enthusiastsExplore multi-agent collaboration and SOUL.md personas
Small teamsSelf-hosted workspace without vendor lock-in
ResearchersPrototype agent messaging, memory, and RAG
CreatorsAI writing teams and automated content pipelines
StudentsLearn agent concepts with readable code

Technical overview

LayerTechnologyNotes
RuntimeNode.js 22+Native node:sqlite, no external DB
HTTPnode:httpNo framework—stdlib only
WebSocketwsReal-time push
DatabaseSQLiteWAL + FTS5, concurrent reads and full-text search
FrontendVanilla JS + CSSNo build step — changes take effect immediately
npm deps6Each one justified
Tests53 unit testsnode:test standard library test framework

Design rule: every dependency is liability, not asset. Prefer the standard library over third-party packages.

What happens when you send one message?

When you send @CodingAssistant write a login API in Xiajiao, roughly 14 steps run:

1. Message stored in SQLite
2. WebSocket broadcast to online clients
3. Parse @mention → target: CodingAssistant
4. Load CodingAssistant SOUL.md
5. Retrieve persistent memory ("User prefers Python; company uses Alibaba Cloud")
6. Inject memory into system prompt
7. Send full context to LLM API (streaming)
8. LLM chooses to call web_search
9. Run search → merge results into context
10. LLM continues generating code
11. Stream tokens to the browser over WebSocket
12. Store full reply in SQLite
13. CodingAssistant calls memory_write ("User needs login API")
14. If a collaboration chain exists → trigger next agent

The entire process is fully transparent to the user—tool-calling steps appear live in the chat UI.

When not to use Xiajiao

Xiajiao is not universal. Consider alternatives for:

NeedSuggestionWhy
Customer-facing AI appsDifyWorkflows + API + multi-tenant
No self-hostingCoze / ChatGPT TeamManaged SaaS
100+ pluginsCozeLarge plugin ecosystem
Massive concurrencyCustom microservicesSQLite single-process limits

See platform comparison for detail.

Six dependencies—why enough?

People question “only six npm packages.” Here is why each one stays:

PackageRoleWhy not removeAlternative
wsWebSocket serverNode has no built-in WS serverNone practical
formidableMultipart uploadsBoundary parsing and streaming not in stdlibHand-roll parser
node-cronCron schedulingNo cron expression support in stdlibsetInterval (weak for complex schedules)
pdf-parsePDF textRAG needs PDF textDrop PDF upload
@larksuiteoapi/node-sdkFeishu connectorFeishu WS protocol is proprietaryNone
@modelcontextprotocol/sdkMCPJSON-RPC + capability negotiation; DIY risks incompatibilityHand-written (risky)

What does a “normal” project need?

Projectnpm dependency countNotes
Xiajiao6Stdlib first
Express hello world30+Framework pulls many
Empty Next.js200+React + toolchain
Dify frontend300+Full enterprise UI

More dependencies are not “bad”—for a self-use tool, stdlib-first means smaller attack surface and fewer upgrades.

What the name means

Xiajiao (虾饺) is named after the Cantonese dim sum—small, refined, rich filling. A thin wrapper around fresh shrimp.

Fewest dependencies, broadest capability—that is the idea behind Xiajiao.

Roadmap

StatusItem
✅ DoneMulti-agent chat, tool calling, persistent memory, RAG, collaboration flow, RBAC
🚧 In progressWorkflow engine, agent negotiation
📋 PlannedMCP tool marketplace, voice input, mobile layout
🤔 ExploringSelf-improving agents, multi-tenant

Next steps

You want to…Read this
Try it nowQuick start — three steps to run
Configure modelsModel configuration — eight providers
Design agentsSOUL.md guide — strong personas
Copy templatesSOUL.md templates — 20 templates
Copy setupsRecipes — 12 team configs
Understand architectureArchitecture — structure and data flow
Compare platformsComparison — vs Dify / Coze / FastGPT
SecuritySecurity & privacy — data sovereignty