nexus-deep-research-agent
Health Uyari
- No license — Repository has no license file
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 13 GitHub stars
Code Gecti
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
- Permissions — No dangerous permissions requested
This tool is a sophisticated AI research agent that performs multi-stage web searches and document analysis. It uses an autonomous loop of planning, executing, and evaluating to synthesize structured research reports.
Security Assessment
Overall risk: Medium. The project itself does not contain malicious code, hardcoded secrets, or dangerous shell execution commands. However, its fundamental architecture requires handling sensitive data. It processes user queries, makes external network requests to multiple LLM providers (Anthropic, OpenAI, Gemini), and relies on a Supabase database to store research history and vector embeddings. Additionally, it requires users to supply their own API keys to function, meaning developers must implement strict environment variable security to prevent accidental key exposure.
Quality Assessment
The project is in active development, with its most recent push occurring today. The codebase passed a light automated audit across 12 files with no dangerous patterns detected, indicating clean and safe code. It has garnered 13 GitHub stars, showing a small but present level of community interest. A major drawback is the complete absence of a license file, which means it legally falls under default copyright laws; technically, users do not have formal permission to modify or distribute the code.
Verdict
Use with caution — the code itself is safe and actively maintained, but you must securely manage your own LLM API keys and be aware of the unlicensed status.
Multi-stage AI research agent with RAG pipeline and structured reasoning workflows.
NexusAI — Deep Research Agent v3
A production-grade, truly agentic research system built on a Planner → Executor → Evaluator → Loop architecture. Multi-provider (Anthropic, OpenAI, Gemini, NVIDIA NIM), streaming SSE, pgvector RAG, and an elite UI.
What Makes This a True Agent
Unlike a staged pipeline that runs N fixed passes, NexusAI v3 has genuine decision logic:
User Query
↓
[PLANNER] — Produces a JSON plan: intent, tool sequence, max steps, target confidence
↓
[EXECUTOR] — Dispatches tools (search, retrieve, reason, critique, synthesize)
Parallel execution for independent tools (search + retrieve run together)
↓
[EVALUATOR] — Scores evidence quality (0.0–1.0), decides next action:
CONTINUE | DONE | PIVOT | EXPAND | FALLBACK
↓
↺ Loop until confidence ≥ target OR max steps reached
↓
[SYNTHESIZER] — Produces final structured report
The Evaluator's action field is what makes it agentic:
DONE— evidence meets quality bar, stop early (saves tokens)PIVOT— current approach isn't working, revise the planEXPAND— good progress but specific gaps remain, add toolsFALLBACK— multiple errors, rotate to next LLM provider
Architecture
src/
├── app/
│ ├── api/
│ │ ├── agent/route.ts ← Streaming SSE endpoint (main entry point)
│ │ ├── memory/route.ts ← GET/DELETE research history
│ │ └── rag/route.ts ← Document upload → chunk → embed → store
│ ├── layout.tsx
│ ├── page.tsx ← Full UI (sidebar + timeline + result panel)
│ └── globals.css
│
├── lib/
│ ├── agent/
│ │ ├── types.ts ← All interfaces + Zod schemas (single source of truth)
│ │ ├── orchestrator.ts ← Main loop: Planner→Executor→Evaluator
│ │ ├── planner.ts ← Converts query to structured JSON plan
│ │ ├── executor.ts ← Dispatches tools, parallel execution
│ │ └── evaluator.ts ← Quality scoring + CONTINUE/DONE/PIVOT/EXPAND/FALLBACK
│ │
│ ├── providers/
│ │ └── normalizer.ts ← Unified adapter for Anthropic/OpenAI/Gemini/NVIDIA
│ │ Retry logic, timeout handling, cost tracking
│ ├── rag/
│ │ ├── chunker.ts ← Semantic chunking with overlap
│ │ ├── embedder.ts ← text-embedding-3-small via OpenAI
│ │ └── retriever.ts ← Vector search → LLM reranking → compression
│ │
│ ├── db/
│ │ ├── supabase.ts ← Supabase client singleton
│ │ └── memory.ts ← Save/load/search research memory
│ │
│ └── observability/
│ ├── logger.ts ← Pino structured JSON logging
│ └── tracer.ts ← Lightweight span-based tracing
│
├── hooks/
│ └── useAgent.ts ← React hook: SSE subscription + typed UI state
│
└── components/
├── AgentTimeline.tsx ← Animated live step visualization
├── ConfidenceGauge.tsx ← SVG arc gauge with sparkline
└── ResultPanel.tsx ← Tabbed: Answer | Trace | Metrics
Setup
1. Clone and install
git clone <your-repo>
cd nexus-deep-research-v3
npm install
2. Configure environment
cp .env.local.example .env.local
Edit .env.local:
# Required: at least one LLM provider
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-proj-... # Also used for embeddings (required for RAG)
GEMINI_API_KEY=AIza-...
NVIDIA_API_KEY=nvapi-...
# Required: Supabase (free tier works)
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_ROLE_KEY=eyJ...
SUPABASE_ANON_KEY=eyJ...
3. Set up Supabase
- Create a free project at supabase.com
- Go to SQL Editor → New Query
- Paste and run the contents of
supabase/schema.sql
This creates:
document_chunkstable with pgvector (1536-dim) for RAGmemory_contextstable with pgvector for semantic memory searchmatch_documentsandmatch_memoriesRPC functions
4. Run locally
npm run dev
Deployment (Vercel)
npm install -g vercel
vercel --prod
Add all environment variables in the Vercel dashboard under Settings → Environment Variables.
Important Vercel settings:
- Functions → Max Duration: set to
300(5 minutes) for exhaustive depth - Edge Runtime is NOT used — agent runs in Node.js runtime for full SDK support
How to Use
Basic research
Type a query, select depth and provider, click Run Research or press ⌘↵.
Research depths
| Depth | Steps | Time | Use case |
|---|---|---|---|
| Quick | 2–3 | ~30s | Fast fact lookup |
| Standard | 4–6 | ~90s | Most queries |
| Deep | 7–9 | ~3m | Complex analysis |
| Exhaustive | 10–12 | ~6m | Critical research |
RAG (Document Upload)
Upload PDF or TXT via the /api/rag endpoint:
curl -X POST http://localhost:3000/api/rag \
-H "x-user-id: your-user-id" \
-F "[email protected]"
Returns { docId, chunkCount }. Pass docId in the agent request to activate retrieval.
API Usage
// POST /api/agent
const res = await fetch('/api/agent', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
query: 'What are the implications of quantum error correction for cryptography?',
depth: 'deep',
provider: 'anthropic',
model: 'claude-sonnet-4-5',
userId: 'user-123',
documentIds: ['doc-uuid-here'], // optional
saveToMemory: true,
}),
});
// Subscribe to SSE events
const reader = res.body.getReader();
// Events: start | plan | loop_start | step_start | step_done | eval | done | error
Key Design Decisions
Why Zod on every LLM response?
LLMs produce inconsistent JSON. Zod validation at the boundary of every agent step catches malformed output, triggers retries, and falls back to safe defaults. This eliminates ~70% of runtime errors.
Why SSE instead of polling?
Each agent step can take 5–45 seconds. SSE lets the UI show progress in real time — users see the plan, watch each step complete, and observe confidence improving. This is the difference between a loading spinner and a research terminal.
Why LLM-as-reranker instead of a cross-encoder?
No extra model deployment needed. gpt-4o-mini reranks 20 candidates for ~$0.001 and produces better results than BM25. Full cross-encoder (like cross-encoder/ms-marco-MiniLM-L-6-v2) can be swapped in if you add a Python sidecar.
Why parallel tool execution?
search and retrieve are independent I/O operations. Running them concurrently cuts that phase from ~16s to ~8s at no quality cost.
Extending
Add a new tool
- Add tool name to
ToolNametype intypes.ts - Add system prompt and handler in
executor.ts - Add timeout config in
TOOL_TIMEOUTS
Add a new provider
- Add adapter function in
normalizer.ts - Add to
PROVIDER_ADAPTERSmap - Add default model to
getDefaultModel() - Add fallback chain in
planner.ts
Add evaluation metrics
The EvalResult has confidence, evidenceQuality, and gaps. You can extend EvalResultSchema to add domain-specific metrics (e.g., citationCount, controversyScore).
Stack
| Layer | Technology |
|---|---|
| Framework | Next.js 15 (App Router) |
| LLMs | Anthropic, OpenAI, Gemini, NVIDIA NIM |
| Validation | Zod (all LLM outputs) |
| Database | Supabase (PostgreSQL + pgvector) |
| Embeddings | text-embedding-3-small (OpenAI) |
| Animations | Framer Motion |
| Logging | Pino (structured JSON) |
| Deployment | Vercel |
License
MIT
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi