private-knowledge-agent
mcp
Uyari
Health Uyari
- No license — Repository has no license file
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 6 GitHub stars
Code Gecti
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
- Permissions — No dangerous permissions requested
Purpose
This is a local knowledge base question-answering assistant that uses a multi-agent architecture. It processes user documents (PDFs, Word, text) and allows users to query them via natural language.
Security Assessment
Overall Risk: Medium. The codebase scan of 12 files found no dangerous patterns, hardcoded secrets, or requests for dangerous system permissions. However, the system inherently processes sensitive local data. By design, it requires an external API key (DeepSeek or OpenAI) and makes network requests to external LLM APIs to function. If you input confidential documents, your data will be sent to those third-party servers unless you strictly configure a local LLM endpoint.
Quality Assessment
The project is actively maintained (last updated today), but it has very low community visibility with only 6 stars. This means it has not been widely tested or vetted by a larger audience. A significant concern is the lack of a license file. Without a defined license, the code is technically under exclusive copyright by default, making the legal terms of use, modification, and distribution unclear.
Verdict
Use with caution — the code itself appears safe, but the lack of a license and the reliance on external API calls for processing your local documents mean you must ensure your data privacy and legal requirements are met before deploying.
This is a local knowledge base question-answering assistant that uses a multi-agent architecture. It processes user documents (PDFs, Word, text) and allows users to query them via natural language.
Security Assessment
Overall Risk: Medium. The codebase scan of 12 files found no dangerous patterns, hardcoded secrets, or requests for dangerous system permissions. However, the system inherently processes sensitive local data. By design, it requires an external API key (DeepSeek or OpenAI) and makes network requests to external LLM APIs to function. If you input confidential documents, your data will be sent to those third-party servers unless you strictly configure a local LLM endpoint.
Quality Assessment
The project is actively maintained (last updated today), but it has very low community visibility with only 6 stars. This means it has not been widely tested or vetted by a larger audience. A significant concern is the lack of a license file. Without a defined license, the code is technically under exclusive copyright by default, making the legal terms of use, modification, and distribution unclear.
Verdict
Use with caution — the code itself appears safe, but the lack of a license and the reliance on external API calls for processing your local documents mean you must ensure your data privacy and legal requirements are met before deploying.
基于 LangGraph + MCP + FastAPI(SSE) + Streamlit 的本地知识库问答助手
README.md
🕵️ Private Knowledge Agent
本地知识库智能问答系统
LangGraph + MCP + RAG + FlashRank
📖 项目简介
一个基于 LangGraph 多智能体架构的本地知识库问答系统。
把文档放进 data/ 目录,用自然语言提问,系统会自动规划检索策略、并发读取文件、生成结构化报告。
🧠 系统架构
flowchart TD
U[用户提问] --> M[Manager 意图识别]
M -->|chat| C[Chat 闲聊]
M -->|research| P[Planner 任务规划]
P --> D[并发分发]
D --> R1[Reader 1]
D --> R2[Reader 2]
D --> RN[Reader N]
R1 --> W[Writer 报告撰写]
R2 --> W
RN --> W
W --> END((END))
C --> END
🖼️ 使用示例
闲聊模式

研究模式


✨ 核心功能
多智能体协作
- Manager:意图识别,分流 chat/research
- Planner:将复杂问题拆解为 2-4 个子任务
- Reader:并发调用 MCP 工具检索知识库
- Writer:汇总数据,生成结构化报告
- Chat:处理日常对话
智能检索
- 向量检索:ChromaDB + 语义 Embedding
- 精排序:FlashRank 重排序提升准确率
- 置信度:每条结果附带相似度分数
多格式支持
- PDF(pdfplumber)
- Word(python-docx)
- Markdown / TXT(自动编码检测)
增量索引
- 文件指纹检测(mtime),变动文件才重建
- 秒级启动,无需重复向量化
流式输出
- SSE 实时推送执行阶段
- 前端可见工具调用过程
🧩 项目结构
private-knowledge-agent/
├── agents/ # 多智能体模块
│ ├── manager.py # 意图识别
│ ├── chat.py # 闲聊处理
│ ├── planner.py # 任务规划
│ ├── reader.py # 知识检索
│ └── writer.py # 报告撰写
├── tools/ # MCP 工具
│ ├── mcp_server_local.py # MCP 服务
│ ├── rag_store.py # RAG 检索
│ └── registry.py # 工具注册
├── api/ # 后端接口
│ ├── routes.py # 路由
│ └── stream.py # SSE 流式处理
├── frontend/ # Streamlit 前端
│ ├── app.py
│ ├── chat_flow.py
│ ├── backend_client.py
│ └── ui.py
├── bootstrap/ # 生命周期管理
├── data/ # 知识库文件目录
├── chroma_db/ # 向量数据库
├── db/ # 索引状态/运行持久化
├── models/ # 本地模型与缓存目录
├── graph.py # LangGraph 编排
├── state.py # 状态定义
├── server.py # FastAPI 入口
├── config.py # 配置
├── Dockerfile
└── docker-compose.yml
⚡ 快速开始
1. 克隆项目
git clone https://github.com/你的用户名/private-knowledge-agent.git
cd private-knowledge-agent
2. 配置环境变量
cp .env.example .env
编辑 .env:
# LLM 配置
OPENAI_MODEL=deepseek-chat
OPENAI_API_KEY=sk-xxx
OPENAI_BASE_URL=https://api.deepseek.com/v1
# 可选:LangSmith 追踪
LANGCHAIN_API_KEY=xxx
3. 准备知识与模型文件
# 知识库文件
cp 你的文件.pdf data/
cp 你的文档.docx data/
# 本地 Embedding 模型目录(默认路径)
# models/embedding/bge-m3
4. Docker 部署
docker compose up -d --build
服务启动后:
- MCP Server:
http://localhost:8003 - Backend API:
http://localhost:8011
5. 启动前端(本地)
streamlit run frontend/app.py
或部署到 Streamlit Cloud,在 Secrets 配置:
BACKEND_URL = "http://你的服务器IP"
🔧 MCP 工具
| 工具 | 功能 | 参数 |
|---|---|---|
list_local_files |
列出知识库文件 | 无 |
read_local_file |
读取文件全文 | filename |
search_local_knowledge |
语义检索 | query |
🛠️ 技术栈
| 类别 | 技术 |
|---|---|
| 编排框架 | LangGraph |
| 协议 | MCP (fastmcp) |
| 后端 | FastAPI + SSE |
| 前端 | Streamlit |
| 向量数据库 | ChromaDB |
| Embedding | HuggingFace (BAAI/bge-m3, 本地加载) |
| 重排序 | FlashRank |
| 文件解析 | pdfplumber, python-docx |
| 持久化 | SQLite (checkpointer) |
🧪 评测
项目内置了回归评测文件,用来验证 /chat SSE 接口在稳定性、回答约束、来源引用和延迟上的表现。
常用命令:
# 完整评测
python scripts/run_eval.py --dataset evals/eval_set.jsonl --timeout 200
# 快速冒烟
python scripts/run_eval.py --dataset evals/eval_test.jsonl --timeout 200
结果会输出到 evals/results/。更详细的说明见:
evals/README.mdevals/EVAL_CASE_GUIDE.md
🗺️ Roadmap
- 支持 Excel / CSV
- 支持图片 OCR
- 本地 LLM 方案(Ollama)
- Web 文件管理界面
🤝 贡献
- 📬 提交 Issue / PR:欢迎提出改进建议或贡献代码!
- 📩 技术交流:微信 a19731567148(备注 Agent)
🌟 如果这个项目帮到了你,请给我点个 Star ⭐,这将是我持续更新的最大动力!
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi