data-agent-tutorial
agent
Warn
Health Warn
- No license — Repository has no license file
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 14 GitHub stars
Code Warn
- network request — Outbound network request in data-agent-frontend/package.json
Permissions Pass
- Permissions — No dangerous permissions requested
Purpose
This project is a hands-on, educational tutorial designed to teach developers how to build an enterprise-grade Text2SQL agent from scratch. It covers advanced concepts like state graph orchestration, dual-RAG retrieval, SQL auto-correction, Docker sandbox execution, and human-in-the-loop (HITL) interactions.
Security Assessment
Overall Risk: Medium. The tool acts as an interface for generating and executing SQL queries based on user input, which inherently interacts with database structures. It relies on a Python Docker sandbox to execute code, which provides a decent layer of isolation, though any sandbox execution carries inherent risks. The automated scanner found outbound network requests originating from the frontend package, which is normal for a web application but still requires network access. The code does not request dangerous system permissions, and no hardcoded secrets were identified in the scan.
Quality Assessment
The project appears to be of good quality and is highly active, with its most recent code push occurring today. It has garnered a modest but positive level of community trust with 14 GitHub stars, indicating early interest and peer review. However, a major drawback is the complete absence of an open-source license. Without a defined license, the code is technically proprietary by default, meaning developers do not have legal permission to copy, modify, or distribute it for personal or commercial use without explicit consent from the author.
Verdict
Use with caution: It is a safe and actively maintained learning resource, but the lack of a formal license makes it legally risky to reuse or adapt for actual production environments.
This project is a hands-on, educational tutorial designed to teach developers how to build an enterprise-grade Text2SQL agent from scratch. It covers advanced concepts like state graph orchestration, dual-RAG retrieval, SQL auto-correction, Docker sandbox execution, and human-in-the-loop (HITL) interactions.
Security Assessment
Overall Risk: Medium. The tool acts as an interface for generating and executing SQL queries based on user input, which inherently interacts with database structures. It relies on a Python Docker sandbox to execute code, which provides a decent layer of isolation, though any sandbox execution carries inherent risks. The automated scanner found outbound network requests originating from the frontend package, which is normal for a web application but still requires network access. The code does not request dangerous system permissions, and no hardcoded secrets were identified in the scan.
Quality Assessment
The project appears to be of good quality and is highly active, with its most recent code push occurring today. It has garnered a modest but positive level of community trust with 14 GitHub stars, indicating early interest and peer review. However, a major drawback is the complete absence of an open-source license. Without a defined license, the code is technically proprietary by default, meaning developers do not have legal permission to copy, modify, or distribute it for personal or commercial use without explicit consent from the author.
Verdict
Use with caution: It is a safe and actively maintained learning resource, but the lack of a formal license makes it legally risky to reuse or adapt for actual production environments.
Data-Agent:从 0 到 1 构建 Text2SQL 智能体实战教程,覆盖 StateGraph 编排、双重 RAG、关系图谱、HITL 人工确认、SQL 自动纠错、Python Docker 沙盒执行与 A2A + SSE 流式交互。
README.md
🚀 Data-Agent: 从 0 到 1 构建企业级 Text2SQL 智能体
本项目是一个基于开源项目的学习型教程:在参考
spring-ai-alibaba/DataAgent的基础上,结合个人理解进行拆解、复现与讲解。
目标是帮助开发者从 0 到 1 系统掌握 StateGraph 图编排、双重 RAG 检索、自我纠错、HITL 人机协同 等 Text2SQL Agent 关键能力。
📌 快速导航
🖼️ 项目介绍图
✨ 核心亮点:你将学到什么?
这个教程不是“只看概念”的介绍,而是围绕一个可运行项目,带你把 Text2SQL Agent 的关键能力拆开学透:
- 🧩 从流程到代码的完整映射:用 StateGraph 把问题理解、知识召回、规划、执行、纠错、报告串成清晰链路,知道每一步该放什么能力。
- 📚 结构化 + 非结构化的双通道检索:同时利用关系图谱信息与业务知识库,理解 Text2SQL 在真实业务里如何减少歧义和幻觉。
- 🛠️ 可落地的执行与纠错机制:掌握 SQL 生成与执行、错误回溯修复、Python Docker 沙盒分析的协作方式,而不是停留在“生成 SQL”这一步。
- 🤝 面向生产的交互设计:通过 HITL 人工确认、A2A 协议、SSE 流式反馈,学习高风险场景下可控、可观测的人机协同模式。
- 📖 可复现的学习路径:基于开源项目进行拆解与复现,提供从骨架搭建到核心编排的章节化路线,适合边读边跑、逐步进阶。
🛠️ 现代化技术栈
- 后端:
Kotlin+Spring Boot 3.x+Jimmer - AI 与编排:
Spring AI Alibaba Graph+Spring AI - 向量与存储:
PostgreSQL+pgvector - 前端:
Vue 3+TypeScript+Vite+A2UI
🧭 宏观系统架构图 (System Architecture)
🗺️ 端到端执行链路速览
[用户自然语言提问]
└── A2A 协议流式请求
└── 路由意图识别
├── 知识召回(向量化业务词汇 + QA)
├── 关系图谱召回
├── 可行性评估与任务拆解(Planner)
├── 人工确认拦截(HITL)
├── SQL 生成与执行 + 自动纠错循环
├── Python Docker 沙盒执行与分析
└── 报告整理(Report Generation)
└── 前端流式打字机效果呈现(A2UI)
🖼️ 效果预览
⚡ 快速启动 (5 分钟极速体验)
1. 环境准备
- 基础环境:
Java 21+、Node.js 20+、pnpm - 数据库:
PostgreSQL(默认localhost:5432/data_agent_tutorial) - 必装扩展:
pgvector
CREATE EXTENSION IF NOT EXISTS vector;
2. 启动后端
首次初始化数据库(在项目根目录执行):
psql -h localhost -p 5432 -U postgres -d data_agent_tutorial -f data-agent-backend/src/main/resources/database.sql
然后启动后端:
cd data-agent-backend
./gradlew bootRun
出现 Tomcat started on port(s): 9933 即启动成功。
3. 启动前端
cd data-agent-frontend
pnpm install
pnpm dev
默认地址:http://localhost:3500(自动代理 /api 到后端)。
4. 验证 A2A 链路
打开浏览器输入自然语言问题,若看到前端卡片出现流式节点打字机效果,即最小 Agent 闭环已跑通。
📖 教程导航(自顶向下进阶)
本教程按章节组织,强烈建议切换到对应章节的 Git 分支对照阅读源码,效果翻倍!
- 🏗️ 00 项目骨架搭建
后端 Kotlin + Jimmer 初始化,前端 Vue3 接入 API 自动生成。 - 🔌 01 A2A 协议实战
跑通 Agent 服务发现与 JSON-RPC 流式事件。 - 🕸️ 02 Graph 编程基础
从单节点走向多分支路由,实现暂停 -> 人工确认 -> 续跑的 HITL 工作流。 - 🧠 03 Bird SQL 知识库基建
完成结构化表关联入库与 PGVector 向量化。 - 🔥 04 SQL Agent 核心编排(系列高潮)
逐个击破:知识召回、关系图谱、任务拆解、SQL 自纠错、Python 高阶计算、商业报告生成。
联系方式
付费远程运行/安装/定制开发联系微信:ljc666max
其他关于程序运行安装报错请加QQ群:
- 416765656(满)
- 632067985
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found