Your-First-LLM-Studio

agent
Guvenlik Denetimi
Gecti
Health Gecti
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Community trust — 20 GitHub stars
Code Gecti
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
  • Permissions — No dangerous permissions requested

Bu listing icin henuz AI raporu yok.

SUMMARY

First LLM Studio: local-first LLM studio for Apple Silicon with MLX runtimes, Compare Lab, benchmark ops, replay, and runtime telemetry.

README.md

First LLM Studio

English | 简体中文

Release
License
Apple Silicon
MLX

First LLM Studio hero


English

First LLM Studio is a local-first LLM workbench for Apple Silicon. It brings local MLX runtimes, remote API targets, Agent sessions, Compare, Fine-tune, Benchmark, model discovery, runtime recovery, release evidence, and admin monitoring into one operating surface.

It is not another chat shell. It is built for people who need to compare behavior, debug runtimes, run evals, prepare adapters, and keep local and remote model work inside one product loop.

Product Surfaces

Route Core workflow
/agent Tool-enabled Agent sessions, target selection, runtime state, replay, trace review, and embedded Compare entry.
/compare Route-owned Compare Studio for prompt composition, lane preview, recipe persistence, review drawer, and benchmark handoff.
/fine-tune Foreground Fine-tune Studio for datasets, recipes, training, evaluation, chat adapter proof loops, export, reports, and artifacts.
/models Model discovery and install verification for local/community models plus hardware-fit and risk signals.
/benchmarks Benchmark run controls, progress, reports, release evidence, baselines, and regression review.
/retrieval Foreground knowledge management, path import, chunk inspection, and grounded retrieval validation.
/experiments Unified run/session timeline with artifact lineage, cross-feature navigation, filters, and retention controls.
/admin Monitoring/configuration mirror for runtime, queues, benchmark history, provider health, guardrails, and audit timelines.

Major Version Story

Version Core capabilities
v0.1 Foundation Established the local-first web studio, Apple Silicon/MLX gateway workflow, local + remote target catalog, runtime telemetry, and the first Agent/Admin operating split.
v0.2 Agent + Benchmark Ops Added richer Agent workbench flows, Compare-style target review, replay/trace inspection, runtime recovery controls, formal benchmark operations, baselines, and regression evidence.
v0.3 Fine-tune + Release Evidence Added fine-tune operation loops for evaluation, adapter chat, adapter export, and distillation starters; expanded operation history, partitioned typechecks, screenshot smoke, route smoke, and public launch assets.
v0.4 Product IA release Moves /fine-tune, /compare, /models, /benchmarks, /retrieval, and /experiments into foreground product routes with feature-owned state/actions, artifact lineage and retention, dark-glass studio/workbench styling, canonical APIs, and admin narrowed toward monitoring/configuration.

Current release: VERSION.

Who It Helps

Local AI builders on Apple Silicon

  • Compare MLX local models against hosted APIs under aligned context budgets.
  • Inspect runtime cost, prewarm, release, recovery, and hardware pressure without leaving the app.
  • Decide which local model is actually usable for daily coding and analysis workflows.

Agent and tooling teams

  • Validate tool-calling, repo-grounded behavior, replay, and patch flows in one workbench.
  • Turn Compare runs into benchmark handoff without switching products.
  • Separate model-quality failures from provider quirks and local runtime instability.

Evaluation and platform engineers

  • Run formal and focused benchmark suites with repeatable profiles.
  • Review baselines, deltas, run notes, failure classifications, and release evidence.
  • Keep local and remote targets inside one comparable target catalog.

Core Value

  • Unified local + remote target catalog.
  • Compare Lab for model-vs-model output review.
  • Fine-tune workflows for datasets, recipes, training, evaluation, adapter proof loops, and export.
  • Benchmark operations with history, progress, baselines, reports, and release evidence.
  • Foreground Retrieval for document import, chunk inspection, and grounded evidence probes.
  • Experiments timeline for session/run lineage, artifact navigation, and retention policy.
  • Replay, trace review, patch inspection, and exportable review notes.
  • Runtime operations for prewarm, release, restart, log inspection, telemetry, and recovery.
  • Dynamic local/community model discovery plus remote provider health scanning.

Current Targets

Local

  • Local Qwen3 0.6B
  • Local Qwen3 4B 4-bit
  • Local Qwen3.5 4B 4-bit
  • Local Gemma 3 4B It Qat 4-bit

Remote

  • OpenAI Codex
  • OpenAI GPT-5.5
  • Claude API
  • DeepSeek API
  • Kimi API
  • GLM API
  • Qwen API

Screenshots

Landing page
Agent workbench
Admin dashboard

Benchmark percentile board:

Benchmark percentiles

Formal milestone regression summary:

Formal regression summary

Local runtime telemetry:

Local runtime telemetry

Quick Start

Requirements

  • macOS on Apple Silicon
  • Node 22.x
  • Python 3.12
  • MLX-compatible local environment

Install

nvm install 22
nvm use 22
npm install
cp .env.example .env.local

Start the web app

npm run dev

Default routes:

Start the local model gateway

python3.12 -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install mlx mlx-lm
python scripts/local_model_gateway_supervisor.py

Gateway health:

Verification

npm run typecheck:changed
npm run smoke:routes
npm run smoke:screenshots

Configuration

Copy .env.example to .env.local and fill only the providers you want to use.

Important notes:

  • .env.local is ignored by git.
  • Remote providers are optional.
  • Several targets use OpenAI-compatible or Claude-compatible endpoints.
  • Public defaults in this repository are sanitized placeholders.

Repository Structure

app/                      Next.js app routes and thin API transports
components/               Shared UI and compatibility shells
features/                 Feature-owned routes, contracts, state, actions, and application ports
lib/agent/                Agent runtime, providers, benchmark, gateway helpers
lib/finetune/             Fine-tune store facade and split operation services
scripts/                  Local gateway, runtime, verification, and release scripts
docs/                     Architecture, release notes, launch notes, roadmap, and assets
modelscope/               ModelScope profile/readme metadata
public/                   Public assets and social cover art

Distribution

The ModelScope package script exports the committed Git tree so GitHub and ModelScope can stay file-identical for each synced version.

Security and Privacy

  • Sensitive local actions require confirmation.
  • Secrets belong in .env.local.
  • Public repository defaults are sanitized.
  • New public commits should use a GitHub noreply address where possible.
  • See SECURITY.md.

Contributing

Issues and PRs are welcome.

Release Notes


简体中文

First LLM Studio 是一个面向 Apple Silicon 的本地优先 LLM 工作台。它把本地 MLX 运行时、远端 API 目标、Agent 会话、Compare 对比、Fine-tune 微调、Benchmark 评测、模型发现、runtime 恢复、发布证据和后台监控统一到一个产品界面里。

它不是另一个聊天壳,而是给真正需要比较模型行为、调试 runtime、跑评测、准备 adapter,并把本地/远端模型工作流收在同一个产品循环里的开发者使用。

产品入口

路由 核心工作流
/agent 带工具循环的 Agent 会话、target 选择、runtime 状态、replay、trace review,以及内嵌 Compare 入口。
/compare 前台 Compare Studio,负责 prompt 编排、lane preview、recipe 持久化、review drawer 和 benchmark handoff。
/fine-tune 前台 Fine-tune Studio,覆盖数据集、配方、训练、评估、adapter proof loop、导出、报告和 artifacts。
/models 本地/社区模型发现、安装验证、硬件适配和风险提示。
/benchmarks Benchmark run controls、进度、报告、发布证据、baseline 和回归审阅。
/retrieval 前台知识管理、路径导入、chunk 检查和 grounded retrieval 验证。
/experiments 统一 Session/Run 时间线、artifact lineage、跨功能导航、筛选和保留策略。
/admin Runtime、队列、benchmark 历史、provider health、guardrails 和 audit timeline 的监控/配置镜像。

大版本核心功能

版本 核心功能
v0.1 基础版 建立本地优先 Web Studio、Apple Silicon/MLX 网关工作流、本地 + 远端 target catalog、runtime telemetry,以及 Agent/Admin 的第一版操作分层。
v0.2 Agent + Benchmark 运维 增强 Agent 工作台、Compare 式 target review、replay/trace 检查、runtime recovery controls、正式 benchmark 运维、baseline 和回归证据。
v0.3 Fine-tune + 发布证据 加入 evaluation、adapter chat、adapter export、distillation starter 等 fine-tune 操作循环;扩展 operation history、分区 typecheck、截图 smoke、route smoke 和公开发布素材。
v0.4 产品结构发布版 /fine-tune/compare/models/benchmarks/retrieval/experiments 推进为前台产品路由;迁移 feature-owned state/actions;接通 artifact lineage、导航和 retention;统一 dark-glass studio/workbench 视觉;使用 canonical API;Admin 收口为监控/配置。

当前发布版本见 VERSION

对哪些用户有价值

Apple Silicon 本地 AI 开发者

  • 在统一上下文预算下,对比 MLX 本地模型和托管 API。
  • 不离开应用就能查看 runtime 成本、prewarm、release、恢复动作和硬件压力。
  • 判断哪个本地模型真的适合日常 coding / analysis 工作流。

Agent / 工具链团队

  • 在一个工作台里验证 tool calling、repo-grounded behavior、replay 和 patch 流程。
  • 直接把 Compare 结果送入 Benchmark,不必切换产品。
  • 区分失败来源:模型质量、provider 行为,还是本地 runtime 不稳。

评测 / 平台工程团队

  • 用可复现 profile 跑 formal 和 focused benchmark suites。
  • 查看 baseline、delta、run note、失败分类和发布证据。
  • 让本地与远端 target 落在同一个可比较的 target catalog 里。

核心价值

  • 本地 + 远端统一 target catalog。
  • Compare Lab 支持模型对模型审阅。
  • Fine-tune 工作流覆盖 dataset、recipe、training、evaluation、adapter proof loop 和 export。
  • Benchmark 运维覆盖 history、progress、baseline、report 和 release evidence。
  • Retrieval 前台覆盖文档导入、chunk 检查和 grounded evidence probe。
  • Experiments 时间线覆盖 Session/Run lineage、artifact 导航和 retention policy。
  • Replay、trace review、patch inspection 与可导出的审阅记录。
  • Runtime 运维覆盖 prewarm、release、restart、日志检查、telemetry 和 recovery。
  • 支持本地/社区模型发现和远端 provider health 扫描。

当前支持的 Target

本地

  • Local Qwen3 0.6B
  • Local Qwen3 4B 4-bit
  • Local Qwen3.5 4B 4-bit
  • Local Gemma 3 4B It Qat 4-bit

远端

  • OpenAI Codex
  • OpenAI GPT-5.5
  • Claude API
  • DeepSeek API
  • Kimi API
  • GLM API
  • Qwen API

截图

Landing page
Agent workbench
Admin dashboard

Benchmark 百分位看板:

Benchmark 百分位看板

正式里程碑回归汇总:

正式里程碑回归汇总

本地 runtime 实时监控:

本地 runtime 实时监控

快速开始

环境要求

  • Apple Silicon macOS
  • Node 22.x
  • Python 3.12
  • 可运行 MLX 的本地环境

安装

nvm install 22
nvm use 22
npm install
cp .env.example .env.local

启动 Web 应用

npm run dev

默认入口:

启动本地模型网关

python3.12 -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install mlx mlx-lm
python scripts/local_model_gateway_supervisor.py

网关健康检查:

验证

npm run typecheck:changed
npm run smoke:routes
npm run smoke:screenshots

配置说明

.env.example 复制成 .env.local,只填写你要启用的 provider 即可。

需要注意:

  • .env.local 已被 git 忽略。
  • 远端 provider 是可选的。
  • 部分 target 走 OpenAI-compatible / Claude-compatible endpoint。
  • 本仓库公开版本已经做过脱敏,占位值需要替换成你自己的 endpoint。

仓库结构

app/                      Next.js app routes 与 thin API transports
components/               共享 UI 与兼容 shell
features/                 feature-owned routes、contracts、state、actions、application ports
lib/agent/                Agent runtime、providers、benchmark、gateway helpers
lib/finetune/             Fine-tune store facade 与拆分后的 operation services
scripts/                  本地网关、runtime、验证和发布脚本
docs/                     架构、release notes、launch notes、roadmap 和 assets
modelscope/               ModelScope 主页/readme 元数据
public/                   对外资源和社媒封面图

发布与同步

ModelScope 打包脚本会导出已提交的 Git tree,因此每次同步都可以让 GitHub 和 ModelScope 保持同一份文件快照。

安全和隐私

  • 敏感本地操作默认需要确认。
  • Secret 应保存在 .env.local
  • 公开仓库默认配置已经做过脱敏。
  • 新增公开提交应尽量使用 GitHub noreply 地址。
  • SECURITY.md

贡献

欢迎 issue 和 PR。

发布说明

Yorumlar (0)

Sonuc bulunamadi