tscg

mcp
Guvenlik Denetimi
Uyari
Health Uyari
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 7 GitHub stars
Code Uyari
  • process.env — Environment variable access in benchmark/harness/cli.ts
Permissions Gecti
  • Permissions — No dangerous permissions requested
Purpose
This is a deterministic tool-schema compiler designed for LLM agents. It compresses tool definitions to save tokens (50-72% savings) while maintaining or improving model accuracy, operating in under a millisecond with zero external dependencies.

Security Assessment
Overall Risk: Low. The tool does not request any dangerous permissions, execute shell commands, or contain hardcoded secrets. There is no evidence of unwanted network requests. The only flagged issue is a `process.env` access located strictly within a benchmark harness file (`benchmark/harness/cli.ts`), which is a standard practice for test configurations and poses no threat to production environments.

Quality Assessment
The codebase is lightweight (1,200 lines) and recently active, with the last push occurring today. It is properly licensed under the standard MIT license. The developer claims strong quality metrics, including a large automated test suite and zero runtime dependencies, which significantly reduces supply-chain attack risks. However, community visibility and adoption are currently very low, with only 7 GitHub stars. While the internal benchmark results provided in the README are highly detailed, they are entirely self-reported and lack independent external validation at this time.

Verdict
Safe to use, though adopters should be aware that it is a very new, low-visibility project with self-reported performance claims.
SUMMARY

TSCG — Deterministic tool-schema compiler for LLM agents. 50-72% token savings, 50 tools in 2.4ms. Phi-4 recovers from 0% to 90% accuracy. 459 tests, zero dependencies, MIT.

README.md

TSCG -- Tool-Schema Compression Grammar

npm @tscg/core
npm @tscg/mcp-proxy
npm @tscg/tool-optimizer
License: MIT
Tests
Zero Dependencies

Deterministic tool-schema compiler that reduces LLM tool-definition overhead by 50--72% while improving accuracy.

1,200 LOC TypeScript. Zero dependencies. Sub-millisecond. 23KB ESM bundle.

Latest Findings (April 2026)

720-Call E2E Benchmark on Claude Models

Claude Opus 4.7 -- matches-or-beats baseline with 57-63% token savings:

Tool Count Baseline TSCG Balanced Δ Accuracy Savings
16 70.0% 77.5% +7.5pp 56.9%
43 77.5% 80.0% +2.5pp 63.0%
50 72.5% 80.0% +7.5pp 62.8%

Claude Sonnet 4 -- consistent 57-63% compression with robust accuracy:

Tool Count Baseline TSCG Balanced Δ Accuracy Savings
16 77.5% 80.0% +2.5pp 56.9%
43 85.0% 80.0% -5.0pp 63.0%
50 77.5% 77.5% ±0.0pp 62.8%

480-Call MCP Proxy Benchmark (v1.4.1)

480-call extended proxy benchmark (n=40 per cell, 2 seeds, 2 models x 3 tool counts):

Model Tools Baseline TSCG Proxy Δ Accuracy Token Savings
Opus 4.7 16 70.0% 75.0% +5.0pp 53.1%
Opus 4.7 43 75.0% 75.0% ±0.0pp 55.8%
Opus 4.7 50 77.5% 77.5% ±0.0pp 55.5%
Sonnet 4 16 80.0% 77.5% -2.5pp 53.1%
Sonnet 4 43 85.0% 82.5% -2.5pp 55.8%
Sonnet 4 50 77.5% 77.5% ±0.0pp 55.5%

Opus 4.7 matches-or-beats baseline in all conditions; Sonnet 4 within expected CI (max -2.5pp). Both achieve 53-56% token savings.

Tool-Optimizer E2E validation (@tscg/tool-optimizer withTSCG() wrapper, 30 calls, Sonnet 4 @ 16 tools): withTSCG 86.7% vs baseline 80.0% (+6.7pp), 36.6% character savings.

Three Frontier-Model Operator Archetypes

TSCG compression response is model-specific. Three distinct archetypes observed:

  • Opus 4.7 -- Operator-HUNGRY -- every operator contributes; balanced (all-8) is optimal
  • Sonnet 4 -- Operator-ROBUST -- config-agnostic; 6 of 7 configs near-identical accuracy
  • GPT-5.2 -- Operator-SENSITIVE -- CFL helps, CFO hurts; custom config optimal

External Validation -- 4 Independent Benchmarks

TSCG's internal benchmark (TAB -- Tool-Agentic Bench, ~19,000 calls) is independently corroborated by four external benchmarks, including industry-standard evaluation suites:

Benchmark Type Result Significance
BFCL (Berkeley Function Calling Leaderboard) Industry standard 108--181% ARR across 3 frontier models Sonnet 4: 85.7%→93.2% (+7.5pp), GPT-4o: 31.7%→57.4% (+25.7pp), GPT-5.2: 61.9%→89.4% (+27.5pp)
ToolBench (Qin et al.) Academic benchmark +5.0pp (75.0%→80.0%) Real-world tool catalog, 20 tools
API-Bank (Li et al.) Academic benchmark -5.0pp (80.0%→75.0%) Honest negative result -- not all benchmarks improve
Real MCP Server (@modelcontextprotocol/server-filesystem) Production endpoint 100% syntactic validity 30 tasks on live MCP server, server-acceptance 90--97%

TAB → Real MCP Transfer (0.1pp): The internal TAB benchmark is not merely a self-constructed evaluation -- it demonstrably predicts real-world MCP behavior within 0.1 accuracy points. Sonnet 4 on 43-tool MCP: synthetic TAB delta = -1.6pp vs real MCP delta = -1.7pp. This tight transfer validates TAB as a reliable proxy for production MCP deployments.

Mean across the 3 external catalog benchmarks: +2.5pp (80.2%→82.7%).

See paper for full methodology and per-benchmark analysis.

The Problem

Every LLM agent framework sends full JSON Schema definitions for every registered tool on every API call. Claude Code injects ~50,000 tokens of tool definitions per subprocess. At production scale (100K calls/day), the schema overhead alone costs >$30,000/month.

Worse: small models (4B--14B) cannot parse JSON-format tool schemas reliably at scale -- achieving 0--49% accuracy with >15 tools. This locks agentic capabilities behind expensive frontier APIs.

Key Results

Pareto Dominance: Better Accuracy AND Fewer Tokens

BFCL (Berkeley Function Calling Leaderboard) validation -- the industry standard for tool-calling evaluation:

Model Without TSCG With TSCG Improvement Token Savings
Claude Sonnet 4 85.7% 93.2% +7.5pp 46.8%
GPT-4o 31.7% 57.4% +25.7pp (181% ARR) 2.6%
GPT-5.2 61.9% 89.4% +27.5pp (144% ARR) 8.3%

Every model improves. TSCG achieves 108--181% Accuracy Retention Rate -- it doesn't just retain accuracy, it increases it.

Small Model Enablement

Model JSON Baseline (20 tools) With TSCG Recovery
Phi-4 14B 0% 84.4% +84.4pp
Mistral 7B 35% 80.1% +45.1pp
Gemma 3 4B 49.9% 67.0% +17.1pp

Seven small models (4B--14B) that achieve 0--49% accuracy on JSON tools recover to 65--90% with TSCG. The root cause: JSON format, not model capacity (R^2 = 0.88 against JSON baselines, collapses to 0.03 against text -- 97% of variance is format sensitivity).

Full Benchmark Summary

From ~19,000 API calls across 12 models (4B--32B + 3 frontier APIs), 5 scenarios:

Finding Detail
Token savings 50--72% on tool schemas
BFCL validation 108--181% Accuracy Retention Rate
Formal guarantee >=51% savings on any well-formed schema (Theorem 3.1)
Predictive model R^2 = 0.88 predicts TSCG benefit from single baseline measurement
Speed 50 tools in 2.4ms (Node.js v24, commodity hardware)
Cost at scale >$30,000/month savings at 100K calls/day

Verified Performance (Fresh Install)

Independent reproduction on @tscg/core from npm:

Metric Measured
5 realistic tools (Claude target) 59.5% token savings
50 tools 66.6% savings in 2.4ms
Compression time (5 tools) 0.9ms
Unit tests 108 passing (core 47 + proxy 61)
Bundle 34.7KB (11.7KB gzipped)
Dependencies 0

What TSCG Does

TSCG applies 8 formally-defined transforms grounded in how causal transformers process tokens:

Principle Full Name What It Does
TAS Tokenizer-Aligned Syntax Optimizes for BPE boundaries
CFL Constraint-First Layout Exploits the attention sink at position 0
CFO Causal-Flow Ordering Orders operations into causal chains
SDM Semantic Density Maximization Removes 104+ filler patterns
DRO Delimiter-Role Optimization Converts verbose phrases to compact delimiters
CCP Closure-Context Preservation Appends closure block for recency bias
CAS Causal Access Scoring Scores and reorders by parameter fragility
SAD-F Selective Anchor Duplication Budget-constrained anchor duplication

Quick Start

All three @tscg/* packages use umbrella versioning -- same version number, released together.

npm install @tscg/core                # Core compression engine
npm install @tscg/mcp-proxy           # Transparent MCP middleware
npm install @tscg/tool-optimizer      # LangChain / Vercel AI SDK integrations
import { compress } from '@tscg/core';

const tools = [
  {
    type: 'function',
    function: {
      name: 'get_weather',
      description: 'Get the current weather for a location',
      parameters: {
        type: 'object',
        properties: {
          location: { type: 'string', description: 'City name or coordinates' },
          units: { type: 'string', enum: ['celsius', 'fahrenheit'] },
        },
        required: ['location'],
      },
    },
  },
];

const result = compress(tools, { model: 'claude-sonnet' });
console.log(result.compressed);
console.log(`Saved ${result.metrics.tokens.savingsPercent}% tokens`);
// => "get_weather(location:str units?:str[celsius|fahrenheit])|Get current weather"
// => "Saved 62.3% tokens"

Result Object

const result = compress(tools, { model: 'claude-sonnet', profile: 'balanced' });

result.compressed                        // string — compressed tool definitions
result.metrics.tokens.original           // number — original token count
result.metrics.tokens.compressed         // number — compressed token count
result.metrics.tokens.savingsPercent     // number — e.g. 62.3
result.metrics.compressionTimeMs         // number — e.g. 0.9
result.appliedPrinciples                 // string[] — e.g. ['SDM', 'CAS', 'DRO', 'TAS']
result.metrics.perTool                   // { name, originalTokens, compressedTokens, savingsPercent }[]

Options

compress(tools, {
  model: 'claude-sonnet',   // Target model: 'claude-sonnet' | 'gpt-4o' | 'gpt-4' | ...
  profile: 'balanced',      // Profile: 'conservative' | 'balanced' | 'aggressive' | 'auto'
});

Description-Only Mode (v1.4.0)

Compress only .description fields while preserving the full JSON Schema structure -- compatible with native tool-calling APIs (OpenAI, Anthropic, Google):

import { compressDescriptions } from '@tscg/core';

const result = compressDescriptions(tools, { model: 'claude-sonnet' });
console.log(result.tools);              // Tools with compressed descriptions
console.log(result.metrics.descriptions.savingsPercent); // ~25-40% description savings

Auto Profile (v1.4.0)

The auto profile selects compression principles based on catalog size. At >=30 tools, CFL/CFO are automatically disabled (they become harmful at scale per our 100-tool benchmark findings):

compress(tools, { model: 'claude-sonnet', profile: 'auto' });

Packages

Package Description Install
@tscg/core Core compression engine (8 operators) npm i @tscg/core
@tscg/mcp-proxy MCP stdio proxy -- transparent TSCG compression for any MCP server npm i @tscg/mcp-proxy
@tscg/tool-optimizer LangChain, MCP, Vercel AI SDK integrations npm i @tscg/tool-optimizer

CLI

# Compress tool schemas
npx tsx cli/tscg.ts compress --input tools.json --model claude-sonnet --profile balanced

# Run benchmarks
npx tsx cli/tscg.ts benchmark --model claude-sonnet

# Show compression info
npx tsx cli/tscg.ts info

MCP Proxy

@tscg/mcp-proxy sits between Claude Code (or any MCP client) and your MCP tool servers, transparently compressing tool schemas:

# Opus 4.7 -- 57-63% savings, +2.5 to +7.5pp accuracy
npx @tscg/mcp-proxy --target=claude-opus-4-7 --server=<your-mcp-command>

# Sonnet 4 -- 57-63% savings, robust accuracy
npx @tscg/mcp-proxy --target=claude-sonnet-4 --server=<your-mcp-command>

Setting --target automatically enables the full compression pipeline validated by our 720-call benchmark. No other flags required.

Legacy mode (backward compatible with v1.0.x):

npx @tscg/mcp-proxy --server=<your-mcp-command>

Integrations

LangChain:

import { withTSCG } from '@tscg/tool-optimizer/langchain';
const optimizedAgent = withTSCG(agent);

Vercel AI SDK:

import { tscgMiddleware } from '@tscg/tool-optimizer/vercel';

TSCG vs Other Approaches

Property TSCG LLMLingua-2 DSPy / SAMMO
Accuracy effect Improves (108--181% ARR) Degrades (-5 to -20%) Degrades
Speed 2.4ms / 50 tools ~42s (GPU) Minutes
Dependencies None GPU + ML framework API calls
Deterministic Yes No No
Formal guarantees >=51% savings None None
Bundle size 34.7KB Requires PyTorch Full stack
Works offline Yes GPU required API required

Who Benefits

  • Claude Code / Cursor / Windsurf users: ~35K fewer tokens per subprocess
  • Local LLM users (Ollama): 7B models become functional tool-use agents with 50+ tools
  • Production API deployments: >$30,000/month savings at 100K calls/day
  • Multi-agent orchestration: Savings multiply per sub-agent in the chain
  • Edge / Mobile / Privacy: EU AI Act compliant local deployment becomes viable

Project Structure

packages/
  core/             # @tscg/core — compression engine (8 operators, 47 tests)
  mcp-proxy/        # @tscg/mcp-proxy — stdio proxy for MCP servers (61 tests)
  tool-optimizer/   # @tscg/tool-optimizer — LangChain, Vercel AI SDK integrations
paper/              # LaTeX source (arXiv version)
cli/                # Unified CLI (compress, benchmark, analyze, info)
benchmark/          # TAB benchmark harness, analysis code, raw data
integrations/       # Framework integration examples
docs/               # Technical documentation

Development

git clone https://github.com/SKZL-AI/tscg.git
cd tscg
npm install
npm run build
npm test          # 459 tests
npm run typecheck # Type checking

Paper

TSCG: Deterministic Tool-Schema Compilation for Agentic LLM Deployments

Furkan Sakizli. 2026.

TSCG-paper.pdf -- arXiv preprint (full version, 12 models, ~19,000 API calls, 4-class taxonomy)

LaTeX source is available in paper/.

Citation

@article{sakizli2026tscg,
  title={TSCG: Deterministic Tool-Schema Compilation for Agentic LLM Deployments},
  author={Sakizli, Furkan},
  year={2026},
  note={arXiv preprint}
}

Contributing

See CONTRIBUTING.md for development setup, code style, and PR guidelines.

License

MIT

Yorumlar (0)

Sonuc bulunamadi