kord
Health Warn
- No license — Repository has no license file
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Pass
- Code scan — Scanned 2 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
No AI report is available for this listing yet.
Kord is an ultra-fast, zero-dependency CLI utility written in Go, designed to package entire codebases into a single streamed XML output. It is engineered explicitly to process massive repositories with minimal memory usage and high throughput, making it the perfect tool to prepare codebase context for LLM & Vector Database
Kord By Siranta
Kord is an ultra-fast, zero-dependency CLI utility written in Go, designed to package entire codebases into a single streamed XML output. It is engineered explicitly to process massive repositories with minimal memory usage and high throughput, making it the perfect tool to prepare codebase context for Large Language Models (LLMs) or vector databases.
Architecture Overview
Kord operates as a streaming ingestion pipeline. Unlike traditional tools that load entire directories or large file arrays into memory before outputting, Kord traverses the filesystem and writes XML tokens sequentially to the output stream. This design ensures that memory consumption remains virtually constant, regardless of the repository size.
[Filesystem Walk] ➔ [Ignore Engine] ➔ [Filters (Binary/SVG/Size)] ➔ [Stream XML Encoder] ➔ [Output / stdout]
Key Features
- Streaming XML Output: Processes and encodes files on-the-fly.
- Zero-Regex Ignore Engine: Groups rules into high-performance buckets evaluated using rapid $O(1)$ map lookups and simple string comparisons instead of heavy regular expression patterns.
- Binary File Detection: Reads the first 512 bytes of files to check for a null byte (
0x00), skipping binary assets dynamically without bloating the stream. - Flexible Size Constraints: Skips content for files exceeding configured limits, but prints their file path tag with an explanation, keeping the LLM informed of their existence.
- Interactive Setup Wizard: Provides a step-by-step interactive CLI mode.
- Static Compilation & No Dependencies: Relies solely on the Go standard library, compiling down to a single standalone binary.
Output XML Schema
The output is wrapped in a <repository> root tag. Each file is encapsulated in a <file> tag, with its relative file path stored in the path attribute, and its raw content wrapped safely in a CDATA block.
<?xml version="1.0" encoding="UTF-8"?>
<repository>
<!-- Normal file with content -->
<file path="main.go"><![CDATA[package main...]]></file>
<!-- Exceeds file size limit (content skipped to save context space) -->
<file path="large_data.json" omitted="size_limit_exceeded"></file>
<!-- SVG file (vector coordinates skipped to prevent LLM bloat/hallucination) -->
<file path="assets/logo.svg" omitted="svg_bloat_omitted"></file>
</repository>
Usage
Kord can be executed directly as a command-line tool, running against the current directory by default, or configured via flags.
1. Basic CLI Commands
# Run against the current directory and output to stdout
kord
# Run against a specific directory
kord -dir /path/to/project
# Specify a custom size limit (e.g., 20KB base size limit instead of 50KB)
kord -max-size 20000
# Specify a custom ignore file
kord -ignore custom.gitignore
# Pipe the output directly to a file
kord -dir . > codebase.xml
2. Interactive Wizard Mode
To run an interactive guide that prompts you for the target directory and output location, run:
kord start
Command Flags
| Flag | Type | Default | Description |
|---|---|---|---|
-dir |
string |
. |
The target directory to traverse. |
-ignore |
string |
.gitignore |
The path to the ignore rules file to parse. |
-max-size |
int64 |
50000 |
The maximum size (in bytes) allowed for standard file contents before they are omitted. |
The Ignore Engine
To ensure maximum walk speed, Kord uses a specialized IgnoreEngine that categorizes rules into four quick-check buckets:
- Exact Directories: Directories skipped immediately during the walk (prevents traversing subdirectories).
- Exact Files: Exact file names to skip.
- Suffixes: Matches patterns starting with
*(e.g.,*.log,*.png). - Prefixes: Matches patterns ending with
*(e.g.,temp*).
Hardcoded Default Ignores
The engine automatically ignores heavy, metadata, and compilation folders/files even if they are not listed in your .gitignore file:
- Directories:
.git,node_modules,vendor,.next,dist,build,.gradle,venv,.venv,target,obj,__pycache__,.dart_tool,Pods. - Suffixes:
.png,.jpg,.jpeg,.gif,.ico,.webp,.lock,go.sum,.min.js,.min.css,.map,.exe,.dll,.so,.dylib,.bin,.zip,.tar.gz,.rar,.7z,.pdf,.pyc,.class.
Ignore Engine Limitations
[!NOTE]
Since the ignore engine does not use complex regular expressions or absolute-path resolution to stay fast:
- Matches are performed against the base name (filename or directory name) of each path.
- Complex nested glob rules (like
app/src/*.test.jsordir/**/*.log) are not fully evaluated. Only the base name components or prefixes/suffixes (e.g.*.test.js) are evaluated.
Traversal & Filtering Logic
When traversing a directory, Kord applies the following filters to each path:
1. Loop Prevention
If you output the streaming XML to a file located inside the directory being traversed (e.g. kord > output.xml), Kord compares the file descriptor (FileInfo) of the output writer with each walked file. It automatically skips reading its own output file to prevent an infinite recursive write loop.
2. Suffix Size Multipliers
For human-readable documentation files, Kord automatically extends the file size limit by 10x of the configured -max-size flag:
- Markdown/Text Files (
.md,.txt,.mdx): Limit =-max-size$\times$ 10. - All Other Files: Limit =
-max-size.
Files exceeding this threshold are written as empty tags with omitted="size_limit_exceeded".
3. SVG Optimization
SVG images contain extensive XML coordinates which are unreadable by LLMs and inflate context length. Kord intercepts .svg files, skips reading their contents, and writes them with omitted="svg_bloat_omitted" to ensure the LLM knows the file is present without receiving the bloat.
4. Binary File Check
Before reading any file, Kord pulls the first 512 bytes. If a null byte (0x00) is found, the file is classified as binary (e.g., compressed data, compiled binaries, non-UTF-8 assets) and is completely excluded from the XML output.
Developer Guide
Prerequisites
- Go 1.25 or higher
Running Tests
We maintain a comprehensive suite of tests covering ignore rules, XML streaming format, size limit multipliers, binary filters, and self-loop detection.
# Run all tests in verbose mode
go test -v ./...
Compilation
You can cross-compile Kord binaries statically across multiple platforms. Run the provided script:
# Run build script (Unix/WSL/Git Bash)
./build.sh
This compiles statically linked binaries without debug symbols (-ldflags="-s -w") to minimize size, saving output files to the bin/ directory:
bin/kord-linux-amd64bin/kord-linux-arm64bin/kord-darwin-amd64bin/kord-darwin-arm64bin/kord-windows-amd64.exe
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found