nullsec-s1

mcp
Security Audit
Warn
Health Warn
  • No license — Repository has no license file
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Community trust — 26 GitHub stars
Code Pass
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
  • Permissions — No dangerous permissions requested

No AI report is available for this listing yet.

SUMMARY

Security-native LLM system for AI-generated application security.

README.md

Nullsec S1

Nullsec-S1

Open-source security model purpose-built to audit AI-generated apps, agents, MCP tools, Web3 flows, and vibecoded software.

tests
release
huggingface
safety layer
benchmark
python

Nullsec-S1 returns final structured JSON security audits: findings, severity, exploit scenario, recommended fix, secure patch, and a deterministic Safety Layer decision.

Quick links: GitHub Release · Hugging Face adapter · Eval docs · Quickstart

Nullsec-S1 RC2/v1.1 ships as a PEFT / QLoRA adapter. The source repo contains training code, corpus, benchmark harness, inference code, and validation gates. The trained adapter is intentionally not committed to git.

State Location Meaning
Source checkout main training pipeline, corpus, benchmark code, docs
GitHub Release v1.0.0-rc25 source of record for adapter, benchmark reports, metrics, pipeline log
Hugging Face Trynullsec/nullsec-s1 public PEFT / QLoRA adapter mirror and model discovery page
Base model Qwen/Qwen2.5-Coder-7B-Instruct required separately to load the adapter
Local claim validation downloaded/unpacked release artifacts source-only checkout may show artifact-gated claims as unavailable

Benchmark performance

Nullsec-S1 was evaluated on the Nullsec RC2/v1.1 111-case security benchmark for AI-generated applications, agents, MCP tools, Web3 flows, and common application-security failure modes.

On this benchmark, Nullsec-S1 ranked #1 by F1 score against the compared baselines, while keeping false-safe rate at 0.0% and maintaining substantially lower hallucination/noise than hosted frontier API baselines.

Rank System / Tool Evaluated / Analyzable Precision Recall F1 Score False-Safe Rate Hallucination Rate
#1 Nullsec-S1 110 / 111 94.2% 90.7% 0.9245 0.0% 6.7%
#2 OpenAI/Codex gpt-5.3-codex 105 / 111 61.7% 88.0% 0.7252 0.0% 60.0%
#3 Claude Opus 4.8 68 / 111 88.9% 51.9% 0.6550 0.0% 14.3%
#4 Semgrep local rules baseline 111 / 111 86.3% 40.7% 0.5535 56.3% 33.3%
#5 Qwen2.5-Coder-7B base model 4 / 111 33.3% 0.9% 0.0180 0.0% 50.0%

Why this matters:
Nullsec-S1 is not just the base model prompted differently. The adapter was trained to produce structured, security-specific JSON verdicts with stronger format adherence, higher recall, higher precision, and lower hallucination on this benchmark.

Important scope:
These results are measured on the Nullsec RC2/v1.1 111-case benchmark. They do not guarantee universal vulnerability detection or replace independent security review. 111/111 raw outputs were produced by the release benchmark; 110/111 above refers to analyzable/scored structured outputs in the comparison report.

Why Nullsec-S1 exists

AI-generated software is moving faster than traditional security review. General models can explain code, but they often struggle to emit consistent, schema-valid security verdicts that can be enforced in CI or agent workflows.

Nullsec-S1 is adapter-aligned for security-specific JSON audit outputs. It focuses on:

  • BROKEN_AUTH
  • UNSAFE_ADMIN_ROUTE
  • EXPOSED_SECRET
  • ENVIRONMENT_EXPOSURE
  • MCP_TOOL_ABUSE
  • COMMAND_INJECTION
  • SSRF
  • XSS
  • MISSING_RATE_LIMIT
  • SMART_CONTRACT_RISK
  • WALLET_TRANSACTION_RISK
  • UNSAFE_FILE_UPLOAD
  • SQL_INJECTION
  • PROMPT_INJECTION
  • DANGEROUS_SHELL_COMMAND
  • DEPENDENCY_RISK

What makes it different from general models

  • The base Qwen model mostly failed to produce scorable Nullsec-style JSON security verdicts in this benchmark.
  • Hosted frontier API baselines were stronger than base Qwen, but had lower recall or higher hallucination/noise on this benchmark.
  • Nullsec-S1 is trained to return structured security verdicts, not free-form commentary.
  • The release is local and reproducible: base model + PEFT adapter + deterministic Safety Layer.

2–5 minute quickstart

Use either the GitHub Release artifact for the full release bundle or the Hugging Face adapter for the PEFT / QLoRA adapter. Users still need the base model Qwen/Qwen2.5-Coder-7B-Instruct.

python -m pip install -e ".[dev]"
python -m pip install -r requirements-train-cu121.txt

NULLSEC_ADAPTER_PATH=outputs/nullsec-s1-qlora \
python inference.py --file examples/unsafe-next-admin-route.ts

The command prints the final Safety-Layer-enforced JSON verdict. It does not print source code by default. If the model emits malformed output, inference.py returns a JSON error object and exits non-zero.

Concrete example

Input:

export async function POST(req: Request) {
  const { userId, role } = await req.json();
  await db.user.update({ where: { id: userId }, data: { role } });
  return Response.json({ ok: true });
}

Representative output shape:

{
  "risk_score": 70,
  "production_ready": false,
  "severity": "HIGH",
  "confidence": "HIGH",
  "reasoning_summary": "Privileged admin mutation is reachable without an authenticated role check.",
  "findings": [
    {
      "category": "UNSAFE_ADMIN_ROUTE",
      "severity": "HIGH",
      "file": "examples/unsafe-next-admin-route.ts",
      "description": "Admin role update route has no session/role check.",
      "recommended_fix": "Require an authenticated admin session before mutating roles."
    }
  ],
  "_safety_layer": {
    "production_ready": false,
    "blocking_reasons": ["R2: dimension 'permissions' failed its check"],
    "adjustments": []
  }
}

This is illustrative, not a benchmark output.

Install / run options

Workflow Command / docs
Local adapter inference python inference.py --file examples/unsafe-next-admin-route.ts
Hugging Face adapter loading Trynullsec/nullsec-s1 + Qwen/Qwen2.5-Coder-7B-Instruct
Benchmark reproduction python benchmarks/run_all.py --mode model --adapter outputs/nullsec-s1-qlora
Semgrep baseline python benchmarks/baselines/semgrep_baseline.py
Hosted API baselines benchmarks/baselines/claude_api.py, benchmarks/baselines/openai_api.py
Release validation python scripts/validate_claims.py --adapter ... --report ... --check

Evaluation methodology

  • 111 security benchmark cases
  • 16 security categories
  • metrics: precision, recall, F1, false-safe rate, hallucination rate
  • comparisons against base Qwen, Semgrep local rules, Claude, and OpenAI/Codex

Details: docs/EVALS.md.

Quick Verification

After downloading and unpacking the release artifacts locally:

python scripts/validate_claims.py \
  --adapter outputs/nullsec-s1-qlora \
  --report releases/nullsec-1.0/benchmark/SUITE.json \
  --check

This verifies that local public claims match the downloaded adapter, benchmark report, safety probes, and release bundle on disk. A source-only checkout may show artifact-gated claims as unavailable until those release assets are unpacked locally.

Model Architecture

Component RC2/v1.1
Base model Qwen/Qwen2.5-Coder-7B-Instruct
Adapter PEFT / QLoRA adapter, mirrored at Trynullsec/nullsec-s1
Adapter path outputs/nullsec-s1-qlora
Weight format adapter_model.safetensors confirmed in the v1.0.0-rc25 release artifact
Tokenizer tokenizer files in the adapter directory when present, otherwise base tokenizer
Chat template release artifact includes chat_template.jinja; inference uses the tokenizer chat template
Reasoning format no custom hidden reasoning token loop; no <thought> parser
Output final structured JSON security audit, then deterministic Safety Layer enforcement

S1 means Security-1. It is not a reasoning-trace model; it returns the final structured audit result.

Architecture

Nullsec S1 is a pipeline, not a single model call. A security-tuned model proposes a verdict; two deterministic layers align and enforce it before anything is trusted.

flowchart TD
    A["AI-generated app / repo / PR / MCP tool / wallet flow"] --> B["Nullsec S1 reasoning pipeline<br/>(security-tuned model: detect · classify · explain · patch)"]
    B -->|raw output| C["Structured JSON verdict<br/>(verdict schema)"]
    C --> D["Security Alignment Layer<br/>parse · schema-validate · normalize severities"]
    D --> E["Nullsec Safety Layer<br/>deterministic enforcement R1–R6"]
    E --> F["Enforced verdict<br/>(production_ready computed, never trusted from the model)"]
    F --> G["Patch · Report · CI gate · API response"]

Plain-text view of the same flow:

AI-generated app / repo / PR / MCP tool / wallet flow
        │
        ▼
Nullsec S1 reasoning pipeline        (security-tuned model: detect · classify · explain · patch)
        │  raw output
        ▼
structured JSON verdict              (data/schemas/verdict.schema.json)
        │
        ▼
Security Alignment Layer             (parse · schema-validate · type-check · normalize severities)
        │  structurally-valid verdict
        ▼
Nullsec Safety Layer                 (deterministic enforcement: rules R1–R6, severity/risk flooring)
        │
        ▼
enforced verdict                     (production_ready recomputed, never trusted from the model)
        │
        ▼
patch · report · CI gate · API response

The model's own production_ready claim is advisory only. The Safety Layer recomputes it and allows true only when all eight check dimensions pass with no HIGH/CRITICAL finding:

auth · secrets · input_validation · rate_limits · permissions · dangerous_exec · dependency_risk · environment_exposure

Prompt and schema details: docs/PROMPT_FORMAT.md. Full design: docs/SYSTEM_OVERVIEW.md.


Core system components

Path What it is
corpus/ Curated training corpus — the single source of truth (authored/ + opt-in ingested/ + synthetic/).
taxonomy/ The 16-category security taxonomy mapped to 8 check dimensions (taxonomy.json).
nullsec/safety/ The Security Alignment Layer (alignment.py) + Nullsec Safety Layer (enforcement.py).
nullsec/core/ Reasoning pipeline (engine.py), verdict models, canonical prompts, version/fingerprint.
nullsec/ingest/ CVE/NVD, Semgrep, SARIF/CodeQL ingestion into the verdict contract.
training/ Dataset prep, QLoRA training, corpus validation, release threshold, preflight.
benchmarks/ Evaluation runners + adversarial Safety Layer probes.
scripts/validate_claims.py Public claim validator — the honesty gate.
scripts/release_candidate.py Release gate — builds a bundle only from real artifacts.
serving/ FastAPI serving layer (/v1/model, /v1/analyze, /v1/patch, streaming).
cli/ nullsec1 command-line analyzer + CI gate.
reports/ Corpus curation sprint reports (auditable provenance).
docs/ Technical documentation (system overview, safety layer, corpus, roadmap, non-claims).

What is live now vs coming next

Live now:

  • source repo
  • GitHub Release artifact
  • Hugging Face PEFT adapter
  • inference.py
  • benchmark suite
  • baseline comparison scripts
  • docs/EVALS.md

Coming next:

  • hosted scanner at s1.trynullsec.com
  • API backend
  • GitHub Action / PR guard
  • CLI hardening
  • larger benchmark suite
  • more framework coverage

Current verified state

The corpus exceeds the v1.0 and RC2/v1.1 data thresholds, the deterministic Safety Layer is enforced, and the trained RC2/v1.1 release artifacts are published as GitHub Release assets rather than committed to source.

This snapshot reflects the artifacts on disk right now. Every number below is produced by a command in this repo — none are hand-entered. Run the commands in Quickstart to reproduce them.

Fact Value Source command
Curated corpus 1,741 examples (1,304 hand-authored + 437 curated-ingested) training/dataset_stats.py --include-ingested
Train / eval split 1,393 train / 348 eval (eval_frac 0.2, seed 42) training/prepare_dataset.py --include-ingested
Taxonomy categories 16 categories → 8 check dimensions taxonomy/taxonomy.json
Per-category coverage every category ≥ 60 curated training/release_threshold.py --include-ingested --profile rc2
Safety Layer consistency 100% (1,741 / 1,741) training/dataset_stats.py --include-ingested
Benchmark suite 111 labeled cases across all 16 categories benchmarks/datasets/detection.json
Adversarial safety probes 8 / 8 blocked, 0 bypassed python -m benchmarks.safety_probes
Test suite passing pytest -q
Release threshold (v1.0) PASS training/release_threshold.py --include-ingested
Release threshold (v1.1 / RC2) PASS training/release_threshold.py --include-ingested --profile rc2

The honesty gate (scripts/validate_claims.py --check) ties public wording to local artifacts. To reproduce the release-asset claim state, unpack the GitHub Release bundle locally and run the Quick Verification command.


The Security Alignment Layer

The deterministic layer is the reason Nullsec S1 is a security system rather than a code model that emits opinions. It runs in two stages.

Stage 1 — Security Alignment Layer (nullsec/safety/alignment.py): extract the JSON object (tolerant of code fences, preamble, and trailing prose), validate it against the verdict schema, type-check it into the Verdict model, and normalize finding severities up to each category's taxonomy floor. Anything that cannot be aligned raises VerdictParseError instead of being guessed at.

Stage 2 — Nullsec Safety Layer (nullsec/safety/enforcement.py): take the structurally-valid verdict and deny production_ready if any of these hold:

Rule production_ready is denied when…
R1 any required dimension is not_checked
R2 any required dimension is fail
R3 any finding is HIGH or CRITICAL
R4 risk_score exceeds the production threshold (default 20)
R5 a finding contradicts a dimension reported as pass
R6 overall severity is HIGH or CRITICAL

It also raises (never lowers) severity and risk_score to match the worst finding, so the model cannot under-report. Because enforcement is deterministic and independent of the model, an attacker who manipulates the model — e.g. via prompt injection embedded in the code under review — still cannot obtain a false production_ready: true. This is verified by adversarial probes in benchmarks/safety_probes.py, including a prompt-injection-in-prose probe.

Deep dive: docs/SECURITY_ALIGNMENT_LAYER.md.


Quickstart

Local CPU machines can verify the corpus, the deterministic layers, and the safety probes — no GPU required.

python3.11 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip setuptools wheel
python -m pip install -e ".[dev]"

python training/prepare_dataset.py --include-ingested --out data/processed
pytest -q
python training/validate_corpus.py --include-ingested
python training/release_threshold.py --include-ingested
python scripts/validate_claims.py --check

Inspect model identity and the reproducible fingerprint at any time:

python -m nullsec.core.version

Corpus status

corpus/ is the single source of truth for training data. The current curated corpus is 1,741 examples (1,304 hand-authored + 437 curated-ingested), every taxonomy category has ≥ 60 curated examples, and Safety Layer consistency is 100% — so both the v1.0 and RC2/v1.1 data thresholds pass.

Provenance is tracked explicitly and never blurred:

  • hand_authored — original examples written for this repo (counts as curated).
  • curated_ingested — CVE / scanner / real-failure records that passed human review and source-provenance enforcement (counts as curated).
  • synthetic_variant — labeled, structure-preserving augmentations; never counts toward curated thresholds.

Raw and rejected candidates are tracked separately and are never training-eligible. The curation workflow, schema, and provenance rules are documented in docs/CORPUS.md, with auditable sprint reports in reports/.


Training workflow

The training targets are built from the corpus through the same alignment + safety layers used at serving time, so no malformed or gate-inconsistent verdict ever enters training.

# 1. build chat-formatted train/eval JSONL from the curated corpus
python training/prepare_dataset.py --include-ingested --out data/processed

# 2. confirm the corpus is genuinely v1.0-ready (exits non-zero until it is)
python training/release_threshold.py --include-ingested

# 3. (on a GPU box) preflight, then train the QLoRA adapter
python training/preflight_train.py
python training/train_qlora.py --config training/config.yaml

The release adapter was trained with QLoRA on Qwen/Qwen2.5-Coder-7B-Instruct (Apache 2.0). Training instructions remain in RELEASE_TRAINING.md, RUNPOD.md, and GPU_QUICKSTART.md.


Training on GPU

Local CPU machines can verify the corpus and the safety layer, but cannot realistically train the model. QLoRA training requires a CUDA-capable NVIDIA GPU.

The end-to-end pipeline (prepare → preflight → train → merge → benchmark → release → validate) is wrapped in one script:

bash scripts/run_training_pipeline.sh

A complete, beginner-friendly walkthrough — choosing a GPU box, disk requirements, environment setup, expected artifacts, and how to collect outputs — is in GPU_QUICKSTART.md.

training/preflight_train.py checks the GPU stack before you spend money: it exits 2 when no CUDA GPU is available (the expected result on a laptop), so you never start a doomed run.


Benchmark workflow for reproduction / development

The benchmark suite measures detection accuracy, false-safe rate, hallucination rate, OWASP coverage, patch correctness (structural), and a secure-generation score. It reports numbers only from real runs. The RC2/v1.1 real-model report ships as a GitHub Release asset under v1.0.0-rc25; generated benchmark reports are not committed to source.

# against the live model (GPU):
python benchmarks/run_all.py --mode model --adapter outputs/nullsec-s1-qlora

# against captured real outputs (no GPU); reports are marked replay-only:
python benchmarks/run_all.py --mode replay --replay path/to/captured.jsonl

A case with no model output is recorded as a real miss, never a synthetic pass. In a source-only checkout, artifact-gated claims remain unavailable until the trained adapter and release report are downloaded/unpacked locally.


Release pipeline for reproduction / development

The release pipeline is how maintainers reproduce a release bundle from real local artifacts:

python scripts/release_candidate.py --adapter outputs/nullsec-s1-qlora --dataset detection.json
python scripts/validate_claims.py --adapter outputs/nullsec-s1-qlora \
    --report releases/nullsec-1.0/benchmark/SUITE.json --check

release_candidate.py aborts (writing nothing) if the adapter is missing, the model fails to load, no outputs are produced, any report section is empty, or any Safety Layer probe is bypassed. The published RC2/v1.1 artifact already passed this path; running it again is a reproducibility workflow. The full path is documented in RELEASE_TRAINING.md.


Repo structure

README.md                 you are here
GPU_QUICKSTART.md          beginner-friendly GPU training walkthrough
RELEASE_TRAINING.md        training-to-release runbook
CONTRIBUTING.md            how to contribute (corpus, taxonomy, probes, docs)
SECURITY.md                vulnerability reporting & responsible disclosure
model_card/                Nullsec-1 model card (identity, intended use, limits)
taxonomy/                  16-category security taxonomy — single source of truth
corpus/                    curated training corpus (authored/ + ingested/ + synthetic/)
data/                      verdict schema (data/schemas) + processed datasets
training/                  prepare_dataset · train_qlora · merge_adapter · validate_corpus
                           · release_threshold · preflight_train · config.yaml
nullsec/
  core/                    reasoning pipeline, verdict models, prompts, version/fingerprint
  safety/                  Security Alignment Layer + Nullsec Safety Layer
  ingest/                  CVE/NVD, Semgrep, SARIF/CodeQL ingestion
serving/                   FastAPI serving layer
benchmarks/                benchmark suite + adversarial Safety Layer probes
scripts/                   release_candidate.py · validate_claims.py · run_training_pipeline.sh
examples/                  worked vulnerable cases + expected verdicts
releases/                  generated release bundles (real artifacts only; ships empty)
cli/                       nullsec1 CLI analyzer + CI gate
tests/                     deterministic-layer test suite (no GPU)
docs/                      architecture · system overview · safety layer · corpus · roadmap
.github/                   CI security gate · issue templates · PR template

Contributing

Contributions to the corpus, taxonomy, safety probes, benchmark runners, docs, and CLI/API are welcome. Corpus examples must include vulnerable code, an exploit scenario, a taxonomy category and severity, a real secure patch, complete checks_performed, the expected Safety Layer behavior, and an auditable provenance reference. See CONTRIBUTING.md for the full requirements and the curated-ingestion workflow.

Useful contribution areas:

  • benchmark cases
  • framework examples
  • scanner integrations
  • docs improvements

Honest scope

Results are benchmark-scoped to the Nullsec RC2/v1.1 111-case benchmark. Nullsec-S1 is not a replacement for human security review. A clean verdict reduces risk; it does not prove the absence of vulnerabilities.


Security

Please report vulnerabilities responsibly and never submit real secrets — use placeholders for any credential in examples or reports. See SECURITY.md.


License

Apache 2.0 — matching the Qwen2.5-Coder base model. See the license note in model_card/NULLSEC1.md.

Reviews (0)

No results found