ai-media-generator
Health Gecti
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 11 GitHub stars
Code Uyari
- Code scan incomplete — No supported source files were scanned during light audit
Permissions Gecti
- Permissions — No dangerous permissions requested
Bu listing icin henuz AI raporu yok.
Zero-skill cinema. Senior-director prompts on autopilot. A Claude Code Skill for high-quality AI image / video / music prompt crafting and browser-based execution across 14+ generative platforms.
ai-media-generator
零門檻做出專業 AI 影片 / 圖片 / 音樂 — 因為 Claude 幫你套上「資深導演級」提示詞。
Zero-skill cinema. Senior-director prompts on autopilot.
說「做個古代將軍騎馬衝鋒的電影感短片」就好。Claude 會:
- 挑對平台 —— Seedance 寫實武打?Veo 3.1 要原生音效?Sora 2 要 bodycam 風格?這個 skill 知道
- 寫對提示詞 —— 不是
cinematic, 8k, beautiful這種沒用的詞,而是 Deakins、Lubezki、Kodak Vision3 500T、teal-orange grade、Constraints tail(不抖動、不變形…)這類「平台真正吃」的 token - 直接操作網站 —— 透過瀏覽器 MCP 把提示詞送上 OiiOii / Flow / Kling / Suno,按完所有按鈕,把成品 fetch 回來
涵蓋 14+ 平台:Midjourney v7、Flux 1.1 Pro、Sora 2、Veo 3.1、Kling 3.0、Seedance 2.0 pro、Suno v5、Runway Gen-4、Ideogram 3、Seedream、Nano Banana Pro、Vidu Q3、Stable Diffusion、OiiOii…
A Claude Code Skill for zero-skill, senior-quality AI media generation across 14+ platforms.
You say "make a cinematic shot of an ancient general charging on horseback." Claude handles the rest — picking the right platform, writing the platform-specific signature prompt (no more beautiful, masterpiece, detailed filler — actual director / DP / film-stock / lens / Constraints-tail vocabulary, calibrated to what the target model actually parses), and driving the browser to submit it.
為什麼需要這個 skill?
因為「提示詞會寫」是門高門檻 — 而且每個平台寫法都不同。
一般人寫的 prompt:
cinematic shot of an ancient general on a horse, 8k, beautiful, masterpiece, detailed
→ 平台吃不到方向,出來的成品永遠像「AI 圖庫」。
這個 skill 寫的 prompt(Seedance 中文戰鬥範例):
古代中國將軍 Aria 主角,身披金色鱗甲、紅披風在風中翻飛,手持青銅寶劍前舉,
騎黑色戰馬在黎明戰場上緩慢衝鋒。戰場遠處千軍列陣、紅色戰旗獵獵、塵霧漂浮。
金色逆光晨霧、熒光輪廓。低角度手持 tracking 鏡頭沿馬側跟拍,輕微晃動、穩定地平線。
墨色留白東方美學、史詩寫實電影感。慢動作、流暢、連貫、不僵硬、720p 高清。
不抖動、不變形、不多肢、穩定地平線、穩定時間一致性。
→ 平台懂的每個細節都到位:8 維公式(Subject+Action+Scene+Light+Camera+Style+Quality+Constraints),符合 Seedance 中文訓練、單動詞、Constraints tail。
差別不在你聰不聰明 — 在你知不知道每個平台吃什麼 token。 這個 skill 就是把那份「該怎麼寫」的內部知識,從跨平台研究(X / Threads / Reddit / 小紅書 / Bilibili + 官方 cookbook)+ 兩輪 head-to-head benchmark 抓出來,塞給 Claude 用。
Why this exists (English)
Generative-AI prompts are not portable. The same idea sent to Flux vs Midjourney vs Seedance produces wildly different quality, because each model was trained on different captioning conventions:
- Midjourney v7 loves comma-chunked detail +
--style raw --stylize 750 - Flux strips out director names but rewards 80-200 word natural paragraphs with technical photography vocabulary
- Seedance 2.0 uses bracketed labels (
[Style] [Scene] [Character] [Shot 1: 00:00-00:05]), eats lens focal lengths + format anchors (Sony A7S3,IMAX Fantasy Camera,Unreal Engine 5), supports native 4-modal audio (Sound design:) and bilingual lip-synced dialogue. Chinese preferred for 武打 / 古裝 / romance / MV. - Sora 2 wants "format anchors" (
bodycam footage,surveillance) and quoted dialogue - Veo 3.1 is the only model where SFX / Ambient tags actually generate audio
This skill captures those signatures — both from official cookbooks (OpenAI, Google Cloud, fal.ai, BFL) and from real-world community posts on X / Threads / Reddit / 小紅書 / Bilibili — into a single reference Claude can load on demand.
What's inside
ai-media-generator/
├── SKILL.md # Top-level skill (auto-pilot + hard rules)
├── references/ # Platform-specific prompt guides (14 platforms)
│ ├── community-prompt-patterns.md # ⭐ cross-platform meta + per-model signatures
│ ├── cinematic-direction.md # advanced director / DP / film-stock vocabulary
│ ├── commercial-direction.md
│ ├── vfx-effects.md
│ ├── sound-design.md
│ ├── editing-transitions.md
│ ├── camera-language.md
│ ├── selector.md # which platform for which use case
│ ├── kling.md / seedance.md / sora.md / veo.md / vidu.md / runway.md
│ ├── midjourney.md / flux.md / ideogram.md / seedream.md / stable-diffusion.md
│ ├── nano-banana.md / suno.md / oiioii.md
├── templates/
│ ├── auto-pilot.md # one-line-to-output pipeline
│ ├── preset-packs.md # 30+ ready-made preset prompts
│ ├── storyboard.md
│ ├── music-video.md
│ ├── negative-bank.md
│ ├── user-flags.md # natural-language flag translator
│ └── token-efficient-mode.md # lazy-load / grep / subagent strategy
└── automation/ # browser automation protocols
├── browser-guide.md
├── click-protocol.md # reliable-click SOP + token optimization
└── site-profiles/ # deep UI maps for verified platforms
├── oiioii.md # ✅ Phase 1-3E + Seedance §12.9 deep playbook
├── flow.md # ✅ Veo 3.1 complete
├── kling.md # ✅ Kling 3.0 complete
├── suno.md # ✅ Suno v5 complete
└── (stubs for midjourney/seedream/runway/sora/vidu/ideogram)
Key concepts
🔴 The Meta Rule (priority order)
Writing the prompt correctly once ≫ submitting it fast 10 times.
A wrong prompt costs ~10 minutes of regeneration + token waste. A slow submit costs ~5 minutes. So the speed-optimization priority is:
- Look up the platform signature →
references/community-prompt-patterns.md - Optimize single submission →
automation/site-profiles/<platform>.md - Wait without polling →
Bash run_in_background:true + sleep N
🔴 The Hard Rule
Every prompt must embed 5-8 high-signal tokens from the appropriate vocabulary layer (director / DP / lens / film-stock / lighting / grading / composition / VFX). Generic words (cinematic, 8k, beautiful, masterpiece) are banned — they dilute signal.
Platform-aware exception: Director names work on Midjourney / Sora 2 / Veo but get stripped by Flux / Nano Banana Pro. For Seedance / Wan, individual DP names are weak signal, but art-movement / brand-style names (Pixar / Ghibli / 90s anime / Fast and Furious / Bloom & Wild) do work. The skill bakes this nuance into the selection logic.
🤖 Auto-Pilot mode
When a user says "make me a 15-second cinematic ad of X" the skill:
- Parses the request into 9 slots (medium / duration / aspect / subject / style / character / scene / audio / language)
- Fills defaults
- Selects the right platform via
selector.md - Auto-generates the storyboard + prompt
- Drives the browser via
site-profiles/<platform>.md - Reports back
Installation
Drop the folder into your Claude Code skills directory:
# Per-user
git clone https://github.com/<your-org>/ai-media-generator.git ~/.claude/skills/ai-media-generator
# Or project-level
git clone https://github.com/<your-org>/ai-media-generator.git ./.claude/skills/ai-media-generator
Claude Code auto-discovers SKILL.md files inside skills directories.
Requirements
- Claude Code (CLI) or any client that supports the Skills format.
- For browser automation: the
claude-in-chromeMCP server (extension) connected. - For specific platforms you want to drive: a logged-in account on that platform in the same browser session.
Verified platforms (browser automation)
| Platform | Site profile status | Notes |
|---|---|---|
| OiiOii.ai | ✅ Verified | Phase 1-3E + Seedance 2.0 pro deep playbook |
| Google Flow (Veo 3.1) | ✅ Verified | Fast / Quality / Lite modes mapped |
| Kling 3.0 | ✅ Verified | + Fast-Track + Native Audio |
| Suno v5 | ✅ Verified | 5-song chain SOP |
| Midjourney v7 | 📝 Stub | Reference only |
| Seedream / Runway / Sora 2 / Vidu / Ideogram | 📝 Stub | Prompt guides complete; site profiles pending |
Prompt-craft reference files exist for all 14+ platforms regardless of automation status.
Provenance
This skill was built iteratively against real-world tasks (storyboarding, music-video production, cinematic shorts) with two rounds of head-to-head benchmarking. Iteration-1 measured "with-skill vs no-skill" baseline (95% vs 47%). Iteration-2 specifically measured "senior-director thinking" (vocabulary depth) — with-skill won 3/3 head-to-head subjective evaluations.
The community-prompt-patterns.md file consolidates research across:
- Official cookbooks: OpenAI Sora 2, Google Cloud Veo 3.1, Runway Gen-4 / Aleph, fal.ai (Kling, Wan, Seedream), BFL (Flux Kontext), Ideogram, Midjourney v7 docs, Google Nano Banana
- Community posts: X / Threads / Reddit / 小紅書 / Bilibili
- Practitioner write-ups: Atlas Cloud Seedance Library, awesome-seedance-2-prompts (GitHub), 知乎 / 腾讯新聞 / CSDN 即夢 Seedance 手册
License
MIT — see LICENSE.
Contributing
This is an opinionated, evidence-driven skill. Contributions welcome — especially:
- New
site-profiles/<platform>.mdfor platforms that don't have one yet - Updates when platforms ship new versions (model signatures change quickly)
- Better presets in
templates/preset-packs.md
See CHANGELOG.md for version history.
Acknowledgments
Inspired by users who got tired of "cinematic, 8k, masterpiece" prompts and wanted Claude to write like a senior director instead.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi