Research a topic and produce a podcast episode with AI-generated voices. Use when user wants to create a podcast, audio episode, narrated discussion, or audio content from a topic or document. Triggers include "create a podcast", "make a podcast episode", "podcast about", "audio episode", "narrated discussion", "turn this into a podcast".
Published by rebyteai
Runs in the cloud
No local installation
Dependencies pre-installed
Ready to run instantly
Secure VM environment
Isolated per task
Works on any device
Desktop, tablet, or phone
Produce podcast episodes from scratch or from source material. This skill orchestrates content preparation, shows the user a preview for approval, then delegates audio production to the podcast-producer skill.
rebyteai/internet-search — Quick web search for facts, quotes, and current datarebyteai/deep-research — Comprehensive multi-source research for in-depth topicsrebyteai/podcast-producer — Audio production engine. Handles all TTS, audio processing, music, mastering. Follow its guidelines for ALL audio production decisions.rebyteai/show-me-how — Interactive widgets for the episode previewParse what the user wants:
Skip if the user provides source material (uploaded document, pasted text, etc.).
internet-search for 3-5 targeted searches.deep-research for comprehensive multi-source coverage.internet-search.Organize findings into an outline: group by segment, note quotes/stats, identify narrative arc.
Write a complete, natural-sounding script. Script quality determines podcast quality.
Script rules:
[SPEAKER NAME] markers for each speaker on their own line.Format by episode type:
Solo narration:
[HOST]
Welcome to the show. Today we're diving into...
[HOST]
That's it for today. If you found this useful...
Two-host discussion:
[HOST A]
So I've been reading about this new trend in...
[HOST B]
Yeah, I saw that too. What surprised me was...
Interview:
[INTERVIEWER]
Tell us about your experience with...
[GUEST]
Well, it started when...
Structure every episode with:
Before generating any audio, show the user a preview widget for approval. Audio generation is expensive (TTS API calls, ffmpeg processing). The preview lets the user catch issues early.
Generate a show-me-how widget that displays the full episode plan. The widget should include:
[INTRO MUSIC], [TRANSITION], [OUTRO MUSIC]) shown as visual dividersWidget template:
```widget
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: var(--widget-font-sans); background: var(--widget-bg-primary); color: var(--widget-text-primary); padding: 24px; }
h1 { font-size: 1.5rem; font-weight: 700; margin-bottom: 4px; }
.subtitle { color: var(--widget-text-secondary); font-size: 0.875rem; margin-bottom: 20px; }
.card { background: var(--widget-bg-secondary); border: 1px solid var(--widget-border); border-radius: var(--widget-border-radius); padding: 20px; box-shadow: var(--widget-shadow-sm); margin-bottom: 16px; }
.card h2 { font-size: 1.1rem; font-weight: 600; margin-bottom: 12px; }
/* Episode metadata */
.meta-grid { display: grid; grid-template-columns: repeat(auto-fit, minmax(140px, 1fr)); gap: 12px; margin-bottom: 16px; }
.meta-item { text-align: center; padding: 12px; background: var(--widget-bg-tertiary); border-radius: 8px; }
.meta-value { font-family: var(--widget-font-mono); font-size: 1.25rem; font-weight: 700; color: var(--widget-accent); }
.meta-label { font-size: 0.75rem; color: var(--widget-text-muted); margin-top: 4px; }
/* Cast */
.cast-row { display: flex; align-items: center; gap: 12px; padding: 8px 0; border-bottom: 1px solid var(--widget-border); }
.cast-row:last-child { border-bottom: none; }
.voice-badge { display: inline-block; padding: 2px 10px; border-radius: 12px; font-size: 0.8rem; font-weight: 600; color: var(--widget-accent-text); }
/* Sound design */
.sound-row { display: flex; justify-content: space-between; padding: 6px 0; border-bottom: 1px solid var(--widget-border); font-size: 0.9rem; }
.sound-row:last-child { border-bottom: none; }
.sound-label { color: var(--widget-text-muted); }
/* Transcript */
.segment { margin-bottom: 16px; }
.speaker-label { display: inline-block; padding: 2px 10px; border-radius: 12px; font-size: 0.8rem; font-weight: 600; color: var(--widget-accent-text); margin-bottom: 6px; }
.timestamp { float: right; font-family: var(--widget-font-mono); font-size: 0.75rem; color: var(--widget-text-muted); }
.dialogue { font-size: 0.95rem; line-height: 1.6; color: var(--widget-text-primary); white-space: pre-wrap; }
.divider { text-align: center; padding: 12px 0; color: var(--widget-text-muted); font-size: 0.8rem; font-style: italic; border-top: 1px dashed var(--widget-border); border-bottom: 1px dashed var(--widget-border); margin: 12px 0; }
</style>
</head>
<body>
<h1>🎙️ Episode Preview: TITLE HERE</h1>
<p class="subtitle">Review the episode plan before generating audio</p>
<!-- Metadata -->
<div class="meta-grid">
<div class="meta-item"><div class="meta-value">~10 min</div><div class="meta-label">Duration</div></div>
<div class="meta-item"><div class="meta-value">2</div><div class="meta-label">Speakers</div></div>
<div class="meta-item"><div class="meta-value">Discussion</div><div class="meta-label">Format</div></div>
<div class="meta-item"><div class="meta-value">3</div><div class="meta-label">Segments</div></div>
</div>
<!-- Cast -->
<div class="card">
<h2>Cast</h2>
<div class="cast-row">
<span class="voice-badge" style="background: var(--widget-chart-1);">HOST A</span>
<span><strong>marin</strong> — Female, warm, confident</span>
</div>
<div class="cast-row">
<span class="voice-badge" style="background: var(--widget-chart-2);">HOST B</span>
<span><strong>cedar</strong> — Male, calm, authoritative</span>
</div>
</div>
<!-- Sound Design -->
<div class="card">
<h2>Sound Design</h2>
<div class="sound-row"><span>Intro Music</span><span class="sound-label">Lo-fi podcast intro (Pixabay, 6s)</span></div>
<div class="sound-row"><span>Background</span><span class="sound-label">Soft coffee shop ambience (0.2x volume)</span></div>
<div class="sound-row"><span>Transitions</span><span class="sound-label">Generated tonal sting (3s)</span></div>
<div class="sound-row"><span>Outro Music</span><span class="sound-label">Same as intro (8s, fade out)</span></div>
</div>
<!-- Transcript -->
<div class="card">
<h2>Transcript</h2>
<div class="divider">🎵 Intro Music (6s)</div>
<div class="segment">
<span class="speaker-label" style="background: var(--widget-chart-1);">HOST A</span>
<span class="timestamp">0:06</span>
<div class="dialogue">Welcome back to the show. Today we're looking at...</div>
</div>
<div class="segment">
<span class="speaker-label" style="background: var(--widget-chart-2);">HOST B</span>
<span class="timestamp">0:32</span>
<div class="dialogue">Yeah, this is a fascinating topic because...</div>
</div>
<div class="divider">🔀 Transition (3s)</div>
<!-- ... more segments ... -->
<div class="divider">🎵 Outro Music (8s)</div>
</div>
</body>
</html>
```
After showing the preview, ask the user:
Here's the full episode plan. You can:
- Continue — I'll generate the audio now
- Change voices — e.g., "Make Host B use ash instead of cedar"
- Edit the script — tell me what to change
- Change music/ambience — e.g., "Use rain instead of coffee shop" or "No background ambience"
- Adjust length — e.g., "Make segment 2 shorter"
Only proceed to Step 5 after the user approves.
Delegate entirely to the podcast-producer skill. It handles:
gpt-4o-mini-tts with voices like marin, cedar, ash)Follow ALL audio production guidance from podcast-producer. Do not manually call TTS or process audio outside of its pipeline.
rebyte-app-builder)rebyte-app-builder and deploy to rebyte.pro. Only if asked.Everyone else asks you to install skills locally. On Rebyte, just click Run. Works from any device — even your phone. No CLI, no terminal, no configuration.
Claude Code
Gemini CLI
Codex
Cursor, Windsurf, Amp
Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.
Convert text to speech audio using OpenAI TTS. Use when user wants to generate voiceovers, narration, audio files from text, or add voice to videos. Triggers include "text to speech", "TTS", "voiceover", "narration", "generate audio", "speak this text", "convert to audio", "voice generation".
Conduct enterprise-grade research with multi-source synthesis, citation tracking, and verification. Use when user needs comprehensive analysis requiring 10+ sources, verified claims, or comparison of approaches. Triggers include "deep research", "comprehensive analysis", "research report", "compare X vs Y", or "analyze trends". Do NOT use for simple lookups, debugging, or questions answerable with 1-2 searches.
Generate images from text prompts or edit existing images using Google Nano Banana 2 (Gemini 3.1 Flash image generation) via Rebyte data API. Supports multi-size output (512px–4K), improved text rendering, and multi-image input. Use for text-to-image generation or image-to-image editing/enhancement. Triggers include "generate image", "create image", "make a picture", "draw", "illustrate", "image of", "picture of", "edit image", "modify image", "enhance image", "style transfer", "nano banana".
rebyte.ai — The only platform where you can run AI agent skills directly in the cloud
No downloads. No configuration. Just sign in and start using AI skills immediately.
Use this skill in Agent Computer — your shared cloud desktop with all skills pre-installed. Join Moltbook to connect with other teams.