Transform videos into engaging YouTube Shorts (9:16 aspect ratio) with AI-generated narration, word-level animated subtitles, and background music mixing. Features GPU acceleration for lightning-fast video processing.
This project is created by MaddoxRVS. GitHub: https://github.com/Maddox-RVS
- Automatic aspect ratio conversion: Crops videos to 9:16 (YouTube Shorts format)
- AI-powered narration: Text-to-speech with multiple emotional tones (excited, sarcastic, dramatic, etc.)
- Animated subtitles: Word-by-word bouncing text with customizable colors
- Audio mixing: Blend background music with AI narration at adjustable volume levels
- GPU acceleration: NVIDIA CUDA support for 10-50x faster video encoding
- YouTube downloader: Built-in utilities to download videos from YouTube
- Library and CLI: Use as a Python module or command-line tool
- Customizable: Control tone, colors, volume, and more
- Python: 3.13 or higher
- FFmpeg: Required for video/audio processing
- Windows: Download from ffmpeg.org or use
winget install ffmpeg - macOS:
brew install ffmpeg - Linux:
apt install ffmpegordnf install ffmpeg
- Windows: Download from ffmpeg.org or use
- NVIDIA GPU (optional): For GPU acceleration, install NVIDIA CUDA Toolkit 11.8+
pip install youtube-short-generatorClone the repository and install in editable mode:
git clone https://github.com/Maddox-RVS/YouTube-Short-Generator.git
cd YouTube-Short-Generator
pip install -e .youtube-short-generator create \
--video input.mp4 \
--audio background_music.mp3 \
--text narration.txt \
--output short.mp4Required arguments:
-v, --video: Input video file (supports .mp4, .mov, .avi, .mkv, .webm)-a, --audio: Background music/audio file (supports .mp3, .wav, .aac, .flac, .m4a)-t, --text: Text file containing the narration script-o, --output: Output file path for the generated short
Optional arguments:
--volume: Background audio volume (0.0-2.0, default: 1.0)--keep-video-audio: Keep original video audio in the mix--tone: TTS narration tone/emotion (default:Regular Guy)- Examples:
excited,sarcastic,dramatic,whisper,cheerful
- Examples:
-c, --subtitle-color: Subtitle text color as hex code (default:#FF0000red)-f, --font: Path to custom font file for subtitles
Full example with all options:
youtube-short-generator create \
-v clip.mp4 \
-a bgm.mp3 \
-t script.txt \
-o output_short.mp4 \
--volume 0.7 \
--tone excited \
-c "#00FF00" \
--keep-video-audioDownload videos or audio from YouTube for processing:
# Download full video
youtube-short-generator download \
--link "https://www.youtube.com/watch?v=dQw4w9WgXcQ" \
--output ./downloads
# Download audio only (MP3)
youtube-short-generator download \
--link "https://www.youtube.com/watch?v=dQw4w9WgXcQ" \
--output ./downloads \
--audioArguments:
-l, --link: YouTube URL to download-o, --output: Directory to save the downloaded content-a, --audio: (Optional flag) Download audio only as MP3
Use the package as a Python library in your own projects:
from youtube_short_generator import ShortGenerator, download_youtube_video
from pathlib import Path
# Create a short from existing files
generator = ShortGenerator(
video_file=Path('input_video.mp4'),
audio_file=Path('background_music.mp3'),
taxt_overlay_file=Path('narration.txt'),
output_file=Path('my_short.mp4')
)
# Generate with custom settings
generator.generate_short(
tone='excited', # AI narration tone
subtitle_color='#FF00FF', # Magenta subtitles
audio_volume=0.6, # 60% background music volume
keep_video_audio=False # Don't include original audio
)
# Download from YouTube
download_youtube_video(
link='https://youtube.com/watch?v=...',
output_dir=Path('./downloads'),
audio_only=False # Set to True for audio-only download
)ShortGenerator reports status updates through a callback that receives:
renderable: A Rich renderable (often aSpinneror styledText)permanent:Trueif the status message should be kept,Falseif it's a live/progress update
You can use this to drive your own UI or logging. Here's a minimal template:
from youtube_short_generator import ShortGenerator
from rich.console import Console
from rich.live import Live
from pathlib import Path
console = Console()
live = Live(console=console, refresh_per_second=10) # Shared live region for dynamic status
def on_status_change(renderable, permanent: bool):
if permanent:
console.print(renderable) # Final messages stay in the log
else:
live.update(renderable, refresh=True) # Live updates replace the spinner line
# Create a short from existing files
generator = ShortGenerator(
video_file=Path('input_video.mp4'),
audio_file=Path('background_music.mp3'),
text_overlay_file=Path('narration.txt'),
output_file=Path('my_short.mp4')
)
generator.on_status_change = on_status_change # Hook status updates
with live:
generator.generate_short(tone='excited') # Run with live status updatesTextToSpeechGenerator also exposes on_status_change; ShortGenerator forwards it to keep a single callback.
The TTS engine supports various emotional tones. Try different tones to find the right style for your content:
Regular Guy(default) - Neutral, calm deliveryexcited- Enthusiastic, high energysarcastic- Witty, sarcastic tonedramatic- Theatrical, dramatic deliverywhisper- Soft, intimate whispercheerful- Happy, positive mood
Provide your own font file for subtitle text:
youtube-short-generator create \
-v video.mp4 -a music.mp3 -t script.txt -o output.mp4 \
-f /path/to/my_custom_font.ttfThe package includes Dosis Bold as the default font, but any TrueType font (.ttf) will work.
If you have an NVIDIA GPU, the application automatically detects it and uses hardware-accelerated video encoding (h264_nvenc) instead of CPU encoding (libx264).
GPU support is not enabled by default because the CUDA version of PyTorch is not available on PyPI. To enable GPU acceleration, run the following command in your environment after installing:
pip:
pip install torch torchaudio torchvision --index-url https://download.pytorch.org/whl/cu118uv:
Add the following to your pyproject.toml:
[[tool.uv.index]]
url = "https://download.pytorch.org/whl/cu118"
name = "pytorch-cu118"
explicit = true
[tool.uv.sources]
torch = { index = "pytorch-cu118" }
torchaudio = { index = "pytorch-cu118" }
torchvision = { index = "pytorch-cu118" }Then run:
uv add torch torchaudio torchvision
uv syncThe tool will automatically detect and use the GPU if CUDA is available.
The main class for video processing.
ShortGenerator(
video_file: Path,
audio_file: Path,
taxt_overlay_file: Path,
output_file: Path,
tts_generator: Optional[TextToSpeechGenerator] = None
)Methods:
generate_short(audio_volume=1.0, keep_video_audio=False, tone='Regular Guy', font_path=Path('Dosis-Bold.ttf'), subtitle_color='#FF0000')- Main processing pipeline
Status:
statusproperty returns a(renderable, permanent)tuple.on_status_changeis a callback invoked on every status update.is_runningindicates whether generation is in progress.
Handles AI narration and subtitle extraction.
import torch
from youtube_short_generator import TextToSpeechGenerator
device = 'cuda' if torch.cuda.is_available() else 'cpu'
TextToSpeechGenerator(device=device)Methods:
generate_text_to_speech_audio(text: str, tone: str, output_file: Path)- Generate speech audiogenerate_timestamped_subtitles(input_speach_file: Path) -> list[dict]- Extract word-level timing
from youtube_short_generator import download_youtube_video
download_youtube_video(
link: str,
output_dir: Path,
audio_only: bool = False
)This tool requires FFmpeg for video and audio processing. It's a system-level dependency that must be installed separately.
Install FFmpeg:
- Windows: Download Installer or
winget install ffmpeg - macOS:
brew install ffmpeg - Linux (Ubuntu/Debian):
sudo apt-get install ffmpeg - Linux (Fedora/RHEL):
sudo dnf install ffmpeg
Verify installation:
ffmpeg -versionEnsure FFmpeg is installed and added to your system PATH. See installation instructions above.
Verify your NVIDIA GPU and CUDA Toolkit installation:
>>> import torch
>>> print('CUDA available:', torch.cuda.is_available())
CUDA available: True
>>> print(torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU')
NVIDIA A100-PCIE-40GBGPU acceleration requires CUDA. If you're on CPU-only, processing will be slower. GPU is optional but strongly recommended.