← Back to MoneyPrinterTurbo Guide

How to Use MoneyPrinterTurbo

You survived the install. Now let's make your first AI video — from opening the UI to watching the final rendered file.

1. The Streamlit UI — What Every Field Does

After running both terminals (backend + Streamlit), you'll see this page at localhost:8501. Here's what every setting does and what you can safely ignore.

Video Topic REQUIRED

The only field you must fill in. This is the video's theme — one sentence is enough. For better results, use a list format (see Prompt Crafting below).

Video Language REQUIRED

Determines what language the AI script and voiceover use. Set to English for English videos, Chinese for Chinese. Mismatched language + voice = broken subtitles.

Speech Synthesis — Voice REQUIRED

Pick a voice from the dropdown. Do not leave this blank — it will silently fail after 3 retries with no clear error. Recommended defaults:

  • English: en-US-AnaNeural-Female or en-US-ChristopherNeural-Male
  • Chinese: zh-CN-XiaoxiaoNeural-Female or zh-CN-YunxiNeural-Male

Video Length OPTIONAL

Target duration in minutes. Default is 1. Longer videos = more script paragraphs = longer generation time. 1–3 minutes is the sweet spot — beyond 5 minutes, quality drops noticeably.

Video Resolution OPTIONAL

Default is 1080p (1920×1080). 720p for faster renders during testing. Stick with the default for final videos — resolution barely affects generation speed since the bottleneck is script writing + footage download, not rendering.

Enable Subtitles OPTIONAL

Checked by default. Adds burned-in subtitles synced to the voiceover. If you're in China, uncheck this — the subtitle feature can trigger a Whisper model download (~3GB) from Hugging Face, which is painfully slow without a VPN.

Background Music OPTIONAL

Adds royalty-free background music. Leave as default or pick a track. Has negligible impact on generation time. Volume and fade settings below this are fine at defaults.

2. Prompt Crafting — Good Topic vs Bad Topic

The topic field is the single biggest factor in video quality. A bad prompt produces a rambling, generic video no matter how well everything else is configured.

Bad Topics (don't use these)

Bad TopicProblem
"AI"Too vague. The LLM has no direction — it'll produce generic fluff.
"How to be happy"Open-ended. No structure. Output is philosophical rambling with no visual hooks.
"The history of the Roman Empire"Way too broad for a 1-minute video. The script will be a shallow summary.
"Why my product is the best"Promotional tone. AI generates marketing-speak. Pexels has no relevant footage.

Good Topics (copy these patterns)

Good TopicWhy it works
"Top 5 facts about black holes that will blow your mind"List format = natural structure. "Blow your mind" sets an engaging tone. Visual hooks are obvious (space footage).
"3 things I wish I knew before visiting Tokyo"List + personal angle. Easy to find matching Pexels footage. ~60 seconds of content fits naturally.
"Why cats sleep so much: the science explained"Specific question → answer structure. Clear visual direction (cat footage). Curiosity-driven title.
"Beginner's guide to investing: 4 things to do first"Actionable, numbered, clear audience. Works as an educational explainer.
The formula: [Number] + [interesting angle] + [specific topic]. Examples: "7 surprising ways...", "3 mistakes people make when...", "How X actually works (in 60 seconds)". List-based topics produce the most watchable videos because each bullet point becomes a natural scene transition.

3. Voice, Language & Region Settings

English Videos

Video Language: English (US)
Voice: en-US-AnaNeural-Female (clear, natural)
Subtitle: ON (default)

Stick with en-US- voices. UK voices (en-GB-) are fine but sound slightly more formal. Avoid en-IN- (Indian accent) unless your audience expects it — the accent is thick and may reduce perceived quality.

Chinese Videos

Video Language: Chinese (Mandarin) / 中文
Voice: zh-CN-XiaoxiaoNeural-Female (xiaoxiao, natural)
Subtitle: OFF (skip Whisper download)

Chinese videos work well with MPT, but two things to know: (1) Pexels search is English-only, so the footage may not match Chinese topics as well — use specific English keywords in your topic to help. (2) Edge TTS Chinese voices are good quality but fewer options than English.

Other Languages

MPT supports Japanese, Korean, German, French, and more via Edge TTS. Check the voice dropdown for available options. The LLM prompt is sent in English regardless of video language setting, so script quality for non-English/Chinese languages varies.

Region note (China): Edge TTS calls go to speech.platform.bing.com — blocked in China. Keep your VPN ON during video generation or you'll get a 403 error. See the edge-tts fix guide.

4. What Happens When You Click "Generate"

Understanding the pipeline helps you debug when things go wrong. Here's the sequence, in order:

  1. Script Writing (~10–30s) — LLM generates a video script from your topic. Watch the backend terminal for progress; Streamlit shows a spinner. If this hangs forever, your LLM provider is down (see g4f fix).
  2. Voiceover (~5–15s) — Edge TTS converts each sentence to speech. You'll see progress in the backend terminal. 403 error at this stage = VPN is off or speech.platform.bing.com is blocked.
  3. Footage Search (~20–60s) — Pexels API searches for stock footage matching each sentence. This is the slowest step. Each sentence gets its own search; ~10–20 searches per minute video. If stuck here, Pexels may be blocked — turn on VPN. Registering a free Pexels API key speeds this up significantly.
  4. Video Assembly (~15–30s) — moviepy + FFmpeg stitch everything together: footage clips, voiceover audio, subtitles, background music. moviepy.editor not found at this stage means you need v1.0.3 (see moviepy fix).
  5. Done! — Video saves to MoneyPrinterTurbo/output/ as an MP4 file. Streamlit shows a download link and a preview player.
Total time for a 1-minute video: ~45–90 seconds with good internet. ~2–3 minutes if Pexels is slow or you're on a VPN. The backend terminal shows much more detail than Streamlit — keep it visible to see what's actually happening.

5. What to Expect from the Output

The result is a slideshow-video, not AI-generated footage. MPT doesn't create video from scratch — it assembles existing stock clips. Think of it as an automated version of: find stock footage → write narration → record voiceover → edit together in a timeline.

What's good

What's not

Set expectations: This is not Sora or Runway. MPT makes informational videos — the kind you'd see on a faceless YouTube channel explaining "5 facts about X." If you want cinematic AI-generated footage, this isn't the tool. If you want to produce 10 explainer videos in an afternoon without touching a timeline, it's perfect.

6. After Your First Video

Tweak and re-generate

The first video is rarely the best. Try these adjustments:

Batch generation

Once you have a winning formula, batch-produce videos by reusing settings. MPT doesn't have built-in batch mode, but you can:

Hit a problem?