local-transcribe-cli

A Windows-focused Python tool for batch-transcribing audio files using faster-whisper.

Download as .zip Download as .tar.gz View on GitHub

local-transcribe-cli

A Windows-focused tool for batch-transcribing audio and video files using faster-whisper. This tool prioritizes local execution with GPU support via CTranslate2.

Features

Prerequisites

Installation & Updates

We provide a single PowerShell script to handle initial installation, environment setup, and updates.

  1. Clone the repository:
    git clone https://github.com/shruggietech/local-transcribe-cli.git
    cd local-transcribe-cli
    
  2. Run the Installer: This script will create a virtual environment (.venv), install all dependencies, and pull the latest code from GitHub.
    .\scripts\InstallLocalTranscribe.ps1
    

    Tip: Run this script again at any time to update the tool to the latest version.

Usage

The primary way to use this tool is via the LocalTranscribe.ps1 wrapper script.

Basic Audio Transcription

Transcribe all audio files in a specific folder to text and JSON (default):

.\scripts\LocalTranscribe.ps1 -AudioDir "C:\Users\You\Documents\VoiceNotes"

Video to Subtitles

Generate .srt subtitle files for a folder of videos:

.\scripts\LocalTranscribe.ps1 `
    -AudioDir "C:\Videos\Recordings" `
    -MediaType "video" `
    -OutputFormats "srt"

Transcribe Everything

Process both audio and video files, generating all output formats:

.\scripts\LocalTranscribe.ps1 `
    -AudioDir "C:\Media\Mixed" `
    -MediaType "all" `
    -OutputFormats "txt", "json", "srt"

PowerShell Parameters

Parameter Description Default
-AudioDir Directory containing input files. . (Current Dir)
-OutDir Directory to write transcripts to. .\transcripts
-MediaType Files to target: audio, video, or all. audio
-OutputFormats Formats to generate: txt, json, srt. txt, json
-Model Whisper model size (e.g., medium, large-v3). large-v3
-Device Compute device: auto, cuda, or cpu. auto
-Language Spoken language code (e.g., en, fr). en
-Pattern Custom glob pattern (e.g., *.m4a). Additive to MediaType. $null

Default Supported Audio Extensions

The following audio file extensions are supported by default and are automatically detected when -MediaType is set to audio or all:

aac, ac3, aiff, alac, amr, au, flac, m4a, mid, midi, mp3, ogg, opus, ra, ram, rm, rpm, snd, wav, wma

Default Supported Video Extensions

The following video file extensions are supported by default and are automatically detected when -MediaType is set to video or all:

3g2, 3gp, avi, flv, m2ts, m4v, mkv, mov, mp4, mpeg, mpg, mts, ogv, vob, webm, wmv

Testing

To ensure the tool is working correctly on your system, you can run the included test suite. This will verify dependency installation and run a small integration test.

.\scripts\TestLocalTranscribe.ps1

Advanced Usage (Python CLI)

Power users can interact directly with the Python package inside the virtual environment.

Activate the environment:

.\.venv\Scripts\Activate.ps1

Run via Module:

python -m local_transcribe_cli.cli --help

Run via Console Script:

local-transcribe --input-dir "C:\Audio" --model medium

Python Arguments

Troubleshooting

“Script is not digitally signed” error: You may need to change your PowerShell execution policy to allow local scripts:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

“ModuleNotFoundError” or Import Errors: Run the install script again to repair the environment:

.\scripts\InstallLocalTranscribe.ps1

Slow Performance: Ensure you are running on a machine with a GPU and that -Device is set to auto or cuda. If you lack a GPU, try using a smaller model (-Model medium or -Model small) and int8 quantization (-ComputeType int8).

For contributors

The project-level interpreter pin for pyshim lives in .python-version:

py:3.12

Dependencies are listed in requirements.txt, and the main entry point is src/local_transcribe_cli/cli.py.