Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Text to Speech Software
Search Results

Search Results for "linux"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 92
Windows 87
Mac 86
More...
BSD 52
ChromeOS 52
Desktop Operating Systems 1
Mobile Operating Systems 1

Category

Artificial Intelligence 92
Multimedia 3
Communications 1
Desktop Environment 1
Education 1
Games 1
System 1

License

OSI-Approved Open Source 90

Translations

German 2
Bengali 1
English 1

Programming Language

Python 92
C 2
BASIC 1
C++ 1
C# 1
More...
Java 1
JavaScript 1
PHP 1
Unix Shell 1

Status

Beta 5
Alpha 3
Production/Stable 2

Showing 92 open source projects for "linux"

View related business solutions

Text to Speech Python Clear Filters & Widen Search

Information Security Made Simple and Affordable | Carbide
For companies requiring a solution to scale their business without incurring security debt

Get expert guidance and smart tools to launch or level up your security and compliance efforts without the complexity.

Learn More
ServiceDesk Plus, a world-class IT and enterprise service management platform
Design, automate, deliver, and manage critical IT and business services

Best in class online service desk software. Offer your customers world-class services with ServiceDesk Plus Cloud, the easy-to-use SaaS service desk software from ManageEngine, the IT management division of Zoho. Track and manage IT tickets efficiently, resolve issues faster, and ensure end-user satisfaction with the cloud-based IT ticketing system used by over 100,000 IT service desks worldwide. Manage the complete life cycle of IT incidents, problems, changes, and projects with out of the box ITIL workflows. Create support SLAs, define escalation levels, and ensure compliance. Automate ticket dispatch, categorization, classification, and assignment based on predefined business rules, and set up notifications and alerts for timely ticket resolution. Reduce walk ins and unnecessary tickets by giving your users more control. Enable end users to access IT services through your service catalog in the self-service portal. Help users create and track tickets and search for solutions.

Learn More
1

Applio

A simple, high-quality voice conversion tool focused on ease of use

Applio is a high-quality voice conversion toolkit designed to make modern RVC/VITS-based voice cloning accessible to non-experts. It focuses strongly on ease of use: installation scripts for Windows, Linux, and macOS set up dependencies and then launch a browser-based Gradio interface. Within that interface, users can train and run voice conversion models for tasks like singing conversion, speech-to-speech transformation, and voice cloning. The project is structured to be flexible through plugins and configurations so users can extend functionality without touching the core code. ...

Downloads: 94 This Week

Last Update: 2026-02-18
See Project
2

kokoro-onnx

TTS with kokoro and onnx runtime

kokoro-onnx is a text-to-speech toolkit that wraps the Kokoro neural TTS model in an easy-to-use ONNX Runtime interface, so you can generate speech from Python with minimal setup. It focuses on running efficiently on commodity hardware, including macOS with Apple Silicon, while still delivering near real-time performance for many use cases. The project ships prebuilt model files and a simple example script, so you can go from installation to producing an audio.wav file in just a few steps....

Downloads: 121 This Week

Last Update: 2025-11-28
See Project
3

Voice-Pro

Comprehensive Gradio WebUI for audio processing

Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.

1 Review

Downloads: 38 This Week

Last Update: 2025-12-05
See Project
4

ebook2audiobook

Generate audiobooks from e-books, voice cloning & 1107+ languages

ebook2audiobook is a tool to convert legally obtained eBooks (non-DRM) into fully narrated audiobooks, complete with chapters and metadata. It automates the pipeline: it reads the eBook file, splits it into appropriate segments (chapters, paragraphs), uses text-to-speech (TTS) models to synthesize audio, optionally applies voice cloning, and outputs a final audiobook — ideal for people who prefer listening over reading, or for accessibility purposes. The tool supports a wide array of...

Downloads: 37 This Week

Last Update: 3 days ago
See Project
Ditto Edge Server is a lightweight standalone server for resource-constrained edge environments, based on the core Ditto Edge SDK.
With Ditto Edge Server, you can join devices as small as a Raspberry Pi to a local mesh network and synchronize data across edge environments.

Ditto's Edge SDK is the only thing your edge devices need to ensure your application is operational in any environment, regardless of network conditions.

Learn More
5

VoxCPM

TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

VoxCPM is a tokenizer-free text-to-speech system that models speech in a continuous space, aiming for extremely realistic, context-aware synthesis and true-to-life zero-shot voice cloning. Instead of converting speech into discrete tokens, it uses an end-to-end diffusion-autoregressive architecture built on the MiniCPM-4 backbone, combining hierarchical language modeling, finite scalar quantization (FSQ), and local Diffusion Transformers. This design helps decouple semantic and acoustic...

Downloads: 41 This Week

Last Update: 2026-04-08
See Project
6

edge-tts

Use Microsoft Edge's online text-to-speech service from Python

edge-tts is a Python module and command-line tool that gives you direct access to Microsoft Edge’s online text-to-speech service without needing the Edge browser, Windows, or any API key. It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications. The tool lets you list available voices, specify locale and voice name, and generate audio files in common...

Downloads: 37 This Week

Last Update: 2025-12-12
See Project
7

pyttsx3

Offline Text To Speech synthesis for python

...It is designed to work entirely without an internet connection, making it suitable for local automation, kiosks, accessibility tools, and embedded applications. On Windows it uses SAPI5, on Linux it typically uses eSpeak or eSpeak-NG, and on macOS it can use NSSpeechSynthesizer or AVSpeechSynthesizer, giving it broad cross-platform compatibility. The library exposes a simple but flexible API for controlling voice selection, speaking rate, volume, and other synthesis parameters from Python code. It supports both a high-level speak convenience function and a lower-level engine object with event hooks, queuing, and saving output to audio files. ...

Downloads: 21 This Week

Last Update: 2025-11-28
See Project
8

SoniTranslate

Synchronized Translation for Videos

SoniTranslate is a video translation and dubbing system that produces synchronized target-language audio tracks for existing video content. It provides a web UI built with Gradio, allowing users to upload a video, choose source and target languages, and then run a pipeline that handles transcription, translation and re-synthesis of speech. Under the hood, it uses advanced speech and diarization models to separate speakers, align audio with timecodes and respect subtitle timing, which lets...

Downloads: 27 This Week

Last Update: 2025-11-28
See Project
9

OpenVoice

Instant voice cloning by MIT and MyShell. Audio foundation model

OpenVoice is a versatile instant voice cloning system that can replicate a speaker’s tone color from just a short audio clip and then generate speech in multiple languages. It is designed not only to match the timbre of the reference voice, but also to give granular control over style parameters such as emotion, accent, rhythm, pauses, and intonation. The model supports cross-lingual and even zero-shot cross-lingual voice cloning, so a speaker recorded in one language can be made to speak...

Downloads: 26 This Week

Last Update: 2025-11-28
See Project
End-To-End Document Management Software
UnForm is ideal for businesses focusing on distribution, manufacturing ERP solutions, and general accounting.

UnForm® is a platform-independent software product that creates, delivers, stores and retrieves graphically enhanced documents from ERP application printing. A complete, end-to-end document management solution, UnForm interfaces at the point of printing to produce documents in various formats for printing and electronic delivery.

Learn More
10

Fish Speech

SOTA Open Source TTS

Fish Speech is a state-of-the-art open-source text-to-speech project that has evolved into the OpenAudio series of advanced TTS models. The repository hosts the code and tooling for training, fine-tuning, and serving high-quality TTS, while the current flagship models (OpenAudio-S1 and S1-mini) are distributed via Fish Audio’s playground and Hugging Face. The models are evaluated with Seed TTS metrics and achieve exceptionally low word and character error rates, indicating strong...

Downloads: 23 This Week

Last Update: 2025-11-28
See Project
11

MLX-Audio

A text-to-speech, speech-to-text and speech-to-speech library

MLX-Audio is a speech library built on Apple’s MLX framework and optimized for Apple Silicon machines (M-series Macs). It focuses on text-to-speech and speech-to-speech workflows, with APIs and a command-line interface that make it easy to generate high-quality audio from text. Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI...

Downloads: 18 This Week

Last Update: 2026-03-30
See Project
12

Pocket TTS

A TTS that fits in your CPU (and pocket)

Pocket TTS is a lightweight text-to-speech project designed to run efficiently on CPUs, targeting developers who want local speech generation without depending on GPUs or hosted web APIs. It is built to feel practical in everyday applications, where installation and usage should be as simple as adding a dependency and calling a function. The project focuses on keeping the runtime footprint manageable while still producing natural-sounding speech, which makes it attractive for offline tools,...

Downloads: 16 This Week

Last Update: 2026-02-16
See Project
13

Matcha-TTS

A fast TTS architecture with conditional flow matching

Matcha-TTS is a non-autoregressive neural text-to-speech architecture that uses conditional flow matching to generate speech quickly while maintaining natural quality. It models speech as an ODE-based generative process, and conditional flow matching lets it reach high-quality audio in only a few synthesis steps, which greatly reduces latency compared to score-matching diffusion approaches. The model is fully probabilistic, so it can generate diverse realizations of the same text while still...

Downloads: 16 This Week

Last Update: 2025-11-28
See Project
14

abogen

Generate audiobooks from EPUBs, PDFs and text with captions

abogen is a tool designed to generate audiobooks (or speech narrations) from textual sources such as EPUBs, PDFs, or plain text, with synchronized captions. In other words, it automates the pipeline of reading a digital book (or document), converting its text into speech via a TTS engine, and packaging the result into an audiobook format — likely along with timestamped captions or subtitles that align with the spoken audio. This can be very useful for accessibility, content consumption on...

Downloads: 14 This Week

Last Update: 2026-02-06
See Project
15

Kitten TTS

State-of-the-art TTS model under 25MB

KittenTTS is an open-source, ultra-lightweight, and high-quality text-to-speech model featuring just 15 million parameters and a binary size under 25 MB. It is designed for real-time CPU-based deployment across diverse platforms. Ultra-lightweight, model size less than 25MB. CPU-optimized, runs without GPU on any device. High-quality voices, several premium voice options available. Fast inference, optimized for real-time speech synthesis.

Downloads: 14 This Week

Last Update: 2026-02-24
See Project
16

Qwen3-TTS

Qwen3-TTS is an open-source series of TTS models

Qwen3-TTS is an open-source text-to-speech (TTS) project built around the Qwen3 large language model family, focused on generating high-quality, natural-sounding speech from plain text input. It provides researchers and developers with tools to transform text into expressive, intelligible audio, supporting multiple languages and voice characteristics tuned for clarity and fluidity. The project includes pre-trained models and inference scripts that let users synthesize speech locally or...

Downloads: 12 This Week

Last Update: 2026-03-17
See Project
17

WhisperLive

A nearly-live implementation of OpenAI's Whisper

WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and...

Downloads: 12 This Week

Last Update: 2026-03-17
See Project
18

ChatTTS

A generative speech model for daily dialogue

ChatTTS is an open-source conversational text-to-speech model optimized for dialogue, developed by 2Noise. Trained on 100,000+ hours of English and Chinese conversation data, it excels at generating expressive prosody—pauses, interjections, laughter—for more natural-sounding speech synthesis in assistant and chatbot applications.

Downloads: 9 This Week

Last Update: 6 days ago
See Project
19

AI Runner

Offline inference engine for art, real-time voice conversations

AI Runner is an offline inference engine designed to run a collection of AI workloads on your own machine, including image generation for art, real-time voice conversations, LLM-powered chatbots and automated workflows. It is implemented as a desktop-oriented Python application and emphasizes privacy and self-hosting, allowing users to work with text-to-speech, speech-to-text, text-to-image and multimodal models without sending data to external services. At the core of its LLM stack is a...

Downloads: 12 This Week

Last Update: 2025-12-11
See Project
20

clone-voice

A sound cloning tool with a web interface, using your voice

Clone-voice is a local voice-cloning tool that lets you synthesize speech in any target voice or convert one recording into another voice using the same timbre. It is built around Coqui’s XTTS-v2 model, so it inherits multilingual support and modern neural TTS quality while wrapping it in a user-friendly desktop workflow. The app is designed to be very easy to use: you download a precompiled package, double-click app.exe, and it launches a browser-based web interface where you control...

Downloads: 11 This Week

Last Update: 2025-11-28
See Project
21

Luna AI

Virtual AI anchor that combines state-of-the-art technology

Luna AI is a virtual AI streamer framework designed to power an interactive VTuber that can go live on major platforms and chat with viewers in real time. It is built around a core assistant persona called “Luna AI,” which can be driven by a wide range of large language models and platforms, including GPT-style APIs, Claude, LangChain-based backends, ChatGLM, Kimi, Ollama, and many others. The project supports multiple rendering backends for the avatar, such as Live2D, Unreal Engine (UE),...

Downloads: 11 This Week

Last Update: 2025-11-28
See Project
22

Style-Bert-VITS2

Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles

Style-Bert-VITS2 is a text-to-speech system based on Bert-VITS2 that focuses on highly controllable voice styles and emotional expression. It takes the original Bert-VITS2 v2.1 and its Japanese-Extra variant and extends them so you can control emotion and speaking style with fine-grained intensity, not just choose a generic tone. The project targets both power users and beginners: Windows users without Git or Python can install and run it using bundled .bat scripts, while advanced users can...

Downloads: 10 This Week

Last Update: 2025-11-28
See Project
23

VibeVoice ComfyUI

ComfyUI integration for Microsoft's VibeVoice text-to-speech model

VibeVoice ComfyUI is a comprehensive wrapper that integrates Microsoft’s VibeVoice text-to-speech models directly into ComfyUI workflows. It exposes VibeVoice as a set of custom nodes so you can build single-speaker and multi-speaker voice generation pipelines visually, combining TTS with other audio or generative components. The integration supports high-quality single-speaker synthesis as well as scripted multi-speaker conversations, with optional voice cloning from audio samples for each...

Downloads: 8 This Week

Last Update: 2025-11-28
See Project
24

FastKoko

Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

FastKoko is a self-hosted text-to-speech server built around the Kokoro-82M model and exposed through a FastAPI backend. It is designed to be easy to deploy via Docker, with separate CPU and GPU images so that users can choose between pure CPU inference and NVIDIA GPU acceleration. The project exposes an OpenAI-compatible speech endpoint, which means existing code that talks to the OpenAI audio API can often be pointed at a Kokoro-FastAPI instance with minimal changes. It supports multiple...

Downloads: 7 This Week

Last Update: 2025-12-13
See Project
25

ElevenLabs Python

The official Python SDK for the ElevenLabs API

elevenlabs-python is the official Python SDK for the ElevenLabs API, giving developers a convenient way to access ElevenLabs’ high-quality, lifelike voices. The library wraps the HTTP API into a typed Python client, so you can perform text-to-speech, streaming, voice cloning, voice management, and agents-related operations with simple method calls. It exposes ElevenLabs’ main models such as Eleven Multilingual v2, Eleven Flash v2.5, and Eleven Turbo v2.5, each targeting different trade-offs...

Downloads: 7 This Week

Last Update: 3 days ago
See Project

Previous
You're on page 1
2
3
4
Next

Related Searches

sapi 5 voices

ai

voice cloning

arabic subtitle

tts

applio

srt to speech

audio visualization vlc

whisper-windows-x64.exe

demucs

Related Categories

Artificial Intelligence

Multimedia

Communications

Desktop Environment

Education

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Privacy Choices Advertise