Showing 10 open source projects for "python sample"

View related business solutions
  • Outplacement, Executive Coaching and Career Development | Careerminds Icon
    Outplacement, Executive Coaching and Career Development | Careerminds

    Careerminds outplacement includes personalized coaching and a high-tech approach to help transition employees back to work faster.

    By helping to avoid the potential risks of RIFs or layoffs through our global outplacement services, companies can move forward with their goals while preserving their internal culture, employer brand, and bottom lines.
    Learn More
  • anny is an all-in-one platform for managing hybrid workplaces and shared resources. Icon
    anny is an all-in-one platform for managing hybrid workplaces and shared resources.

    For Businesses looking for a flexible solution for internal and external bookings

    Enable your employees to easily book desks, meeting rooms, parking spots, equipment, and more – all in one place. With flexible rules and group permissions, you stay in full control of who can access what.
    Learn More
  • 1
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    MLX-Audio

    MLX-Audio

    A text-to-speech, speech-to-text and speech-to-speech library

    ...Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI (mlx_audio.tts.generate) as well as a Python API for programmatic generation of audio, including parameters for voice choice, speed, language hints, output format, and sample rate. It includes examples such as audiobook generation to demonstrate long-form synthesis and joined audio segments. On top of that, MLX-Audio offers a modern web interface powered by FastAPI, with real-time waveform and 3D visualizations, file upload, and audio management.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 3
    VideoChat

    VideoChat

    Real-time voice interactive digital human

    ...It supports both pure end-to-end voice solutions based on multimodal large language models (GLM-4-Voice feeding directly into talking-head generation) and a more traditional cascaded pipeline using ASR → LLM → TTS → talking head. It is built as a Gradio Python demo, exposing a web interface where users can talk to an animated avatar that lip-syncs to synthesized speech while responding intelligently. The system is customizable: you can define your own avatar appearance and voice, and it supports voice cloning so you can generate a new voice from a short 3–10 second reference sample. The tech stack integrates FunASR for speech recognition, Qwen for language understanding, multiple TTS engines like GPT-SoVITS, CosyVoice, or edge-tts, and MuseTalk for talking-head generation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    OuteTTS

    OuteTTS

    Interface for OuteTTS models

    ...It provides a high-level Interface API that wraps model configuration, speaker handling, and audio generation so you can focus on integrating speech into your application rather than wiring up low-level engines. The project supports multiple backends including llama.cpp (Python bindings and server), Hugging Face Transformers, ExLlamaV2, VLLM and a JavaScript interface via Transformers.js, allowing it to run on CPUs, NVIDIA CUDA GPUs, AMD ROCm, Vulkan-capable GPUs, and Apple Metal. It also includes a notion of speaker profiles: you can create a speaker from a short audio sample, save it as JSON, and reuse it for consistent voice identity across generations and sessions. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Power through agendas and documents, make more informed decisions and conduct board meetings faster. Icon
    Power through agendas and documents, make more informed decisions and conduct board meetings faster.

    For team managers searching for a solution to manage their meetings

    iBabs not only captures the entire decision-making process – it takes all the paperwork out of meetings. iBabs empowers everyone who has ever organized or attended, a meeting. With a seemingly simple app that offers complete control and a comprehensive overview of all those fiddly details. With about 3000 organizations and over 300,000 users, iBabs gives you peace of mind. So you can quickly organize effective meetings, and good decisions can be made with confidence. iBabs didn’t just happen overnight. We started analyzing and simplifying board meeting processes many years ago. We understand all the work that goes into meetings, and how to streamline everything so it all flows smoothly. On any device, confidentially, securely and automatically. Make good decisions with confidence.
    Learn More
  • 5
    ChatTTS_colab

    ChatTTS_colab

    One-click deployment (including offline integration package)

    ChatTTS_colab is a wrapper project around the ChatTTS model that focuses on “one-click” deployment, especially in Google Colab. It provides an integrated offline bundle and scripts for Windows and macOS so users can run ChatTTS locally without wrestling with complex environment setup. The repository includes Colab notebooks that launch a Gradio-based web UI and expose streaming TTS, making it possible to listen to generated audio as it is produced. A distinctive feature is the “voice gacha”...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Dia

    Dia

    A TTS model capable of generating ultra-realistic dialogue

    Dia is a neural text-to-speech model designed specifically for generating ultra-realistic dialogue in a single pass. Instead of focusing on isolated sentences or flat narration, it is optimized for conversational audio, complete with natural turn-taking, prosody, and pacing. The model can be conditioned on a reference audio sample, allowing you to control emotion, tone, and other stylistic aspects of the speech. It can also produce nonverbal vocalizations like laughter, coughs, clearing the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    StyleTTS 2

    StyleTTS 2

    Towards Human-Level Text-to-Speech through Style Diffusion

    StyleTTS2 is a state-of-the-art text-to-speech system that aims for human-level naturalness by combining style diffusion, adversarial training, and large speech language models. It extends the original StyleTTS idea by introducing a style diffusion model that can sample rich, realistic speaking styles conditioned on reference speech, allowing highly expressive and diverse prosody. The architecture uses a two-stage training process and leverages an auxiliary speech language model to guide...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    VALL-E X

    VALL-E X

    Open source implementation of Microsoft's VALL-E X zero-shot TTS model

    ...It also preserves aspects of the acoustic environment, such as background noise or reverb, making the generated audio feel more like it came from the same setting as the prompt. The repository includes Python APIs, sample scripts, ready-to-use voice presets, and demos hosted on Hugging Face Spaces and Google Colab so users can try it.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Mocking Bird

    Mocking Bird

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

    ...The codebase is implemented in Python (with PyTorch) and includes modules for encoder, synthesizer, vocoder, preprocessing, and inference, as well as demo scripts and a web-server interface for easier experimentation or deployment. MockingBird supports both using pretrained models and training your own synthesizer (with custom datasets), giving flexibility for voice-cloning or custom-voice synthesis depending on your needs.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Ecwid is a hosted cloud commerce platform used by over 1.5 million merchants and offers the easiest way to add an online store to any website, social site or multiple sites simultaneously. Icon
    Ecwid is a hosted cloud commerce platform used by over 1.5 million merchants and offers the easiest way to add an online store to any website, social site or multiple sites simultaneously.

    Your free online store is just a few clicks away.

    Set up your Ecwid store once to easily sync and sell across a website, social media, marketplaces like Amazon, and live in-person. Get started with one, or try them all.
    Start Selling
  • 10
    HiFi-GAN

    HiFi-GAN

    Generative Adversarial Networks for Efficient and High Fidelity Speech

    HiFi-GAN is a GAN-based neural vocoder designed to generate high-fidelity speech waveforms from mel spectrograms with exceptional efficiency. It introduces a generator architecture tailored to model the periodic structure of speech and a set of discriminators that focus on different scales and periods of the waveform to better capture naturalness. The model targets a sweet spot between sample quality and generation speed, outperforming many previous GAN vocoders while being far faster than...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB