Showing 192 open source projects for "speech"

View related business solutions
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • Iris Powered By Generali - Iris puts your customer in control of their identity. Icon
    Iris Powered By Generali - Iris puts your customer in control of their identity.

    Increase customer and employee retention by offering Onwatch identity protection today.

    Iris Identity Protection API sends identity monitoring and alerts data into your existing digital environment – an ideal solution for businesses that are looking to offer their customers identity protection services without having to build a new product or app from scratch.
    Learn More
  • 1
    Transcoder

    Transcoder

    Hardware-accelerated video transcoding using Android MediaCodec APIs

    Transcoder by DeepMedia is an AI-powered video-to-video speech translation engine that enables fully automated multilingual dubbing. Unlike traditional speech translation systems that rely on multi-stage pipelines, Transcoder directly translates one speaker’s video into another language while preserving facial expressions, lip-sync, and vocal identity. Designed for real-time use and production-grade pipelines, Transcoder combines advanced deep learning models with GPU acceleration to deliver high-quality translations across languages. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    Deep Chat

    Deep Chat

    Customizable AI chat component for websites with API support

    ...It is built as a framework-agnostic solution, meaning it can work across various frontend environments, with additional support provided for React through a dedicated wrapper. Deep Chat includes advanced interaction capabilities such as speech input and output, file handling, and multimedia communication, making it suitable for rich conversational experiences. Internally, it uses a structured architecture that manages input, message handling, and service communication, allowing developers to intercept and customize requests and responses.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    Stanford CoreNLP

    Stanford CoreNLP

    Stanford CoreNLP, a Java suite of core NLP tools

    ...Pipelines produce CoreDocuments, data objects that contain all of the annotation information, accessible with a simple API, and serializable to a Google Protocol Buffer. CoreNLP generates a variety of linguistic annotations, including parts of speech, named entities, dependency parses, and coreference.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4

    speech intonator

    The purpose of the project is to develop audio processing algorithms

    The initial version of the main branch of the project has been completed. The main name of the project is "Java audio mixer Summaha". The second name of the project is "Sound Arithmometer". Main purpose - production of musical sound remixes from a set of samples. The name "Summaha" rhymes well with 'Yamaha' and creates motivation and inspiration to achieve a sound quality comparable to with a well-known brand. Detailed documentation in 'read' signature files. Anyone who is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MicroStation by Bentley Systems is the trusted computer-aided design (CAD) software built specifically for infrastructure design. Icon
    MicroStation by Bentley Systems is the trusted computer-aided design (CAD) software built specifically for infrastructure design.

    Microstation enables architects, engineers, and designers to create precise 2D and 3D drawings that bring complex projects to life.

    MicroStation is the only computer-aided design software for infrastructure design, helping architects and engineers like you bring their vision to life, present their designs to their clients, and deliver their projects to the community.
    Learn More
  • 5
    Operit AI

    Operit AI

    Powerful Android AI agent with tools, automation, and Linux shell

    Operit is a full-featured AI assistant and agent platform designed specifically for Android devices, aiming to go far beyond traditional chat-based interfaces. It integrates deep system-level capabilities with a wide range of tools, allowing the AI to perform real tasks such as file management, automation, and system control directly on the device. A standout aspect of the project is its built-in Ubuntu 24 environment, which enables users to run Linux commands, scripts, and development tools...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 6
    AnySoftKeyboard

    AnySoftKeyboard

    Android (f/w 2.1+) on screen keyboard for multiple languages

    The only Android keyboard you'll ever need. Free as in speech and Free as in beer. Android (f/w 4.0.3+, API level 15+) on screen keyboard for multiple languages.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP is a machine learning-based NLP library that provides tools for text-processing tasks such as tokenization, sentence segmentation, and named entity recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Smile

    Smile

    Statistical machine intelligence and learning engine

    Smile is a fast and comprehensive machine learning engine. With advanced data structures and algorithms, Smile delivers the state-of-art performance. Compared to this third-party benchmark, Smile outperforms R, Python, Spark, H2O, xgboost significantly. Smile is a couple of times faster than the closest competitor. The memory usage is also very efficient. If we can train advanced machine learning models on a PC, why buy a cluster? Write applications quickly in Java, Scala, or any JVM...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    eGuideDog free software for the blind
    eGuideDog project develops free software for the blind. Currently, we focus on WebSpeech, Ekho TTS and WebAnywhere.
    Leader badge
    Downloads: 186 This Week
    Last Update:
    See Project
  • Premier Construction Software Icon
    Premier Construction Software

    Premier is a global leader in financial construction ERP software.

    Rated #1 Construction Accounting Software by Forbes Advisor in 2022 & 2023. Our modern SAAS solution is designed to meet the needs of General Contractors, Developers/Owners, Homebuilders & Specialty Contractors.
    Learn More
  • 10
    JSpeech

    JSpeech

    Java library designed to integrate Speech-to-Text

    jSpeech is a Java library designed to integrate Speech-to-Text (STT) capabilities, command control, and diarization (speaker identification) into applications in a simple, modular, and decoupled way.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    Google2SRT

    Google2SRT

    Download, save and convert multiple subtitles from YouTube videos

    Google2SRT allows you to download, save and convert multiple subtitles and translations from YouTube and Google Video to SubRip (.srt) format, which is recognized by most video players. You can download XML subtitles or simply type video's URL, Google2SRT will do the rest.
    Downloads: 58 This Week
    Last Update:
    See Project
  • 12
    A series of open source files and programs available to use for developing programs to work with the WowWee Robotics RSMedia Robot. These include a USB serial console, a cross-compiler, a firmware dump program, text-to-speech and source code.
    Leader badge
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Conversations

    Conversations

    App in java for chatting to a generative A.I. (involving tts and stt)

    Java application for chatting to generative AI Llama3. * The user can speak into the microphone (speechToText), edit the recognized text and send it to the AI. * The AI ​​responds and the server returns that response in real time, and the sentences converted to audio (textToSpeech), and the application broadcasts them through the speaker. The application is prepared so that only one user occupies the server's resources, so if the server is busy, in theory it will not let you...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    elevenlabs-api

    elevenlabs-api

    elevenlabs-api is an open source Java wrapper around the ElevenLabs

    ...For any public repository security, you should store your API key in an environment variable, or external from your source code. The most realistic and versatile AI speech software, ever. Eleven brings the most compelling, rich and lifelike voices to creators and publishers seeking the ultimate tools for storytelling. Generate top-quality spoken audio in any voice and style with the most advanced and multipurpose AI speech tool out there. Our deep learning model renders human intonation and inflections with unprecedented fidelity and adjusts delivery based on context.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15

    Social World

    a software project that tries to visualize people's social behaviour

    Social World is a software project that tries to visualize people's social behaviour. Therefore the actions, the speech and the physical and mental constitution are part of a simulation process that calculates, how attributes of creatures and characters change and how they act and react.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Intelligent Java

    Intelligent Java

    Integrate with the latest language models, image generation and speech

    ...Generate audio from text; Access DeepMind’s speech models. The only dependencies is GSON. Required to add manually when using IntelliJava jar. However, if you imported this repo through Maven, it will handle the dependencies.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    navmol-ch

    A fork of the navmol (https://sourceforge.net/projects/navmol/)

    NavMol with practical improvements, the addition of menus, the support of Mandarin, the addition of the text-to-speech, the implementation of the interrupt function of speech, and the full internationalization of text, easier and more convenient to be used.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    VnCoreNLP

    VnCoreNLP

    A Vietnamese natural language processing toolkit

    VnCoreNLP is a Java-based natural language processing toolkit tailored for Vietnamese. It offers a fast and accurate pipeline for essential NLP tasks, facilitating research and application development in Vietnamese language processing. ​
    Downloads: 9 This Week
    Last Update:
    See Project
  • 19
    Jason is a fully-fledged interpreter for an extended version of AgentSpeak, a BDI agent-oriented logic programming language, and is implemented in Java. Using JADE a multi-agent system can be distributed over a network effortlessly. This project was moved to https://jason-lang.github.io
    Downloads: 33 This Week
    Last Update:
    See Project
  • 20
    MaryTTS

    MaryTTS

    An open-source, multilingual text-to-speech synthesis system

    MaryTTS is an open-source, multilingual Text-to-Speech Synthesis platform written in Java. It was originally developed as a collaborative project of DFKI’s Language Technology Lab and the Institute of Phonetics at Saarland University. It is now maintained by the Multimodal Speech Processing Group in the Cluster of Excellence MMCI and DFKI. As of version 5.2, MaryTTS supports German, British and American English, French, Italian, Luxembourgish, Russian, Swedish, Telugu, and Turkish; more languages are in preparation. ...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 21
    Live Transcribe Speech Engine

    Live Transcribe Speech Engine

    Live Transcribe is an Android application

    Live Transcribe Speech Engine provides on-device speech recognition components that power real-time transcription for accessibility and everyday voice-first experiences. Its design prioritizes latency and robustness in noisy, far-field environments, enabling continuous transcription with low delay on mobile hardware. The engine manages audio front-end processing—such as noise suppression and voice activity detection—before feeding audio into compact, accurate acoustic and language models. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    AzVoice is an Augmentative Communications System designed for use on common off the shelf hardware. It is intended to replace more expensive systems such as DynaVox(R) and the like. The AzVoice talker gives speech to the speech impaired.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    OpenOffice.org Export As DAISY
    odt2daisy is an OpenOffice.org Writer extension, enabling to export in DAISY XML, Full DAISY (xml+audio) and Audiobook format. DAISY is an NISO Z39.86 standard for blind, visual impaired, print-disabled, and learning-disabled people.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Korean Analyzer Rhino

    Korean Analyzer Rhino

    Parsing Korean words by morpheme and part-of-speech

    RHINO parses Korean words by morpheme and part-of-speech. Its dictionaries are based on Korean Modern Tagged Corpus(12 million phrases scale) which was made by Korean government. So it analyses many cases of stems and endings. And the newly developed Dynamic Dictionary Technology can make words to react with their context. That is, a programmed database. For more information see the files in the help folder.
    Leader badge
    Downloads: 33 This Week
    Last Update:
    See Project
  • 25

    ASR for Medical Reporting

    Automatic speech recognition system for medical reporting in spanish.

    This is a functional prototype of automatic speech recognition system for medical reporting in Spanish using CMU Sphinx4 ASR toolkit. This ASR use pre-trained acoustic model and context dependent language model in nuclear medicine diagnostics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB