Speech
- A.I. Tools
Toucan TTS: An MIT Licensed Text-to-Speech Advanced Toolbox with Speech Synthesis in More Than 7000 Languages
In recent research, the Institute for Natural Language Processing (IMS) at the University of Stuttgart, Germany, has introduced ToucanTTS, significantly…
Read More » - A.I. Tools
Conformer-Based Speech Recognition on Extreme Edge-Computing Devices
This paper was accepted at the Industry Track at NAACL 2024. With increasingly more powerful compute capabilities and resources in…
Read More » - A.I. Tools
Hypernetworks for Personalizing ASR to Atypical Speech
*Equal Contributors Parameter-efficient fine-tuning (PEFT) for personalizing automatic speech recognition (ASR) has recently shown promise for adapting general population models…
Read More » - A.I. Tools
Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation
Human evaluation is a critical component in machine translation system development and has received much attention in text translation research.…
Read More » - Blockchain News
Exploring AssemblyAI’s Integrations: Enhancing Speech AI Workflows
AssemblyAI continues to enhance its position in the AI landscape by expanding its integrations with…
Read More » - A.I. Tools
Deciphering Auditory Processing: How Deep Learning Models Mirror Human Speech Recognition in the Brain
Research states computations converting auditory data into linguistic representations are involved in voice perception. The auditory pathway is activated when…
Read More » - A.I. Tools
Google AI Proposes Easy End-to-End Diffusion-based Text to Speech E3-TTS: A Simple and Efficient End-to-End Text-to-Speech Model Based on Diffusion
In machine learning, a diffusion model is a generative model commonly used for image and audio generation tasks. The diffusion…
Read More » - A.I. Tools
Modality Dropout for Multimodal Device Directed Speech Detection using Verbal and Non-Verbal Features
Device-directed speech detection (DDSD) is the binary classification task of distinguishing between queries directed at a voice assistant versus side…
Read More » - A.I. Tools
Whisper models for automatic speech recognition now available in Amazon SageMaker JumpStart
Today, we’re excited to announce that the OpenAI Whisper foundation model is available for customers using Amazon SageMaker JumpStart. Whisper is…
Read More »