Speech to Text with Python

5 Under the Radar AI Infrastructure Startups Powering the Consumer AI Boom

From real time voice AI to generative media, these five startups are building the inference layer powering the next ...

Adversarial Speech-Text Pre-Training for Speech Translation

Abstract: Large-scale pre-training has been shown to benefit speech translation tasks. However, existing multimodal pre-training efforts rely on parallel corpora for semantic alignment, potentially ...

IEEE

Multilingual Braille-to-Text Conversion and Text-to-Speech System using Raspberry Pi

Abstract: In spite of the fact that Braille is an important channel of communication for the visually impaired, conventional systems require specialized training and expensive devices that are hard to ...

Slator

Prompt-Based Control Reaches Enterprise Speech-to-Text

Slator is the leader in market intelligence for language solutions and language AI. Slator's Advisory practice is a trusted partner to clients looking for M&A services and independent analysis. Slator ...

XDA Developers on MSN

Whisper transcribes my voice notes faster than I can type, and it runs entirely offline

I'd rather keep voice notes to myself.

Blavity on MSN

Spelman students develop PlantGPT, an AI system that allows verbal communication with plants

Students at Spelman College have developed an artificial intelligence system that allows anyone to communicate verbally with ...

Explained: Sarvam AI that managed to beat Google Gemini and ChatGPT in key AI benchmarks

Bengaluru-based Sarvam AI has outperformed Google’s Gemini and OpenAI’s ChatGPT in Indian language benchmarks, showcasing locally trained models for documents, speech, and low-bandwidth use across ...

eWeek

Mistral AI’s Voxtral Transcribe 2 Launch Breaks Sound Barrier

Voxtral Transcribe 2 consists of two speech-to-text models with transcription quality, diarization, and ultra-low latency.

11d

Awesome DIY Offline Raspberry Pi Al Chatbot is Now Faster

Keep a Raspberry Pi AI chatbot responsive by preloading the LLM and offloading with Docker, reducing first reply lag for ...

eWeek

Voice AI Startup ElevenLabs Raises $500M at $11B Valuation

ElevenLabs has raised $500 million in a Series D funding round, valuing the AI audio company at $11 billion and marking one ...

OSTechNix

Pocket TTS: High-Quality Local Voice Cloning Without GPU

Pocket TTS delivers high-quality text-to-speech on standard CPUs. No GPU, no cloud APIs. It is the first local TTS with voice ...

11d

ElevenLabs bags $500M at $11B valuation; eyes global expansion including Bengaluru

ElevenLabs generated over $330 million in annual recurring revenue in 2025. India is the second-largest enterprise revenue ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results