Convert Audio to Text

12h

Kagi Translate’s AI answers the question “What would horny Margaret Thatcher say?”

But what if you want to translate into more esoteric “languages” like “LinkedIn Speak,” “Gen Z slang,” or “horny Margaret ...

13h

Modulate Launches Velma Transcribe: High-Performance Transcription For Real World Conversations at 90% Lower Cost

Modulate’s ELM model architecture unlocks transcription for the masses, cutting costs by 10x while achieving industry-leading ...

Bandicam Launches AI Feature to Transcribe Video to Text on Mac

New AI transcription tool turns screen recordings into searchable text, subtitles, and MP4 transcripts in seconds. Our ...

Scientific Research Publishing

Cybersecurity and Forensic Audio Analysis: Deepfake Detection Based on Mfcc, Audio-Text Disconsistency, and Prosodic Features ()

1 Department of Computer and Instructional Technologies Education, Gazi Faculty of Education, Gazi University, Ankara, Türkiye. 2 Department of Forensic Informatics, Institute of Informatics, Gazi ...

IEEE

Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment

Abstract: Text-to-video retrieval systems have recently made significant progress by utilizing pre-trained models trained on large-scale image-text pairs. However, most of the latest methods primarily ...

GitHub

DePasqualeOrg/mlx-audio-plus

The best audio processing library built on Apple's MLX framework, providing fast and efficient text-to-speech (TTS), speech-to-text (STT), and speech-to-speech (STS) on Apple Silicon. Kokoro Fast, ...

GitHub

aymalkhalid/Openai-Whisper-AI-Subtitle-Generator-Mp4-Mp3-To-SRT

A free, open-source Python tool that converts English audio files (MP3, WAV, etc.) to subtitle files using OpenAI's Whisper AI model.

IEEE

Diagnosis of Depression Based on Four-Stream Model of Bi-LSTM and CNN From Audio and Text Information

Abstract: Recent development trends in artificial intelligence applications have seen increasing interest in the design of automated systems for depression detection and diagnosis among the affective ...

Journal of Medical Internet Research

Parallel Corpus Analysis of Text and Audio Comprehension to Evaluate Readability Formula Effectiveness: Quantitative Analysis

Retention was better for text, with significant differences in exact word matching for Patient Instructions and BMJ Journal. Longer texts increased perceived difficulty in text but reduced free recall ...

PC Magazine

Gemini Can Now Generate 30-Second Songs From Text, Images with Lyria 3

Gemini Can Now Generate 30-Second Songs From Text, Images with Lyria 3 You don't have to provide the lyrics. Just mention the mood and tempo or upload an image for reference, and let Lyria 3 do the ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results