Kokoro
Kokoro is an 82 million parameter text-to-speech (TTS) model that has made significant advancements in the field. Here are some key highlights:
* Released under Apache 2.0 license, making it freely usable and modifiable
* Achieved #1 ranking on TTS Spaces Arena with fewer parameters and less data than other top models
* Outperformed larger models like XTTS v2 (467M params), Edge TTS, MetaVoice (1.2B params), Parler Mini (880M params), and Fish Speech (~500M params)
Key Features:
* 82 million parameters
* Trained on less than 100 hours of audio
* Released in fp32 precision
* Available in .onnx format
https://hf.co/spaces/hexgrad/Kokoro-TTS
LLM Apps Collection
A curated collection of awesome Large Language Model (LLM) apps built with Retrieval Augmented Generation (RAG) and AI agents. This repository features:
* LLM apps using OpenAI, Anthropic, Google, and open-source models like LLaMA
* Practical and creative ways to apply LLMs across domains, such as code repositories and email inboxes
* Apps combining LLMs with RAG and AI Agents
* Well-documented projects for learning and contributing to the growing open-source ecosystem of LLM-powered applications
https://github.com/Shubhamsaboo/awesome-llm-apps
Bonus
3D scenes generation: https://x.com/XRarchitect/status/1877424350812385388
CES 2025: https://x.com/minchoi/status/1877366380011360661
Human-like robots: https://x.com/CollinRugg/status/1877854672469545183
On Mac for Kokoro we need to:
1. brew install espeak-ng
2. PHONEMIZER_ESPEAK_LIBRARY=/opt/homebrew/Cellar/espeak-ng/1.52.0/lib/libespeak-ng.dylib python kokoro-example.py
Also if we need to store file and we don’t use Jupiter Notebook, we can do it directly:
# from IPython.display import display, Audio
# display(Audio(data=audio, rate=24000, autoplay=True))
# print(out_ps)
from scipy.io.wavfile import write
rate = 24000
write(“output_audio.wav”, rate, audio)