HuggingFace crossed now 1.000.000 public models. Mystic v2 is out – on X there are plenty examples right now of that upscaler (4k and 8k images). Molmo was released, this is direct…
Month: September 2024
WIP: LLMs
Ollama is my preferred choice, but here I want to gather the alternatives I’ve found. mlx-lm Repo: https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/README.md it’s part of MLX:MLX is an array framework for machine learning research on Apple…
Llama 3.2 released
New models are in versions 1B, 3B, 11B or 90B. The smallest ones are described as: Use our 1B or 3B models for on device applications such as summarizing a discussion from…
Moshi Foundation Model for Speech-Text Processing
If you’re looking for an alternative to the Whisper stack, one option worth considering is Moshi. I’ve originally found it mentioned on X: Summary: Key Features: Mimi Codec: Training and Evaluation:
WIP: Quarkus
I started to learn something different from Spring Boot, which I use daily in my work. My choice is Quarkus – here I will gather some useful things. Long time I was…
Offline Whisper Audio Transcription and Ollama-Voice Assistant
The WhisperLive project is a real-time transcription application that utilizes the OpenAI Whisper model to convert speech input into text output. This technology can be employed for transcribing both live audio input…
PAR LLAMA: A Terminal User Interface for Easy Ollama Model Management
PAR LLAMA is a Terminal User Interface (TUI) application designed for easy management and use of Ollama based Large Language Models (LLMs). This means that users can interact with the Ollama model…
OpenAI’s New Model: A Step Forward in AI Reasoning
The latest development from OpenAI has brought forth a new model, aptly named o1. According to the company, this model has been designed to take its time thinking before responding, thus enabling…
ColPali and Byaldi for reading PDFs with images, Reflection-70B
Multi-modal documents were always a problem, but what I can read now AI developers already made a huge progress, beating traditional PDF parsers. Now we have at least 3 solutions available as…
AI programming tools
Claude, Cursor or Replit agents? It seems like the market of application creation AI tools is growing fast. After a very good Claude Sonnet application creation chat, developers moved to Cursor and…