Adobe Firefly, Gemini robotics research + technical reading bonuses

Adobe’s Firefly Video Model

Adobe has announced the beta launch of its Firefly Video Model, a web module that generates videos from text prompts or image inputs. Key features include:

* Quickly transforming ideas into stunning video clips
* Creating b-roll, visual effects, and more
* Starting with a photo and generating a video
* Commercially safe to use in creative projects

Gemini in Robotics: Improving Low-Level Robot Control

Google explored leveraging Gemini’s capabilities in robotics, specifically improving low-level robot control without training.

Key Findings:

* Naive prompting methods can lead to suboptimal value estimation performance.
* Value estimation improves with in-context learning using Gemini’s long context window.
* Value functions estimated by Gemini 1.5 Pro can improve low-level policy learning, such as search and planning.

Potential Applications:

* Supervising lower level policies for self-practice and improvement.
* Using value functions in robotics tasks like success detection and RL.

By utilizing Gemini’s capabilities, they aim to make robotics more efficient and effective.

https://twitter.com/xf1280/status/1854643292476227829

Transformers

Want to truly understand how information flows through a transformer? Check out this open-source project that breaks down the inner workings of LLM Transformer Models.

Key takeaways:

* Learn about self-attention mechanisms and their role in transformer-based AI models like GPT.
* Understand the encoder-decoder architecture that powers these models.
* Discover how positional encoding helps preserve sequential information.
* Explore the concept of multi-head attention and its benefits.

https://poloclub.github.io/transformer-explainer

Also some introduction to LLM language and concepts: https://rahulrajpvr7d.medium.com/what-are-the-query-key-and-value-vectors-5656b8ca5fa0

Exolabs / MLX on M4 Max

• Single-device inference on M4 Max is 27% faster with MLX.
• Bandwidth improved by 200% (120Gbps vs 40Gbps).
• This results in significantly better Exolabs performance.

Comparing AI Inference Speeds

• M4 Max outperforms M3 Max with a 27% speedup, reaching 72 tok/sec compared to the M3 Max’s 56 tok/sec.
• This improved speed is consistent across various models, including Llama-3.2-1b and Llama-3.2-3b.

https://twitter.com/exolabs/status/1854882563262714000

Adobe’s Firefly Video Model

Gemini in Robotics: Improving Low-Level Robot Control

Transformers

Exolabs / MLX on M4 Max

Leave a Reply Cancel reply