AI KATANA
Posts
Google Announces Veo 2, Imagen 3 Update, and Whisk

Google Announces Veo 2, Imagen 3 Update, and Whisk

Also: OpenAI o1 and New Tools for Developers

AI KATANA
December 17, 2024

Morning!

Google, NVIDIA, OpenAI, and Meta are making waves with groundbreaking AI advancements this week. Google unveiled Veo 2 and Imagen 3 updates, enhancing video and image generation capabilities, along with Whisk, a creative remix tool powered by its Gemini model. NVIDIA introduced its most affordable AI supercomputer, the Jetson Orin Nano, offering powerful performance for developers at just $249. OpenAI launched o1, its latest reasoning model, alongside developer-friendly updates like GPT-4o mini, preference fine-tuning, and Go/Java SDKs. Meanwhile, Meta’s Ray-Ban smart glasses now feature ‘Live AI,’ providing real-time assistance akin to Google’s Project Astra. In entertainment, a Japanese studio announced an AI-driven anime project, marking a shift in animation production.

Sliced just for you:

🆕 Google Announces Veo 2, Imagen 3 Update, and Whisk
🦾 NVIDIA Unveils Its Most Affordable Generative AI Supercomputer
🛠️ OpenAI o1 and New Tools for Developers
🕶️ Meta’s Ray-Ban Smart Glasses Adding ‘Live AI’
🇯🇵 Japanese Studio Announces Anime Heavily Made With AI

🆕 Google Announces Veo 2, Imagen 3 Update, and Whisk

Google has unveiled significant updates to its AI-driven creative tools with the launch of Veo 2, an enhanced video generation model, and an improved Imagen 3 for text-to-image generation. Veo 2 generates high-definition videos up to 4K resolution with improved realism, physics understanding, and customizable cinematic effects, catering to professional content creators through Google Labs’ VideoFX platform. Imagen 3 delivers sharper, more diverse, and better-composed images with greater photorealism, now available to a wider audience in over 100 countries. Additionally, Google introduced Whisk, a new tool enabling users to remix and combine images into creative outputs using AI prompts powered by its Gemini model. Whisk, designed for consumer use, allows users to generate unique visuals by blending subjects and styles effortlessly. These innovations highlight Google’s continued push to refine AI tools for professional and everyday creativity, cementing its role as a major player in the generative AI space.

🦾 NVIDIA Unveils Its Most Affordable Generative AI Supercomputer

NVIDIA has unveiled its Jetson Orin Nano Super Developer Kit, the company’s most affordable generative AI supercomputer, priced at $249. Designed to empower developers, hobbyists, and students, the compact device delivers a 1.7x boost in generative AI inference performance and enhanced efficiency compared to its predecessor. The system-on-module includes an NVIDIA Ampere GPU with tensor cores, a 6-core Arm CPU, and support for up to four cameras, making it ideal for robotics, computer vision, and AI applications. NVIDIA’s software ecosystem, including tools like TAO Toolkit and Omniverse Replicator, accelerates model training and synthetic data generation. A software update also unlocks these performance gains for existing Jetson Orin Nano devices, reflecting NVIDIA’s commitment to delivering accessible and scalable AI computing solutions.

🛠️ OpenAI o1 and new tools for developers

OpenAI has introduced o1, its latest reasoning model optimized for handling complex, multi-step tasks with improved accuracy, now available in the API for developers. Alongside this, enhancements to the Realtime API include WebRTC support for low-latency voice applications, significant price reductions for GPT-4o audio tokens, and the addition of GPT-4o mini, a cost-effective small model. Developers also gain access to Preference Fine-Tuning, a new method to refine AI outputs based on user preferences, making it ideal for subjective tasks like creative writing and summarization. To streamline development, OpenAI launched official Go and Java SDKs in beta, extending support beyond Python and Node.js. These updates empower developers to build more efficient, customizable, and real-time AI-driven applications.

🕶️ Meta’s Ray-Ban smart glasses adding ‘Live AI’ that works like Google’s Project Astra

Meta has introduced Live AI features to its Ray-Ban smart glasses, transforming them into a versatile real-time assistant. The new update, version 11, enables the glasses to analyze live video feeds and provide contextual, hands-free AI assistance for tasks like cooking, gardening, or navigating new places. Users can interact with the AI naturally, referencing earlier queries without repeating prompts. Additionally, live translation offers real-time speech translation into English for Spanish, French, and Italian, output through the glasses’ audio and transcribed on a connected phone. Meta plans to gradually roll out these features via a waitlist, currently limited to users in the U.S. and Canada. This development positions Meta as a competitor to Google’s Project Astra, which similarly uses video feeds for contextual AI responses.

🇯🇵 Japanese Studio Announces Anime Heavily Made With AI

A Japanese animation studio has announced a groundbreaking anime project created primarily using AI tools, signaling a significant shift in traditional animation workflows. The AI technology will handle various stages of production, including background art, character designs, and animation rendering, while human creators oversee direction and refinement. This move aims to reduce production time and costs, allowing for faster content delivery without compromising creative quality. However, it raises questions about the balance between technological innovation and the role of human artistry in anime, sparking debates within the creative industry about the future of AI collaboration in visual storytelling.

🛠️ AI tools updates

OpenAI challenges Google Search by making ChatGPT search free for all users

OpenAI has made its ChatGPT search capabilities free for all logged-in users, marking a significant move to challenge Google Search’s dominance. Previously exclusive to paid subscribers, this expansion aligns with OpenAI’s strategic push to attract a broader audience as it grows into a for-profit enterprise. ChatGPT search provides real-time answers, product comparisons, and contextual results similar to traditional search engines, while also supporting browser extensions for Google Chrome. The company’s move targets users heavily reliant on search tools, offering a streamlined alternative to traditional search platforms. Although OpenAI’s offering lacks the vast integration of Google, this step highlights its ambition to redefine web-based search and capture new market share.

💵 Venture Capital updates

AI startup Databricks hits $62 billion valuation in record VC round

Databricks, the prominent AI and data analytics startup, has achieved a staggering $62 billion valuation after raising $10 billion in a highly oversubscribed funding round. Led by Thrive Capital and supported by top-tier investors like Andreessen Horowitz, Insight Partners, and GIC, this milestone underscores the growing demand for companies driving enterprise AI adoption. Databricks plans to use the funding to facilitate employee stock cash-outs, recruit elite AI talent, and explore mergers and acquisitions. CEO Ali Ghodsi emphasized that the company’s recent profitability and $3 billion revenue run rate are fueled by optimized growth strategies, including streamlined hiring and offshore operations. With a focus on managing vast amounts of unstructured data and expanding its enterprise-grade AI products, Databricks is well-positioned as a future public market contender, though an IPO timeline remains flexible.

🫡 Meme of the day

⭐️ Generative AI image of the day

Before you go, check out Oreo maker is using AI to develop new snack flavors.