- AI KATANA
- Posts
- OpenAI’s ChatGPT Will Respond to Video Feeds in Real Time
OpenAI’s ChatGPT Will Respond to Video Feeds in Real Time
Also: What is Google’s Project Mariner? This AI Agent Can Navigate the Web for You

Good morning, and welcome to your daily dose of AI. Today, we delve into the latest advancements in AI, from OpenAI’s real-time video-processing ChatGPT to Google’s web-navigating AI agent, alongside groundbreaking developments in healthcare, computer vision, and AI-driven regulation. You’ll also discover Meta’s new watermarking tool to combat deepfakes and an exciting €200M venture fund fueling quantum computing and AI startups. Let’s dive into the stories shaping tomorrow’s tech landscape.
Sliced just for you:
👀 OpenAI’s ChatGPT Will Respond to Video Feeds in Real Time
🤖 AI Agents with More Autonomy Than Chatbots Are Coming. Some Safety Experts Are Worried
🖼️ AI Godmother Fei-Fei Li Has a Vision for Computer Vision
⚕️ AI in Health Should Be Regulated, but Don’t Forget About the Algorithms, Researchers Say
🌐 What is Google’s Project Mariner? This AI Agent Can Navigate the Web for You
OpenAI has announced an enhancement to ChatGPT, enabling it to process and respond to video feeds in real time. This new capability allows the AI to identify objects and provide live feedback through a smartphone’s camera. Users can now ask the AI for assistance with tasks like crafting responses to messages in apps or offering step-by-step instructions for everyday activities, such as making coffee. The feature is rolling out initially to ChatGPT Plus and Pro subscribers, with plans to expand to enterprise and educational customers in January. This development reflects OpenAI’s push into multimodal AI technologies, enhancing interactive and dynamic user experiences.
AI agents are emerging as autonomous systems designed to perform tasks and achieve goals with minimal human input, a significant evolution from traditional AI tools like chatbots. These agents interact with external applications and adapt to real-time changes, handling tasks such as booking travel or managing business operations. Companies like OpenAI, Google, and Amazon are advancing this technology, integrating it into consumer and enterprise applications. While proponents highlight its potential to automate mundane tasks and boost productivity, experts express concerns about alignment with human values, ethical decision-making, and regulatory challenges. The rapid commercialization of AI agents raises urgent questions about accountability, safety, and the long-term societal impact of increasingly independent AI systems.
Fei-Fei Li, a renowned figure in AI, has significantly influenced the field of computer vision, notably through the creation of ImageNet, which catalyzed the deep learning revolution by enabling advancements in object recognition. Her work laid the groundwork for neural network breakthroughs, such as AlexNet’s success in 2012. Currently, Li’s focus has shifted to 3D spatial intelligence with her startup, World Labs, which aims to create AI systems capable of reasoning and interacting within three-dimensional environments. She underscores the importance of 3D comprehension for advancing machine intelligence, linking perception with action in AI systems. Her advocacy extends to public access to computational resources, emphasizing the importance of combining public and private sector innovation for societal benefits. Li envisions a future where 3D intelligence enhances creative industries, robotics, healthcare, and education, enabling applications like immersive design, precise medical imaging, and advanced augmented reality experiences.
Researchers from MIT and collaborators emphasize the need for robust regulation of both AI and traditional algorithms in healthcare to ensure equity and safety. Recent federal rules under the Affordable Care Act prohibit discrimination in clinical decision-support tools, a category that includes AI-enabled systems. While the FDA has approved nearly 1,000 AI devices to support clinical decisions, non-AI systems such as clinical risk scores also lack oversight despite their widespread use. Experts warn that biases inherent in these tools can perpetuate disparities and call for stringent standards to address these issues. The proposed regulations aim to enhance transparency, non-discrimination, and accountability in healthcare technologies, highlighting the need for continued dialogue between regulators, industry leaders, and academic researchers.
Google’s Project Mariner is an experimental Chrome extension powered by Gemini 2.0, designed to function as an AI agent that automates web-based tasks. The tool interprets on-screen content, including images and forms, and executes user-directed activities like data collection or navigation within the active browser tab. Mariner provides transparency by presenting a step-by-step plan before executing tasks and requires user confirmation for sensitive actions. It showcased its potential by completing tasks such as finding contact details for companies listed in a spreadsheet. Google emphasizes responsible development, exploring risks and mitigations. Currently available only to a limited group of testers, Project Mariner exemplifies advancements in agentic AI, aimed at enhancing productivity through web interaction automation.
🛠️ AI tools updates
Meta has introduced Meta Video Seal, an open-source tool for watermarking AI-generated videos, as part of its effort to combat the rise of deepfakes and preserve content authenticity. This tool embeds imperceptible watermarks into videos and can also encode hidden messages to trace the origins of the content. Meta claims the tool is resilient to common video edits, such as cropping, blurring, and compression, setting it apart from existing solutions like DeepMind’s SynthID and Microsoft’s watermarking methods. While Meta Video Seal aims to enhance the robustness and scalability of video watermarking, challenges remain in ensuring adoption across the industry. To promote transparency and drive improvements, Meta has launched a public leaderboard, Meta Omni Seal Bench, for benchmarking watermarking technologies and plans to host a workshop at a major AI conference. This initiative underscores the growing importance of tools that enhance accountability in AI-generated media.
💵 Venture Capital updates
Scientifica Venture Capital has announced the launch of its €200 million fund, Scientifica Fund 1 (SF1), aimed at supporting innovation in quantum computing, AI, smart materials, and environmental technologies. Expected to roll out in 2025, SF1 targets startups at various stages, from ideation to scaling, offering a Zero CapEx model that grants access to advanced laboratories and equipment without upfront costs. This initiative facilitates a seamless pipeline between scientific research and venture capital, fostering sustainable innovation. The fund is complemented by plans for additional vertical funds and a global footprint with collaborations across Europe and a Silicon Valley hub to connect European startups with the U.S. tech ecosystem. Scientifica’s portfolio already includes groundbreaking startups in biotech, AI healthcare, space manufacturing, and renewable energy, showcasing its commitment to driving technological advancements and economic growth.
🫡 Meme of the day

⭐️ Generative AI image of the day

Before you go, check out What do the gods of generative AI have in store for 2025?
