- AI KATANA
- Posts
- ChatGPT update enables its AI to “see, hear, and speak,“ according to OpenAI
ChatGPT update enables its AI to “see, hear, and speak,“ according to OpenAI
Also: Amazon Takes a Big Stake in the A.I. Start-Up Anthropic

Welcome!
In the recent AI landscape, OpenAI upgraded ChatGPT to analyze images and offer voice synthesis for more dynamic interactions. Meanwhile, Amazon is making significant investments in AI start-up Anthropic, aiming for a larger footprint in the AI sector. Corporate and investment banks are harnessing the potential of generative AI to boost their operations and profits. Spotify is experimenting with an AI-enabled podcast translation tool that retains the original podcaster's voice, providing a more authentic listening experience. On the tools front, Getty Images introduced an AI-powered art generator with an emphasis on commercial safety, and in venture capital news, Cisco Systems' acquisition of Splunk underscores the growing intersection of AI and cybersecurity.
Sliced:
👀 ChatGPT update enables its AI to “see, hear, and speak,“ according to OpenAI
📦 Amazon Takes a Big Stake in the A.I. Start-Up Anthropic
🏦 How corporate and investment banks are tackling gen AI
🎤 Spotify Is Testing AI-Powered Podcast Language Translation — Which Mimics the Podcaster’s Own Voice
OpenAI recently unveiled an update to ChatGPT, equipping it with image analysis capabilities and enhancing its conversational features with voice synthesis. This means the AI can now interpret images in text conversations and hold verbal discussions. The forthcoming ChatGPT mobile app will facilitate complete voice interactions, with image recognition available on both web and mobile interfaces. Though the internal workings are undisclosed, it's speculated that OpenAI may utilize a shared encoding space for text and images, potentially using CLIP to merge visual and textual data. The voice feature will offer multiple synthetic voice options, developed with professional voice actors, and will incorporate OpenAI's Whisper for speech recognition. Although OpenAI has touted these enhancements, they've also highlighted the system's limitations and the potential risks, emphasizing a gradual release to ensure user safety and refine the system over time. Some in the AI community have cautioned against overstating the capabilities, reminding users that despite its advancements, ChatGPT remains a machine and not a human.
Amazon announced its intention to invest up to $4 billion in the AI start-up Anthropic, aiming to expand its influence in the rapidly growing AI sector and compete with rivals like Microsoft and Google. Anthropic, known for its generative AI capabilities and chatbot named Claude, stands out as a key player in the AI space. The partnership benefits both entities: while Amazon secures a stake in a promising AI company, Anthropic can channel its funding into Amazon's cloud services, underlining a novel business strategy where AI start-ups and cloud providers mutually profit. This move could enhance Amazon's presence in the AI landscape and foster novel technological innovations within its ecosystem.
Corporate and investment banks (CIB) have long been pioneers in adopting traditional AI and machine learning for trading and analyzing unstructured data. As generative AI (gen AI) emerges, its potential for these banks is vast. McKinsey Global Institute estimates that gen AI could add between $200 billion and $340 billion in value across the entire banking sector. Gen AI excels in content creation, offering enhanced natural-language understanding, which is a step beyond the conventional natural-language processing. For instance, CIB banks can leverage gen AI for compliance by intelligently parsing through regulatory reports and summarizing them. Institutions like JPMorgan Chase and Morgan Stanley have already started integrating gen AI tools, while others are still in the early stages or facing challenges such as high computing costs. With the right implementation, gen AI can boost productivity in core CIB operations by 30-90%, potentially adding 9-15% to their operating profits. The most significant gains from gen AI can be found in new product development, customer operations, and marketing and sales. However, for banks to fully harness gen AI's potential, they need a clear strategy, cost consideration, and the necessary talent.
🎤 Spotify Is Testing AI-Powered Podcast Language Translation — Which Mimics the Podcaster’s Own Voice
Spotify is piloting an AI-powered podcast language translation feature that replicates the original podcaster's voice, aiming to provide a more genuine listening experience than conventional dubbing. Teaming up with renowned podcasters like Dax Shepard, Monica Padman, Lex Fridman, Steven Bartlett, and Bill Simmons, the streaming giant aims to release translated episodes in languages like Spanish, French, and German. Leveraging OpenAI's voice-generation tech, the translations promise authenticity and personalization. The new episodes will be globally accessible on Spotify, with Spanish versions launching on September 25, followed by French and German in subsequent weeks. As Spotify's VP of personalization, Ziad Sultan, remarked, this innovation aims to foster stronger bonds between listeners and creators. With its dominant presence in the podcast market, the streaming service recently refocused its strategy on forging ties with top global podcasters.
🛠️ AI tools updates
Getty Images has unveiled "Generative AI by Getty Images", a novel AI-driven art generator, developed in collaboration with Nvidia. The tool transforms text prompts into visuals and is trained on Getty's extensive library. Unlike its contemporaries such as DALL-E 3 and Midjourney, Getty's version promises a higher level of commercial safety. Users receive a standard royalty-free license for images created, offering protection against copyright claims and global, non-exclusive usage rights. To counter misuse, the tool restricts the generation of images resembling living artists' styles or depicting public figures in potentially misleading contexts. All AI-generated images will carry a watermark. The platform aims to empower brands and marketers, ensuring fair compensation to contributors whose content trained the AI. Getty plans to share the tool's revenues with these contributors. The tool's release follows Getty's criticism and legal actions against other AI tools like Stable Diffusion for potential intellectual property violations.
💵 Venture Capital updates
Cisco Systems has acquired Splunk for approximately $28 billion, marking the year's most significant enterprise software transaction and underscoring the increasing convergence of cybersecurity and AI. This acquisition primarily focuses on Splunk's recently introduced AI-powered detect-and-respond capabilities. Cisco's CEO, Chuck Robbins, emphasized the deal's potential to propel AI-driven security and observability, ranging from threat detection to prevention. The past weeks have also witnessed a surge in investment activities in the AI cybersecurity sector, with several startups securing significant funding. Despite a dip in funding for most sectors, the overlap of AI and security remains a key interest area for investors, as illustrated by nearly $1 billion raised by cybersecurity startups focusing on AI this year. This synergy is rooted in the mutual growth potential: while AI can enhance cybersecurity measures, cybersecurity, in turn, ensures the safety of AI processes.
🫡 Meme of the day

⭐️ Generative AI image of the day

Before you go, check out Spotify will not ban AI-made music, says boss.
