Microsoft’s AI Copilot Makes Video Summaries, But There’s a Catch

Microsoft has introduced a new feature to its AI Copilot in the Edge browser that allows it to generate text summaries of videos. However, this time-saving feature comes with limitations. It only works on pre-processed videos or those with subtitles. In other words, Microsoft’s AI Copilot primarily summarizes the text transcripts of videos rather than the videos themselves.

According to Mikhail Parakhin, Microsoft’s CEO of advertising and web services, the video needs to be pre-processed for Edge Copilot to work effectively. If the video already has subtitles, the AI Copilot can fall back on them. However, if the video does not have subtitles and hasn’t been pre-processed yet, the AI Copilot feature will not work. This limitation restricts the functionality of Edge Copilot and limits its usability.

The AI Copilot functionality is not limited to the Edge browser alone. It can also perform similar functions within Microsoft 365. This includes summarizing Teams video meetings and calls for customer service agents. However, in both cases, the audio of the video needs to be transcribed first by Microsoft. Additionally, Copilot on Microsoft Stream, another Microsoft product, is capable of summarizing any video. However, it requires users to generate a written transcript beforehand. These limitations highlight the dependence of the AI Copilot on transcription and pre-processing of the videos.

A user named Pietro Schirano posted a screen recording of Edge Copilot summarizing a YouTube video about the GTA VI trailer. In this case, Copilot seemed to be performing its job perfectly. The recording shows the user pressing the “Generate video summary” button in the Copilot sidebar, and within seconds, Copilot generates a summary with highlights and timestamps. However, it should be noted that this demonstration was based on a video that likely had pre-existing subtitles or had been pre-processed by Microsoft, which may not be the case for all publicly available videos.

Microsoft’s AI Copilot is part of the larger generative AI race between Microsoft and Google. Google recently upgraded its YouTube extension for Bard chatbot, enabling it to summarize video content and extract specific information from videos. However, both companies face challenges and limitations in their respective AI capabilities. Google’s Gemini update, for example, has faced criticism for potentially misrepresenting the AI’s capabilities in a demo and occasionally providing inaccurate information. Microsoft’s Parakhin has been open about the different stages of Copilot’s evolution on social media.

Microsoft’s AI Copilot feature in the Edge browser offers the ability to generate text summaries of videos. However, its functionality is restricted to pre-processed videos or those with subtitles. The AI Copilot can summarize Teams video meetings, calls, and videos on Microsoft Stream, but only after transcribing the audio or generating a written transcript. While the demonstration of Edge Copilot summarizing a YouTube video seemed promising, it is important to consider the limitations and dependencies of this feature. As Microsoft continues to compete with Google in the generative AI space, further advancements and improvements can be expected.


