Meta AI Project Gives Robots Eyes through Artificial Visual Cortex

Meta Platforms has unveiled a new generation of AI robots that can learn to perform challenging sensorimotor skills by watching videos of humans.

The company’s AI researchers have developed two innovations: adaptive sensorimotor coordination (ASC) and visual cortex (VC-1) i robot AI. ASC is a framework that allows robots to learn from videos of humans performing everyday tasks and then adapt their actions to different environments and embodiments.

Meta is another Big Tech company that is focusing on AI development. However, the company is playing catchup with Microsoft, which is so far the dominant mainstream AI player thanks to its partnership with OpenAI.

Robotic AI Vision and Movement without Data Learning

These two developments will allow AI-powered robots to function without needing to take real-world data. We often think of AI as a sort of brain, but what if it could also have a body? The future of autonomous robots with AI learning and generative capabilities is still some way off. However, if 2023 has shown anything it is that AI is here and development is racing ahead.

VC-1 is a perception model that is compatible with a wide range of sensorimotor skills, environments and embodiments.

“VC-1 is trained on videos of people performing everyday tasks from the novel Ego4D dataset created by Meta AI and academic partners. And VC-1 matches or surpasses state-of-the-art results on 17 different sensorimotor tasks in virtual environments,” Meta’s press release days.

The researchers said that they were inspired by the human visual cortex, the brain region that (along with the motor cortex) enables an organism to convert vision into motion. According to the team, they wanted to develop an artificial visual cortex that could enable robots to learn from videos of human interactions with the real world and simulated interactions within virtual worlds.

ASC is a framework that can adapt the robot’s actions to different environments and embodiments. It is near-perfect in functioning in physical environments. It has a 98% success rate, according to the blog post, in mobile robotic copying, which involves moving towards an object, picking it up, moving to another location and placing the object.

Creating New Ways for Robots to Learn by Watching Human Interactions

One of the key challenges of AI is that it needs data to learn from in both cases. Meta’s researchers developed “new ways for robots to learn, using videos of human interactions with the real world and simulated interactions within simulated worlds”.

The research was presented at the International Conference on Learning Representations (ICLR) 2023 and published in a paper titled “Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?”.

Meta’s research is important because it makes a breakthrough in real-world image detection, visual capabilites, and movements. Computing AI can already detect objects in images and act as a “seeing” model for software. Microsoft’s recent Azure Cognitive Service for Vision is a good example.

Azure Cognitive Service for Vision – which is now available in preview – gives developers tools for integrating visual components into their apps. For example, the platform provides image analysis, facial detection, image tagging, text reading, text extraction with optical character recognition (OCR), and facial recognition.

Seeing AI is another Microsoft project that has similar goals to Meta’s work. Launched way back in 2017, the service is something of an innovative precursor for the sort of real-world vision AI for modern robotics. The iOS app uses computer vision to give visually impaired users a description of their surroundings and environment.

Once downloaded, users point their iPhone camera at a person and let the AI take over. The app will say who the person is and their current emotion. Seeing AI will also work on items, such as products.

Also in 2017, during its annual Build conference, Microsoft discussed how it is using AI in camera technology to make workplaces safer. Visual AI models ca scan environments for hazard and warn users. This technology is similar to Seeing AI and can also function throught a smartphone camera.

Part of the Azure Edge AI service, this in-camera model works in similar ways to Meta’s AI. One core difference is Meta is teaching its AI to be independent and learn by observing real-world interactions and human movements. It does not require a dataset, which is why it could be a huge leap forward for robotics.

Meta Turbocharging Development to Catch Microsoft

Last month, Meta CEO Mark Zuckerberg revealed the company is merging its AI development teams into a single division. He says the company wants to “turbocharge” its AI development. Meta transitioned from Facebook to focus on the development of the Metaverse. The company saw augmented reality technology as the next major breakthrough.

However, AI mainstreaming has reached a new level and Meta was one of the companies caught off guard by Microsoft’s products such as Bing Chat, Microsoft 365 Copilot, and Azure OpenAI Service. While Microsoft is already legitimately an AI company, its Big Tech rivals are not as mature in their AI development.

Tip of the day: File History is a Windows back up feature that saves each version of files in the Documents, Pictures, Videos, Desktop, and Offline OneDrive folders. Though its name implies a primary focus on version control, you can actually use it as a fully-fledged backup tool for your important documents.

Meta AI Project Gives Robots Eyes through Artificial Visual Cortex

Robotic AI Vision and Movement without Data Learning

Creating New Ways for Robots to Learn by Watching Human Interactions

Meta Turbocharging Development to Catch Microsoft

Recent News

New NSA Cybersecurity Information Sheet Targets AI System Security

Brave Search Integrates AI to Enhance Query Responses Globally