HomeWinBuzzer NewsMicrosoft Research Develops AI that Learns Through Positive Human Feedback

Microsoft Research Develops AI that Learns Through Positive Human Feedback

Microsoft Research has detailed a new artificial intelligence framework that learns through positive affectivity such as a smile.


A tried and tested learning technique amongst humans is positive feedback. This is a characteristic such as a smile, emotion, or feeling that rewards people. Studies have shown we have increased satisfaction towards learning goals when positive affectivity is used. With that in mind, Research wants to extend the idea to artificial intelligence (AI).

wants to integrate the concept of positive feedback into machine learning models. Such a training technique would help AI build confidence and work more effectively to fulfilling a goal.

According to the research team running the project, learning through reinforcement is often used to achieve a goal through policy rewards. However, these rewards are often too broad so intrinsic rewards are more effective. These rewards depend on the task at hand and can more accurately highlight success in achieving a goal.

“We present a novel approach leveraging a task-independent intrinsic reward function trained on spontaneous smile behavior that captures positive affect.”

Looking for such an intrinsic approach in AI, Microsoft Research built a framework that is motivated by human feedback. For example, it is motivated by affects like happiness and a smile. This is possible through a computer vision platform.

Reward System

Through the reward system, the framework predicts responses like a human smile and learns it as a positive feedback towards a general policy.

“Here we were not attempting to mimic affective processes, but rather to show that functions trained on affect like signals can lead to improved performance,” wrote the coauthors of the paper.

“In summary, we argue that such an intrinsically motivated learning framework inspired by affective mechanisms can be effective in increasing the coverage during exploration, decreasing the number catastrophic failures, and that the garnered experiences can help us learn general representations for solving tasks including depth estimation, scene segmentation, and sketch-to-image translation.”

Luke Jones
Luke Jones
Luke has been writing about all things tech for more than five years. He is following Microsoft closely to bring you the latest news about Windows, Office, Azure, Skype, HoloLens and all the rest of their products.

Recent News