Building a Conversational Social Robot - A Complete Guide! (2024)

The figure below displays the basic architecture to build a conversational social robot, using QTrobot and ChatGPT. The code is available on LuxAI GitHub.

Building a Conversational Social Robot - A Complete Guide! (1)

Now, let’s delve into each part of the implementation in detail!

Online speech recognition

Large language models work with text inputs, also called prompts. Our AI Chatbot will be using online speech recognition. The advantage of it is that the voice can be analyzed as a stream, the user can speak as long as they need, and the transcription of the speech is quite fast. The disadvantage of using Google Speech is that the language of communication needs to be known in advance. The output of the speech recognizer is a transcription of the speech which we can use as our prompt.

For that, we created Google speech recognition wrapper pre-installed on QTrobot. To this wrapper work, you will need to set up a Google Cloud account, get the Google API credentials, and run the Google speech app. Detailed instructions for setting up the account and getting the API credentials can be found in this link.

All we need to do is define and call the Google Speech ROS service as below:

The parameter language can be set to any other language that Google Speech covers. Any other speech recognition service for any language can be used instead of Google Speech, just try it out!

Language understanding and generation with OpenAI GPT models

OpenAI’s GPT (Generative Pre-trained Transformer) is a language model that uses machine learning to generate natural language text. GPT can complete sentences, paragraphs, and entire articles, based on the large amount of data it has been trained on. GPT has been used in various applications, including chatbots, content generation, and text summarization.

We will use GPT to generate responses from speech recognition transcripts.

To make this task simple, we have created two Python classes for OpenAI models: one utilizing the “text-davinci” model and another for the “gpt3.5-turbo” model. We have implemented history tracking and included a system message for GPT to give QTrobot a personality, as shown below. You can check the full code at this link and read more about OpenAI here.

Davinci3 Class:

The system_message stores one part of the prompt that helps to create the identity of the robot. The recognised transcript of the speech is attached to it and both together form a prompt that is sent to the language understanding and generation model, GPT in this example.

We can now call OpenAI to get a generated response from the GPT model.

To make this simple example work nicely, we have created the TaskSynchronizer, a python class that helps to synchronise speech processing and language generation. You can learn more about it here.

Postprocessing, emotion recognition and embodied response

Emotions and sentiments are analysed at two points:

  1. The user’s input is analyzed so that the chatbot can express empathy based on the user’s emotions and feeling detected;
  2. The response from the language model is analyzed in order to embody the emotional expressions detected in the generated language.

We use NLTK and text2emotions for the analysis of emotions, sentiment. The response from the language model is also analysed in term of sentence structure and keywords. In each sentence, we analyse keywords and add some gestures to the robot output accordingly. For example, if there is a ‘yes’ in one of the sentences, we would like QTrobot to nod with the head, if we detect a ‘no’, the robot would shake the head showing an embodied negative response. While we have implemented this feature for a few selected keywords, you have the flexibility to extend it to any word you like and make the model for embodiment of the generated textual response more complex.

With the recognised emotions and sentiments, we can use QTrobot’s facial expressions, gestures and text-to-speech to generate an multimodal response to the user’s prompt.

This is a very simple approach, and you can easily try out your own with a more complex models of emotions and postprocessing!

Multiturn conversations with QTrobot

After QTrobot shows and says the entire response, we call the Google Speech ROS service again and repeat the entire process. This enables you to have endless conversations with QTrobot, using OpenAI’s GPT model.

Giving the robot an identity

Speaking from different social roles in different social situations requires speakers to behave and to speak differently. We can achieve these differences in the robot’s behavior by writing prompts that make a language model to generate language that is closer to the desired identity. For example, the robot can be a tourist guide or a teacher’s assistant, it can be an astronaut who just returned from space or an artificial companion that helps learners of a foreign language to practice conversation. Writing proper prompts is an art! You can find some advice on how to write good prompts here: https://learnprompting.org/ and here: https://www.promptingguide.ai/.

In our GitHub repository, you can find five pre-configured characters for the ‘gpt-3.5-turbo’ model: qtrobot, fisherman, astronaut, therapist and gollum. If you want to try them out, you just need to change the character parameter on line 12 in the ‘gpt_bot.yaml’ file and have fun talking to them.

To use custom character prompt you can change the parameter ‘use_custom’ to true and write your ‘prompt’ parameter instead of the template that you find in the ‘gpt_bot.yaml’. QTrobot will then take your prompt to trigger responses from the GPT model.

Conclusion and further directions

Now that you have seen how to build a conversational social robot assistant using Google Speech Recognition and OpenAI’s GPT model, you can access the full code in our GitHub repository. Feel free to use it as a foundation and add as many features as you would like to customize your chatbot!

About LuxAI and QTrobot

LuxAIis the founder, developer, and manufacturer of QTrobot and distributes QTrobots to countries around the world. QTrobot platform for research and development combines the best-in-the-market hardware components with a friendly design. QTrobot is a robust platform suitable for intensive working hours and multi-disciplinary research projects on social robotics and human-robot interaction. That makes QTrobot the ideal companion for researchers and developers in the field of social robotics.

QTrobotis a humanoid social robot with extensive capabilities to be used for research and development. QTrobot is a helpful tool in delivering best practices in child education, especially for children with autism and special educational needs. Being a robust platform with extensive built-in features, QTrobot can be used in many ways to support education and conducting research projects.

Building a Conversational Social Robot - A Complete Guide! (2024)

References

Top Articles
ITV3 HD - Today's TV | TV Guide
Flights from San Francisco to San Salvador: SFO to SAL Flights + Flight Schedule
San Fernando Craigslist Pets
Review: Chained Echoes (Switch) - One Of The Very Best RPGs Of The Year
Csl Plasma Birthday Bonus
Best Fantasy Basketball Team
Franklin City School District - Ohio
Big Lots $99 Fireplace
Xfinity Store By Comcast Branded Partner Fort Gratiot Township Photos
The Four Fours Puzzle: To Infinity and Beyond!
Five Guys Calorie Calculator
My Fico Forums
Best Amsterdam Neighborhoods for Expats: Top 9 Picks
Sky Park Stl Coupon
Rugged Gentleman Barber Shop Martinsburg Wv
Clayton Grimm Siblings
Pay Vgli
Female Same Size Vore Thread
Vineland Daily Journal Obits
2024 Chevrolet Traverse First Drive Review: Zaddy Looks, Dad-Bod Strength, Sugar Daddy Amenities
Pella Culver's Flavor Of The Day
Point After Salon
Roomba I3 Sealing Problem With Clean Base
Pokemon TCG: Best Japanese Card Sets
Food Handlers Card Yakima Wa
Slim Thug’s Wealth and Wellness: A Journey Beyond Music
Woude's Bay Bar Photos
Guide for The Big Con
Sdsu Office Of Financial Aid
O2 eSIM guide | Download your eSIM | The Drop
Obituaries Cincinnati Enquirer
Wells Fargo Arena Des Moines Seating Chart Virtual View
Vhl Spanish 2 Answer Key
Craigslist For Port Huron Michigan
Donald Vacanti Obituary
Expend4bles | Rotten Tomatoes
Puppies For Sale in Netherlands (98) | Petzlover
Detroit Lions Den Forum
Dinar Guru Recaps Updates
Press-Citizen Obituaries
New York Rangers Hfboards
Blog:Vyond-styled rants -- List of nicknames (blog edition) (TouhouWonder version)
Inter Miami Vs Fc Dallas Total Sportek
Wbap Iheart
Detroit Area Craigslist
Craigs List Williamsport
Ukg Dimensions Urmc
Why Did Anthony Domol Leave Fox 17
Buhsd Studentvue
Dukes Harley Funeral Home Orangeburg
Exceptions to the 5-year term for naturalisation in the Netherlands
Priority Pass: How to Invite as Many Guests as Possible to Airport Lounges?
Latest Posts
Article information

Author: Kieth Sipes

Last Updated:

Views: 5693

Rating: 4.7 / 5 (67 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Kieth Sipes

Birthday: 2001-04-14

Address: Suite 492 62479 Champlin Loop, South Catrice, MS 57271

Phone: +9663362133320

Job: District Sales Analyst

Hobby: Digital arts, Dance, Ghost hunting, Worldbuilding, Kayaking, Table tennis, 3D printing

Introduction: My name is Kieth Sipes, I am a zany, rich, courageous, powerful, faithful, jolly, excited person who loves writing and wants to share my knowledge and understanding with you.