top of page
karlavisionfactory

Building smart Julie with PYTHON and OPENAI API

Updated: Feb 7


Leonel from Vision Factory

Author: Leonel Silima

Date of Publication: 12/05/2023




Contextualization


It seems that the title of this article has piqued your curiosity. Yes, this article was created for this purpose. I think the first question for you is who is smart Julie? That's exactly what I wanted you to ask because our main goal is to answer what smart Julie is. However, in a nutshell, smart Julie is a friend who, in addition to being kind, is also very intelligent, capable of answering all your questions and even giving you advice if you ask for it.


Smart Julie

Technically, smart Julie will be a virtual assistant that we will build with Python and Openai API. So, we will be able to answer any question that is put to her, both academically and socially. However, the question has to be well formulated and the pronunciation of the words sufficient in the language that she intends to use. Therefore, in this project, we will use English but any internationally recognized language can be used as well.


What do we need to complete this project?


The first step for this project is to know how to program with python because the knowledge of installing libraries and connecting to an API will be essential. But don't worry because we will explain it step by step. So, you just need to follow all the steps and at the end, we will provide the link from GitHub with the source code.


Necessary tools


The tools needed for this project are Python programming language and the following libraries:

  • SpeechRecognition is a Python library that allows you to recognize human speech and convert it into text. In particular, this can be used in various applications, such as virtual assistants and audio transcription. To use the library, you need to have a microphone connected to your computer and install the library using a package manager like Pip.

  • OpenAI API is an application programming interface (API) made available by OpenAI. This way developers can use the artificial intelligence models developed by the company in their own applications.

  • GTTS stands for "Google Text-to-Speech" and is a free API made available by Google to convert text to speech. The API allows developers to integrate speech synthesis functionality into their own applications, using a variety of voices and languages.

  • Pygame is a set of Python libraries aimed at game and multimedia development. Specifically, libraries include modules for window and event management, 2D graphics, image manipulation, audio, keyboard and mouse input, collision detection, and much more. In our case, we will use it for audio manipulation.

  • Pyttsx3 is based on different speech synthesis engines like SAPI5 (Speech API 5) on Windows and NSSpeechSynthesizer on macOS. In addition, it supports multiple languages ​​and voices, as well as real-time audio playback.

  • OS: the os module is a standard Python library that provides a way to interact with the operating system. Actually, it allows you to work with the file system, get information from the environment, manage processes, change environment variables, and much more.

  • TIME the time module is a Python standard library that allows you to work with functions related to time, such as measuring durations. That is waiting for a certain period before executing a certain action, formatting dates and times, among other features.

  • Random: the random module is a standard Python module that provides functions for generating pseudorandom numbers. In fact, these functions are useful for various applications such as games, simulations, and cryptography.

  • Text editor is where we write our code in this case, we use vscode but you can choose your preferences.

Coding

  1. Installation and import of libraries

Before importing the libraries make sure that you have Python installed on your computer and that your microphone is operational because some libraries require it. Regarding the procedures for installing Python and libraries, find in this article where we present all the details for the purpose, below we present the imports.

  1. Coding

Next, we call our API key in the form of a string


We define the function that transforms the voice captured in the mic to text


Afterward, we define the function that will transform the text (message) into audio, I leave both for the gTTS library and for pyttsx3.


Final considerations


Once you have completed all the steps, you can talk to our assistant Julie and she is already in a position to answer your questions. So, don’t lose time! Find the source code here and test it on your own computer and have fun. Remember that the next article will be much more interesting because we are going to convert Julie into an installable setup on your computer or smartphone don't miss it out!


 

You can also read about:


 

Reference List

36 views0 comments

Comments


bottom of page