Consider creating an application to work with ChatGPT. Its task will be to receive voice messages via Telegram, convert them to text, transfer them to ChatGPT, and send a response back to Telegram. During the creation process, you will need to go through the following steps:

  1. Creating of a bot for Telegram and its basic setup.
  2. Receiving a file with a voice message (Telegram uses the .oga format).
  3. Convert file from oga to mp3.
  4. Transferring the mp3 file to the Whisper service for conversion to text.
  5. Sending a received text to ChatGPT.
  6. Sending a response from ChatGPT to a user in Telegram.

Creating a bot for Telegram and its basic setup

It all starts with the very fact of registering a bot. And for this, Telegram uses its own special bot - BotFather. You need to write to him and go through simple sequential steps.

Telegram BotFather

As a result, you should have a link to your bot and a token for requests to its API.

Now you can move on to setting up the bot in AppMaster. The easiest way to do this is to install the appropriate module. You just need to select it in the list of modules and specify your API token in the settings.

Telegram Module

The necessary models will be automatically created in your project, blocks for working with Telegram will appear, as well as the basic Telegram business process Telegram: Echo. It is worth considering carefully and understanding the principle of work.

Telegram Echo business process

At the input, it receives a message from Telegram (Message model). It contains the text of the sent message (text), as well as the chat model from which you can get the sender's identifier (id). The last block Telegram: Send Message sends the received message back to the sender, but on behalf of the bot.

You can use this business process for the initial testing of the bot and communication with it. To do this, you need to create an endpoint that will receive information and start a business process.

Telegram bot endpoint

When creating it, it is important to disable Middleware Token Auth. The endpoint must be open for use without authorization.

The principle of operation of the Telegram bot is quite simple - all messages that will be sent to it go to a special webhook, which can automatically transfer them further and send them to the endpoint of your choice for further actions.

Accordingly, the last step remains to activate the bot - you need to register this endpoint in Telegram and indicate that this bot should be associated with it. This requires sending a POST-request with the full endpoint URL and indicating your bot token instead of {Bot API Token} to the address

https://api.telegram.org/bot{Bot API Token}/setWebhook

If you receive such a message in response, then everything was done correctly.

 {

    "ok": true,

    "result": true,

    "description": "Webhook was set"

}

Post Telegram Webhook

The bot is ready to work, you can send him a message and get it back.

Receiving a file with a voice message

The Telegram module is designed primarily for working with text messages. And our task is to get a file with a voice message. With AppMaster, you can easily solve this problem. First, we need to analyze what we generally receive from Telegram to parse the structure of the message. The Get Request Body block is designed for this. It eliminates the need to specify the request structure in advance and allows you to receive the entire request, regardless of its content. This block returns the query result as a set of bytes, and you can use the To String block to present the result in a human-readable form, as well as save it in logs (Write to Log block) for further analysis.

Telegram Request log

We are interested in two parameters from the entire request:

Sender ID - specified in the request as "id":300493858
File ID - "file_id":"AwACAgIAAxkBAAMzZBk6QRvO-OYWsWUC-Bu1UXDM2FwAAkktAAKTZclIWTSkfTTw8wYvBA"

You can create your own model that matches the request and use it to get the required fields. But it will be faster to create a regular expression (Regex) and use it. To do this, the String Match Regex block takes the expression itself as input, as well as the string in which the match with the given expression will be checked.

In the first case, the expression is "id":\d+

As a result, we get the string "id":300493858, from which we will need to remove the extra ("id":) using the Replace String block and leave only the identifier itself.

In the second case, the principle is exactly the same, but a slightly more complex expression is used: "file_id":"[^"]+

Regex to get request ID

Now we have the sender id and the file id, and we can use that to get the file itself. To do this, you need to turn to the Telegram API. This has already been done before when registering the endpoint of the bot. Now you need to make a similar request to get the file. {File ID} in the request URL must be replaced with the received file ID.

https://api.telegram.org/bot{Bot API Token}/getFile?file_id={File ID}

To send a request and receive its result, we use the HTTP Request block, specifying URL and Method = GET as parameters for it.

Telegram File Request

From the received response, you can find out the relative path to the file, it is passed in the "file_path" parameter. Accordingly, using the next regular expression ("file_path":"[^"]+) you can extract the desired value and connect with “https://api.telegram.org/file/bot{Bot API Token}/” to get the full link to the file.

Convert file from OGA to MP3

The file is received, but the obstacle is that the Whisper service does not support working with the OGA format. You need to convert to one of the appropriate formats.

As an example, the Zamzar service is used (its free plan supports the ability to make 100 conversions per month) and converting to MP3.

You can refer to its documentation for details or use another similar service. We will not analyze the work with it in detail, and we will consider only the part that relates directly to the implementation of AppMaster.

First of all, the request will need the correct authentication data. They must be provided in the Basic Authentication format. To do this, you need to pass a header with values in the request:

Key = 'Authorization'

Value = 'Basic '+ User ID and password separated by “:” in base64 format

The API key obtained when registering with the service is the user ID. You need to add “:” to it and encode it in Base64 format using the To Base64 block. The result needs to be turned into a header (Make Key-Value (String) block).

Zamzar Auth Header

The next step is to create a model for the query in the database designer. The request must be sent in the Multipart Form format, respectively, it is necessary to prepare a model of the form of this request. In our example, the model consists of three fields of type String:

  • source_file - the full path to the source file (it was learned in the previous step).
  • source_format - source file format, in this example, it is a fixed value "ogg".
  • target_format - target format for conversion. You can choose any format that is supported by Whisper. Let's use "mp3" as an example.

Zamzar Request Model

In the business process editor, you need to use the Make block to fill in the model data and send it as a POST request to https://sandbox.zamzar.com/v1/jobs/ using the HTTP Request block (be sure to specify Serialize request body = Multipart Form).

Zamzar Conversion Request

It should be noted that this request does not return the converted file but only creates a task to convert it. You need to apply for the result separately; for this, you need the ID of the created task. This ID must be obtained from the body of the response to the request, and for this, the already worked out process should be done using regular expressions and extracting the id value.

The result of the conversion must be applied separately. This will require two more requests. The first is to find out if the result is ready. The second is to pick up the finished file. At the same time, we do not know the exact time of readiness, so we can organize a loop that will send repeated requests to check readiness at certain intervals (for example, every second).

Zamzar Conversion Check Loop

An HTTP Request must be sent using the GET method to the URL https://sandbox.zamzar.com/v1/jobs/{id}, where {id} is the task id obtained in the previous step. This uses the same headers as in the previous request.

From the received response, you need to find out the readiness status. If the conversion is completed, the response will contain "status": "successful" and for us, this is a signal that we can complete the loop and move on.

In addition to the status, the response must contain the ID of the finished file ("target_files":[{"id":). It must be extracted to get the final link to the file in the form https://sandbox.zamzar.com/v1/files/{ID}/content

At the same time, receiving a file is available only to authorized users, so you need to execute an HTTP Request using the same headers as in previous requests.

As a result of the request, the contents of the file will be obtained, which must be given a name and saved for further use.

Zamzar save converted file

Sending an MP3 file to Whisper for conversion to text

Now everything is ready for the next step - sending a file with a voice message for recognition. This will require another request in the Multipart Form format. Only unlike the previous example, the request will need to transfer the file itself and not a link to it.

A model for such a request can be created in the External HTTP Request section. In this case, you can not create a request completely but limit yourself only to creating a request body model. The model itself consists of two parameters:

  1. File (Virtual File type) - the same file that needs to be recognized.
  2. model (type String) - here we specify the value whisper-1.

Whisper request model

Also, for the request, it is necessary to obtain a key for working with the OpenAI API and generate an authorization header of the Bearer Token type.

Key = 'Authorization'

Value = 'Bearer '+ OpenAI API Key

Next, you can send the POST request itself to recognize the voice message to the Whisper service at the URL https://api.openai.com/v1/audio/transcriptions

Whisper HTTP Request

As a result of successful recognition of the file, a response will be received in the form {"text": "Hello world.”}

Sending a received text to ChatGPT

You can continue to use HTTP Request blocks to send a request to ChatGPT. To explore the API documentation, as well as independently create models for requests and responses. But you can also use a simpler option in the form of a ready-made module from AppMaster for working with OpenAI, which must be installed in the modules section.

OpenAI Module

In the minimum sufficient version, you only need to specify the parameters of the OAI ChatCompletionMessage model (role = user, content = message to be sent), add it to the array, and send a request to ChatGPT with the OpenAI: Create Chat Completion block (set parameter model = gpt-4).

OpenAI Request

As a result, we get a response from ChatGPT. We read it from the content parameter of the OAI ChatCompletionChoice model.

ChatGPT Response

Sending a response from ChatGPT to a user in Telegram

The last step is only to repeat what has already been done before - send a message to Telegram. But if we started by simply returning the message back to the sender, now this message has launched a series of various actions and the result is returned as a response from ChatGPT.

Telegram send ChatGPT response

In the process of developing such a bot, it is worth considering:

  1. Telegram works in such a way that each request sent by the bot must be successfully processed. Otherwise, he will try to repeat it many times, which means that in case of any problems you will receive it again and again. Make sure that the logic is built in such a way that the request does not result in an error and can be processed successfully.
  2. Not all requests will work as intended on the first try. You will need to search for errors, and for this, arrange to Write to Log blocks. They are useful both for checking that each block actually receives the data that you planned to transmit and for analyzing the result of the block and studying the response in detail.
  3. As part of the guide, we assumed that all requests are predictable and successful and the process is fairly linear. In practice, it will be necessary to provide a variety of options for actions (for example, what to do if the user sends a text message instead of a voice message or even a picture) and error handling (checking the status of HTTP requests, take into account different response options).
Was this article helpful?

AppMaster.io 101 Crash Course

10 modules
2 weeks

Not sure where to start? Get going with our crash course for beginners and explore AppMaster from A to Z.

Start Course
Development it’s so easy with AppMaster!

Need More Help?

Solve any issue with the help of our experts. Save time and focus on building your applications.

headphones

Contact Support

Tell us about your problem, and we’ll find you a solution.

message

Community Chat

Discuss questions with other users in our chat.

Join Community