Build an audio conferencing system with programmable voice

Share on twitter
Share on whatsapp
Share on linkedin
Share on reddit
Share on email

Are you looking to build an audio conferencing solution? Or just to embed conferencing capabilities into your apps and solutions? That blog post will show you how to quickly built it using programmable voice capabilities. 

Audio conferencing is the simplest way to connect people. Being able to quickly create, at no cost, as many as you need will bring you a lot of flexibility and scalability in your business. It can also increase your solution value by seamlessly embedding audio conferencing in your user’s journey. Thus, either needed as a standalone solution or to support your business processes, audio conferencing is used everywhere, every day.  

Audio conferencing in real life

Imagine yourself working in a critical support center, dealing with life-threatening emergencies. This man is calling you. He is having a stroke. You need some medical assistance and an ambulance service.

Would you put this man on hold or transfer him to the medical operators? Letting him hearing this annoying music while he is fearing for his own life? What are the other options available to you? How could you reach the help he needs without letting him hang on for several minutes?

Creating audio conferencing on the fly could improve that user’s journey. It would allow you to reach for the medical assistance and ambulance service while keeping him on the line. So you keep reassuring and updating him about the actions you took. And finally, when you are sure the medical experts are able to take over the situation. You could just leave the audio conference without breaking the link between the patient and the medical operators.

Let’s imagine another example :

You run a marketplace or a platform business, you connect buyers and suppliers for a product or service. But sometimes you have to handle complex closings or disputes. The two parties need to talk and might need a mediator or a third party for them to handle the situation.

By embedding audio conferencing in your own apps, you could allow these two parties to reach each other and in one click they could ask for a third party to step in and help them.


Wazo Enterprise Programmable Communication provides all the APIs to build such use cases and many other communication scenarios. Using the free visual builder Node-Red, you could build your own flow answering your business needs in a couple of clicks.

Create an audio conferencing system with programmable voice

Let’s dig in this example. You want to create an on-demand audio conference system. Invite participants using a calendar. And be able to manage your rooms from a simple interface.

For the purpose of this example, we have built a Web UI using Node-red. But, you could build it with any other solution or connect an already existing one.

Use case workflow

To deploy such a use case, you will need a management interface. It will enable audio conference administration: create, manage and visualize. From this interface, you would also like to be able to enter the email linked to your calendar. This way, you will receive the initial invitation containing all credentials and information, and then be able to invite other participants in a few clicks.

These participants will receive the invitation by mail. You must set up all the needed information as variables: access and PIN codes.

Regarding the user experience, we offered 2 options for that example. But with programmable voice capabilities, you can design it the way you want. There are no limits.

So first, you could add the public SDA (Phone number) of your freshly built audio conference room in the invitation. This way, the participant will have to dial it and make his way through DTMF credentials directly on his phone.

The other solution is to provide a public Web UI, which URL will be given within the invitation. This UI will ask the user’s phone number, and his credentials to trigger a web-callback on the phone he provided.

You could also imagine building a full integration within your existing calendar solution. It means that creation and eventually suppression of the rooms must be scripted and triggered by specific events. The rest of the process would stay the same.

Programmable use case | Episode #2 – Audio Conferencing (French & English subtitles)

Let’s dig into the tech

In this example, we used Wazo Enterprise Programmable Communication (Programmable Voice), Node-red Dashboard (Management UI), MongoDB (additional data storage) and Google calendar.

conferencing node-red programmable voice communication
Node-red conferencing workflow

Part #1 – Programmable voice APIs

First, let’s focus on the “connexion to room” flows, our first three lines in Node-Red.

We have built this audio conference feature using Wazo’s events and APIs requests. “Events” are WebSocket based nodes, meaning that we are able to catch something happening in our platform in real-time.

Let’s start with call_entered, this event react at every incoming call landing in a specified SDA. Then, we used the Answer API to order the pick up the call.

It will trigger our next event node: Call_answered followed by a Playback API request to order the platform to play a specific audio message.

We also use call_DTMF_received, which will react to DTMF inputs (our credentials, to access the room). At the end of these first flow lines, you find a bridge call API request, this one will connect the current call to the targeted audio conference room.

If you want a better understanding of Wazo’s programmable voice API, please find our related documentation: API Reference.

Each of these events are triggering functions, where we host few code lines to process the data or script how the flow should react.

For example, our function DTMF process will operate every time DTMF inputs are detected and will verify the exactitude of the Access and PIN code combos.

Part #2 – User experience

For the second part of this workflow, about how we built the web interface. I recommend reading Node-red dashboard literature.

Just note that since we need to store information, such as Access and PIN codes or email addresses, we used a third party database: MongoDB. You will easily spot database nodes, appearing in green.

You probably spotted our confd API, this is used to pass creation and suppression parameters to the system.

The third and fourth rows of this flow are about the creation and suppression of rooms. Here we use a lot of confd API requests, either to retrieve information from Wazo and to update some.

First, we retrieve (GET) the context and available extension from the existing dial plan and then we book (POST) an extension for a new audio conference room.

Finally, we update all information to link an audio conference room to our booked extension (PUT). Additional information is stored in a database and the initial calendar invitation is sent using respectively MongoDB and Google calendar APIs.

Regarding suppression, in the 3 final lines of the flow, we just delete the room and the extension (DELETE), as well as the information in the database.

Discover programmable communications use cases

Using node-red you can easily set up workflows combining Wazo Enterprise Programmable Communication with other software. You will solve complex business challenges, without compromise on your with users’ needs.

If you are interested to discover additional programmable communications use cases, we have built some for you:

  • “Easily build a voice alerting system” – Episode #1
  • “Increase the value of your voice channels” – Episode #3
  • “Turn your communication data into business analytics” – Episode #4

And if you have use cases that you can’t solve with your existing system, feel free to reach out. It would be a pleasure to discuss it and see how we can help 🙂

Stay tuned!
Subscribe to receive exclusive content and news.