Set Up Environment for Twilio Media Streams
- Ensure you have Node.js installed. Use the latest stable version for best compatibility with Twilio's APIs.
- Install the Twilio Node.js SDK by running the command in your project directory:
npm install twilio
Configure Your Twilio Client
- Import the Twilio module and initialize the client with your Account SID and Auth Token.
const twilio = require('twilio');
const client = new twilio('YOUR_ACCOUNT_SID', 'YOUR_AUTH_TOKEN');
Set Up WebSocket Server
- Prepare a WebSocket server to handle incoming media streams. You’ll need an HTTP server and a WebSocket server.
- Use the 'ws' library for WebSocket communication.
const WebSocket = require('ws');
const http = require('http');
const server = http.createServer();
const wss = new WebSocket.Server({ server });
wss.on('connection', (ws) => {
console.log('New WebSocket connection established');
ws.on('message', (message) => {
console.log(`Received message: ${message}`);
// Transcription logic goes here
});
ws.on('close', () => {
console.log('WebSocket connection closed');
});
});
server.listen(8080, () => {
console.log('Listening on port 8080');
});
Integrate Twilio Media Streams with WebSocket
- Configure your Twilio Voice response to stream the media to your WebSocket server.
const express = require('express');
const VoiceResponse = twilio.twiml.VoiceResponse;
const app = express();
app.post('/twilio-media-stream', (req, res) => {
const response = new VoiceResponse();
const connect = response.connect();
connect.stream({
url: 'wss://your-ngrok-url/ws'
});
res.type('text/xml');
res.send(response.toString());
});
app.listen(3000, () => {
console.log('Express server listening on port 3000');
});
Utilize Ngrok for Local Testing
- Use Ngrok to expose your local server to the internet and test the WebSocket connection with Twilio.
- Start Ngrok by running the following command:
ngrok http 3000
Handle Incoming Audio and Transcription
- Process audio data received over WebSocket and integrate with a transcription service like Google’s Speech-to-Text API.
const speech = require('@google-cloud/speech');
const client = new speech.SpeechClient();
async function transcribeAudio(message) {
const request = {
audio: {
content: message
},
config: {
encoding: 'MULAW',
sampleRateHertz: 8000,
languageCode: 'en-US'
}
};
const [response] = await client.recognize(request);
const transcription = response.results
.map(result => result.alternatives[0].transcript)
.join('\n');
console.log(`Transcription: ${transcription}`);
}
wss.on('connection', (ws) => {
ws.on('message', async (message) => {
await transcribeAudio(message);
});
});
Final Testing
- After setting everything up, make a test call to your Twilio number and ensure that audio is streamed through your WebSocket and transcribed appropriately.
- Debug any issues that arise during the process, ensuring all WebSocket messages are being received and processed correctly.