If you're building a product that sends voice notes across multiple platforms, you already know each one has its own requirements. Audio formats differ, authentication models differ, and the delivery semantics are completely different. A voice note API comparison is essential before you commit to a platform-specific implementation that locks you in.
This post breaks down what it takes to send voice notes on LinkedIn, Telegram, and WhatsApp, side by side, so you can plan your integration with eyes open.
Audio Format Requirements
The first surprise for most developers is that each platform requires a different audio format. You cannot use the same file across all three.
| Platform | Container | Codec | Sample Rate | Max Duration | |----------|-----------|-------|-------------|--------------| | LinkedIn | M4A | AAC-LC | 44.1/48 kHz | 60 seconds | | Telegram | OGG | OPUS | 48 kHz | No hard limit | | WhatsApp | OGG | OPUS | 48 kHz | No hard limit |
Telegram and WhatsApp share the same format (OGG/OPUS), which is helpful. LinkedIn is the outlier, requiring M4A with AAC encoding. If your TTS provider outputs MP3 or WAV, you'll need a format conversion step for every platform.
Send the wrong format and the behavior varies. Telegram treats it as a generic audio file instead of a voice note. WhatsApp may reject it outright. LinkedIn silently drops it.
Authentication Models
Each platform handles authentication differently, and this is where the complexity really lives.
LinkedIn has no public API for voice messages. There is no OAuth scope, no developer console toggle, and no documented endpoint. Sending a voice note requires browser session credentials (the li_at cookie), which are tied to the user's IP address. This means server-side delivery from a data center IP will fail. You need either a browser extension or a local agent running on the user's machine.
Telegram
Telegram is the most developer-friendly. Create a bot via @BotFather, get a token, and you have full API access. The sendVoice method is documented, stable, and has been available for years. The catch: bots can only message users who have initiated a conversation first, so you cannot cold-outreach arbitrary users.
WhatsApp requires either the official Business API (which involves a Meta Business verification process and approval timeline) or an unofficial approach. The Business API is well-documented but has strict template requirements for outbound messages. Session-based approaches require QR code authentication and active session management.
Sending a Voice Note: Platform by Platform
There is no simple code example for LinkedIn voice notes because there is no public API. The flow involves uploading audio to LinkedIn's media infrastructure, then referencing the uploaded asset in a message send call. Both steps require valid session credentials and correct headers. Building this from scratch typically takes teams 2-4 weeks.
Telegram
Telegram is straightforward. Here's a Node.js example using the raw API:
const fs = require("fs");
async function sendTelegramVoice(botToken, chatId, filePath) {
const form = new FormData();
form.append("chat_id", chatId);
form.append("voice", new Blob([fs.readFileSync(filePath)]), "voice.ogg");
const res = await fetch(
`https://api.telegram.org/bot${botToken}/sendVoice`,
{ method: "POST", body: form }
);
return res.json();
}
And in Python:
import requests
def send_telegram_voice(bot_token, chat_id, file_path):
url = f"https://api.telegram.org/bot{bot_token}/sendVoice"
with open(file_path, "rb") as f:
resp = requests.post(url, data={
"chat_id": chat_id,
}, files={
"voice": ("voice.ogg", f, "audio/ogg"),
})
return resp.json()
The official Business API uses a media upload flow followed by a message send. Here's the basic structure with curl:
# Step 1: Upload the audio file
curl -X POST "https://graph.facebook.com/v18.0/{phone_number_id}/media" \
-H "Authorization: Bearer $WHATSAPP_TOKEN" \
-F "file=@voice.ogg;type=audio/ogg" \
-F "messaging_product=whatsapp"
# Step 2: Send the voice message using the media ID
curl -X POST "https://graph.facebook.com/v18.0/{phone_number_id}/messages" \
-H "Authorization: Bearer $WHATSAPP_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"messaging_product": "whatsapp",
"to": "15551234567",
"type": "audio",
"audio": { "id": "MEDIA_ID_FROM_STEP_1" }
}'
Two API calls, media hosting, and token management. Not terrible, but not a single call either.
Rate Limits and Throttling
| Platform | Rate Limit | Notes | |----------|-----------|-------| | LinkedIn | ~30-60s between messages | Aggressive detection, account restrictions possible | | Telegram | 30 messages/second (bot) | Very generous, rarely an issue | | WhatsApp | 80 messages/second (Business API) | Tiered by quality rating |
LinkedIn is by far the most restrictive. Telegram is the most permissive. WhatsApp falls in between but requires maintaining a good quality rating to keep high throughput.
The Multi-Platform Problem
Here's what a typical multi-platform voice note system requires:
- TTS generation to create audio from text
- Format conversion to produce M4A for LinkedIn, OGG/OPUS for Telegram and WhatsApp
- Three separate auth flows with different credential types
- Three delivery pipelines with different upload and send mechanics
- Three error handling paths with platform-specific failure modes
- Ongoing maintenance as each platform updates their API or detection systems
Most teams that build this in-house spend 4-8 weeks on the initial implementation and then budget ongoing engineering time for maintenance. LinkedIn alone accounts for half that effort because of its undocumented nature.
One API for All Three Platforms
Svara collapses all of this into a single endpoint. Here's what sending a voice note looks like across all three platforms:
# LinkedIn
curl -X POST https://api.svarapi.io/v1/send \
-H "x-api-key: your_api_key" \
-H "Content-Type: application/json" \
-d '{
"platform": "linkedin",
"recipient": "john-doe-12345",
"text": "Hey John, quick thought on your recent post about developer tooling."
}'
# Telegram
curl -X POST https://api.svarapi.io/v1/send \
-H "x-api-key: your_api_key" \
-H "Content-Type: application/json" \
-d '{
"platform": "telegram",
"recipient": "123456789",
"text": "Hey, wanted to follow up on our conversation from yesterday."
}'
# WhatsApp
curl -X POST https://api.svarapi.io/v1/send \
-H "x-api-key: your_api_key" \
-H "Content-Type: application/json" \
-d '{
"platform": "whatsapp",
"recipient": "15551234567",
"text": "Hi Sarah, just a quick voice update on the proposal."
}'
Same endpoint, same auth, same request shape. Svara handles the TTS generation, format conversion, and platform-specific delivery under the hood. You write one integration and reach users on every platform.
In Node.js, a helper function covers all three:
async function sendVoiceNote(platform, recipient, text) {
const res = await fetch("https://api.svarapi.io/v1/send", {
method: "POST",
headers: {
"x-api-key": process.env.SVARA_API_KEY,
"Content-Type": "application/json",
},
body: JSON.stringify({ platform, recipient, text }),
});
return res.json();
}
await sendVoiceNote("linkedin", "john-doe-12345", "Quick update for you.");
await sendVoiceNote("telegram", "123456789", "Following up on our chat.");
await sendVoiceNote("whatsapp", "15551234567", "Voice note for Sarah.");
And the same in Python:
import requests
def send_voice_note(platform, recipient, text):
return requests.post(
"https://api.svarapi.io/v1/send",
headers={
"x-api-key": "your_api_key",
"Content-Type": "application/json",
},
json={
"platform": platform,
"recipient": recipient,
"text": text,
},
).json()
send_voice_note("linkedin", "john-doe-12345", "Quick update for you.")
send_voice_note("telegram", "123456789", "Following up on our chat.")
send_voice_note("whatsapp", "15551234567", "Voice note for Sarah.")
Get Started
Svara's free tier includes 50 voice notes across all platforms. No credit card required. Generate your API key at svarapi.io/dashboard and send your first cross-platform voice note in under a minute.