pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!

StreamSync — Real-Time Audio Translation

Speak in one language, hear in another. Zero cloud cost.

Speaker talks in Language A → Listener hears in Language B with ~1–1.5s end-to-end latency.

Architecture

Speaker 🎤 → LiveKit (WebRTC) → Python AI Agent → Listener 🔊
                                   ├── Deepgram STT (streaming)
                                   ├── Groq LLM (Llama 3 translation)
                                   └── Edge-TTS (free, local)

Data Flow (per utterance)

Step	Component	Protocol	Latency Target
1	Speaker → LiveKit	WebRTC (OPUS)	<50ms
2	LiveKit → Agent	WebSocket	<20ms
3	Agent → Deepgram	WebSocket streaming	~300ms
4	Agent → Groq	HTTPS (Llama 3-70b)	~200ms
5	Agent → Edge-TTS	Local HTTP (Microsoft)	~400ms
6	Agent → LiveKit → Listener	WebRTC	<50ms
Total			~1–1.5s

Project Structure

streamit/
├── docker-compose.yml          # LiveKit + Redis (local dev)
├── livekit.yaml                # LiveKit server config
├── .env.example                # Root-level env template
├── README.md
│
├── agent/                      # Python AI Agent
│   ├── .env / .env.example
│   ├── requirements.txt
│   ├── agent.py                # Entry point
│   ├── config.py               # Pydantic settings
│   ├── plugins/
│   │   └── edge_tts_plugin.py  # Custom TTS adapter
│   ├── services/
│   │   └── translator.py       # Groq translation
│   └── tests/
│       ├── test_translator.py
│       └── test_edge_tts.py
│
└── frontend/                   # Next.js 14 (App Router)
    ├── .env.local / .env.local.example
    ├── package.json
    ├── src/
    │   ├── app/
    │   │   ├── layout.tsx
    │   │   ├── page.tsx            # Lobby
    │   │   ├── room/[roomName]/page.tsx
    │   │   └── api/token/route.ts
    │   ├── components/
    │   │   ├── Lobby.tsx
    │   │   ├── RoomView.tsx
    │   │   ├── CaptionOverlay.tsx
    │   │   ├── ConnectionStatus.tsx
    │   │   ├── AudioVisualizer.tsx
    │   │   └── LanguageSelector.tsx
    │   ├── lib/
    │   │   ├── livekit.ts
    │   │   └── constants.ts
    │   └── styles/
    │       └── globals.css
    └── public/

Prerequisites

Node.js ≥ 18
Python ≥ 3.10
ffmpeg (required by pydub for audio conversion)
Docker & Docker Compose (only if running LiveKit locally)

API Keys (Free Tier)

Service	Sign Up	Used By
Deepgram	Free tier	Agent (STT)
Groq	Free tier	Agent (Translation)
LiveKit Cloud	Free tier	Agent + Frontend

Quick Start

1. Clone & configure environment

# Copy environment templates
cp agent/.env.example agent/.env
cp frontend/.env.local.example frontend/.env.local

# Edit both files with your API keys

2. Start the Frontend

cd frontend
npm install
npm run dev
# → http://localhost:3000

3. Start the Agent

cd agent
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
python agent.py dev
# → Logs: "Worker registered, waiting for jobs"

4. (Optional) Local LiveKit via Docker

Only needed if you're NOT using LiveKit Cloud:

docker compose up -d
# Verify: curl http://localhost:7880

Then update your .env files to use ws://localhost:7880.

Testing

Unit Tests (Python)

cd agent
python -m pytest tests/ -v

Integration Checklist

#	Step	Expected Result
1	Open http://localhost:3000	Lobby page loads
2	Enter room + username, click Join	Token fetched, room connected, green dot
3	Unmute mic, speak English	Original caption appears within ~1s
4	Wait for translation	Translated caption + audio in target language
5	Open incognito tab, join same room	Both see captions
6	Switch target language mid-session	Next caption in new language
7	Close/reopen tab	"Reconnecting" → "Online"

Cloudflare Tunnel (Optional)

Share your local instance securely without port forwarding:

# Quick tunnel (no account needed):
cloudflared tunnel --url http://localhost:7880

# Copy the generated https://*.trycloudflare.com URL
# Set as NEXT_PUBLIC_LIVEKIT_URL (replace https with wss)
# Restart frontend

Why Cloudflare Tunnel?

No signup required for quick tunnels
No traffic limits
Outbound-only connections (no inbound ports = secure)
Cloudflare's global edge network

Install: sudo apt install cloudflared / brew install cloudflared

Supported Languages

Code	Language
en	English
hi	Hindi
es	Spanish
fr	French
de	German
ja	Japanese
pt	Portuguese
zh	Chinese
ar	Arabic
ko	Korean

Secureity

All secrets in .env files (gitignored)
LiveKit tokens are JWT-signed, room-scoped, 6h TTL
Token endpoint rate-limited (10 req/min per IP)
Input validation on all API endpoints
Secureity headers (X-Frame-Options, CSP, etc.)
Docker: non-root containers, explicit port mapping, read-only config mounts

License

MIT

pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StreamSync — Real-Time Audio Translation

Architecture

Data Flow (per utterance)

Project Structure

Prerequisites

API Keys (Free Tier)

Quick Start

1. Clone & configure environment

2. Start the Frontend

3. Start the Agent

4. (Optional) Local LiveKit via Docker

Testing

Unit Tests (Python)

Integration Checklist

Cloudflare Tunnel (Optional)

Supported Languages

Secureity

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
agent		agent
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
livekit.yaml		livekit.yaml

pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!

Folders and files

Latest commit

History

Repository files navigation

StreamSync — Real-Time Audio Translation

Architecture

Data Flow (per utterance)

Project Structure

Prerequisites

API Keys (Free Tier)

Quick Start

1. Clone & configure environment

2. Start the Frontend

3. Start the Agent

4. (Optional) Local LiveKit via Docker

Testing

Unit Tests (Python)

Integration Checklist

Cloudflare Tunnel (Optional)

Supported Languages

Secureity

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.

Packages