Product

Turn anything into audio — then refine it.

EchoLive turns documents, feeds, and videos into high-quality audio — with a production studio (segment editing, SSML, exports) and a listening inbox (feed reader, organization, AI search) so you can listen, refine, and publish.

Who it’s for

Use cases mapped to real features

EchoLive supports both everyday listening and professional production workflows.

Listeners

Auditory learners & “listen instead of read”

Turn articles, docs, and feeds into a personal listening library that stays organized.

  • Quick Read: paste text → generate audio
  • Feeds: articles, newsletters, podcasts, and YouTube in one inbox
  • Global AI Search across your saved feeds + projects
Consumers

News, newsletters, and long articles

Stay on top of content with a feed reader built for listening and generation.

  • Feed organization (folders), filters, sorting, read/unread + starred
  • Reader mode + playback position persistence
  • Optional summaries and bulk generation workflows
Students

Students & researchers

Import documents, keep them structured, and search what you’ve saved later.

  • Import PDF/DOCX/MD/HTML/URLs with auto-segmentation
  • Word Sync read-along when timing is available
  • Semantic search across projects and feed items
Creators

Creators & podcasters

Produce long-form audio with studio controls, iteration, and exports for publishing workflows.

  • Studio segment editor: per-segment voices/styles/pacing + SSML tools
  • Segmented waveform navigation and generation controls
  • Exports: MP3/WAV + production bundles (ZIP/AAF-style)
Educators

Educators & course teams

Turn scripts and notes into consistent audio, with fine control over pacing and emphasis.

  • Segment-based structure for lessons and chapters
  • Voice presets to keep delivery consistent across sessions
  • Smart Import / comprehension suggestions for clearer pacing
Audiobooks

Audiobook-style narration

Handle long scripts with professional controls for pronunciation, pauses, and structure.

  • Per-segment SSML and prosody controls
  • Long-form generation with progress tracking
  • Exports that fit downstream editing and publishing

At a glance

What’s included today

EchoLive is built for creators who want both production controls and an always-growing content inbox.

Studio editor

Multi-segment timeline with per-segment voices/styles, visual SSML tools, and a segmented waveform for fast navigation.

Feeds inbox

A full feed reader for articles, podcasts, newsletters, and YouTube with organization, refresh, summaries, and audio generation.

Smart Import

Import txt/md/docx/pdf/HTML/URLs and auto-segment — with optional AI-assisted comprehension to improve structure and pacing.

Global AI Search

Ingest → index → vector search across your library (feeds + projects) so saved content stays discoverable and reusable.

Exports

Export MP3/WAV/ZIP bundles and AAF-style bundles for production workflows and downstream editing.

Credits billing

Pay-as-you-go credits with estimation + refunds on failures/cancels. Subscription billing is planned.

Available today

Functional features

These are the current capabilities available in EchoLive.

  • Secure sign-in and private, account-scoped content.
  • Leverages Azure Speech’s large voice catalog (availability varies by locale).
  • Voice previews, search, filters, favorites, and presets to choose voices quickly.
Core surfaces (the product “modes”)
  • Quick Read: paste text → generate audio quickly (single job).
  • Studio: multi-segment timeline editor for production workflows.
  • Feeds: RSS/Atom + crawl-based “inbox” for articles/podcasts/newsletters (with optional YouTube channel support).
  • Library: browse/organize voices, favorites, collections, and presets (synced).
  • Projects: manage saved work and play/export long-form outputs.
Global playback (cross-app)
  • A single persistent player used across the app for:
    • Quick Read audio,
    • Studio segment playback,
    • merged project playback,
    • feed-item audio playback,
    • YouTube playback mode (separate from TTS audio).
Quick Read (single-pass generation)
  • Voice selection (simple + advanced voice modes) with styles, rate, pitch, and volume controls.
  • Optional SSML input mode.
  • Real-time generation progress with job resumption after refresh.
  • Word-level highlighting view when timing data is available.
  • Global persistent playback (doesn’t stop when navigating to other pages).
Studio editor (multi-segment projects)
  • Segment-based timeline (text segments + pause segments).
  • Per-segment voice settings: voice, style, rate, pitch, volume, SSML vs plain text.
  • Batch operations: reorder, select many, bulk delete, collapse/expand, “apply settings to all”.
  • “Unified Text View”: edit all segment text in one continuous editor with non-editable segment markers.
  • Focus mode for distraction-free editing.
  • Segment-level generation: generate, regenerate, cancel, play-while-generating.
  • Project-level Word Sync toggle (controls whether word-level timing is generated/used).
  • Segment waveform timeline (“segmented waveform”) for fast navigation/seek to a specific segment/time.
  • Per-segment visual SSML editor (build SSML with UI for breaks, emphasis, prosody, say-as, phonemes, substitutions) + manual SSML editing.
Import (documents → segments)
  • Import formats:
    • Timeline JSON (.json) project imports/exports.
    • Text/Markdown (.txt, .md) with formatting preservation for emphasis.
    • PDF (.pdf) via client-side extraction (PDF.js).
    • Word (.docx) with formatting/headings (mammoth.js → Markdown).
    • HTML/Google Docs export (.html, .htm, .gdoc).
    • URL import (“reader mode” extraction via backend).
  • Segmentation strategies:
    • Automatic detection (best strategy per document).
    • Markers (e.g. ---SEGMENT---, <!-- SEGMENT -->, ===).
    • Headings (H1/H2 hierarchy splitting).
    • Paragraph splitting and length-aware fallback splitting.
AI-assisted import & comprehension
  • Adaptive Comprehension Mode / Smart Import: AI analyzes text complexity and suggests:
    • better segment boundaries,
    • pacing adjustments,
    • optional pauses,
    • style suggestions,
    • “emphasis” candidates.
  • Asynchronous job model with progress and result retrieval.
Voice library & personalization
  • Azure voice discovery (/api/voices) and caching.
  • Voice preview library and preview sync/status tooling.
  • Voice Library UI:
    • search + filters (locale, gender, styles),
    • favorites,
    • collections,
    • presets (voice + style + prosody),
    • default voices per context (quick_read, feeds, studio) synced across devices.
Projects (management + playback)
  • Projects list with search and sorting.
  • Rename, delete, multi-select operations.
  • “Prepare audio” for project playback (merge segments into a single playback file when needed).
  • Auto-save / draft tracking behaviors in the Studio lifecycle.
Export (Studio → files)
  • Export types:
    • Audio (MP3/WAV).
    • Timeline JSON (project data).
    • Segment bundles (ZIP of per-segment assets, when applicable).
    • AAF bundle (ZIP containing an AAF-like XML + audio assets for import into editors).
  • Export job history with progress tracking, downloads, and cleanup.
Feeds (content inbox)

EchoLive Feeds is a full feed reader + ingestion system, not just an “RSS importer”.

  • Subscribe/import:
    • RSS/Atom feed discovery from a website URL (handles direct-feed URLs).
    • Crawl-based “feeds” for non‑RSS sites (discover URLs and import selected pages).
    • Sitemap parsing for bulk URL discovery.
    • Curated recommendations and categories.
  • Feed categorization:
    • Feed-level category (article | podcast | newsletter) with user override.
    • Item-level content type (article | podcast_episode | newsletter).
    • Podcast items play enclosure audio (no TTS); newsletters force TTS behavior.
  • Feed settings:
    • folder organization,
    • fetch interval,
    • auto-generate audio (default voice),
    • cleanup patterns (preview/apply to remove repeated boilerplate).
  • Items:
    • pagination, sorting, date filters,
    • search (title/excerpt/author),
    • read/unread, starred, bulk delete,
    • edit publication dates (single + bulk mappings).
  • Reader experience:
    • full-screen Article Viewer (reading view) with font size controls,
    • refresh/re-fetch item content,
    • in-view audio controls + generation controls,
    • word-sync overlay for “read along” playback when timing is available.
  • Audio for items:
    • per-item generate/regenerate/cancel with progress,
    • bulk generation with rate-limited queueing,
    • sticky player with seek + playback speed,
    • playback position persistence to backend.
  • Scheduled refresh:
YouTube support in Feeds (Shipped; API key optional)
  • Subscribe to YouTube channels via:
    • RSS conversion when available, and/or
    • YouTube Data API browsing (optional YOUTUBE_API_KEY) to go beyond typical RSS limits.
  • Browse/search channel videos, import selected videos into a feed.
  • Transcript + summary pipeline (optional):
    • fetch transcripts when available,
    • AI summarization,
    • optional summary-to-audio generation.
  • YouTube playback mode in the global player (separate from TTS audio playback).
Notifications
  • In-app notification center backed by the database.
  • Notification actions for common flows (e.g., retry from “insufficient credits” events).
Billing (credits)
  • Credit-based prepaid billing model:
    • cost estimation by text length,
    • reserve credits on job start,
    • confirm on success,
    • refund on failure/cancel.
  • Transactions + usage history.
  • Coupons (validate/redeem) and admin tools for managing credits/coupons and cleanup/backfills.
  • Credit expiry:
    • credit grants/purchases are issued with an expires_at timestamp (default 365 days),
    • the balance API can report credits expiring soon and the next expiration date,
  • Subscription billing is planned; current model is pay-as-you-go via credits.
Global AI Search (Shipped)
  • Ingest → index → vector search pipeline over your own content:
  • UI:
    • Cmd/Ctrl+K global search modal with keyboard navigation,
    • filter chips (All / Feeds / Projects).

Non-functional

Privacy, reliability, and operations

How EchoLive stays dependable for long jobs and sensitive content.

Performance & scalability
  • V2 parallel chunk-based synthesis with real-time progress streaming.
Reliability & operations
  • Job progress polling + SSE streaming (where available).
  • Defensive client UX: monotonic progress tracking, timeouts, resumable sessions.
Security & privacy
  • Encryption for sensitive user content at rest (project text, comprehension inputs/results).
  • “No content logging” posture for user text/audio inputs; client monitoring redacts sensitive fields.
  • GDPR endpoints for data export and account deletion.

Next

What we’re exploring

Areas we’re actively building toward as we learn from users.

Studio & production workflows
  • Pronunciation dictionary (per account/project), phoneme tooling, and reusable style/preset packs.
  • Audio enhancement: background music mixing, fades/crossfades, loudness normalization/mastering.
Studio & production workflows
  • Pronunciation dictionary (per account/project), phoneme tooling, and reusable style/preset packs.
  • Audio enhancement: background music mixing, fades/crossfades, loudness normalization/mastering.
  • Collaboration: sharing, team workspaces, comments, version history.
  • Integration/export expansion: richer DAW/video editor interchange, webhooks, public API.
Voice capabilities
  • Voice cloning / custom voices (subject to provider constraints and policy).
  • Multi-speaker scripting improvements and automated “role casting”.
Mobile
  • A mobile-first companion app focused on capture + offline listening (not a web wrapper):
    • camera/OCR → TTS,
    • share-sheet URL/article import,
    • background audio + downloads,
    • CarPlay/Android Auto,
    • lightweight editing + sync.
AI features
  • More advanced summarization and “listenability” transformations (e.g., newsletter → concise audio brief).
  • Semantic navigation and highlights over long audio (“jump to section”, “key points”).

Comparisons

How EchoLive compares

Practical differences versus voice-first platforms, consumer readers, and traditional editors.

Quick comparison

This chart reflects typical product focus and may not capture every feature or tier.

Capability EchoLiveElevenLabsSpeechifyDescript
Built-in content inbox (feeds, newsletters, YouTube) Limited
Long-form production (segment/timeline editing) Limited ✓ (audio-first)
Large voice catalog ✓ (Azure) Limited
Voice cloning as a primary offering Not focus Varies ✓ (Overdub)
Exports for downstream editors ✓ (MP3/WAV + bundles) Basic Basic
Semantic search over your library Limited
Compared to voice-first platforms (e.g. ElevenLabs)
  • EchoLive is workflow-first: timeline/segment editing, bulk regeneration, exports, and long-form project management.
  • EchoLive is ingestion-first: built-in feeds (RSS/crawl/newsletters/podcasts) and optional YouTube channel workflows.
  • EchoLive is provider-leveraging: built around Azure Neural TTS (breadth of voices, regional availability) rather than proprietary voice models.
  • Voice cloning/custom voice creation is not currently the core shipped differentiator (it’s a possible roadmap item).
Compared to consumer “read it to me” apps (e.g. Speechify)
  • EchoLive is a studio + inbox (creation + production + export), not only a playback app.
  • EchoLive has production controls (SSML, per-segment voices, AAF exports) that are typically out of scope for consumer readers.
  • EchoLive’s Feeds model supports mixed media (articles + podcasts + YouTube) with unified playback and generation pipelines.
  • EchoLive adds semantic search over your own library (ingest → index → vector search), which turns “saved content” into a searchable knowledge base.
Compared to traditional audio editors (e.g. Descript)
  • EchoLive generates high-quality audio from text with TTS-native controls and then can export to editing workflows (e.g., AAF bundle).
  • EchoLive is not primarily a recorded-audio editor; it’s optimized for script → voiced audio pipelines.