TL Provider: Soft Refresh, IndexedDB Cache & the 404 Rewrite
The Problem
The video editor’s TL provider (a live streaming aggregator) had a brutal UX issue: every time you navigated away from TL to another provider (tango, fc2, sc) and back, the entire stream list was wiped and reloaded from scratch. initialize() cleared all state — videos, streamer map, co-streamer positions, scroll position — and showed a “Loading…” flash while hitting the API again.
For a list of 50-100 live streamers with resolved metadata, this was wasteful and disorienting. The list would rebuild in a different order as API responses and co-streamer resolutions arrived asynchronously.
The Architecture Before
- Frontend: Svelte 5 SPA with a
videoListStore(reactive state) holdingvideos[],streamerMap<alias, TlStreamer>,processedStreamIds,liveFilenameMap - Backend: Express proxy that resolves HLS master playlists from tango.me, rewrites segment URLs, and serves them locally
- Provider switch:
$effectwatches the route param, callsinitialize(provider)which zeroes everything, thenloadTlStreamsfetches from the API
The key data flow for TL: fetchStreams() returns { following, recommended } arrays of TlStreamer objects, each with streamerId, alias, masterListUrl (the HLS master playlist on tango.me), and metadata. Co-streamers are discovered per-streamer via fetchMultiBroadcast(streamId).
Step 1: In-Memory Snapshot for Instant Restore
Module-level variable in a new tl-cache.ts service. saveTlSnapshot(store) captures videos, streamerMap, processedStreamIds, liveFilenameMap, and listIdentifiers. restoreTlSnapshot(store) puts them back.
The provider-switch $effect now saves before leaving TL and restores on return:
leaving TL → saveTlSnapshot(videoListStore)
entering TL (with snapshot) → initializeSoft('tl') → restoreTlSnapshot → softRefreshTlStreams
entering TL (no snapshot) → initialize('tl') → loadTlStreams (full load)
initializeSoft bumps the epoch (for stale-async detection) and sets the provider, but does NOT wipe data. The list appears instantly.
Step 2: IndexedDB for liveUrl Persistence
The most important piece of metadata is the liveUrl — the resolved 720p sub-playlist URL derived from the masterListUrl. Resolving it requires a backend round-trip that fetches the master playlist from tango.me with auth cookies, parses the HLS manifest for the 720p variant, and constructs the full URL.
New IndexedDB store (tl-cache db, streamers object store, keyed by streamerId):
{ streamerId, masterListUrl, liveUrl: string | null, cachedAt: number }
On soft refresh, the classification loop checks IDB: if a streamer has the same masterListUrl and a cached liveUrl, use it directly — no backend call needed.
Step 3: The Eager Walk — Co-Streamers + liveUrl in One Pass
Initially I had two separate fire-and-forget loops: fetchCoStreamersEagerly walked the list checking for co-streamers via fetchMultiBroadcast, and resolveAllLiveUrls walked the same list resolving liveUrls. They ran in parallel, independently.
Merged them into processStreamersEagerly — a single sequential pass that, for each streamer:
- Checks co-streamers (if
streamIdexists, gated bymarkStreamIdProcessedto avoid re-checking) - Resolves liveUrl for each co-streamer found → stores in IDB
- Resolves liveUrl for the main streamer → stores in IDB
- 200ms delay, next streamer
One walk, everything gets cached politely.
Step 4: The 404 Bug — Don’t Check Local HLS
The scroll-triggered refreshTlStreams (fires when the user scrolls near the bottom) had a 404 HEAD check meant to detect stale proxy sessions. For each existing streamer, it did HEAD /hls/${alias}/playlist.m3u8.
The problem: that local HLS playlist only exists after the user clicks to play a streamer. Every unplayed streamer — which is most of them — returned 404. The code treated 404 as “dead stream”, removed it from its position, and re-added it at the bottom. Following streamers at the top of the list would jump to the middle/bottom on every refresh.
Replaced with a three-way classification based on masterListUrl comparison:
| Fresh streamer vs. existing | Action |
|---|---|
| New alias | Append at bottom + eager walk |
Same alias, different masterListUrl |
Different stream — remove old + append new |
Same alias, same masterListUrl |
Skip — already live, liveUrl already cached |
No HTTP checks. The fetchStreams() API response IS the liveness signal.
Step 5: The liveUrl Cache Principle
This was the key insight that emerged during debugging. The masterListUrl (master playlist) can 404 while the resolved liveUrl (720p sub-playlist) still serves segments. They’re different URLs with different lifecycles.
This means:
- Never use
masterListUrlresolution as a liveness check — a null result fromresolveLiveUrldoesn’t mean the stream is dead - Never overwrite a cached liveUrl with null — the old liveUrl might still work
- Only delete IDB entries when the stream disappears from the API entirely — and even then, only after 24h
The 24h guard on removeCached and sweepOrphans protects against a stream momentarily disappearing from the API (flaky endpoint, pagination issues) while its liveUrl is still actively serving segments.
Enforcement
putCachedchecks: if the new liveUrl is null but an existing entry has a liveUrl, the write is silently skippedprocessStreamersEagerlyonly callsputCachedon successful resolutionremoveCachedreadscachedAttimestamp, skips if younger than 24hsweepOrphansiterates all entries, only deletes orphans older than 24h
The Refresh Flow Summary (Before Rewrite)
Initial load (loadTlStreams):
fetchStreams() → build videos + streamerMap → fire-and-forget processStreamersEagerly
Soft refresh (returning from another provider):
restoreTlSnapshot → fetchStreams() → remove dead aliases → classify:
new → append + processStreamersEagerly
different masterListUrl → remove + append + processStreamersEagerly
same masterListUrl + IDB cached liveUrl → use cache
same masterListUrl + no cache → processStreamersEagerly
→ sweepOrphans (24h guard)
Scroll refresh (near bottom of list):
fetchStreams() → classify:
new → append + processStreamersEagerly
different masterListUrl → remove + append + processStreamersEagerly
same masterListUrl → skip entirely
Files Changed
Backend (video-editor-backend):
tl-proxy.routes.ts— newPOST /tl/resolve-live-urlendpoint, reuses existingresolveLiveUrl()without creating proxy sessions
Frontend (video-editor-svelte):
tl-cache.ts(new) — IndexedDB wrapper + in-memory snapshotconstants.ts—TL_API.RESOLVE_LIVE_URL,TL_PAGEconfig objecttl-api.ts—liveUrlfield onTlStreamer,resolveLiveUrl()functionvideoList.svelte.ts—initializeSoft(),removeStreamers(),updateStreamerLiveUrl()+page.svelte— soft refresh orchestration,processStreamersEagerly, snapshot save/restore, scroll refresh rewrite
Part 2: The 404 Black Screen Bug
Hours after shipping the soft refresh, a new problem surfaced: streams that had gone offline sat in the list as black screens. They weren’t being removed.
The root cause was a missing code path. The system was careful about protecting the liveUrl (never overwrite with null, 24h deletion guard), but it never checked the liveUrl. When a stream died on tango.me, the masterListUrl would 404 — but the code treated that as “transient, keep the cached liveUrl.” The cached liveUrl was also dead, but nobody was asking it.
First Fix: Wrong
Added a checkLiveUrl function — a backend endpoint that does a GET against the liveUrl on tango.me. Called it in processStreamersEagerly when resolveLiveUrl returned null: “masterListUrl failed, so check the cached liveUrl. If that also 404s, remove.”
Also added it to softRefreshTlStreams: “before restoring a cached liveUrl, verify it’s still alive.”
This was architecturally wrong for two reasons.
The Regression: Position Loss
The soft refresh liveUrl check was too aggressive. tango.me HLS endpoints may not behave identically on HEAD vs GET requests. The initial implementation used HEAD (later changed to GET), but the real problem was structural: checking every cached liveUrl on every provider switch meant any transient failure would yank followed streams from the top of the list. The removal + re-addition on the next refresh cycle put them in the middle as “new” streams.
Tried to fix it with hideStreamers() — removing from the video list but keeping the streamerMap entry to prevent re-addition. This was a band-aid over the wrong architecture.
The Correct Architecture: liveUrl as Source of Truth
The breakthrough was clarifying what “liveUrl is the source of truth” actually means in practice. It doesn’t mean “check the liveUrl during processing.” It means:
The only time a liveUrl gets checked is when it’s actually used. During video playback, the backend proxy fetches the liveUrl from tango.me to serve the m3u8 playlist. If tango.me returns 404 — a real, organic HTTP 404 — the stream is dead. That’s the signal. Not a proactive health check. Not a processing-time validation. The natural act of playing the video IS the liveness check.
The Rewrite
Ripped out the proactive checking and replaced it with three clear flows:
1. Processing (resolve + cache, never remove)
processStreamersEagerly resolves the liveUrl from the masterListUrl. If resolution succeeds, cache it. If it fails, fall back to the IDB cached liveUrl. No removal. Stream stays in the list — if it has no liveUrl, it’s unplayable until the next cycle resolves it. That’s fine.
2. 30-second refresh interval (duplicate detection)
Replaced scroll-triggered refresh with a 30s setInterval. The duplicate check is based on streamerId + masterListUrl: same pair → skip (stream stays exactly where it is), different → new stream, queue for processing. This is the fix for the position-loss bug — duplicates are never touched.
3. Video playback (organic 404 removal)
When HLS.js fires a 404 error on a TL stream, the frontend calls checkLiveUrl(liveUrl) — a real GET against tango.me. If tango.me confirms 404, the stream is removed from the video list, the streamerMap, and IndexedDB. If it’s alive (transient proxy issue), retry. The 404 has to come from tango.me, not from the local proxy.
The IDB Rules (Final)
Reduced to two sentences: store on successful resolution. Remove on 404 from tango.me (immediate) or when 24 hours old (sweep).
Part 3: The PWA Background Problem
A day after the rewrite, a new issue: leave the PWA in the background for a while, come back, and the old streams at the top of the list are black. The new ones added to the end work fine. The stale streams should be getting removed — they’re dead on tango.me — but they just sit there.
Three compounding failures
1. No visibility change detection. The browser suspends background tabs. When the tab wakes, the HLS.js instances are in a broken state, but nothing forces them to reload. The 30s setInterval resumes, but it only adds new streamers — it never re-checks existing ones.
2. loadStream early-returns on same filename. The video player caches which filename is loaded on each element via el.dataset.loadedFilename. When preloadAdjacent fires after waking, it calls loadStream for the same filenames — but loadStream sees the match and skips the reload. The broken HLS.js instances from before the sleep persist.
3. Passive player 404s are silently swallowed. The HLS error handler had if (!isActivePlayer) { hls.destroy(); return; } — preloaded videos that got 404s just destroyed their HLS instance without triggering any removal. Only the active player ran the checkLiveUrl → remove flow. The “three videos loaded at once means three videos checked at once” assumption was wrong.
Net result: video shows black, no 404 fires because nothing reloads, even if it did only the active player would act on it.
The fix: replace the timer with a processing queue
The 30s timer was fundamentally the wrong abstraction. It said “every 30 seconds, check for new streams from the API.” What we actually needed was “continuously process all streams, prioritizing new ones from the API.”
New architecture — a single async loop that runs while on TL:
while on TL:
Phase 1: fetch endpoint → process new/changed (resolve liveUrl + co-streamers)
Phase 2: reprocess existing (check liveUrls against tango.me)
wait minimum 30s from cycle start
repeat
Phase 1 is identical to the old logic: hit the endpoint, diff against the current list, process anything new or changed.
Phase 2 is entirely new. For each existing streamer not just processed in Phase 1:
- Check the cached liveUrl against tango.me (
checkLiveUrl). If alive → done, stream is fine. - If dead → try resolving a new liveUrl from
masterListUrl. If the new one is alive → update cache, keep the stream. - Only if both the cached liveUrl AND the freshly resolved one are confirmed 404 → remove from list, memory, and IndexedDB.
The “both must be 404” rule is critical. The cached liveUrl is the source of truth. The endpoint liveUrl (from masterListUrl) is the fallback. A resolveLiveUrl returning null (can’t parse the master playlist) is NOT a 404 — we can’t confirm death, so the stream stays for 24h via the IDB sweep.
The active player is not special
The old architecture had a separate 404 handling path in the VideoPlayer: HLS.js fires 404 → checkLiveUrl → onLiveUrlDead callback → remove stream → navigate to next. Passive players silently swallowed errors. Active players had their own removal logic.
This was wrong. The queue is the single authority on stream removal. The VideoPlayer’s job is to play whatever’s in the list, not to decide what belongs there.
New error handling for TL in VideoPlayer:
HLS 404 → destroy the HLS instance. That's it.
No checkLiveUrl, no onLiveUrlDead callback, no active/passive distinction. When the queue determines a stream is dead and removes it from the list, a reactive $effect in VideoPlayer detects the current video is gone and navigates to the first remaining video.
Session restore via IDB
One more gap: the OS kills the PWA (memory pressure, phone restart). The in-memory snapshot is gone. When the app restarts, it hits the endpoint and builds a fresh list. But processNewStreamer falls back to IndexedDB when resolveLiveUrl fails — and the IDB is full of stale liveUrls from the dead session.
The fix: before doing anything else, walk every IDB entry and checkLiveUrl against tango.me. Dead entries get purged immediately. By the time the endpoint fetch runs and processNewStreamer falls back to IDB, only live liveUrls remain.
This is Phase 0 of the queue — runs once on start, before initial processing or the loop.
Let the queue breathe
The initial implementation had a 30-second floor between cycles (REFRESH_GATE_MS). It was a safety net against hammering the endpoint. But with 200ms delays between each item, the queue already paces itself. Processing 80 streams takes ~16 seconds. Add the endpoint fetch and IDB operations, and a full cycle is naturally 20-30 seconds for a typical list. A small list (5 streams) cycles faster, but the endpoint can handle that — it’s a single GET.
Removed the artificial floor entirely. The queue runs at its natural pace: as fast as the network and the 200ms per-item delay allow.
Takeaway
Four lessons.
First: identify which piece of data is most valuable and protect it. The liveUrl outlives the masterListUrl that produced it. Build the cache around that.
Second: don’t proactively check what you can check organically. The proxy already hits the liveUrl during playback — that’s the real 404 signal. Adding a separate health-check loop introduced false positives, timing bugs, and position-loss regressions.
Third: a timer is the wrong abstraction when what you need is a work queue. The 30s timer only looked forward (new streams from the API), never backward (are existing streams still alive?). A continuous processing queue naturally covers both directions. New items from the API get priority, then the queue circles back to verify existing items. No stale state accumulates because the queue eventually reaches everything.
Fourth: persistent state needs startup validation. IndexedDB survives app restarts but the data it holds may not. Walking the cache on startup — before anything depends on it — turns stale persistence from a liability into a clean slate.