Super inspiring idea. Could be useful in an internal knowledgebase where information often gets quite messy and unmaintained over time (I've met a few of those gargantuan Confluence-installations).
Thanks! Yeah, that’s a big part of the motivation.
Most internal wikis slowly decay because no one has time to maintain them. An agent running on a cron schedule can continuously research topics and update the docs, while GitHub keeps the full history of what changed.
The thing that bothered me, say there are 5 topics that need pages but only one has this huge growth in articles. Like say discography vs books on Wikipedia.
The interesting technical problem here turned out not to be speech recognition but script alignment.
ASR output arrives in ~600 ms chunks and is messy (filler words, homophones, skipped phrases). A simple substring match breaks immediately.
The current tracker uses:
- inverted token index to find candidate windows
- banded Levenshtein distance for fuzzy matching
- Double Metaphone phonetic normalization
- locality penalties to stay near the current position
Between ASR updates the UI speculatively advances the cursor based on measured WPM so the highlight moves smoothly.
Curious if anyone here has worked on similar real-time alignment problems.
Anticipating a common question: this doesn’t bypass Copilot’s licensing or give you “free” access. You still need an active Copilot subscription, and the bridge just exposes it as a local API. Think of it as a shim: Copilot stays the backend, this just makes it usable from scripts, CLIs, or tools that expect an OpenAI-style interface.
I’ve tried to keep the project in the spirit of extending Copilot’s usefulness without abusing the service. Feedback on where the line should be drawn is welcome.
reply