Everything LexDOGE wants to be
What this is. The full picture of what LexDOGE wants to be — every capability planned, every audience served, every system designed. No dates, no quarters, no commitments. Just a map of the surface area.
What this is not. A schedule. A promise. A priority list with delivery timelines. Build order emerges from contributor capacity, funding, document availability, and what the city is doing in any given month — it is not pre-decided here. For point-in-time status, see the project status in the README.
How to use it. Find the row that fires you up. Open an issue, or pick up the linked module spec. Everything here is fair game — contribute anonymously, pseudonymously, or under your real name, your choice. See /about #community for channels.
How to read the status flags
These flags describe reality, not aspiration. They can move in either direction. A capability is only live when it is running on real LFUCG data in production.
§1 Foundation — the public-record substrate
Every other capability in this roadmap depends on this layer. LexDOGE covers every public entity a Lexington taxpayer funds — not just LFUCG. Foundation is organized by entity (who collects the tax dollar and writes the audit) plus cross-entity infrastructure (storage, search, lineage) that all sources share.
- liveLFUCG annual budget (adopted + proposed PDFs → structured line items)1,026 line items FY25+FY26 in prod
- liveLFUCG CAFR (Comprehensive Annual Financial Report) — pension, debt, audit findingspension_data, debt_obligations, debt_service_schedule populated (Round 5)
- liveCouncil legislation, agendas, minutes, votes (Legistar API)16,923 legislation rows + 3,238 contracts in prod, daily cron
- liveContracts + vendor extraction from council legislation2,729 contracts with extracted vendor (Round 7+)
- liveFOIA / Kentucky Open Records requests (MuckRock API v2)JWT pipeline operational; submission still human-gated
- liveOpen-data portal ingest (ArcGIS Hub: parcels, zoning, addresses, schools, STR registry, voting precincts, council districts, parks, historic districts, H-1 overlay — +143 more datasets available)6,796 rows across 10 LFUCG datasets; DCAT-US catalog ingested; daily Celery Beat refresh
- visionGranicus meeting media (audio/video, captions, RSS) for LFUC Councilprimary source for Tier-2/3 meeting summaries
- liveFCPS adopted + tentative budget (district financials)school_district_finances seeded FY26 working budget $827.2M (Round 6)
- specifiedFCPS audited financial statements (independent CPA, annual)primary source replaces secondary-press magnitude errors (see methodology)
- specifiedKDE District Financial Profile (state-level audit overlay)education.ky.gov/districts/FinRept — Kentucky Dept of Education
- specifiedFCPS BoardDocs governance feed (agendas, minutes, board policies)parallel to LFUCG's Legistar — different vendor, same role
- specifiedFCPS Open Records portal (KORA requests directed at the district)separate channel from LFUCG MuckRock pipeline
- specifiedFCPS Council Voting Record + Alignment (school board roll-calls)Module 09 covers both LFUC Council and FCPS Board
- specifiedLexTran (Lexington Transit Authority) — board resolutions, operating budget, federal grant disclosuresFY26 ops budget $37.97M; ~70% from 6¢/$100 property tax
- specifiedLFCHD (Lexington-Fayette County Health Dept.) — audited financials, 2.43¢/$100 health levyFY24 audited $26.27M revenues — source-of-truth must be the audit, not press
- specifiedLexington Public Library — board minutes, audited financials, dedicated property-tax allocation~$24M magnitude per Library Board orientation materials
- specifiedBGADD (Bluegrass Area Development District) — federal pass-through, regional planningFY25 total receipts $9.0M per KY Legislature ADD Annual Report
- specifiedFayette County PVA (Property Valuation Administrator) — parcel-level valuations + tax-roll exportsfeeds Module 06 — Property Tax + Parcel-Level Data
- specifiedFayette County Sheriff — property-tax collection statements (how the bill is actually billed)the missing piece between PVA valuation and entity allocation
- visionIndependent agencies + special districts (LFUCG-adjacent, e.g. Airport Board, BlueGrass Tomorrow)scoping needed — each appears in CAFR component-units footnotes
- specifiedKPPA pension disclosures (CERS funded-ratio, employer rates) — county-employer viewcovers LFUCG CERS + FCPS KTRS in one pipeline; kyret.ky.gov primary source
- specifiedEMMA bond filings (MSRB disclosures) — LFUCG GO + revenue bonds + FCPS school bondsemma.msrb.org continuing-disclosure API; closes the bondholder-view gap
- specifiedFull-text document search across all ingested recordspg_trgm + tsvector over budgets, CAFRs, minutes, FOIA responses, contracts
- liveEmbedding index over ingested chunks for semantic search (pgvector)600+ chunks indexed and growing; OpenAI text-embedding-3-small
- partialCloudflare R2 raw-document archive (every primary source preserved with content-hash lineage)archive_document() + archive_bytes() helpers wired into budget/CAFR/MuckRock; activates on R2 token mint — see apps/api/.env.example
- partialSource-document provenance + content-hash lineage (every claim traces to PDF + page + hash)content_hash + r2_key + archived_at columns added in m30; loop closes when R2 archive activates
- visionLincoln Institute fiscally-standardized cities dataset (cross-city benchmarks)context for 'is Lexington's burden typical for a 330k-population peer?'
- specifiedKORA/FOIA response archive — every response document ingested + searchablefeeds Module 03 — Open Records Request Tracker
§2 Analysis — what the agents do with the substrate
Once data is ingested, agents analyze it. Methods are documented, reproducible, and described in plain English on the public methodology page.
- liveBenford's Law (leading-digit distribution)
- liveYear-over-year spike detection>25% Δ without matching council action
- liveThreshold-avoidance / contract-splittingclusters just below $20k / $30k / $50k thresholds
- liveDuplicate-payment detectionfuzzy vendor+amount+date matching
- liveVendor concentration (HHI > 0.4)
- specifiedSole-source concentration over time
- visionDistributional anomaly across yearslongitudinal pattern detection — open scope
- visionNetwork analysis (vendor → council member → committee assignment)
- specifiedWhistleblower Channel — cryptographic intake of insider tips with metadata stripping
- specifiedSettlements + Litigation Ledger — every settlement and judgment paid by LFUCG
- specifiedOpen Records Request Tracker — public log of every KORA request, response time, redactions
- specifiedPublic Payroll Search — name-searchable employee database with base, OT, longevity, total comp
- specifiedCampaign Finance + Lobbying Overlay — donations + lobbying joined to votes and awards
- specifiedProperty Tax + Parcel-Level Data — who owns Lexington, who pays the bill, who got an exemption
- specifiedContract Lifecycle Tracker — solicitation → award → sole-source justification → change orders
- specifiedPer-District Dashboard — fifteen council districts × one accountability dashboard each
- specifiedCouncil Voting Record + Alignment — full roll-call history (LFUC Council + FCPS Board)
- specifiedPublic Safety Metrics — use of force, complaints, settlements, response times, overtime
§3 Publication — what reaches the public
Anomalies, findings, and analyses become public artifacts. Every artifact carries citations to primary sources, a confidence score, and a tier classification (1 = automated; 4 = legal review required).
- liveLive budget dashboard (department breakdown, fund composition, YoY)
- liveAnomaly feed with per-flag citations503 anomalies live on /anomalies with plain-English humanizer + evidence disclosure
- partialLong-form reports (Tier 3/4, agent-drafted, human-reviewed)report-writer agent exists; first publication pending
- partialCouncil meeting summaries with fiscal-impact extraction
- partialFOIA log on /foia with status, responses, produced documents
- liveGlossary tooltips on every technical term used on the siteglossary.ts + Term component shipped, 31 terms
- specifiedCorrections + retractions log with timestamped diffs
- visionRSS / Atom feeds per category (anomalies, reports, FOIA, meetings)
- specifiedEmbed widgets (per-department chart, per-vendor table, FOIA tracker)
§4 Distribution — getting findings to those who can act
A finding nobody sees is no finding at all. Distribution is a first-class concern.
- specifiedEmail Alerts + Subscriptions — vendor watchlists, anomaly thresholds, department beatsModule 10 — converts visitors into beat followers
- specifiedPublic API + Journalist Kits — REST + bulk exports + reproducibility kits + journalist programModule 11 — turns LexDOGE into civic-data infrastructure
- specifiedNews Monitoring + Autonomous Reports Pipeline — sensor-to-publish loopModule 18 — the largest single new build
- visionSocial auto-poster (anomalies + reports → X / Bluesky / Mastodon with citations)
- partialPress kit / journalist onboarding flow
§5 Community interface — how Lexington engages
LexDOGE serves residents directly. Every interface here is designed to lower the bar to participation — for tipsters, contributors, and casual readers.
- partialAnonymous / pseudonymous / identifiable contribution surfaceemail + GitHub today; Module 01 is the upgrade
- live/about page surfacing mission, independence, AI-experiment framing
- liveIn-page glossary for civic terminology (TIF, CAFR, OPEB, KORA, etc.)
- specifiedDistrict-localized views (show me my council district)
- liveMobile responsiveness across every pageviewport meta + 3 breakpoint bands + table overflow scroll
§6 Governance and trust — transparency about the project itself
LexDOGE asks public institutions to disclose how they operate. It owes the public the same standard about itself.
- livePublic ADRs documenting every architectural / policy decisionseven published; more expected
- livePublic methodology page describing data sources, agents, anomaly methods, tier system
- liveForbidden-words content policy enforced by the codebase, not just the docs
- specifiedPublic corrections + retractions log
- specifiedSelf-disclosure: budget, funding sources, infrastructure costs, governing body, vendors
- livePublic source code for everything: parsers, prompts, thresholds, gatesAGPL-3.0
- vision501(c)(3) status + IRS Form 990 published as soon as filed
§7 Resilience and autonomy — what keeps the system honest
The project's premise is that AI agents can run a civic watchdog at low cost with bounded human oversight. That premise is only credible if the system can detect its own failures and recover from them.
- specifiedSelf-Monitoring + Resilience — every cron, parser, queue has health checks and SLOsModule 19 — load-bearing for autonomy
- specifiedAutonomy Audit — end-to-end review of where humans sit in the loopguides which gates can be safely automated
- liveDaily Discord digest of ingest activity (new PDFs, new FOIA responses, parser deltas)webhook plumbing shipped; webhook URL pending
- partialSentry observability across web + API + workerinitialized; alerts pending
- visionAutomated cost monitoring (LLM spend per agent, per module, per finding)
- visionPer-agent A/B testing infrastructure (compare prompts, models, thresholds)research-grade tooling
- specifiedReproducibility kit — given a finding, replay the exact pipeline that produced itelement of Module 11
§8 Multi-jurisdiction and the larger project
LexDOGE serves Lexington-Fayette first. But the codebase, methodology, and agent stack are not Lexington-specific. The intent is that any community wanting a civic-transparency dashboard can fork LexDOGE and inherit everything.
- liveAGPL-3.0 license that closes the SaaS-vendor loopholeADR-001
- liveFork-over-multi-tenant architecture (each jurisdiction is its own deployment)ADR-002
- visionConfiguration-driven jurisdiction setup (one config file → new city)partially possible today; full extraction not yet done
- visionReference fork documentation (how to clone LexDOGE for Knoxville, Louisville, …)
- liveShared upstream improvements flow back to canonical repo
- visionCross-jurisdiction benchmark dashboard (using Lincoln Institute's FiSC dataset)
§9 The bigger questions
These are not modules. They are the questions the project is, on its longest view, trying to answer — in public, by working.
- Can autonomous AI agents do investigative civic work at a quality and consistency journalists and auditors will trust? The answer must be demonstrated, not asserted.
- Can a small civic project, run by agents, monitor public finance with the depth that previously required an institutional newsroom?
- Can pseudonymous community contribution be a first-class pattern in civic transparency, without becoming an attack surface for bad actors?
- Can other communities adopt this codebase and produce findings as strong as Lexington's? The fork model exists. The replication does not — yet.
- What does positive-sum AI for public goods look like at scale? LexDOGE is one data point. More are needed.
How to add to this roadmap
Anyone can. Two paths:
- Adding a new capability. Open an issue with the rough idea, the audience it serves, and any references. Once it has at least a paragraph of motivation, it lands here as vision. Once someone writes a real spec for it, it advances to specified.
- Advancing an existing capability. Pick up the module spec or open issue, implement it, and the next maintainer pass will update the flag.
There is no editorial gate on what counts as “in scope.” LexDOGE is a civic-data project for Lexington-Fayette. If something serves that mission and meets the legal and editorial bar in ADR-003, it belongs here.
This is a living document. It describes what the project intends to be, not what has been promised. Build order is not encoded here. See the README's Project Status for what is shipping right now.