Data Sources
Every dataset, API, and algorithm powering the Houston DOGE transparency platform
City Payments
Primary payment data — city checkbook records covering all department expenditures, vendor payments, contracts, and procurement transactions from FY2018 to present. Updated nightly via CKAN API.
Fraud Detection
Automated heuristic scoring engine that classifies all flagged alerts. Analyzes 7 factors per alert including surname frequency, vendor type, officer role, geographic nexus, and entity type. Produces suspicion scores (0-100) and classifications.
7 fraud detection algorithms (duplicate payments, outliers, vendor concentration, spending spikes, split transactions, threshold clustering, Benford's Law) plus 6 kickback detection algorithms (round-number invoices, contract steering, end-of-period surges, payment velocity, bid rotation, shell companies).
Vendor Enrichment
Vendor entity enrichment via the Texas Comptroller's Franchise Tax public API. Returns corporate officers, directors, and principals for registered Texas businesses. Used to cross-reference vendor leadership against city officials.
Campaign Finance
Bulk CSV export of campaign contribution data for all Harris County judges. Includes contributor names, employers, amounts, and dates. Cross-referenced against city vendor principals for conflict-of-interest detection.
Manually compiled registry of 17 Houston city officials (mayor and city council members) with known associates (spouses, family). Used for campaign finance cross-referencing and conflict-of-interest scanning.
Aggregated law firm and organization contribution data derived from TEC filings. Shows total amounts given, number of judges supported, and individual contribution breakdowns per firm.
Judiciary
Historical criminal court records from the Harris County District Clerk public data portal. Covers all misdemeanor and felony cases with disposition data, attorney names, court assignments, and defendant tracking via SPN (Subject Person Number).
5-factor scoring model (0-100) for each judge: contribution concentration (HHI), city vendor cross-match, volume anomaly vs peers, top donor concentration, and donor-outcome correlation. Risk levels: high (≥60), elevated (≥35), moderate (≥15), low (<15).
Aggregated attorney performance statistics per court derived from criminal case data. Shows case volumes, dismissal rates, and conviction rates for each attorney-court combination. Minimum 5 cases required.
Bond Body Count
Derived dataset tracking defendants released on bond who subsequently commit violent crimes. Uses SPN (Subject Person Number) matching to trace violent incidents back to pending cases and releasing courts/judges. 730-day hard cap on bond-to-violence window.
Individual incident records linking each violent crime to the pending case where the defendant was released on bond. Includes violence tier classification (bodies, serious violence, other violence), releasing court/judge, and days on bond before violence.
Data Transparency Note
All payment data is sourced from the City of Houston's official open data portal (data.houstontx.gov) and is publicly available under the Texas Public Information Act. Campaign contribution data is sourced from the Texas Ethics Commission's public bulk data export. Criminal court records are sourced from the Harris County District Clerk's public records. No private, proprietary, or non-public data is used on this platform. AI-powered investigation classifications are heuristic-based and should not be treated as conclusive findings.