Got enough. Here’s the ELI5:
The thesis in plain English
The puzzle. Latvia spends ~15% of GDP through public procurement, but ~22% of tenders only get one bid (vs. ~14% EU average). Each missing bidder costs real money (2.5–6% per extra competitor on price). The Latvian government is reforming the Public Procurement Law in 2026 — most visibly, raising the regulated threshold from €10k to €143k. But the reform is essentially a guess about why suppliers don’t bid. Jāzeps’s thesis asks the prior question: why do legally eligible suppliers stay away, especially in knowledge work (IT, consulting, R&D, analytics)? (#DWCV8W, #U57A7C)
The theoretical move. He reframes a tender as a signaling environment: the buyer designs a screening architecture, and submitting a bid is a costly signal. The supplier’s bid/no-bid decision is then a rational response to perceived transaction costs (PTC) that hit before any contract is awarded. He decomposes PTC into three components: bid preparation burden, evaluation uncertainty, and integrity risk — and builds a “two-gate” participation model (Gate 1 = is this tender even worth engaging with; Gate 2 = can I produce a competitive bid at acceptable cost). Knowledge services amplify all of this because the bid itself requires expensive intellectual labour you can’t recover if you lose. (#YFYYHE, #3T92HU)
The empirics. Sequential explanatory mixed-methods (#U733DL):
- Quantitative phase — IUB open procurement data 2024–2025, KIBS tenders identified by CPV codes (72/73/79) plus a Latvian-language keyword dictionary as robustness check. Regressions (Poisson/NB for bidder counts, logit for single-bid). The trick: he uses residuals diagnostically. Where the model explains low participation well → cost-binding sub-market. Where actual participation is way below predicted → something the observable variables can’t see (trust, evaluation competence concerns) — and those become the targets for interviews. (#44QV8L, #6CWVF5)
- Qualitative phase — semi-structured interviews with suppliers (domestic + international, active + withdrawn), using anonymised tenders from the dataset as vignettes, coded against the gate framework. (#82JSLX)
- Synthesis — a sub-market × binding-constraint matrix (cost-binding / quality-binding / integrity-binding / mixed), used to predict which sub-markets the 2026 reform will actually reach. The reform mostly simplifies formal procedural cost; if a sub-market is informally constrained (trust, integrity), threshold changes won’t help. (#AG5337)
Cross-cutting variable: supplier origin. Cross-border participation in Latvian KIBS is 1.9% vs. Estonia’s 13.6% — same EU rules, very different behaviour, suggesting informal participation costs. (#5M8N7W)
The LLM usage
This is in §3.2 (research design) and is fairly disciplined. The IUB export gives titles and basic fields, but the real gate-model variables (evaluation criteria weights, qualification threshold specificity, documentation complexity, required annexes, award records) live inside PDFs and .doc files, mostly in Latvian, with formatting that varies by contracting authority. Manual coding the full sample (~1,789 tenders) is infeasible, so he uses Claude Haiku via API for structured extraction.
The protocol has four stages (#8LUL7H, #QTUWVU):
- Manual gold-standard set — he hand-codes a reference set first.
- Prompt + schema development — Claude Haiku with structured output schemas, iterative prompt refinement against the gold standard. He cites Lunn et al. 2025 for the iteration method, and Konet et al. 2024 / Schmidt et al. 2024 as evidence that LLMs can hit ~80–96% accuracy on structured extraction when guided by a predefined schema and validated. (#C7RMKD)
- Full pipeline run with audit logging.
- Validation on a random 20% subsample against source documents, with agreement rates reported per variable (Krippendorff’s principle: reliability isn’t a single number, it varies by what you’re extracting). Variables below threshold are either re-prompted or dropped with the limitation flagged.
He’s explicit about the measurement-uncertainty caveat in the limitations (#NUMYCT). The framing is conservative: LLM as a force-multiplier on a Krippendorff-style content-analysis protocol, not as an oracle — gold standard first, schemas constrain output, human validation per variable, and the methodological literature he leans on is the recent computational-content-analysis wave rather than anything hand-wavy.
So in one breath: theory = tenders are screening architectures, non-participation = rational response to perceived transaction costs; method = mixed-methods with regression residuals targeting the interview phase; LLM role = scaling structured variable extraction from Latvian tender PDFs under a validated content-analysis protocol; payoff = a sub-market typology that predicts which slices of the 2026 reform will actually move participation.