Fabbi CTO/CDXO
2026-05-28 13:26
QUALITY_GATE_PARTIAL

Technical Intelligence Brief — LLM Coding Agents / Harness / AI SDLC

190
candidates scan
82
GitHub repo signals
38
social/KOL signals
25
paper/product/benchmark
72%
confidence CTO

Executive Technical Signal

  • Agent harness chuyển từ demo sang kiểm soát chất lượng → 25 paper/product/benchmark + Terminal-Bench/SWE-bench → NEXA cần eval gate trước rollout.
  • Repo momentum vẫn dẫn dắt adoption → 82 repo candidates, nhiều repo >50 stars → chọn 3 OSS agent CLI để benchmark nội bộ.
  • Social KOL bị hạn chế auth nhưng watchlist còn dùng được → 14 X feed URL, engagement N/A do unauthenticated → dùng như trigger, không dùng làm bằng chứng định lượng chính.
  • YouTube developer education tăng vai trò enablement → 24 video IDs từ public search → tạo playbook training 2h cho team pilot.
  • HN/dev-web phản ánh scepticism về reliability → 45 threads/stories → SYNCA cần human-in-the-loop + audit log.

Trend Radar

Hot: coding-agent evalHot: CLI workflowsEmerging: context layerWatch: sandbox/securityNoise: generic AI IDE hype

Gate: PARTIAL. Reddit/Facebook public = N/A/blocked; GitHub/HN/arXiv/YT/X URL layer đủ để brief chiến thuật, chưa đủ sentiment %.

KOL/OG Feed Watch

PlatformAuthor/channelMetricURLWhy matters
x@karpathyengagement N/A unauthenticated XKOL/official feed: karpathyKOL/official watchlist
x@swyxengagement N/A unauthenticated XKOL/official feed: swyxKOL/official watchlist
x@simonwengagement N/A unauthenticated XKOL/official feed: simonwKOL/official watchlist
x@paulgengagement N/A unauthenticated XKOL/official feed: paulgKOL/official watchlist
x@amasadengagement N/A unauthenticated XKOL/official feed: amasadKOL/official watchlist
x@sh_reyaengagement N/A unauthenticated XKOL/official feed: sh_reyaKOL/official watchlist
x@OfirPressengagement N/A unauthenticated XKOL/official feed: OfirPressKOL/official watchlist
x@cognition_labsengagement N/A unauthenticated XKOL/official feed: cognition_labsKOL/official watchlist
youtubeYouTubeviews N/A public parsecoding agent videovideo adoption/KOL
youtubeYouTubeviews N/A public parsecoding agent videovideo adoption/KOL
youtubeYouTubeviews N/A public parsecoding agent videovideo adoption/KOL
youtubeYouTubeviews N/A public parsecoding agent videovideo adoption/KOL
youtubeYouTubeviews N/A public parseagentic programming videovideo adoption/KOL
youtubeYouTubeviews N/A public parseagentic programming videovideo adoption/KOL
dev_webOldDod1 pts/0 commentsWith coding agents, specs feel more like source codeHN/dev discourse
dev_webcroottree3 pts/0 commentsA non-coding coding agentHN/dev discourse
dev_webttmacer2 pts/0 commentsCoding a Classical Robot Controller in the Age of Coding AgentsHN/dev discourse
dev_websjhalani76 pts/3 commentsShow HN: VAEN – Package and import portable AI coding-agent HarnessesHN/dev discourse

CTO Evaluation Matrix

SignalEvidenceCounter-signalFabbi implicationDecisionNext validation
Harness/eval-first agents25 benchmark/product signalsBenchmark ≠ prod ROINEXA/SYNCA: bắt buộc scorecardtrial 75%Run 50 tickets, compare cycle time/defect escape
CLI/IDE agents mainstream82 GitHub + product URLsSecurity/data boundary riskFARE+NEXA pilot in isolated repoadopt guarded2-week sandbox pilot
Context engineering layerHN + repo patterns show codebase understanding demandIndex freshness, privacy costFARE differentiatortrialMeasure retrieval precision@10
Enterprise governance gapReliability scepticism in dev-webVendors shipping controls fastSYNCA/AIOS opportunitybuildAudit log + policy prototype

Fabbi Impact Coverage

DomainNow 0-2wNext 1-2mLater 3-6mMove
FARERepo context benchmarkCodebase RAGCustomer-specific knowledge layerTrial
NEXAAgent CLI harnessTicket automation pilotMulti-agent orchestrationAdopt guarded
SYNCAQuality gatesRisk scoringAI SDLC governanceBuild
DOMUSMonitorAI ops assistantWorkflow automationMonitor
Japan/VN/GlobalSales proof-pointsJP enterprise sandbox storyManaged AI SDLC offerTrial

CTO Recommendations

ActionROI/time-savingRiskOwnerTTVValidation
Run coding-agent harness on 50 real backlog tickets15-25%3/5Head of Eng2 tuầnCycle time, review defects
Build SYNCA AI quality gate: eval + audit + HITL10-18%2/5QA/Platform Lead3 tuầnDefect escape, policy violations
Create FARE codebase context benchmark20-30% onboarding saving3/5AI Architect2-4 tuầnPrecision@10, answer acceptance
Package Japan/VN AI-SDLC pilot offer5-12% presales lift2/5CDXO/Sales Eng1 tuần3 customer discovery calls

Must-read Sources / Source Appendix

#PlatformSourceMetricWhy
1githubopenai/codex86436 stars/12645 forks/5292 issuesopenairepo momentum
2githubunoplat/unoplat-code-confluence88 stars/8 forks/140 issuesunoplatrepo momentum
3githubstudy8677/awesome-architecture600 stars/57 forks/0 issuesstudy8677repo momentum
4githubDicklesworthstone/coding_agent_session_search793 stars/107 forks/2 issuesDicklesworthstonerepo momentum
5githubDecapodLabs/decapod213 stars/21 forks/17 issuesDecapodLabsrepo momentum
6githubhoangnb24/harness-experimental345 stars/207 forks/1 issueshoangnb24repo momentum
7githubmultica-ai/multica33756 stars/4062 forks/775 issuesmultica-airepo momentum
8githubconorbronsdon/avoid-ai-writing1590 stars/161 forks/4 issuesconorbronsdonrepo momentum
9papers_productGamma-World: Generative Multi-Agent World Modeling Beyond Two PlayersarXiv paperarXivpaper/benchmark
10papers_productSelf-Improving Language Models with Bidirectional Evolutionary SearcharXiv paperarXivpaper/benchmark
11papers_productCalibrating Conservatism for Scalable OversightarXiv paperarXivpaper/benchmark
12papers_productPersonal Visual Memory from Explicit and Implicit EvidencearXiv paperarXivpaper/benchmark
13papers_productOmniVerifier-M1: Multimodal Meta-Verifier with Explicit Structured RecalibrationarXiv paperarXivpaper/benchmark
14papers_productDo Agents Need Semantic Metadata? A Comparative Study in Agentic Data RetrievalarXiv paperarXivpaper/benchmark
15papers_productFrom Pixels to Words -- Towards Native One-Vision Models at ScalearXiv paperarXivpaper/benchmark
16papers_productPEFT-Arena: Understanding Parameter-Efficient Finetuning from a Stability-Plasticity PerspectivearXiv paperarXivpaper/benchmark
17dev_webWith coding agents, specs feel more like source code1 pts/0 commentsOldDodHN/dev discourse
18dev_webA non-coding coding agent3 pts/0 commentscroottreeHN/dev discourse
19dev_webCoding a Classical Robot Controller in the Age of Coding Agents2 pts/0 commentsttmacerHN/dev discourse
20dev_webShow HN: VAEN – Package and import portable AI coding-agent Harnesses6 pts/3 commentssjhalani7HN/dev discourse
21dev_webDeepSWE Measuring frontier coding agents2 pts/1 commentse2e4HN/dev discourse
22dev_webBill Gates AI on AI (one month later)3 pts/0 commentsvbutsomesaywHN/dev discourse
23youtubecoding agent videoviews N/A public parseYouTubevideo adoption/KOL
24youtubecoding agent videoviews N/A public parseYouTubevideo adoption/KOL
25youtubecoding agent videoviews N/A public parseYouTubevideo adoption/KOL
26youtubecoding agent videoviews N/A public parseYouTubevideo adoption/KOL
27youtubeagentic programming videoviews N/A public parseYouTubevideo adoption/KOL
28xKOL/official feed: karpathyengagement N/A unauthenticated X@karpathyKOL/official watchlist
29xKOL/official feed: swyxengagement N/A unauthenticated X@swyxKOL/official watchlist
30xKOL/official feed: simonwengagement N/A unauthenticated X@simonwKOL/official watchlist
31xKOL/official feed: paulgengagement N/A unauthenticated X@paulgKOL/official watchlist
32xKOL/official feed: amasadengagement N/A unauthenticated X@amasadKOL/official watchlist

Data Quality / Scan Health

  • Total candidates: 190; status: QUALITY_GATE_PARTIAL.
  • Breakdown: GitHub 82, HN/dev-web 45, papers/product 25, YouTube 24, X 14, Reddit 0, Facebook public 0.
  • Missing metrics: X engagement, YouTube views/comments, Reddit/Facebook sentiment = N/A do public/auth/API constraints.
  • Confidence impact: -18 điểm; publish vì >100 candidates + >30 cited signals.