Payload
{
"action": "edited",
"issue": {
"url": "https://api.github.com/repos/darkmatter/nixmac/issues/108",
"repository_url": "https://api.github.com/repos/darkmatter/nixmac",
"labels_url": "https://api.github.com/repos/darkmatter/nixmac/issues/108/labels{/name}",
"comments_url": "https://api.github.com/repos/darkmatter/nixmac/issues/108/comments",
"events_url": "https://api.github.com/repos/darkmatter/nixmac/issues/108/events",
"html_url": "https://github.com/darkmatter/nixmac/issues/108",
"id": 4402634906,
"node_id": "I_kwDOSB6EzM8AAAABBmrgmg",
"number": 108,
"title": "Stabilize and benchmark B200 / gpt-oss-120b for nixmac usage",
"user": {
"login": "linear-code[bot]",
"id": 222613912,
"node_id": "BOT_kgDODUTRmA",
"avatar_url": "https://avatars.githubusercontent.com/in/1658531?v=4",
"gravatar_id": "",
"url": "https://api.github.com/users/linear-code%5Bbot%5D",
"html_url": "https://github.com/apps/linear-code",
"followers_url": "https://api.github.com/users/linear-code%5Bbot%5D/followers",
"following_url": "https://api.github.com/users/linear-code%5Bbot%5D/following{/other_user}",
"gists_url": "https://api.github.com/users/linear-code%5Bbot%5D/gists{/gist_id}",
"starred_url": "https://api.github.com/users/linear-code%5Bbot%5D/starred{/owner}{/repo}",
"subscriptions_url": "https://api.github.com/users/linear-code%5Bbot%5D/subscriptions",
"organizations_url": "https://api.github.com/users/linear-code%5Bbot%5D/orgs",
"repos_url": "https://api.github.com/users/linear-code%5Bbot%5D/repos",
"events_url": "https://api.github.com/users/linear-code%5Bbot%5D/events{/privacy}",
"received_events_url": "https://api.github.com/users/linear-code%5Bbot%5D/received_events",
"type": "Bot",
"user_view_type": "public",
"site_admin": false
},
"labels": [
{
"id": 10893040504,
"node_id": "LA_kwDOSB6EzM8AAAACiUabeA",
"url": "https://api.github.com/repos/darkmatter/nixmac/labels/Feature",
"name": "Feature",
"color": "ededed",
"default": false,
"description": null
}
],
"state": "open",
"locked": false,
"assignees": [],
"milestone": null,
"comments": 0,
"created_at": "2026-05-07T23:33:36Z",
"updated_at": "2026-06-01T06:44:25Z",
"closed_at": null,
"assignee": null,
"author_association": "NONE",
"issue_field_values": [],
"type": null,
"active_lock_reason": null,
"sub_issues_summary": {
"total": 0,
"completed": 0,
"percent_completed": 0
},
"parent_issue_url": "https://api.github.com/repos/darkmatter/nixmac/issues/100",
"issue_dependencies_summary": {
"blocked_by": 0,
"total_blocked_by": 0,
"blocking": 0,
"total_blocking": 0
},
"body": "## Context\n\nFarhan asked whether the Dark Matter B200 running gpt-oss-120b is good enough to use for all of nixmac in its current shape, and whether that creates an obvious flat-cost monetization path.\n\n## Scope\n\n* Define a nixmac eval set that represents current app workloads: config generation, Homebrew/diff summarization, relation/group allocation, error repair, longer multi-turn evolutions, and structured-output tasks.\n* Route those evals through the B200 gpt-oss-120b path via the current LiteLLM/proxy stack.\n* Compare against the current cloud-provider baseline on pass rate, latency, token usage, structured-output validity, empty responses, max-token failures, and connection resets.\n* Identify prompt/token-budget mitigations needed for B200 viability.\n* Produce a go/no-go recommendation for one of: default hosted path, lower-cost tier only, internal/dev only, or not viable yet.\n\n## Acceptance criteria\n\n* Evals include both short interactive tasks and long-running evolution/summarization tasks.\n* Results quantify reliability, latency, token usage, and failure modes against the current baseline.\n* Empty response, max-token, structured-output, and connection-reset failure modes are explicitly tracked.\n* Recommendation states whether B200/gpt-oss-120b is good enough for all of nixmac today, a subset of nixmac, or not yet.\n* Follow-up implementation tickets are created only after the benchmark produces a clear routing recommendation.\n\n## Related work\n\n* [ENG-420](https://linear.app/darkmatterlabs/issue/ENG-420/centralized-llm-proxy-for-provider-consistency) centralized LLM proxy/provider consistency.\n* [ENG-214](https://linear.app/darkmatterlabs/issue/ENG-214/token-usage-visibility-and-cost-awareness-for-cloud-ai-providers) BYO/cloud token usage visibility.\n* Existing eval/performance tickets for provider quality and routing.\n\n---\n\n## Acceptance Criteria / Gherkin Specs\n\n```gherkin\nScenario: Eval set covers all key nixmac workloads\n Given the eval set is defined\n Then it includes tasks covering: config generation, Homebrew/diff summarization, relation/group allocation, error repair, multi-turn evolutions, and structured-output tasks\n And both short interactive tasks and longer-running tasks are represented\n\nScenario: B200/gpt-oss-120b results are measured against a cloud baseline\n Given the eval set is run against both B200 and the current cloud baseline\n Then results are recorded for: pass rate, latency, token usage, structured-output validity, empty responses, max-token failures, and connection resets\n And the comparison is quantitative, not just qualitative\n\nScenario: Empty response and connection-reset failure modes are explicitly tracked\n Given the eval runs against B200\n Then empty responses, max-token failures, structured-output failures, and connection resets are counted separately\n And these failure modes are reported in the benchmark results\n\nScenario: Benchmark produces a go/no-go recommendation\n Given the benchmark results are analyzed\n Then a clear recommendation is produced: one of \"default hosted path\", \"lower-cost tier only\", \"internal/dev only\", or \"not viable yet\"\n And the recommendation is backed by the quantitative eval results\n\nScenario: No hosted-usage productization work starts until the benchmark recommends it\n Given the benchmark has not yet produced a go recommendation\n Then no usage-credit accounting, payment tier, or hosted-usage billing is built\n And only after a clear go recommendation are ENG-460 follow-up implementation tickets created\n```",
"reactions": {
"url": "https://api.github.com/repos/darkmatter/nixmac/issues/108/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
},
"timeline_url": "https://api.github.com/repos/darkmatter/nixmac/issues/108/timeline",
"performed_via_github_app": null,
"state_reason": null,
"pinned_comment": null
},
"changes": {
"body": {
"from": "## Context\n\nFarhan asked whether the Dark Matter B200 running gpt-oss-120b is good enough to use for all of nixmac in its current shape, and whether that creates an obvious flat-cost monetization path.\n\n## Scope\n\n* Define a nixmac eval set that represents current app workloads: config generation, Homebrew/diff summarization, relation/group allocation, error repair, longer multi-turn evolutions, and structured-output tasks.\n* Route those evals through the B200 gpt-oss-120b path via the current LiteLLM/proxy stack.\n* Compare against the current cloud-provider baseline on pass rate, latency, token usage, structured-output validity, empty responses, max-token failures, and connection resets.\n* Identify prompt/token-budget mitigations needed for B200 viability.\n* Produce a go/no-go recommendation for one of: default hosted path, lower-cost tier only, internal/dev only, or not viable yet.\n\n## Acceptance criteria\n\n* Evals include both short interactive tasks and long-running evolution/summarization tasks.\n* Results quantify reliability, latency, token usage, and failure modes against the current baseline.\n* Empty response, max-token, structured-output, and connection-reset failure modes are explicitly tracked.\n* Recommendation states whether B200/gpt-oss-120b is good enough for all of nixmac today, a subset of nixmac, or not yet.\n* Follow-up implementation tickets are created only after the benchmark produces a clear routing recommendation.\n\n## Related work\n\n* [ENG-420](https://linear.app/darkmatterlabs/issue/ENG-420/centralized-llm-proxy-for-provider-consistency) centralized LLM proxy/provider consistency.\n* [ENG-214](https://linear.app/darkmatterlabs/issue/ENG-214/token-usage-visibility-and-cost-awareness-for-cloud-ai-providers) BYO/cloud token usage visibility.\n* Existing eval/performance tickets for provider quality and routing."
}
},
"repository": {
"id": 1209959628,
"node_id": "R_kgDOSB6EzA",
"name": "nixmac",
"full_name": "darkmatter/nixmac",
"private": false,
"owner": {
"login": "darkmatter",
"id": 17834193,
"node_id": "MDEyOk9yZ2FuaXphdGlvbjE3ODM0MTkz",
"avatar_url": "https://avatars.githubusercontent.com/u/17834193?v=4",
"gravatar_id": "",
"url": "https://api.github.com/users/darkmatter",
"html_url": "https://github.com/darkmatter",
"followers_url": "https://api.github.com/users/darkmatter/followers",
"following_url": "https://api.github.com/users/darkmatter/following{/other_user}",
"gists_url": "https://api.github.com/users/darkmatter/gists{/gist_id}",
"starred_url": "https://api.github.com/users/darkmatter/starred{/owner}{/repo}",
"subscriptions_url": "https://api.github.com/users/darkmatter/subscriptions",
"organizations_url": "https://api.github.com/users/darkmatter/orgs",
"repos_url": "https://api.github.com/users/darkmatter/repos",
"events_url": "https://api.github.com/users/darkmatter/events{/privacy}",
"received_events_url": "https://api.github.com/users/darkmatter/received_events",
"type": "Organization",
"user_view_type": "public",
"site_admin": false
},
"html_url": "https://github.com/darkmatter/nixmac",
"description": "Home manager and nix-darwin that understands plain English",
"fork": false,
"url": "https://api.github.com/repos/darkmatter/nixmac",
"forks_url": "https://api.github.com/repos/darkmatter/nixmac/forks",
"keys_url": "https://api.github.com/repos/darkmatter/nixmac/keys{/key_id}",
"collaborators_url": "https://api.github.com/repos/darkmatter/nixmac/collaborators{/collaborator}",
"teams_url": "https://api.github.com/repos/darkmatter/nixmac/teams",
"hooks_url": "https://api.github.com/repos/darkmatter/nixmac/hooks",
"issue_events_url": "https://api.github.com/repos/darkmatter/nixmac/issues/events{/number}",
"events_url": "https://api.github.com/repos/darkmatter/nixmac/events",
"assignees_url": "https://api.github.com/repos/darkmatter/nixmac/assignees{/user}",
"branches_url": "https://api.github.com/repos/darkmatter/nixmac/branches{/branch}",
"tags_url": "https://api.github.com/repos/darkmatter/nixmac/tags",
"blobs_url": "https://api.github.com/repos/darkmatter/nixmac/git/blobs{/sha}",
"git_tags_url": "https://api.github.com/repos/darkmatter/nixmac/git/tags{/sha}",
"git_refs_url": "https://api.github.com/repos/darkmatter/nixmac/git/refs{/sha}",
"trees_url": "https://api.github.com/repos/darkmatter/nixmac/git/trees{/sha}",
"statuses_url": "https://api.github.com/repos/darkmatter/nixmac/statuses/{sha}",
"languages_url": "https://api.github.com/repos/darkmatter/nixmac/languages",
"stargazers_url": "https://api.github.com/repos/darkmatter/nixmac/stargazers",
"contributors_url": "https://api.github.com/repos/darkmatter/nixmac/contributors",
"subscribers_url": "https://api.github.com/repos/darkmatter/nixmac/subscribers",
"subscription_url": "https://api.github.com/repos/darkmatter/nixmac/subscription",
"commits_url": "https://api.github.com/repos/darkmatter/nixmac/commits{/sha}",
"git_commits_url": "https://api.github.com/repos/darkmatter/nixmac/git/commits{/sha}",
"comments_url": "https://api.github.com/repos/darkmatter/nixmac/comments{/number}",
"issue_comment_url": "https://api.github.com/repos/darkmatter/nixmac/issues/comments{/number}",
"contents_url": "https://api.github.com/repos/darkmatter/nixmac/contents/{+path}",
"compare_url": "https://api.github.com/repos/darkmatter/nixmac/compare/{base}...{head}",
"merges_url": "https://api.github.com/repos/darkmatter/nixmac/merges",
"archive_url": "https://api.github.com/repos/darkmatter/nixmac/{archive_format}{/ref}",
"downloads_url": "https://api.github.com/repos/darkmatter/nixmac/downloads",
"issues_url": "https://api.github.com/repos/darkmatter/nixmac/issues{/number}",
"pulls_url": "https://api.github.com/repos/darkmatter/nixmac/pulls{/number}",
"milestones_url": "https://api.github.com/repos/darkmatter/nixmac/milestones{/number}",
"notifications_url": "https://api.github.com/repos/darkmatter/nixmac/notifications{?since,all,participating}",
"labels_url": "https://api.github.com/repos/darkmatter/nixmac/labels{/name}",
"releases_url": "https://api.github.com/repos/darkmatter/nixmac/releases{/id}",
"deployments_url": "https://api.github.com/repos/darkmatter/nixmac/deployments",
"created_at": "2026-04-14T00:37:13Z",
"updated_at": "2026-06-01T06:15:50Z",
"pushed_at": "2026-06-01T06:32:03Z",
"git_url": "git://github.com/darkmatter/nixmac.git",
"ssh_url": "git@github.com:darkmatter/nixmac.git",
"clone_url": "https://github.com/darkmatter/nixmac.git",
"svn_url": "https://github.com/darkmatter/nixmac",
"homepage": "https://nixmac.com",
"size": 678800,
"stargazers_count": 5,
"watchers_count": 5,
"language": "Rust",
"has_issues": true,
"has_projects": true,
"has_downloads": true,
"has_wiki": true,
"has_pages": false,
"has_discussions": false,
"forks_count": 1,
"mirror_url": null,
"archived": false,
"disabled": false,
"open_issues_count": 77,
"license": {
"key": "mit",
"name": "MIT License",
"spdx_id": "MIT",
"url": "https://api.github.com/licenses/mit",
"node_id": "MDc6TGljZW5zZTEz"
},
"allow_forking": true,
"is_template": false,
"web_commit_signoff_required": false,
"has_pull_requests": true,
"pull_request_creation_policy": "all",
"topics": [
"home-manager",
"nix",
"nix-darwin",
"nix-flake",
"opencode"
],
"visibility": "public",
"forks": 1,
"open_issues": 77,
"watchers": 5,
"default_branch": "develop",
"custom_properties": {}
},
"organization": {
"login": "darkmatter",
"id": 17834193,
"node_id": "MDEyOk9yZ2FuaXphdGlvbjE3ODM0MTkz",
"url": "https://api.github.com/orgs/darkmatter",
"repos_url": "https://api.github.com/orgs/darkmatter/repos",
"events_url": "https://api.github.com/orgs/darkmatter/events",
"hooks_url": "https://api.github.com/orgs/darkmatter/hooks",
"issues_url": "https://api.github.com/orgs/darkmatter/issues",
"members_url": "https://api.github.com/orgs/darkmatter/members{/member}",
"public_members_url": "https://api.github.com/orgs/darkmatter/public_members{/member}",
"avatar_url": "https://avatars.githubusercontent.com/u/17834193?v=4",
"description": ""
},
"enterprise": {
"id": 469843,
"slug": "darkmatter",
"name": "darkmatter",
"node_id": "E_kgDOAAcrUw",
"avatar_url": "https://avatars.githubusercontent.com/b/469843?v=4",
"description": "",
"website_url": "darkmatter.io",
"html_url": "https://github.com/enterprises/darkmatter",
"created_at": "2025-09-07T16:01:00Z",
"updated_at": "2026-05-09T15:34:55Z"
},
"sender": {
"login": "czxtm",
"id": 1325802,
"node_id": "MDQ6VXNlcjEzMjU4MDI=",
"avatar_url": "https://avatars.githubusercontent.com/u/1325802?v=4",
"gravatar_id": "",
"url": "https://api.github.com/users/czxtm",
"html_url": "https://github.com/czxtm",
"followers_url": "https://api.github.com/users/czxtm/followers",
"following_url": "https://api.github.com/users/czxtm/following{/other_user}",
"gists_url": "https://api.github.com/users/czxtm/gists{/gist_id}",
"starred_url": "https://api.github.com/users/czxtm/starred{/owner}{/repo}",
"subscriptions_url": "https://api.github.com/users/czxtm/subscriptions",
"organizations_url": "https://api.github.com/users/czxtm/orgs",
"repos_url": "https://api.github.com/users/czxtm/repos",
"events_url": "https://api.github.com/users/czxtm/events{/privacy}",
"received_events_url": "https://api.github.com/users/czxtm/received_events",
"type": "User",
"user_view_type": "public",
"site_admin": false
},
"installation": {
"id": 131074261,
"node_id": "MDIzOkludGVncmF0aW9uSW5zdGFsbGF0aW9uMTMxMDc0MjYx"
}
}