Workspace-Bench Official Leaderboard

Public Lite Rankings

Public Workspace-Bench-Lite harness/model rows generated from the latest detailed rubrics pass table.

Workspace-Bench Leaderboards

Framework x Model Matrix

Matrix view of public Workspace-Bench-Lite rubric pass rates. Blank cells mean the latest detailed result table does not contain that framework/model combination.

Threshold Views

Compare average passed Lite tasks under each rubric threshold using the detailed pass_at columns.

Public threshold summary

Composition Analysis

Compare full and Lite task-ability composition directly from the latest official metadata analysis.

Leaderboard Analysis

Real secondary views derived from released Lite leaderboard rows and benchmark metadata.