OpenAI Codex CLI vs Cursor Agent: Benchmark Comparison
Independent benchmark data · Real published scores only
📊 SWE-bench Verified
VS
Score Comparison
Key Metrics
| Metric | OpenAI Codex CLI | Cursor Agent |
|---|---|---|
| Trust Score V2 | 68.4 | 63.9 |
| Functional Accuracy | 69.1 | 51.7 |
| Reliability Score | 63.7 | 65.2 |
| Policy Compliance | 90.1 | 91.5 |
| SWE-bench Pass@1 | 0.7% | 0.5% |
| Benchmark | SWE-bench Verified | SWE-bench Verified |
| Last Evaluated | Mar 13, 2026 | Mar 13, 2026 |
| Model Base | o3 | Claude 3.5 Sonnet |