Third-party benchmark synthesis

Compare AI on published benchmarks — without a faked ranking.

Novamente AI Lab turns public AI leaderboards into dated, source-linked operator notes. We report what third parties measured, and never invent a house ranking.

7
benchmark pages
8
public sources tracked
0
rankings invented

What's inside

Three things the site can honestly defend.

Each surface ends in a reusable artifact: cited benchmark notes, a checklist, a shortlist, a workflow outline, a prompt test plan, or an agent launch score.

Cited benchmark synthesis

Public leaderboards for coding, tool use, hallucination, preference, speed, and cost are turned into dated operator notes with links back to the source.

Free verification tools

Generate a checklist, shortlist, workflow outline, prompt test skeleton, or launch-risk score without waiting for a backend.

Operator playbooks

Workflows and guides focus on review gates, failure modes, evidence capture, and how to keep AI output inside a process.

Method

Benchmark authority starts with saying what we did not do.

The site does not run first-party model evaluations today. It publishes a dated synthesis of public evidence, keeps a visible rubric for how we would score tools ourselves, and reserves rankings until there is a real run log.

Honest synthesis

Every benchmark page should point back to public sources, name the date, and keep our interpretation separate from the source number.

Source-linkedDated

Tool utility

The tools exist to produce something that can be copied into a team doc, ticket, or review note.

ReusableStatic and fast

Operator depth

Guides and workflows stay focused on evidence, failure modes, and control points instead of generic AI cheerleading.

PracticalNo fake certainty

Free tools

Start with a practical decision.

The current tool set is still lightweight, but each one is designed to produce a decision artifact you can keep. A deeper comparison matrix is the next major tool surface.

AI Tool Finder

Filter candidate AI tools by task, budget, privacy boundary, and validation needs.

Workflow Builder

Draft agent workflows with scope, tools, human gates, and failure modes.

Prompt Test Generator

Create fixtures, expected behavior notes, scoring rubrics, failure labels, and retest rules.

Archive

Respect the name without impersonating the past.

Novamente.net has AGI history. The archive explains that history, routes old links to relevant context, and clearly states that this is an independent new project.