Cited benchmark synthesis and free tools

Use published AI evidence without pretending we ran the benchmark.

Novamente AI Lab turns public benchmark data, verification workflows, and lightweight tools into something operators can actually use: compare coding assistants, sanity-check research output, score agent risk, and turn AI work into reviewable evidence.

7 benchmark synthesis pages 8 public leaderboard sources tracked No first-party rankings invented

What do you need to check?

Build workflow Score agent risk
AI verification lab workspace with code review, source citation, and workflow evaluation screens
7Benchmark pages now written as cited syntheses 5Free tools for shortlists, checklists, prompts, workflows, and risk scoring 0Claimed house rankings without a dated run log

What's inside

Three things the site can honestly defend.

Each surface ends in a reusable artifact: cited benchmark notes, a checklist, a shortlist, a workflow outline, a prompt test plan, or an agent launch score.

Cited benchmark synthesis

Public leaderboards for coding, tool use, hallucination, preference, speed, and cost are turned into dated operator notes with links back to the source.

Free verification tools

Generate a checklist, shortlist, workflow outline, prompt test skeleton, or launch-risk score without waiting for a backend.

Operator playbooks

Workflows and guides focus on review gates, failure modes, evidence capture, and how to keep AI output inside a process.

Method

Benchmark authority starts with saying what we did not do.

The site does not run first-party model evaluations today. It publishes a dated synthesis of public evidence, keeps a visible rubric for how we would score tools ourselves, and reserves rankings until there is a real run log.

Honest synthesis

Every benchmark page should point back to public sources, name the date, and keep our interpretation separate from the source number.

Source-linkedDated

Tool utility

The tools exist to produce something that can be copied into a team doc, ticket, or review note.

ReusableStatic and fast

Operator depth

Guides and workflows stay focused on evidence, failure modes, and control points instead of generic AI cheerleading.

PracticalNo fake certainty

Free tools

Start with a practical decision.

The current tool set is still lightweight, but each one is designed to produce a decision artifact you can keep. A deeper comparison matrix is the next major tool surface.

AI Tool Finder

Filter candidate AI tools by task, budget, privacy boundary, and validation needs.

Workflow Builder

Draft agent workflows with scope, tools, human gates, and failure modes.

Prompt Test Generator

Create fixtures, expected behavior notes, scoring rubrics, failure labels, and retest rules.

Archive

Respect the name without impersonating the past.

Novamente.net has AGI history. The archive explains that history, routes old links to relevant context, and clearly states that this is an independent new project.