Methodology

Testing Methodology

How every score is built.

Every number on Toolspect comes from the same documented process. This page explains what we test, how we test it, and how each dimension is weighted — so you can decide how much to trust the result.

The setup

Testing environment

Every tool is tested on a paid personal subscription purchased with a personal credit card. No press accounts. No vendor-provided credits. No trials extended by the company. This matters because vendor-provided access can bypass render queues, support tiers, and usage limits that a real paying customer encounters every day.

Hardware

Mac + Windows

M2 MacBook Pro (macOS Sequoia) and Windows 11. Both tested to surface platform-specific issues.

Browsers

Chrome + Firefox

Primary testing on Chrome. Firefox for comparison. Mobile apps tested separately on iOS where available.

Duration

2 weeks minimum

No review is published until at least two weeks of active testing are complete across all four scenarios.

Account type

Paid only

Personal subscription at the plan most relevant to independent B2B video teams. Enterprise tiers noted separately if tested.

Use cases

3 minimum

At least three distinct production use cases per tool, covering the core scenarios below.

Verification

Primary sources

All pricing figures and feature claims verified against the vendor’s live pricing page or help documentation.

The process

Four core test scenarios

Every tool goes through the same four standardised scenarios before we write a word of the review. This makes scores comparable across tools — not based on whatever each vendor wants to show off in their demo.

Demo

The client demo script

A 90-second product demo produced from a standardised script using the tool’s primary workflow. Tests output quality, avatar realism where applicable, voice naturalness, and the gap between what the marketing shows and what you actually get at the plan being reviewed.

Training

The training module workflow

A five-minute instructional video built using screen recording, narration, and slide integration. Tests the practical workflow for the most common B2B video use case and surfaces problems with timeline editing, audio sync, and export quality.

Limits

The credit and limit stress test

We deliberately push against the plan’s stated limits — render minutes, storage, seats, API calls — to document where the tool slows down, locks features, or prompts upsells. This is where the real pricing picture becomes clear, and where the Pricing Transparency dimension of the score is primarily determined.

Support

The support contact test

A genuine support query — not a test query — submitted during the testing period. We document response time, quality of the answer, and whether the issue was resolved. Support quality is weighted in the score because it’s a real operational concern for any team relying on these tools for production work.

The score

Score dimensions and weights

50%

Output Quality

Realism and fidelity of the final video output across all tested use cases. Includes avatar quality, voice naturalness, render consistency, and how closely output matches the preview shown before rendering.

25%

Pricing Transparency

Clarity of what you get at each price point. Hidden credit systems, buried render limits, and misleading plan comparisons all reduce this score. Tools with simple, honest pricing score higher.

20%

Ease of Use

Time from account creation to first usable video output, interface clarity, workflow logic, and how much the tool gets out of the way of the actual production work.

10%

Support Quality

Response time, accuracy, and resolution rate on genuine support queries. Tested at the paid tier being reviewed — not enterprise tier. Tested once, unannounced, during the standard testing period.

Our commitments

What we don’t do

🚫

No vendor demos

We don’t review tools based on vendor demos, press briefings, or beta access unless we’ve also tested the paid version independently.

🚫

No free-tier reviews

We don’t publish scores based on tools we’ve only tested on free tiers. Free-tier behavior doesn’t predict paid-tier experience.

🚫

No incomplete tests

We don’t publish scores until all four test scenarios are complete. No exceptions for tight timelines or recently launched tools.

🚫

No commercial score adjustments

Scores are not adjusted in response to vendor feedback, affiliate commission rates, or any commercial consideration.

🔄

Scores are updated

When a tool changes its pricing, ships a major feature update, or degrades in quality, we re-test and update the score with a visible revision date.

🔗

Sources are cited

All pricing figures and feature claims are verified against primary sources. Where we cite third-party data, we link to the original.