undefinedundefined

Procurement & Selection Framework

AI Tool Evaluation Criteria

A structured scoring framework for evaluating AI tools before procurement. Use this to compare options consistently, identify risks, and justify your selection to stakeholders.

THE REAL COST OF A WRONG CHOICE

Choosing the wrong AI tool costs more than money. It costs time, trust, and momentum.

The average UK organisation wastes 4-6 months and £15,000-£50,000 on an AI tool that doesn’t deliver before they realise the mistake. This framework stops that from happening.

Not sure where to start? Our fractional AI director service includes independent tool selection — no vendor ties, no hidden incentives.

Document Type: Evaluation Framework

Version: 2.0

Issued By: AI-Si Consultancy

Last Reviewed: February 2026

Use Case: AI Tool Selection & Procurement

How to Use This Framework

1. Evaluation Methodology

This framework provides a structured scoring approach for comparing AI tools. Score each tool against the criteria below on a 1–5 scale. Multiply each score by the weighting factor. Sum the weighted scores to produce a total evaluation score.

No AI tool should be procured or approved for use without completing this evaluation. For tools that will process personal data, a Data Protection Impact Assessment (DPIA) must also be completed before approval.

Score 1

Does not meet requirement. Significant gap with no credible roadmap to resolution.

Score 3

Partially meets requirement. Some gaps but manageable with compensating controls.

Score 5

Fully meets or exceeds requirement. No gaps. Market-leading capability in this area.

Evaluation Criteria

2. Evaluation Scoring Framework

Criterion	What to Assess	Weight	Score (1–5)
CAPABILITY (30% of total)
Core Functionality	Does the tool do what you need it to do, reliably and accurately?	10%	____
Output Quality	Is the quality of AI outputs sufficient for your use case without extensive human correction?	10%	____
Customisation	Can the tool be configured, fine-tuned, or prompted to align with your specific context, terminology, and requirements?	5%	____
Reliability & Uptime	What is the vendor’s documented uptime SLA? Is there evidence of production reliability at scale?	5%	____
SECURITY (25% of total)
Data Encryption	Is data encrypted in transit and at rest? What encryption standards are used?	8%	____
Data Residency	Where is data stored and processed? Is UK/EEA data residency guaranteed? Are there standard contractual clauses for non-UK transfers?	8%	____
Access Controls	Is role-based access control available? Is SSO/MFA supported? Can audit logs be exported?	5%	____
Security Certifications	Does the vendor hold ISO 27001, SOC 2 Type II, Cyber Essentials, or equivalent certifications?	4%	____
COMPLIANCE (20% of total)
UK GDPR Compliance	Is the vendor compliant with UK GDPR? Is a Data Processing Agreement available? Has a DPIA been completed?	8%	____
Training Data Transparency	Can the vendor confirm what data was used to train the model? Is there a risk of the model reproducing proprietary or personal data?	6%	____
Output Ownership	Does the vendor claim any ownership or usage rights over content you create using the tool? Is IP ownership clearly assigned to you?	6%	____
INTEGRATION (15% of total)
API Availability	Is a well-documented API available for integration with your existing systems?	6%	____
Existing System Compatibility	Does the tool integrate with your current Microsoft 365, Google Workspace, CRM, or ERP environment?	6%	____
Implementation Complexity	What level of technical resource is required to deploy and maintain the integration?	3%	____
COMMERCIAL (10% of total)
Total Cost of Ownership	What is the full cost including licences, implementation, training, and ongoing support over 3 years?	5%	____
Vendor Viability	Is the vendor financially stable? How long have they been trading? What is their client retention rate?	3%	____
Exit Provisions	Can you export your data? Are there reasonable contract termination provisions? What is the data deletion commitment on exit?	2%	____

Mandatory Requirements

3. Non-Negotiable Requirements

Regardless of overall evaluation score, a tool must meet all of the following mandatory requirements before it can be approved for use. Any single failure is a disqualifying condition:

Automatic Disqualifiers

No Data Processing Agreement available (if processing personal data)
Data is processed or stored in countries without adequate data protection
Vendor claims ownership of outputs created using the tool
No documented security certifications or completed security questionnaire
Tool uses your data to train or improve public AI models without explicit opt-out

Minimum Acceptable Standards

Overall weighted score of 3.0 or above
Security section score of 3.5 or above
Compliance section score of 3.5 or above
Reference customers in your sector available for verification
Vendor can provide a completed security questionnaire within 10 working days

Due Diligence

4. Vendor Due Diligence Checklist

Document / Action Required	Responsible	Completed
Complete evaluation scoring framework above	IT / Operations	[ ]
Request and review vendor security questionnaire	IT / Information Security	[ ]
Review vendor Data Processing Agreement (DPA)	DPO / Legal	[ ]
Complete Data Protection Impact Assessment (if processing personal data)	DPO	[ ]
Obtain and verify security certifications (ISO 27001, SOC 2, etc.)	IT	[ ]
Check data residency and international transfer provisions	DPO	[ ]
Review training data usage and opt-out provisions	IT / Legal	[ ]
Verify IP ownership terms for AI-generated outputs	Legal	[ ]
Obtain two sector-relevant reference contacts	Operations	[ ]
Complete total cost of ownership modelling (3-year)	Finance	[ ]
Obtain approval from AI Steering Committee	Operations Director	[ ]

Worked Examples

How Popular AI Tools Score Against This Framework

To illustrate how to use the evaluation criteria, here is how three widely-used AI tools scored against our framework when assessed for a typical UK professional services organisation. Scores are indicative — your specific context will affect results.

ChatGPT (OpenAI)

Security & GDPR6/10

Data Residency (UK)4/10

Integration Ease8/10

Capability9/10

Vendor Stability8/10

Overall: 7/10
Best for: General productivity

Microsoft Copilot 365

Security & GDPR9/10

Data Residency (UK)9/10

Integration Ease9/10

Capability7/10

Vendor Stability9/10

Overall: 8.6/10
Best for: MS365 environments

Claude (Anthropic)

Security & GDPR8/10

Data Residency (UK)6/10

Integration Ease7/10

Capability9/10

Vendor Stability8/10

Overall: 7.6/10
Best for: Complex analysis

These scores are indicative and context-dependent. Your organisation’s existing infrastructure, compliance requirements, and use case will significantly affect the correct choice. Need independent evaluation? Book a free call.

Related resources: Once you’ve selected a tool, you’ll need governance to deploy it safely. See our AI Governance & Risk framework and AI Director services for implementation support.

FREE DOWNLOAD

AI Tool Scoring Spreadsheet

A pre-built spreadsheet for evaluating AI tools against the 6 criteria in this framework. Weight criteria by importance, score 2-5 tools side-by-side, and generate a recommendation with justification for board sign-off.

+Pre-loaded with security, GDPR, and contract red-flag scoring
+Worked example with 3 popular AI tools included
+Auto-calculates weighted scores and recommendation
+Includes governance and services internal links for context

AI Governance Context Services Overview

Get the scoring spreadsheet:

UK GDPR compliant. No spam. Unsubscribe at any time.

Need help evaluating tools for your organisation?

Book a Free Call

Need Help Evaluating AI Tools?

AI-Si provides independent AI vendor evaluation as part of our fractional AI director services — helping you select the right tools without vendor bias.

BOOK YOUR FREE AI STRATEGY DISCUSSION NOW

undefined