How we calculate Content Relevance

By Rick van Haasteren
2 min read - last updated 28 Jan 2026

We turn your keyword cluster and your page into vectors (“embeddings”), compare them with cosine similarity for each page section, and then combine the best-matching sections into one page score. Higher = more on-topic.

How SiteGuru Calculates Content Relevance

TL;DR: We convert your keyword cluster and your page into vectors (“embeddings”), compare every page section to the cluster using cosine similarity, and then combine the best-matching sections into one page score. Higher = more on-topic.

What Is an Embedding?

An embedding is a numeric representation of text. Similar ideas map to similar vectors in space. We use the same embedding model for your keyword cluster and for your page sections, so comparisons are apples-to-apples.

From Keywords to a Page Score (Step-by-Step)

Build the cluster centroid. We embed each keyword and average them into one vector-the centroid-that represents the overall topic.
Chunk the page into sections. We split the body into coherent sections (structure-aware windows) and embed each section.
Compare with cosine similarity. After normalizing vectors, we compute the similarity between the centroid and every section (0 = unrelated, 1 = very strong match). In practice, strong sections often land around 40% – 60%; 60%+ is excellent but uncommon.
Aggregate the best sections. Your Page Relevance is a weighted top-K average of the strongest sections (K ≈ 10 by default). The very best sections get slightly more weight because a few excellent matches matter more than many weak ones.
Show what helps (and what doesn’t). Reports list sections from most to least relevant and flag a few low-relevance sections you might prune or improve.

How to Read the Numbers

Per-section similarity (0–100%)
- 0 - 15%: off-topic
- 15% - 30%: weakly related
- 30% - 45%: on-topic but light
- 45% - 60%: strong alignment
- 60%+: excellent alignment (rare)
Page Relevance (0–1) - the weighted average of your best sections. Higher usually indicates clearer topical focus and better coverage.

Note: Ranges are guidelines; distributions vary by niche and intent.

Keyword Coverage and “Depth” Suggestions

Keyword coverage: For each cluster keyword, we find its best-matching section. If similarity is below a calibrated threshold, it’s marked uncovered and shown in Content Gaps.
Depth opportunities: We find sections in a mid-band (relevant but shallow) that miss high-importance subtopics. Those sections get a “go deeper” recommendation with the top missing subtopics.

Why Two Pages Can Score Differently

Focus vs. breadth: A tight, focused page can beat a longer but scattered page.
Structure: Clear headings and cohesive sections help the model recognize the topic.
Language & consistency: Using the same language and model for keywords and content keeps relevance scores reliable.

How to Improve Your Score (and Rankings)

Strengthen your best sections with examples, data, or step-by-step guidance.
Fill Content Gaps: cover uncovered, high-importance subtopics first.
Add helpful modules where relevant: comparison tables, FAQs, pricing/ROI, checklists, citations.
Keep sections cohesive: one subtopic per section with clear headings.

FAQ

Is this keyword density?
No. It’s semantic similarity-coverage of ideas, not raw keyword counts. It's much smarter then just looking at specific words, it really understands the text.

Do titles and H1s matter?
Yes. Clear, on-topic headings improve alignment and help readers and models.

Can a single great section rescue a page?
It helps-our aggregation gives extra weight to your strongest sections-but broad, useful coverage still matters.

Rick van Haasteren

Rick van Haasteren loves SEO and building great tools.

Rick has worked as an SEO specialist for many large, international clients, and also has wide experience in developing websites and applications.

One more thing: Rick is the founder and owner of SiteGuru.

Starting an SEO audit

Content Optimization

Search Topics

Technical Audit

Indexation & Sitemaps

Google Update Report

Branded Keywords

AI report

Internal Link Optimization

Backlink Insights

Backlink monitoring

Meta Description Generator

Page Title Generator

On-Page SEO

Structured Data

Image Alt Text

OpenGraph Tags

Broken Links

Canonical URL

HTTP Status Code

Viewport

Compression

Text-to-HTML Ratio

Noindex check

Non-Branded Keywords

CTR Curve