Methodology

How Seeker Analyzes 200,000+ Job Listings

Every number on this site comes from a real, continuously updated job corpus. Here is exactly how we collect, process, and match jobs to resumes.

Data sources

Seeker ingests job listings from 18 verified sources across three tiers:

Tier 1: Company Career Pages

Trust: 95%

Direct API connections to applicant tracking systems. Highest data quality: full descriptions, structured skills, salary data.

Greenhouse, Workday, Ashby, SmartRecruiters, Apple, Amazon, Kaiser Permanente

Tier 2: Curated Job Boards

Trust: 70-80%

Aggregators and specialty boards with editorial curation. Good data quality with moderate skill extraction rates.

Himalayas, Adzuna, USAJobs, Jooble, Reed, TheMuse, FindWork, Remotive

Tier 3: Supplemental Sources

Trust: 40-60%

Additional coverage for niche domains. Lower trust score, used to fill geographic and domain gaps.

Jobicy, Careerjet, JSearch

Each source has a trust score (0.0-1.0) that reflects data quality, description depth, and skill extraction reliability. Trust scores influence match confidence but do not filter results.

Update frequency

The corpus is updated continuously via automated ingestion cycles:

Tier 1 sources: Every ingestion cycle (approximately hourly). Change detection skips unchanged boards to reduce API load.
Tier 2 sources: Every cycle, budget-permitting. API call budgets prevent over-fetching from rate-limited sources.
Deactivation: Jobs not seen in 14+ days are automatically deactivated. The corpus reflects currently active listings, not historical postings.
What “200,000+ live openings” means: jobs the ingestion pipeline has scanned or refreshed within the last 7 days. This is the corpus we re-verified against the source within the last week, not a lifetime total.

Average new jobs per week49,000+

Deduplication rate2-3% per cycle

Skill extraction and matching

When you upload a resume, Seeker runs a multi-stage analysis pipeline:

Resume parsing: Extracts structured data from PDF/DOCX including work history, education, skills, and project descriptions.
Skill extraction: Identifies 15-30 skills per resume using a domain-aware taxonomy. Skills are classified as primary (frequently mentioned, high evidence) or secondary.
Corpus matching: Compares your skill profile against all active listings simultaneously. Each match is scored across skill overlap, seniority alignment, domain relevance, and title intent.
Tier assignment: Matches are bucketed into High Priority, Strong Fit, Good Bets, and Trajectory based on composite score and evidence strength.
Quality gating: Thin listings (fewer than 3 extracted skills or under 200 characters of description) cannot appear in the top tier regardless of score.

Match scoring

Each job match receives a composite score (0-100) built from independent signals:

Skill alignment

40%

Direct and transferable skill overlap between resume and job listing.

Seniority fit

25%

Level alignment between candidate experience and role expectations.

Title intent

15%

Semantic similarity between candidate role history and job title.

Domain relevance

10%

Domain distance between candidate background and job category.

Location + modifiers

10%

Geographic alignment, remote compatibility, and company affinity.

Scores are not curved or normalized. A 78% match means 78% of the scoring criteria are met by evidence on your resume. Two candidates can both score 78% on the same job through different skill combinations.

What we do not do

We do not fabricate statistics. Every number on this site is derived from the live corpus or clearly attributed to industry sources.
We do not sell resume data. Your resume is processed for matching and is not shared with employers, recruiters, or third parties.
We do not inflate match scores. If your resume does not match a job well, the score reflects that honestly.
We do not use AI to generate job listings. Every listing in the corpus is sourced from a real employer career page or verified job board.

Corpus

Our corpus holds hundreds of thousands of active postings, ingested continuously from verified employer career pages and job boards. Because listings are added and expire daily, the exact count changes constantly — so rather than quote a figure here that would quietly go stale, every insight report states the corpus size and the snapshot date it was built from.

Questions about our methodology? Read our guides or reach out at support@seekerscore.com.