How It Works

How PathScorer Scores 1,000+ Careers in Two Minutes

A look inside the algorithm: from resume text to ranked career matches.

Most career tools are recommendation engines wearing a quiz costume. Answer ten questions about whether you prefer working with people or data, and the system hands you three job titles it was going to suggest anyway. The model is thin, the output is thin, and the person walking away has learned nothing they didn’t already know.

PathScorer works differently, and the difference is worth explaining in some detail, partly because the technical approach is interesting, and partly because understanding how a system reaches its conclusions is the only way to evaluate whether those conclusions are worth trusting.

Here’s what happens from the moment someone uploads a resume to the moment a ranked list of career matches appears on screen.

Step 1: Parsing the resume into structured signals

A resume is unstructured text. Job titles, company names, dates, bullet points describing responsibilities, the occasional skills section with a list of software tools. Before any matching can happen, that text needs to be converted into something a scoring algorithm can work with.

The parsing pipeline does several things in sequence.

First, it extracts job history: roles, tenures, industries, seniority trajectory. A person who went from analyst to senior analyst to manager over eight years carries a different signal than someone who held the same title at three different companies. Duration matters. Direction matters.

Second, it pulls explicit skills: programming languages, software platforms, certifications, methodologies. These are relatively easy to extract because people tend to list them directly. “Python, SQL, Tableau” in a skills section is unambiguous.

Third, and more interestingly, it extracts implicit skills from the way responsibilities are described. “Led a team of six through a product migration” contains signals about people management, project coordination, and change management that the person may never have listed as discrete skills. “Reduced customer churn by 18% through targeted outreach campaigns” contains signals about analytics, customer relationship management, and performance measurement.

The system also accepts plain-language input for people who don’t have a resume formatted the way a parser expects, or who want to describe side projects and informal experience that never made it onto a CV. A freelance photographer who also teaches weekend workshops and handles their own client billing is carrying sales, curriculum design, and financial management skills that a standard resume would never surface.

The output of this step is a raw skill inventory: a list of capabilities with rough confidence scores based on how explicitly and how repeatedly they appear in the source material.

Step 2: Building the skill vector

The raw inventory gets mapped to a structured representation. Specifically, each user’s profile becomes a vector in a high-dimensional skill space, where the dimensions correspond to the skill taxonomy defined by O*NET.

O*NET (the Occupational Information Network, maintained by the U.S. Department of Labor) describes occupations across 35 standardized skill categories. Things like:

Active Listening

Critical Thinking

Judgment and Decision Making

Systems Evaluation

Negotiation

Coordination

Operations Monitoring

Equipment Maintenance

Each skill gets a score from 0 to 100 representing estimated proficiency or exposure. The vector for a mid-level product manager might look something like this:

Active Listening78

Critical Thinking82

Judgment / Decisions74

Systems Evaluation68

Negotiation61

Coordination85

Complex Problem Solving71

Writing65

Programming34

Equipment Maintenance4

The scores aren’t self-reported. They’re inferred from job history, role seniority, stated responsibilities, and the implicit signals extracted during parsing. Someone who has managed cross-functional projects for five years gets a higher coordination score than someone who lists “coordination” in a skills section with no supporting context.

Hidden inputs get mapped to the same space. “Selling vintage furniture on eBay” maps to sales, inventory management, and pricing judgment. “Fluent in Mandarin and English” maps to a set of roles that specifically require bilingual communication. “Teaching yoga on weekends” maps to instructional skills, curriculum design, and physical education. The vector captures what the person can do, not just what their employer printed on their contract.

Curious how this works on your profile? Upload your resume and see the full pipeline in action — 1,000+ occupations scored in two minutes, free.

Score my career — free

Step 3: Comparing against 1,000+ occupation vectors

O*NET doesn’t just describe skills in the abstract. It provides rated skill profiles for over 1,000 occupations, using the same 35-dimensional taxonomy. Each occupation has a vector of its own: the skill levels required to perform that job at a competent level.

The vector for a Financial Analyst looks different from the vector for a Nurse Practitioner, which looks different from the vector for a Supply Chain Manager. But some occupations that carry completely different titles are closer together in this space than common intuition would suggest.

The core matching problem is now a geometric one: given a user vector, find the occupation vectors closest to it.

Cosine similarity is the primary distance metric. For two vectors A and B:

similarity(A, B) = (A · B) / (‖A‖ × ‖B‖)

Cosine similarity measures the angle between two vectors rather than their absolute distance. This matters because it handles a specific problem in skill matching: a senior professional and a junior professional might have similar proportional skill distributions even if the senior person’s scores are uniformly higher. Cosine similarity catches that structural resemblance; pure Euclidean distance would miss it.

The result of running this comparison across all 1,000+ occupations is a ranked similarity score for every occupation in the database. Raw similarity alone isn’t the final output, but it’s the foundation everything else builds on.

Step 4: Applying weighted factors

Similarity to an occupation’s skill profile tells you what you could plausibly do. It doesn’t tell you what you should pursue. The algorithm applies four additional factors to reshape the ranking based on what actually matters for career quality.

Salary

Bureau of Labor Statistics data provides median annual compensation by occupation, broken down by metropolitan area. This isn’t an estimate or a survey average; it’s federal wage data covering hundreds of millions of employment records.

The salary factor does two things. First, it adjusts scores upward for occupations that pay significantly more than the user’s current role, surfacing matches that represent genuine economic upside. Second, it runs a geographic comparison: if an occupation pays $40,000 more in a city within the user’s stated relocation range, that information gets surfaced explicitly rather than buried in fine print.

Work-life balance

O*NET includes occupational data on working conditions: frequency of overtime, pace of work, physical demands, schedule regularity, and stress exposure. These get rolled into a work-life balance modifier. A user who indicates they want to exit a high-burnout environment gets scores adjusted against occupations with similar intensity profiles, even if the skill match is strong.

Automation risk

Each occupation carries an exposure score based on research into which task types are most susceptible to automation over a ten-year horizon. Occupations with high routine-task concentration and low social or creative skill requirements score higher on this risk dimension. The algorithm surfaces this as a modifier rather than a filter, on the reasoning that a user should see the information and decide what weight to give it rather than having the system silently exclude options.

Growth potential

BLS projected employment growth data by occupation, combined with O*NET data on within-occupation advancement pathways, produces a forward-looking score. This rewards occupations that are expanding and where the skill profile has room to develop rather than plateau.

The four factors combine with the base similarity score through weighted scoring. The weights shift based on user inputs during the priority-setting phase: someone who says salary is their primary concern gets a different final ranking than someone who is optimizing for stability and reasonable hours, even if their skill vectors are identical.

The full pipeline: from resume to top matches

Put the steps together and the flow looks like this:

Resume / description

Parsing layer

NLP extraction of explicit + implicit skills, side hustles, languages

Skill vector (35 dimensions)

Mapped to O*NET taxonomy

Cosine similarity

Against 1,000+ occupation vectors

Weighted scoring

Salary · Work-life balance · Automation risk · Growth

Priority weights applied

User-specified preferences

Top 20 career matches

With scores, salary data, gap analysis, and growth paths

The output for each match includes the similarity score, current compensation in the user’s metro and in higher-paying alternatives, the specific skills driving the match, the skills that represent the gap between current profile and full qualification, and concrete paths to close that gap (certifications, programs, timeframes).

Why the 35-dimensional space matters

A natural question is why this approach surfaces matches that straightforward job board search misses.

The answer is in the geometry. In a 35-dimensional skill space, occupations that share no title vocabulary can sit very close together. “Logistics coordinator” and “operations analyst” might not share a single keyword, but their skill vectors overlap heavily in systems thinking, data interpretation, coordination, and process management. Cosine similarity catches that. Title matching doesn’t.

It also works in the other direction. Two occupations with nearly identical titles can sit far apart in skill space if the actual work requires very different things. “Sales associate” at a retail chain and “sales associate” at an investment bank look similar in a keyword search and require almost entirely different capabilities.

The skill vector is also additive across experience types in a way that job titles aren’t. A person who has worked as a teacher, run a small side business, and speaks three languages has a richer profile than any single job title captures. All three threads contribute to the vector, and the vector finds matches that none of the individual threads would have surfaced on their own.

What the algorithm doesn’t do

Worth being direct about the limits.

The matching is probabilistic. A high similarity score means the skill overlap is strong based on available signals; it doesn’t guarantee someone will perform well in or enjoy a particular role. The algorithm identifies plausible paths, not certain ones.

The system also relies on the quality of the underlying O*NET data, which is comprehensive but not perfectly current. Emerging roles (AI prompt engineering, for example) may not have fully developed profiles yet. The BLS salary data lags by 12 to 18 months by the time it’s published.

And the scoring reflects what users report. Someone who significantly underrepresents their experience during intake gets a vector that underrepresents their actual capabilities. The system surfaces what it can infer, but it can’t infer what it’s never been given.

The point of the technical transparency

The reason to explain all this isn’t to impress with complexity. Plenty of systems are complex without being useful.

The reason is that career decisions are high-stakes and often irreversible on short timescales. Taking a master’s program, pursuing a certification, relocating for a new role: these cost real time and real money. Anyone who makes those decisions based on a tool they don’t understand is taking on unnecessary risk.

The PathScorer model is grounded in government labor data, uses a well-understood similarity metric, and applies transparent modifiers based on factors people actually care about. The matches it surfaces are not vibes. They’re geometry.

Whether that geometry points somewhere worth going is still a human judgment. But it’s easier to make that judgment when you can see the map.

Run your analysis

PathScorer maps your real skills against 1,000+ occupations using O*NET and BLS data. Free to try, results in two minutes.

Score my career — free

Keep reading

How We Built a Map of the Entire U.S. Labor Market How We Built the Career Matching Architecture at PathScorer Career Tests Are Astrology for Professionals

career matching algorithmskill based job searchO*NET skill taxonomyresume parsingcosine similarity career matching