ReSkillio — API Reference

Authentication

ℹ

The API currently uses GCP Application Default Credentials on the server side. No Authorization header is required from API clients. Set GCP_PROJECT_ID and GOOGLE_APPLICATION_CREDENTIALS in the server environment before starting.

⚠

Endpoints that require GCP (BigQuery, Vertex AI) will return 503 Service Unavailable if GCP_PROJECT_ID is not configured.

Error Codes

Status	Meaning	Common cause
400	Bad Request	Missing required field or empty resume text
422	Unprocessable	PDF parsing failed or empty PDF
500	Server Error	Pipeline orchestration failure
503	Unavailable	GCP not configured — set GCP_PROJECT_ID

POST /analyze

analyze

Full 5-stage career-rebound orchestration. The primary interview-demo endpoint — one call, complete output. Accepts a resume (PDF upload or raw text) plus a target role and optional JD, then runs all pipeline stages in sequence. Each stage is fail-safe: a failure in one never blocks results from earlier stages.

ℹ

Send as multipart/form-data. Provide either resume (PDF file) or resume_text (plain string) — if both are given, the PDF takes precedence.

Request Parameters

Field	Type	Required	Description
resume	file	optional	Resume PDF file upload
resume_text	string	optional	Resume plain text (alternative to PDF)
target_role	string	required	Target job title, e.g. `Senior Data Engineer`
candidate_id	string	optional	Unique ID — auto-generated as `demo-{8hex}` if omitted
jd_text	string	optional	Job description text — enables gap analysis (Stage 2)
jd_title	string	optional	JD title used in narrative generation
industry	string	optional	Target industry — auto-detected from industry match if omitted
include_pathway	bool	optional	Include 90-day CrewAI roadmap (default `false`, adds ~45s)

Example Request

curl
# PDF upload
curl -X POST http://localhost:8000/analyze \
  -F "resume=@resume.pdf" \
  -F "target_role=Senior Data Engineer" \
  -F "jd_text=We need Python, BigQuery, Airflow, dbt, Spark" \
  -F "candidate_id=demo-001"

# Plain text (no file needed)
curl -X POST http://localhost:8000/analyze \
  -F "resume_text=Experienced data engineer with Python, SQL, BigQuery..." \
  -F "target_role=Senior Data Engineer"

# With 90-day pathway (~45 s extra)
curl -X POST http://localhost:8000/analyze \
  -F "resume=@resume.pdf" \
  -F "target_role=Senior Data Engineer" \
  -F "include_pathway=true"
        

Response — AnalysisResult

200 OK · application/json
{
  "candidate_id":  "demo-001",
  "target_role":   "Senior Data Engineer",
  "analyzed_at":   "2026-04-17T10:30:00Z",
  "skill_count":   47,
  "top_skills": [
    { "name": "Python",   "category": "technical", "confidence": 0.97 },
    { "name": "BigQuery", "category": "tool",      "confidence": 0.95 }
  ],
  "gap": {
    "gap_score": 72.4,
    "matched_skills":      ["Python", "SQL", "BigQuery"],
    "missing_skills":      ["Kafka", "Terraform"],
    "transferable_skills": [
      { "jd_skill": "Airflow", "candidate_skill": "Luigi", "similarity": 0.81 }
    ],
    "recommendation": "Strong match. Bridge Kafka and Terraform to close gaps."
  },
  "industry_match": {
    "top_industry":       "data_ai",
    "top_industry_label": "Data & AI",
    "scores": [
      { "rank": 1, "industry": "data_ai",     "match_score": 88.3 },
      { "rank": 2, "industry": "cloud_devops", "match_score": 74.1 }
    ]
  },
  "narrative": "A data-first engineer with a strong Python foundation...",
  "pathway": null,
  "stages": {
    "extract":   { "success": true, "duration_ms": 342  },
    "gap":       { "success": true, "duration_ms": 1820 },
    "industry":  { "success": true, "duration_ms": 510  },
    "narrative": { "success": true, "duration_ms": 3102 },
    "pathway":   { "success": true, "duration_ms": 0, "error": "skipped" }
  },
  "total_duration_ms": 5892
}
        

POST /extract

extraction

Extract skills from raw text using the spaCy PhraseMatcher + NER pipeline. Persists results to BigQuery and refreshes the candidate profile.

Field	Type	Required	Description
text	string	required	Raw text to extract skills from
candidate_id	string	required	Candidate identifier for BigQuery storage
model_name	string	optional	spaCy model override (default: `en_core_web_lg`)

POST /resume/upload

resume

Upload a PDF resume. Parses the PDF with pdfplumber, splits into labeled sections (Summary, Experience, Skills, Education, Certifications, Projects, Other), extracts skills per section, and deduplicates by highest confidence.

Field	Type	Required	Description
file	file	required	PDF resume file
candidate_id	string	required	Candidate identifier
store	bool	optional	Persist to BigQuery (default: `true`)

POST /agent/extract

agent

LangGraph stateful skill extractor. Runs section-by-section at high confidence (0.7), retries on full text at lower threshold (0.4) if skill count < 3. Returns skills plus a full graph trace for debugging.

Response includes

skillsSkill[]Extracted and validated skills

retry_countintNumber of retry passes taken

extraction_idstringBigQuery row ID if stored

tracelistLangGraph node execution trace

GET /candidate/{candidate_id}/profile

candidate

Returns the aggregated skill profile for a candidate — one entry per unique skill, ranked by frequency and confidence. Built from all extraction runs via BigQuery MERGE.

ProfiledSkill fields

skill_namestringNormalized skill name

categorystringtechnical / soft / domain / tool / certification

frequencyintTimes seen across all extractions

confidence_avgfloatAverage extraction confidence (0–1)

first_seen / last_seendatetimeTemporal range of skill evidence

POST /jd

job-description

Ingest a job description. Detects seniority level (junior/mid/senior/lead/staff/manager), classifies into one of 8 industries, splits required vs preferred skills, extracts skills per section, and stores to BigQuery.

Field	Type	Required	Description
jd_id	string	required	Unique JD identifier
title	string	required	Job title
text	string	required	Full JD text
company	string	optional	Company name
industry	string	optional	Industry override (auto-detected if omitted)

POST /gap

gap-analysis

Gap analysis between a candidate's skill profile and a stored JD. Uses exact match plus semantic cosine similarity via Vertex AI embeddings. Surfaces transferable skills that keyword matching misses.

Field	Type	Required	Description
candidate_id	string	required	Candidate to evaluate
jd_id	string	required	Previously ingested JD ID
similarity_threshold	float	optional	Minimum cosine similarity for transferable match (default: `0.75`)

GapAnalysisResult fields

gap_scorefloat 0–100Higher = stronger fit. Formula: min(100, (matched + 0.7×transferable) / total_required × 100)

matched_skillsstring[]Skills that exactly match JD requirements

transferable_skillsTransferableSkill[]Similar but not identical — includes similarity score

missing_skillsstring[]Required JD skills with no candidate match

recommendationstringHuman-readable career guidance

GET /industry/match/{candidate_id}

industry-match

Score a candidate against all 8 industry centroid vectors using BQML ML.DISTANCE(COSINE) directly in BigQuery. Returns ranked scores for: Data & AI, Software Engineering, FinTech, HealthTech, eCommerce, Cybersecurity, Cloud/DevOps, Product Management.

IndustryScore fields (per industry)

rankint1 = best fit

industrystringIndustry enum key

industry_labelstringHuman-readable label

match_scorefloat 0–100Converted from cosine distance (100 = perfect)

POST /narrative

narrative

Generate a RAG-grounded career narrative via Gemini 2.5 Flash. Retrieves candidate's top 8 skills, industry's top 10 demanded skills, skill overlap, and sample JD titles from BigQuery — then generates a 3-sentence, second-person story. System prompt enforces no hallucinations.

Field	Type	Required	Description
candidate_id	string	required	Candidate with an existing profile
target_role	string	required	Target job title for the narrative
industry	string	optional	Industry context (auto-detected if omitted)

POST /market/analyze

market-analyst

Real-time skill demand analysis via a CrewAI single-agent crew. Uses DuckDuckGo search (no API key needed) to research current job market demand for up to 10 skills. Returns demand score 0–100, trend (growing / stable / declining), and evidence snippet per skill.

⚠

This endpoint makes live web search calls. Latency varies (~10–30s for 10 skills). Limit to the skills most important for the gap analysis to keep response time reasonable.

POST /pathway/plan

pathway-planner

CrewAI two-agent 90-day reskilling roadmap. Researcher agent searches Coursera, Udemy, and YouTube for real courses per missing skill. Planner agent synthesizes into a 3-phase roadmap with weekly hours, milestones, and success metrics. Expect ~45s.

Field	Type	Required	Description
candidate_id	string	required	Candidate identifier
target_role	string	required	Target job title
missing_skills	string[]	required	Skills from gap analysis to close
gap_score	float	optional	Gap score 0–100 for roadmap pacing

POST /embeddings/candidate/{candidate_id}

embeddings

Embed a candidate's profile skills not yet in the global catalog. Reads the candidate profile, filters skills already embedded, embeds new ones via Vertex AI text-embedding-004 (768-dim), and upserts to skill_embeddings table.

POST /embeddings/similar

embeddings

Semantic skill similarity search. Embeds the query skill using RETRIEVAL_QUERY task type, then runs BigQuery VECTOR_SEARCH (COSINE) against the skill catalog. Returns the top-N most similar skills with distance scores.

GET /lakehouse/status

lakehouse

Returns row counts for all 8 tables across Bronze, Silver, and Gold BigQuery datasets. Use this to verify lakehouse health and confirm data is flowing through the medallion pipeline.

LakehouseStatus structure

bronzeLayerTableInfo[]raw_resume_ingestion, raw_jd_ingestion

silverLayerTableInfo[]candidate_skills, jd_skill_profiles, ingestion_log

goldLayerTableInfo[]match_scores, industry_rankings, candidate_readiness

GET /lakehouse/gold/readiness/{candidate_id}

lakehouse

Fetch the Gold-layer composite readiness index for a candidate. Combines match scores, industry coverage, extraction confidence, and skill breadth into a single 0–100 score with tier (READY / DEVELOPING / EMERGING).

CandidateReadiness formula

readiness_indexfloat 0–10040% match score + 30% industry coverage + 20% avg confidence + 10% skill breadth

readiness_tierstringREADY (≥70) · DEVELOPING (40–69) · EMERGING (<40)

best_industrystringTop industry from most recent match run

POST /registry/evaluate

model-registry

Run F1 evaluation against the golden test set — no GCP write. Returns precision, recall, F1 score, per-example breakdown, and whether the model passes the F1 ≥ 0.85 gate required for registration.

GET /monitoring/drift/recent

drift-monitoring

Returns the last N drift records from BigQuery. Each record includes unknown_skill_rate, avg_confidence, taxonomy_coverage, and whether the alert threshold was triggered (unknown_rate > 20%).

Param	Type	Description
limit	int	Max records to return (default: 10)

GET /health

ops

Basic health check. Returns {"status": "ok"}. No GCP dependency — use this for Cloud Run health probes and load balancer checks.