API Reference
ReSkillio exposes a RESTful API over HTTP. Interactive docs are available at /docs (Swagger UI) and /redoc when the server is running.
Authentication
Authorization header is required from API clients. Set GCP_PROJECT_ID and GOOGLE_APPLICATION_CREDENTIALS in the server environment before starting.503 Service Unavailable if GCP_PROJECT_ID is not configured.Error Codes
| Status | Meaning | Common cause |
|---|---|---|
| 400 | Bad Request | Missing required field or empty resume text |
| 422 | Unprocessable | PDF parsing failed or empty PDF |
| 500 | Server Error | Pipeline orchestration failure |
| 503 | Unavailable | GCP not configured — set GCP_PROJECT_ID |
Full 5-stage career-rebound orchestration. The primary interview-demo endpoint — one call, complete output. Accepts a resume (PDF upload or raw text) plus a target role and optional JD, then runs all pipeline stages in sequence. Each stage is fail-safe: a failure in one never blocks results from earlier stages.
multipart/form-data. Provide either resume (PDF file) or resume_text (plain string) — if both are given, the PDF takes precedence.Request Parameters
| Field | Type | Required | Description |
|---|---|---|---|
| resume | file | optional | Resume PDF file upload |
| resume_text | string | optional | Resume plain text (alternative to PDF) |
| target_role | string | required | Target job title, e.g. Senior Data Engineer |
| candidate_id | string | optional | Unique ID — auto-generated as demo-{8hex} if omitted |
| jd_text | string | optional | Job description text — enables gap analysis (Stage 2) |
| jd_title | string | optional | JD title used in narrative generation |
| industry | string | optional | Target industry — auto-detected from industry match if omitted |
| include_pathway | bool | optional | Include 90-day CrewAI roadmap (default false, adds ~45s) |
Example Request
# PDF upload curl -X POST http://localhost:8000/analyze \ -F "resume=@resume.pdf" \ -F "target_role=Senior Data Engineer" \ -F "jd_text=We need Python, BigQuery, Airflow, dbt, Spark" \ -F "candidate_id=demo-001" # Plain text (no file needed) curl -X POST http://localhost:8000/analyze \ -F "resume_text=Experienced data engineer with Python, SQL, BigQuery..." \ -F "target_role=Senior Data Engineer" # With 90-day pathway (~45 s extra) curl -X POST http://localhost:8000/analyze \ -F "resume=@resume.pdf" \ -F "target_role=Senior Data Engineer" \ -F "include_pathway=true"
Response — AnalysisResult
{
"candidate_id": "demo-001",
"target_role": "Senior Data Engineer",
"analyzed_at": "2026-04-17T10:30:00Z",
"skill_count": 47,
"top_skills": [
{ "name": "Python", "category": "technical", "confidence": 0.97 },
{ "name": "BigQuery", "category": "tool", "confidence": 0.95 }
],
"gap": {
"gap_score": 72.4,
"matched_skills": ["Python", "SQL", "BigQuery"],
"missing_skills": ["Kafka", "Terraform"],
"transferable_skills": [
{ "jd_skill": "Airflow", "candidate_skill": "Luigi", "similarity": 0.81 }
],
"recommendation": "Strong match. Bridge Kafka and Terraform to close gaps."
},
"industry_match": {
"top_industry": "data_ai",
"top_industry_label": "Data & AI",
"scores": [
{ "rank": 1, "industry": "data_ai", "match_score": 88.3 },
{ "rank": 2, "industry": "cloud_devops", "match_score": 74.1 }
]
},
"narrative": "A data-first engineer with a strong Python foundation...",
"pathway": null,
"stages": {
"extract": { "success": true, "duration_ms": 342 },
"gap": { "success": true, "duration_ms": 1820 },
"industry": { "success": true, "duration_ms": 510 },
"narrative": { "success": true, "duration_ms": 3102 },
"pathway": { "success": true, "duration_ms": 0, "error": "skipped" }
},
"total_duration_ms": 5892
}
Extract skills from raw text using the spaCy PhraseMatcher + NER pipeline. Persists results to BigQuery and refreshes the candidate profile.
| Field | Type | Required | Description |
|---|---|---|---|
| text | string | required | Raw text to extract skills from |
| candidate_id | string | required | Candidate identifier for BigQuery storage |
| model_name | string | optional | spaCy model override (default: en_core_web_lg) |
Upload a PDF resume. Parses the PDF with pdfplumber, splits into labeled sections (Summary, Experience, Skills, Education, Certifications, Projects, Other), extracts skills per section, and deduplicates by highest confidence.
| Field | Type | Required | Description |
|---|---|---|---|
| file | file | required | PDF resume file |
| candidate_id | string | required | Candidate identifier |
| store | bool | optional | Persist to BigQuery (default: true) |
LangGraph stateful skill extractor. Runs section-by-section at high confidence (0.7), retries on full text at lower threshold (0.4) if skill count < 3. Returns skills plus a full graph trace for debugging.
Response includes
Returns the aggregated skill profile for a candidate — one entry per unique skill, ranked by frequency and confidence. Built from all extraction runs via BigQuery MERGE.
ProfiledSkill fields
Ingest a job description. Detects seniority level (junior/mid/senior/lead/staff/manager), classifies into one of 8 industries, splits required vs preferred skills, extracts skills per section, and stores to BigQuery.
| Field | Type | Required | Description |
|---|---|---|---|
| jd_id | string | required | Unique JD identifier |
| title | string | required | Job title |
| text | string | required | Full JD text |
| company | string | optional | Company name |
| industry | string | optional | Industry override (auto-detected if omitted) |
Gap analysis between a candidate's skill profile and a stored JD. Uses exact match plus semantic cosine similarity via Vertex AI embeddings. Surfaces transferable skills that keyword matching misses.
| Field | Type | Required | Description |
|---|---|---|---|
| candidate_id | string | required | Candidate to evaluate |
| jd_id | string | required | Previously ingested JD ID |
| similarity_threshold | float | optional | Minimum cosine similarity for transferable match (default: 0.75) |
GapAnalysisResult fields
min(100, (matched + 0.7×transferable) / total_required × 100)Score a candidate against all 8 industry centroid vectors using BQML ML.DISTANCE(COSINE) directly in BigQuery. Returns ranked scores for: Data & AI, Software Engineering, FinTech, HealthTech, eCommerce, Cybersecurity, Cloud/DevOps, Product Management.
IndustryScore fields (per industry)
Generate a RAG-grounded career narrative via Gemini 2.5 Flash. Retrieves candidate's top 8 skills, industry's top 10 demanded skills, skill overlap, and sample JD titles from BigQuery — then generates a 3-sentence, second-person story. System prompt enforces no hallucinations.
| Field | Type | Required | Description |
|---|---|---|---|
| candidate_id | string | required | Candidate with an existing profile |
| target_role | string | required | Target job title for the narrative |
| industry | string | optional | Industry context (auto-detected if omitted) |
Real-time skill demand analysis via a CrewAI single-agent crew. Uses DuckDuckGo search (no API key needed) to research current job market demand for up to 10 skills. Returns demand score 0–100, trend (growing / stable / declining), and evidence snippet per skill.
CrewAI two-agent 90-day reskilling roadmap. Researcher agent searches Coursera, Udemy, and YouTube for real courses per missing skill. Planner agent synthesizes into a 3-phase roadmap with weekly hours, milestones, and success metrics. Expect ~45s.
| Field | Type | Required | Description |
|---|---|---|---|
| candidate_id | string | required | Candidate identifier |
| target_role | string | required | Target job title |
| missing_skills | string[] | required | Skills from gap analysis to close |
| gap_score | float | optional | Gap score 0–100 for roadmap pacing |
Embed a candidate's profile skills not yet in the global catalog. Reads the candidate profile, filters skills already embedded, embeds new ones via Vertex AI text-embedding-004 (768-dim), and upserts to skill_embeddings table.
Semantic skill similarity search. Embeds the query skill using RETRIEVAL_QUERY task type, then runs BigQuery VECTOR_SEARCH (COSINE) against the skill catalog. Returns the top-N most similar skills with distance scores.
Returns row counts for all 8 tables across Bronze, Silver, and Gold BigQuery datasets. Use this to verify lakehouse health and confirm data is flowing through the medallion pipeline.
LakehouseStatus structure
Fetch the Gold-layer composite readiness index for a candidate. Combines match scores, industry coverage, extraction confidence, and skill breadth into a single 0–100 score with tier (READY / DEVELOPING / EMERGING).
CandidateReadiness formula
Run F1 evaluation against the golden test set — no GCP write. Returns precision, recall, F1 score, per-example breakdown, and whether the model passes the F1 ≥ 0.85 gate required for registration.
Returns the last N drift records from BigQuery. Each record includes unknown_skill_rate, avg_confidence, taxonomy_coverage, and whether the alert threshold was triggered (unknown_rate > 20%).
| Param | Type | Description |
|---|---|---|
| limit | int | Max records to return (default: 10) |
Basic health check. Returns {"status": "ok"}. No GCP dependency — use this for Cloud Run health probes and load balancer checks.