Skip to content

pcescato/knowledge-graph-cv

Repository files navigation

🕸️ AI Knowledge Graph CV Builder

Live Demo Dev.to Article License: MIT

Transform your resume into an interactive knowledge graph powered by Gemini AI.

Professional journeys aren't timelines—they're networks of interconnected skills, projects, and expertise. This tool uses AI to extract and visualize those connections.

🔗 Try it live | 📝 Read the story on Dev.to


🎯 Why This Project?

Traditional CVs are chronological and linear. They work well for continuous trajectories but fail to represent:

  • Non-linear career paths
  • Cross-domain expertise
  • Technology interconnections
  • Skill-project relationships

This project reimagines professional identity as a knowledge graph: nodes (skills, projects, concepts) connected by semantic relationships.


✨ Features

🕸️ Network Graph

Interactive force-directed graph with:

  • Click-to-focus: Highlight direct connections
  • Color-coded nodes: Skills (blue), Projects (green), Concepts (gray)
  • Dynamic filtering: Filter by category
  • Adjustable spacing: 6 levels from Compact to Mega Wide

🌊 Flow Diagram

Sankey visualization showing:

  • Skills → Projects → Expertise flow
  • Weighted connections: Band thickness = importance
  • Visual narrative: Tell your career story

📊 Skills Matrix

Heatmap displaying:

  • Project × Skill relationships
  • Quick scanning: See which projects use which technologies
  • Insights: Most-used skill, average skills per project

🤖 AI-Powered Extraction

  • Gemini Flash Preview 3.0: Multimodal PDF analysis
  • Dense graphs: 60-80+ relationships extracted
  • Semantic understanding: Not just keywords—contextual connections
  • ~8 seconds: From PDF upload to interactive graph

🎨 User Experience

  • Demo pre-loaded: My CV ready to explore (zero friction)
  • Multi-view dashboard: 3 perspectives on the same data
  • Responsive controls: Collapsible sidebar, adjustable spacing
  • English interface: Global audience

🚀 Quick Start

Try the Live Demo

No installation needed! The app loads with a demo CV:

  1. 🌐 Open the app
  2. 🔍 Explore the 3 views (sidebar: Network / Flow / Matrix)
  3. 📤 Upload your own PDF CV to try it

🛠️ Installation

Prerequisites

Local Setup

# Clone the repository
git clone https://github.com/pcescato/knowledge-graph-cv.git
cd knowledge-graph-cv

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set up environment variables
echo "GOOGLE_API_KEY=your_api_key_here" > .env

# Run the app
streamlit run app.py

The app will open at http://localhost:8501


📦 Deployment

Google Cloud Run (Recommended)

# Build Docker image
docker build -t knowledge-graph-cv .

# Tag for Google Container Registry
docker tag knowledge-graph-cv gcr.io/YOUR_PROJECT/knowledge-graph-cv

# Push to GCR
docker push gcr.io/YOUR_PROJECT/knowledge-graph-cv

# Deploy to Cloud Run
gcloud run deploy knowledge-graph-cv \
  --image gcr.io/YOUR_PROJECT/knowledge-graph-cv \
  --platform managed \
  --region europe-west1 \
  --allow-unauthenticated \
  --set-env-vars GOOGLE_API_KEY=your_key

Alternative: Streamlit Cloud

  1. Fork this repository
  2. Connect to Streamlit Cloud
  3. Add GOOGLE_API_KEY to secrets
  4. Deploy!

🧰 Tech Stack

AI & Data Processing

  • Gemini Flash Preview 3.0 (via Google AI API): PDF analysis & graph extraction
  • Google AI Studio: Prompt engineering & testing

Visualization

  • Streamlit: Web framework & UI
  • streamlit-agraph: Network graph (vis.js wrapper)
  • Plotly: Sankey diagrams & heatmaps
  • NetworkX: Graph algorithms & validation

Deployment

  • Google Cloud Run: Serverless container deployment
  • Docker: Containerization

📐 Architecture

PDF Upload → Gemini API → JSON Graph → Multi-View Dashboard
                                        ├─ Network Graph (vis.js)
                                        ├─ Flow Diagram (Plotly Sankey)
                                        └─ Skills Matrix (Plotly Heatmap)

Graph Structure

{
  "nodes": [
    {"id": "python", "label": "Python", "type": "Skill", "importance": 10}
  ],
  "edges": [
    {"from": "python", "to": "ai_automation", "label": "ENABLES"}
  ]
}

Relationship types:

  • MASTERS, USES, CREATES: Direct actions
  • ENABLES, REQUIRES, BUILT_WITH: Technical dependencies
  • DEMONSTRATES, IMPLEMENTED_IN: Conceptual connections
  • EXPERTISE_IN, RELATED_TO: Meta relationships

🎨 Design Decisions

Why streamlit-agraph for Network Visualization?

I evaluated several libraries for the interactive Network Graph:

Library Interactivity Responsive Dev Time
streamlit-agraph Excellent Fixed (1400px) 2 hours
Plotly Graph Objects Limited 100% responsive 6 hours
D3.js Custom Full control 100% responsive 8+ hours

Decision: streamlit-agraph

Why? In a 2-day iteration cycle with real user feedback, interaction quality was more valuable than perfect iframe responsiveness.

Trade-off accepted: Fixed 1400×900px canvas. Works great on desktop (1440px+), less optimal in narrow embeds. The collapsible sidebar provides ~250px of extra space when needed.

Why 3 Visualizations?

Different audiences need different views:

  • Developers: Want to explore connections (Network Graph)
  • Recruiters: Need quick visual narratives (Flow Diagram)
  • Managers: Want fast skill scanning (Skills Matrix)

One visualization can't serve all needs.


📊 Project Metrics

Technical

  • Nodes: 25-35 (average per CV)
  • Relationships: 60-80 (average)
  • Density: 2.0-2.8 edges/node
  • Extraction Time: ~20 seconds (Gemini Flash Preview 3.0)
  • Supported Formats: PDF only
  • Visualizations: 3 modes

Development

  • Versions: V1 → V8.3 (8 major iterations)
  • Prompt iterations: 20+ (Google AI Studio)
  • User feedback cycles: Multiple
  • Lines of code: ~1,200

🎓 How It Works

1. Prompt Engineering (Google AI Studio)

Before writing any application code, I spent time in AI Studio crafting a multi-level extraction prompt:

LEVEL 1: Core entities (Person, Skills, Projects)
LEVEL 2: Relationships (USES, CREATED, MASTERS)
LEVEL 3: Technical relationships (PHP ENABLES WordPress)
LEVEL 4: Concepts & expertise domains
LEVEL 5: Temporal & contextual relationships
LEVEL 6: Bidirectional concept-project links

The prompt evolved through 20+ iterations in AI Studio before integration.

2. Semantic Extraction

The system doesn't just extract keywords—it reasons about context:

Example: "Built a WordPress migration tool in Python"

Extracted graph:

  • Python ENABLES Migration Engineering
  • Migration Engineering IMPLEMENTED_IN Migration Tool
  • Migration Tool DEMONSTRATES Migration Engineering
  • Migration Tool USES Python
  • Migration Tool USES WordPress

Bidirectional semantic relationships create graph completeness.

3. Multi-View Rendering

The same JSON graph is rendered in 3 ways:

  • Network: Force-directed layout (streamlit-agraph) Network
  • Flow: Sankey diagram (Plotly) Flow
  • Matrix: Heatmap (Plotly) Matrix

User switches views via sidebar radio button—instant switching (1 click, no reload).


🧪 Example Use Cases

For Job Seekers

  • Portfolio enhancement: Show interconnected skills
  • Interview preparation: Visualize your expertise domains
  • Gap analysis: Identify under-connected skills

For Recruiters

  • Quick assessment: 30-second skill scan (Matrix view)
  • Depth evaluation: Explore project connections (Network view)
  • Story telling: See candidate's journey (Flow view)

For Career Counselors

  • Career path visualization: Show non-linear trajectories
  • Skill planning: Identify development opportunities
  • Portfolio building: Help clients present themselves better

🤝 Contributing

Contributions welcome! This project was built in 2 days with iterative feedback—there's room for improvement.

Ideas for contribution:

  • Export formats (GraphML, Neo4j cypher, JSON-LD)
  • Comparison mode (two CVs side-by-side)
  • Temporal dimension (career evolution over time)
  • Skills gap analysis (compare against job descriptions)
  • Alternative graph libraries (Plotly native, D3.js)
  • Multi-language support (currently English only)

To contribute:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📝 Development Log

V1-V3: POC → Dense graph (30 nodes, 70+ edges)
V4-V6: Spacing optimization + bidirectional relationships
V7.0-7.6: Multi-view dashboard + readability iterations
V8.0-8.1: English interface + demo CV auto-loading
V8.2: Radio button instant switching fix
V8.3: Final polish (hero message, instructions, footer) ✅

Total: 8 major versions based on real user feedback.


📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


👤 Author

Pascal Cescato


🙏 Acknowledgments

  • Google AI Studio: Invaluable for prompt engineering & iteration
  • Gemini Flash Preview 3.0: Fast, accurate multimodal analysis
  • Streamlit Community: Excellent framework for rapid prototyping
  • vis.js: Powerful force-directed graph visualization
  • Dev.to Challenge: Motivation to build and ship in 2 days

📚 Related Resources


🐛 Known Limitations

Canvas Size

  • Network Graph uses fixed 1400×900px canvas (streamlit-agraph limitation)
  • Works great on desktop (1440px+), less optimal in narrow iframes
  • Mitigation: Collapsible sidebar adds ~250px when needed

PDF Support Only

  • Currently supports PDF input only
  • DOCX support planned for future versions

Single Language

  • Interface in English only
  • Internationalization planned

No Persistence

  • Graphs are session-only (not saved server-side)
  • Export functionality planned

🎯 New Year, New You

This project was created for the Dev.to "New Year, New You" Challenge, powered by Google AI.

Theme alignment: Rather than reinventing yourself, sometimes you just need to represent yourself differently. This tool helps visualize professional identity as it truly is—a network of connections, not a linear timeline.


⭐ Star this project!

If you find this project useful, please consider giving it a ⭐ on GitHub. It helps others discover the project!

Questions? Issues? Open an issue or reach out!


Built with ❤️ and AI in 2 days | Deployed on Google Cloud Run

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors