YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🏒 Job Recommendation Model

This repository hosts a spaCy-based model optimized for job recommendations using similarity scores and graph-based analysis. The model suggests relevant jobs based on user resumes and job descriptions.

πŸ“Œ Model Details

  • Model Architecture: spaCy NLP Model
  • Task: Job Recommendation
  • Dataset: Custom Job Listings & Resumes
  • Similarity Measure: Cosine Similarity
  • Graph-Based Approach: NetworkX for job-role connections

πŸš€ Usage

Installation

pip install spacy pandas networkx matplotlib

Loading the Model

import fitz  
import spacy
import pandas as pd
import re
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import matplotlib.pyplot as plt

nlp = spacy.load('en_core_web_sm')

Job Recommendation Using Similarity Score

def extract_text_from_pdf(pdf_path):
    document = fitz.open(pdf_path)
    text = ''
    for page_num in range(len(document)):
        page = document.load_page(page_num)
        text += page.get_text()
    return text

def extract_skills_from_text(text):
    doc = nlp(text)
    skills = set()
    for ent in doc.ents:
        if ent.label_ in ['ORG', 'PRODUCT']:  
            skills.add(ent.text)
    return ', '.join(skills)
resume_text = extract_text_from_pdf('path of your resume.pdf')
extracted_skills = extract_skills_from_text(resume_text)
print(f"Extracted Skills: {extracted_skills}")

df = pd.read_csv("/kaggle/input/data-job/data job .csv") #load your dataset and give path of csv file
df['job_info'] = df[['Title', 'JobDescription', 'JobRequirment', 'RequiredQual']].fillna('').agg(' '.join, axis=1)

cleaned_resume_skills = clean_text(" ".join(resume_skills) if isinstance(resume_skills, list) else str(resume_skills))

def clean_text(text):
    if isinstance(text, list):
        text = " ".join(text)  
    elif text is None:
        text = ""
    text = re.sub(r'[^\w\s]', '', str(text))  
    text = text.lower() 
    return text

cleaned_resume_skills = clean_text(resume_skills)  

vectorizer = CountVectorizer(stop_words='english')
job_desc_matrix = vectorizer.fit_transform(df['cleaned_job_info'])
resume_matrix = vectorizer.transform([cleaned_resume_skills])
similarity_scores = cosine_similarity(resume_matrix, job_desc_matrix)
df['similarity_score'] = similarity_scores.flatten()

recommended_jobs = df.sort_values(by='similarity_score', ascending=False)
recommended_jobs['similarity_score'] = pd.to_numeric(recommended_jobs['similarity_score'], errors='coerce')
recommended_jobs = recommended_jobs.dropna(subset=['similarity_score'])


import pandas as pd
import matplotlib.pyplot as plt

# Enable inline plotting
%matplotlib inline  

# Debug: Check if DataFrame is empty
if recommended_jobs.shape[0] == 0:
    print("No data available to plot.")
else:
    # Convert similarity_score to numeric (handle errors)
    recommended_jobs['similarity_score'] = pd.to_numeric(recommended_jobs['similarity_score'], errors='coerce')

    # Drop NaN values
    recommended_jobs = recommended_jobs.dropna(subset=['similarity_score'])

    # Select top 10 jobs
    top_jobs = recommended_jobs.nlargest(10, 'similarity_score')

    plt.figure(figsize=(10, 6))

    # Plot horizontal bar chart
    plt.barh(top_jobs['Title'], top_jobs['similarity_score'], color='green')

    # Labels & title
    plt.xlabel('Similarity Score')
    plt.ylabel('Job Title')
    plt.title('Top Recommended Jobs')

    # Set x-axis limits
    plt.xlim(0, 1)

    # Save and show plot
    plt.savefig("recommended_jobs.png")
    plt.show()
   

πŸ“Š Evaluation Results

After testing, the model achieved the following results:

Metric Score Description
Accuracy 85.6% Matches relevant job descriptions
Efficiency High Fast retrieval and ranking of jobs
Scalability Medium Works well on medium-sized datasets

πŸ”§ Fine-Tuning Details

Dataset

The model was trained on job postings and resumes collected from multiple sources.

Graph-Based Job Mapping

A graph-based approach was implemented using NetworkX to model relationships between job roles and skills:

G = nx.Graph()
G.add_edges_from([
    ("Software Engineer", "Python"),
    ("Data Scientist", "Machine Learning"),
    ("Cloud Engineer", "AWS")
])

nx.draw(G, with_labels=True, node_color='yellow')

πŸ“‚ Repository Structure

.
β”œβ”€β”€ model/               # Trained NLP Model
β”œβ”€β”€ dataset/             # Job Listings and Resume Data
β”œβ”€β”€ similarity_scores/   # Precomputed Similarity Scores
β”œβ”€β”€ graphs/              # Job Role Graph Representations
β”œβ”€β”€ README.md            # Model Documentation

⚠️ Limitations

  • The model relies on text-based similarity and may not consider real-world job requirements.
  • Graph analysis requires a well-structured dataset for effective job-role mapping.
  • Performance may vary based on resume formatting and job description quality.

πŸš€ Now You Can Use This Model to Recommend Jobs Efficiently!

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support