metadata
tags:
- nlp
- regression
- tfidf
- ridge
- summaries
- kaggle
🧠 CommonLit Summary Scoring Model
This model was trained using the CommonLit Evaluate Student Summaries dataset on Kaggle.
It predicts two scores for student-written summaries:
content
→ Idea coverage qualitywording
→ Clarity and phrasing quality
Built with:
- TF-IDF vectorizer
- Ridge Regression (scikit-learn)
- MultiOutputRegressor wrapper
Example usage:
from joblib import load
model = load("ridge_model.pkl")
tfidf = load("tfidf_vectorizer.pkl")
summary = "This text discusses..."
X = tfidf.transform([summary])
pred = model.predict(X)