SetFit with sentence-transformers/all-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/all-mpnet-base-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 384 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
yes	"Amazon you so carefully forget that the AWS cloud system is Amazon's big revenue stream and it is part of the retail side. Not clever at all to call it a 'book store' but then you are not a techie of any kind!\n" "Happened to me with On Cloud, but I didn't see an ad but rather I typed in the shoe name in Google and a site came up that looked right. Thankfully I used PayPal and when the vendor came up as a person's name, I immediately knew it was wrong. PayPal was great about refunding me the funds after an investigation.\n" "ZR It won't be long before people lose ownership entirely of all of their digital content, including their precious family photos, as the digital storage devices and formats will be gone. The plan is to have exclusive Cloud storage and streaming, which everyone will pay for and not own. I regret digitizing the majority of my old photographs, although I did it to preserve those very old ones (family history photos over 100 years old) because they were beginning to fade. As for slides, where can one even get them hand-developed anymore? The machines used for slides are terrible.\n"
no	"Have you considered how data science can increase agricultural yields? How computer science makes surgeries safer? How a bunch of programmers created the platform that you're commenting on right now? someone creates and updates the software architects used to do design. Someone, somewhere uses a computer program to design more stable supply chains. Do I think hands-on work is important? Absolutely! But don't act like all those programmers are out there just working for big tech and banks. They work in manufacturing, in agriculture, in engineering, etc. doing real work that benefits real people in quantifiable ways.\n" 'Huh. A white guy "hailed as his era's most brilliant and influential chef(s)" uses unpaid labor whom he regularly physically and verbally abuses. This reeks of modern-day indentured servitude. I don't hesitate to use this term because too many of the staffers believe they must compete this way to obtain future employment.The deck is stacked in several ways. About being deified as a luminary these days, and I've seen it so many times, especially from a gushing, slavering media: heck yeah! If you don't pay your staff any wage, or minimal wage (restaurants! universities! really awesome internships!), force them to work twice as many hours as they should (Twitter under Musk, perhaps), THEN OF COURSE YOU WILL AMAZE EVERYONE WITH YOUR OUTPUT. You will probably outperform other people in your field, who wouldn't dream of behaving in such a sordid/criminal manner. Your staff will be frightened of you. You will have a much better operating budget than, say, a boss/owner who believes in ACTUALLY PAYING PEOPLE. Stop drooling over these people with monstrous egos and no concern for the workers who make them successful.\n' 'That's the name of the book? "Where is My Flying Car?"I know it's a metaphor, but oh, brother.Some years back before I got too old to seriously contemplate taking flying lessons, there was a promising 'flying car' in development. It was well on its way to market.I imagined flying from my local airport to my brother's local airport making what is always a four hour driving trip about an hour and a half. Yes, that would be lovely.Now let's think about logistics, something these 'visionaries' never think about. Why do they never think about these things? Because it's hard and all they want is their plane-car. Everyone needs to get out of the way of their plane-car. They never want to acknowledge the mayhem resulting from parts and pieces of plane-cars dropping out of the sky, people's peace and quiet destroyed by plane-cars flying overhead all the time, etc., etc.And, they'd be the first to complain that "the government needs to do something about this immediately!"I worked in commercial nuclear power as a youth. We could have made it 'safe enough,' but that would have taken a massive amount of international cooperation. We also needed oil companies to accept the need for change. We also needed people to accept 'safe enough.'I am a scientist and I lean progressive (a la Bernie.) Please do not label me "ergophobic." It is not people like me who brought us here.I don't know who you are trying to convince or even what your point is.\n'

Label

Examples

yes

"Amazon you so carefully forget that the AWS cloud system is Amazon's big revenue stream and it is part of the retail side. Not clever at all to call it a 'book store' but then you are not a techie of any kind!\n"
"Happened to me with On Cloud, but I didn't see an ad but rather I typed in the shoe name in Google and a site came up that looked right. Thankfully I used PayPal and when the vendor came up as a person's name, I immediately knew it was wrong. PayPal was great about refunding me the funds after an investigation.\n"
"ZR It won't be long before people lose ownership entirely of all of their digital content, including their precious family photos, as the digital storage devices and formats will be gone. The plan is to have exclusive Cloud storage and streaming, which everyone will pay for and not own. I regret digitizing the majority of my old photographs, although I did it to preserve those very old ones (family history photos over 100 years old) because they were beginning to fade. As for slides, where can one even get them hand-developed anymore? The machines used for slides are terrible.\n"

"Have you considered how data science can increase agricultural yields? How computer science makes surgeries safer? How a bunch of programmers created the platform that you're commenting on right now? someone creates and updates the software architects used to do design. Someone, somewhere uses a computer program to design more stable supply chains. Do I think hands-on work is important? Absolutely! But don't act like all those programmers are out there just working for big tech and banks. They work in manufacturing, in agriculture, in engineering, etc. doing real work that benefits real people in quantifiable ways.\n"
'Huh. A white guy "hailed as his era's most brilliant and influential chef(s)" uses unpaid labor whom he regularly physically and verbally abuses. This reeks of modern-day indentured servitude. I don't hesitate to use this term because too many of the staffers believe they must compete this way to obtain future employment.The deck is stacked in several ways. About being deified as a luminary these days, and I've seen it so many times, especially from a gushing, slavering media: heck yeah! If you don't pay your staff any wage, or minimal wage (restaurants! universities! really awesome internships!), force them to work twice as many hours as they should (Twitter under Musk, perhaps), THEN OF COURSE YOU WILL AMAZE EVERYONE WITH YOUR OUTPUT. You will probably outperform other people in your field, who wouldn't dream of behaving in such a sordid/criminal manner. Your staff will be frightened of you. You will have a much better operating budget than, say, a boss/owner who believes in ACTUALLY PAYING PEOPLE. Stop drooling over these people with monstrous egos and no concern for the workers who make them successful.\n'
'That's the name of the book? "Where is My Flying Car?"I know it's a metaphor, but oh, brother.Some years back before I got too old to seriously contemplate taking flying lessons, there was a promising 'flying car' in development. It was well on its way to market.I imagined flying from my local airport to my brother's local airport making what is always a four hour driving trip about an hour and a half. Yes, that would be lovely.Now let's think about logistics, something these 'visionaries' never think about. Why do they never think about these things? Because it's hard and all they want is their plane-car. Everyone needs to get out of the way of their plane-car. They never want to acknowledge the mayhem resulting from parts and pieces of plane-cars dropping out of the sky, people's peace and quiet destroyed by plane-cars flying overhead all the time, etc., etc.And, they'd be the first to complain that "the government needs to do something about this immediately!"I worked in commercial nuclear power as a youth. We could have made it 'safe enough,' but that would have taken a massive amount of international cooperation. We also needed oil companies to accept the need for change. We also needed people to accept 'safe enough.'I am a scientist and I lean progressive (a la Bernie.) Please do not label me "ergophobic." It is not people like me who brought us here.I don't know who you are trying to convince or even what your point is.\n'

Evaluation

Metrics

Label	Accuracy
all	0.8

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("davidadamczyk/setfit-model-3")
# Run inference
preds = model("It might have been more fun for everyone if the Thruway Authority had given individual contracts for each rest stop, with the stipulation that each reflect some local regional character. This could interest travelers to maybe get off at the next exit and explore some local places. With every stop the same, the traveler might as well be in Kansas.
")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	43	140.9	262

Label	Training Sample Count
no	18
yes	22

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 120
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0017	1	0.4637	-
0.0833	50	0.2019	-
0.1667	100	0.0063	-
0.25	150	0.0003	-
0.3333	200	0.0002	-
0.4167	250	0.0001	-
0.5	300	0.0001	-
0.5833	350	0.0001	-
0.6667	400	0.0001	-
0.75	450	0.0001	-
0.8333	500	0.0001	-
0.9167	550	0.0001	-
1.0	600	0.0001	-

Framework Versions

Python: 3.10.13
SetFit: 1.1.0
Sentence Transformers: 3.0.1
Transformers: 4.45.2
PyTorch: 2.4.0+cu124
Datasets: 2.21.0
Tokenizers: 0.20.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

davidadamczyk
/

setfit-model-3