library_name: transformers
tags:
- finance
license: mit
datasets:
- Recompense/amazon-appliances-lite-data
language:
- en
base_model:
- meta-llama/Llama-3.1-8B-Instruct
Model Card for Model ID
Predicts Prices based on product description.
Model Details
Model Description
This model predicts prices of amazon aplliances data based on a product description
- Developed by: https://huggingface.co/Recompense
- Model type: Transformer (causal, autoregressive)
- Language(s) (NLP): English
- License: MIT
- Finetuned from model: meta-llama/Llama-3.1-8B-Instruct
Model Sources
- Repository: [https://huggingface.co/Recompense/Midas-pricer/]
Uses
Primary use case: Generating estimated retail prices for household appliances from textual descriptions.
Example applications:
Assisting e-commerce teams in setting competitive price points
Supporting market analysis dashboards with on-the-fly price estimates
Not intended for: Financial advice or investment decisions
Out-of-Scope Use
Attempting to predict prices outside the appliances domain (e.g., electronics, furniture, vehicles) will likely yield unreliable results.
Using this model for any price-sensitive or regulatory decisions without human oversight is discouraged.
Bias, Risks, and Limitations
Data biases: The training dataset is drawn exclusively from Amazon appliance listings. Price distributions are skewed toward mid-range consumer electronics; extreme low or high‐end appliances are underrepresented.
Input sensitivity: Minor changes in phrasing or additional noisy tokens can shift predictions noticeably.
Generalization: The model does not understand supply chain disruptions, seasonality, or promotions—it only captures patterns seen in historical listing data.
Recommendations
Always validate model outputs against a small set of ground-truth prices before production deployment.
Use this model as an assistant, not an oracle: incorporate downstream business rules or domain expertise.
Regularly retrain or fine-tune on updated listing data to capture shifting market trends.
How to Get Started with the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("Recompense/Midas-pricer")
model = AutoModelForCausalLM.from_pretrained("Recompense/Midas-pricer", torch_dtype=torch.bfloat16)
# Prepare prompt
product_desc = "How much does this cost to the nearest dollar?\n\nSamsung 7kg top-load washing machine with digital inverter motor"
prompt = f"{product_desc}\n\nPrice is $"
# Tokenize and generate
inputs = tokenizer(prompt, return_tensors="pt")
attention_mask = torch.ones(inputs.shape, device="cuda")
generated = model.generate(inputs, attention_mask=attention_mask, max_new_tokens=3, num_return_sequences=1)
price_text = tokenizer.decode(generated[0], skip_special_tokens=True)
print(f"Estimated price: ${price_text}")
Training Details
Training Data
Dataset: Recompense/amazon-appliances-lite-data
Train/validation/test split: 80/10/10
Training Procedure
Training Hyperparameters
Fine-tuning framework: PyTorch + Hugging Face Accelerate
Precision: bf16 mixed precision
Batch size: 1 sequence
Learning rate: 1e-5 with linear warmup (10% of total steps)
Optimizer: AdamW
Evaluation
Testing Data, Factors & Metrics
Test set: Held-out 10% of listings (≈5 000 examples)
Metric: Root Mean Squared Logarithmic Error (RMSLE)
Hit@$40: Percentage of predictions within ±$40 of true price
Metric | Value |
---|---|
RMSLE | 0.61 |
Hit@$40 | 85.2 % |
Summary
The model achieves an RMSLE of 0.61, indicating good alignment between predicted and actual prices on a log scale, and correctly estimates within $40 in over 85% of test cases. This performance is competitive for rapid prototyping in price-sensitive applications.
Environmental Impact
Approximate compute emissions for fine-tuning (using ML CO₂ impact calculator):
Hardware: Tesla T4
Duration: 2 hours(0.06 epoch)
Cloud provider: Google Cloud, region US-Central
Estimated CO₂ emitted: 6 kg CO₂e
Technical Specifications
Model Architecture
Base model: Llama-3.1-8B (8 billion parameters)
Objective: Autoregressive language modeling with instruction tuning
Compute Infrastructure
Hardware: 4× Tesla T4 GPUs
Software:
PyTorch 2.x
transformers 5.x
accelerate 1.x
bitsandbytes (for 8-bit quantization optional inference)
Glossary
RMSLE (Root Mean Squared Logarithmic Error): Measures the square root of the average squared difference between log-transformed predictions and targets. Less sensitive to large absolute errors.
Hit@$40: Fraction of predictions whose absolute error is ≤ $40.
Model Card Authors
Damola Jimoh(Recompense)