# # Install core packages # !pip install -U transformers datasets accelerate import logging import io import time import gradio as gr # โœ… required for progress bar from pathlib import Path # Python standard + ML packages import pandas as pd import numpy as np import torch from torch.utils.data import Dataset from sklearn.model_selection import train_test_split from sklearn.metrics import classification_report, precision_recall_fscore_support # Hugging Face Hub from huggingface_hub import hf_hub_download # Hugging Face transformers import transformers from transformers import ( AutoTokenizer, DebertaV2Tokenizer, BertTokenizer, BertForSequenceClassification, AutoModelForSequenceClassification, Trainer, TrainingArguments ) PERSIST_DIR = Path("/home/user/app") MODEL_DIR = PERSIST_DIR / "saved_model" LOG_FILE = PERSIST_DIR / "training.log" # configure logging log_buffer = io.StringIO() logging.basicConfig( level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s", handlers=[ logging.FileHandler(LOG_FILE), logging.StreamHandler(log_buffer) ] ) logger = logging.getLogger(__name__) # Check versions logger.info("Transformers version:", transformers.__version__) # Check for GPU availability logger.info("Transformers version: %s", torch.__version__) logger.info("torch.cuda.is_available(): %s", torch.cuda.is_available()) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # Label mapping for evaluation label_map = {0.0: "no", 0.5: "plausibly", 1.0: "yes"} # Custom Dataset class class AbuseDataset(Dataset): def __init__(self, texts, labels, tokenizer): self.encodings = tokenizer(texts, truncation=True, padding=True, max_length=512) self.labels = labels def __len__(self): return len(self.labels) def __getitem__(self, idx): item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()} item["labels"] = torch.tensor(self.labels[idx], dtype=torch.float) return item def __getitem__(self, idx): item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()} item["labels"] = torch.tensor(self.labels[idx], dtype=torch.float) return item # Convert label values to soft scores: "yes" = 1.0, "plausibly" = 0.5, others = 0.0 def label_row_soft(row): labels = [] for col in label_columns: val = str(row[col]).strip().lower() if val == "yes": labels.append(1.0) elif val == "plausibly": labels.append(0.5) else: labels.append(0.0) return labels # Function to map probabilities to 3 classes # (0.0, 0.5, 1.0) based on thresholds def map_to_3_classes(prob_array, low, high): """Map probabilities to 0.0, 0.5, 1.0 using thresholds.""" mapped = np.zeros_like(prob_array) mapped[(prob_array > low) & (prob_array <= high)] = 0.5 mapped[prob_array > high] = 1.0 return mapped def convert_to_label_strings(array): """Convert float label array to list of strings.""" return [label_map[val] for val in array.flatten()] def tune_thresholds(probs, true_labels, verbose=True): """Search for best (low, high) thresholds by macro F1 score.""" best_macro_f1 = 0.0 best_low, best_high = 0.0, 0.0 for low in np.arange(0.2, 0.5, 0.05): for high in np.arange(0.55, 0.8, 0.05): if high <= low: continue pred_soft = map_to_3_classes(probs, low, high) pred_str = convert_to_label_strings(pred_soft) true_str = convert_to_label_strings(true_labels) _, _, f1, _ = precision_recall_fscore_support( true_str, pred_str, labels=["no", "plausibly", "yes"], average="macro", zero_division=0 ) if verbose: logger.info(f"low={low:.2f}, high={high:.2f} -> macro F1={f1:.3f}") if f1 > best_macro_f1: best_macro_f1 = f1 best_low, best_high = low, high return best_low, best_high, best_macro_f1 def evaluate_model_with_thresholds(trainer, test_dataset): """Run full evaluation with automatic threshold tuning.""" logger.info("\n๐Ÿ” Running model predictions...") yield "\n๐Ÿ” Running model predictions..." predictions = trainer.predict(test_dataset) probs = torch.sigmoid(torch.tensor(predictions.predictions)).numpy() true_soft = np.array(predictions.label_ids) logger.info("\n๐Ÿ”Ž Tuning thresholds...") yield "\n๐Ÿ”Ž Tuning thresholds..." best_low, best_high, best_f1 = tune_thresholds(probs, true_soft) logger.info(f"\nโœ… Best thresholds: low={best_low:.2f}, high={best_high:.2f} (macro F1={best_f1:.3f})") yield f"\nโœ… Best thresholds: low={best_low:.2f}, high={best_high:.2f} (macro F1={best_f1:.3f})" final_pred_soft = map_to_3_classes(probs, best_low, best_high) final_pred_str = convert_to_label_strings(final_pred_soft) true_str = convert_to_label_strings(true_soft) logger.info("\n๐Ÿ“Š Final Evaluation Report (multi-class per label):\n") yield "\n๐Ÿ“Š Final Evaluation Report (multi-class per label):\n " logger.info(classification_report( true_str, final_pred_str, labels=["no", "plausibly", "yes"], digits=3, zero_division=0 )) yield classification_report( true_str, final_pred_str, labels=["no", "plausibly", "yes"], digits=3, zero_division=0 ) def load_saved_model_and_tokenizer(): tokenizer = DebertaV2Tokenizer.from_pretrained(MODEL_DIR) model = AutoModelForSequenceClassification.from_pretrained(MODEL_DIR).to(device) return tokenizer, model def evaluate_saved_model(progress=gr.Progress(track_tqdm=True)): if os.path.exists("saved_model/"): yield "โœ… Trained model found! Skipping training...\n" else: yield "โŒ No trained model found. Please train the model first.\n" return try: logger.info("๐Ÿ” Loading saved model for evaluation...") yield "๐Ÿ” Loading saved model for evaluation...\n" tokenizer, model = load_saved_model_and_tokenizer() test_dataset = AbuseDataset(test_texts, test_labels, tokenizer) trainer = Trainer( model=model, args=TrainingArguments( output_dir="./results_eval", per_device_eval_batch_size=4, logging_dir="./logs_eval", disable_tqdm=True ), eval_dataset=test_dataset ) label_map = {0.0: "no", 0.5: "plausibly", 1.0: "yes"} # Re-yield from generator for line in evaluate_model_with_thresholds(trainer, test_dataset): yield line logger.info("โœ… Evaluation complete.\n") yield "\nโœ… Evaluation complete.\n" except Exception as e: logger.exception(f"โŒ Evaluation failed: {e}") yield f"โŒ Evaluation failed: {e}\n" token = os.environ.get("HF_TOKEN") # Reads my token from a secure hf secret # Load dataset from Hugging Face Hub path = hf_hub_download( repo_id="rshakked/abusive-relashionship-stories", filename="Abusive Relationship Stories - Technion & MSF.xlsx", repo_type="dataset", use_auth_token= token ) df = pd.read_excel(path) # Define text and label columns text_column = "post_body" label_columns = [ 'emotional_violence', 'physical_violence', 'sexual_violence', 'spiritual_violence', 'economic_violence', 'past_offenses', 'social_isolation', 'refuses_treatment', 'suicidal_threats', 'mental_condition', 'daily_activity_control', 'violent_behavior', 'unemployment', 'substance_use', 'obsessiveness', 'jealousy', 'outbursts', 'ptsd', 'hard_childhood', 'emotional_dependency', 'prevention_of_care', 'fear_based_relationship', 'humiliation', 'physical_threats', 'presence_of_others_in_assault', 'signs_of_injury', 'property_damage', 'access_to_weapons', 'gaslighting' ] logger.info(np.shape(df)) # Clean data df = df[[text_column] + label_columns] logger.info(np.shape(df)) df = df.dropna(subset=[text_column]) logger.info(np.shape(df)) df["label_vector"] = df.apply(label_row_soft, axis=1) label_matrix = df["label_vector"].tolist() # Proper 3-way split: train / val / test train_val_texts, test_texts, train_val_labels, test_labels = train_test_split( df[text_column].tolist(), label_matrix, test_size=0.2, random_state=42 ) train_texts, val_texts, train_labels, val_labels = train_test_split( train_val_texts, train_val_labels, test_size=0.1, random_state=42 ) #model_name = "onlplab/alephbert-base" model_name = "microsoft/deberta-v3-base" def run_training(progress=gr.Progress(track_tqdm=True)): if os.path.exists("saved_model/"): yield "โœ… Trained model found! Skipping training...\n" yield evaluate_saved_model() return yield "๐Ÿš€ Starting training...\n" try: logger.info("Starting training run...") # Load pretrained model for fine-tuning tokenizer = DebertaV2Tokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained( model_name, num_labels=len(label_columns), problem_type="multi_label_classification" ).to(device) # Move model to GPU # gradient checkpointing helps cut memory use: model.gradient_checkpointing_enable() # Freeze bottom 6 layers of DeBERTa encoder for name, param in model.named_parameters(): if any(f"encoder.layer.{i}." in name for i in range(0, 6)): param.requires_grad = False train_dataset = AbuseDataset(train_texts, train_labels,tokenizer) val_dataset = AbuseDataset(val_texts, val_labels,tokenizer) test_dataset = AbuseDataset(test_texts, test_labels,tokenizer) # TrainingArguments for HuggingFace Trainer (logging, saving) training_args = TrainingArguments( output_dir="./results", num_train_epochs=3, per_device_train_batch_size=8, per_device_eval_batch_size=8, evaluation_strategy="epoch", save_strategy="epoch", logging_dir="./logs", logging_steps=500, disable_tqdm=True ) # Train using HuggingFace Trainer trainer = Trainer( model=model, args=training_args, train_dataset=train_dataset, eval_dataset=val_dataset ) logger.info("Training started with %d samples", len(train_dataset)) yield "๐Ÿ”„ Training in progress...\n" total_steps = len(train_dataset) * training_args.num_train_epochs // training_args.per_device_train_batch_size intervals = max(total_steps // 20, 1) for i in range(0, total_steps, intervals): time.sleep(0.5) percent = int(100 * i / total_steps) progress(percent / 100) yield f"โณ Progress: {percent}%\n" # # This checks if any tensor is on GPU too early. # logger.info("๐Ÿงช Sample device check from train_dataset:") # sample = train_dataset[0] # for k, v in sample.items(): # logger.info(f"{k}: {v.device}") # Start training! trainer.train() # Save the model and tokenizer MODEL_DIR.mkdir(parents=True, exist_ok=True) model.save_pretrained(MODEL_DIR) tokenizer.save_pretrained(MODEL_DIR) logger.info(" Training completed and model saved.") yield "๐ŸŽ‰ Training complete! Model saved.\n" except Exception as e: logger.exception( f"โŒ Training failed: {e}") yield f"โŒ Training failed: {e}\n" # Evaluation try: if 'trainer' in locals(): evaluate_model_with_thresholds(trainer, test_dataset) logger.info("Evaluation completed") except Exception as e: logger.exception(f"Evaluation failed: {e}") log_buffer.seek(0) return log_buffer.read() def push_model_to_hub(): try: logger.info("๐Ÿ”„ Pushing model to Hugging Face Hub...") tokenizer, model = load_saved_model_and_tokenizer() model.push_to_hub("rshakked/abuse-detector-he-en", use_auth_token=token) tokenizer.push_to_hub("rshakked/abuse-detector-he-en", use_auth_token=token) return "โœ… Model pushed to hub successfully!" except Exception as e: logger.exception("โŒ Failed to push model to hub.") return f"โŒ Failed to push model: {e}"