Spaces:

PatternGroup5
/

pattern

Sleeping

App Files Files Community

pattern / Delete_Later_report.txt

AAZ1215

RerportAZSection (#4)

a306fec verified 28 days ago

raw

history blame

5.26 kB

	Report section:
	Title Page
	• Title: Term Project
	• Authors: Saksham Lakhera and Ahmed Zaher
	• Course: CSE 555 — Introduction to Pattern Recognition
	• Date: July 20 2025
	Abstract
	NLP Engineering Perspective
	This project addresses the challenge of improving recipe recommendation systems
	through advanced semantic search capabilities using transformer-based language models.
	Traditional keyword-based search methods often fail to capture the nuanced relationships
	between ingredients, cooking techniques, and user preferences in culinary contexts.
	Our approach leverages BERT (Bidirectional Encoder Representations from Transformers)
	fine-tuning on a custom recipe dataset to develop a semantic understanding of culinary content.
	We preprocessed and structured a subset of 15,000 recipes into standardized sequences organized by
	food categories (proteins, vegetables, legumes, etc.) to create training data optimized for the BERT architecture.
	The model was fine-tuned to learn contextual embeddings that capture semantic relationships between ingredients and
	tags. At the end, we generated embeddings for all recipes in our dataset and implemented a cosine
	similarity-based retrieval system that returns the top-K most relevant recipes based on user search queries.
	Our evaluation demonstrates [PLACEHOLDER: key quantitative results - e.g., Recall@10 = X.XX, MRR = X.XX, improvement
	over baseline = +XX%]. This work provides practical experience in transformer fine-tuning
	for domain-specific applications and demonstrates the effectiveness of structured data preprocessing
	for improving semantic search in the culinary domain.

	Computer-Vision Engineering Perspective
	(Reserved – to be completed by CV author)

	Introduction
	NLP Engineering Perspective
	This term project, carried out for CSE 555, serves primarily as an educational exercise aimed at giving graduate students end-to-end exposure to building a modern NLP system. Our goal is to construct a semantic recipe-search engine that demonstrates how domain-specific fine-tuning of BERT can substantially improve retrieval quality over simple keyword matching. We created a preprocessing pipeline that restructures 15 000 recipes into standardized ingredient-sequence representations and then fine-tuned BERT on this corpus. Key contributions include (i) a cleaned, category-labelled recipe subset, (ii) training scripts that yield domain-adapted contextual embeddings, and (iii) a production-ready retrieval service that returns the top-K most relevant recipes for an arbitrary user query via cosine-similarity ranking. A comparative evaluation against classical lexical baselines will be presented in Section 9 [PLACEHOLDER: baseline summary]. The project thus provides a compact blueprint of the full NLP workflow—from data curation through deployment.

	Computer-Vision Engineering Perspective
	The Computer-Vision track followed a three-phase pipeline designed to simulate the data-engineering challenges of real-world projects. Phase 1 consisted of collecting more than 6 000 food photographs under diverse lighting conditions and backgrounds, deliberately introducing noise to improve model robustness. Phase 2 handled image preprocessing, augmentation, and the subsequent training and evaluation of a convolutional neural network whose weights capture salient visual features of dishes. Phase 3 integrated the trained network into the shared web application so that users can upload an image and receive 5–10 recipe recommendations that match both visually and semantically. Detailed architecture choices and quantitative results will be provided in later sections [PLACEHOLDER: CV performance metrics].

	Background / Related Work
	• Survey of prior methods and the state of the art
	• Clear positioning of your approach relative to existing literature
	Dataset and Pre-processing
	• Data source(s), collection or selection criteria
	• Cleaning, normalization, augmentation, class balancing, etc.
	Methodology
	• Theoretical foundations and algorithms used
	• Model architecture, feature extraction, hyper-parameters
	• Assumptions and justifications
	Experimental Setup
	• Hardware / software environment
	• Train / validation / test split, cross-validation strategy
	• Evaluation metrics (accuracy, F1-score, ROC-AUC, etc.)
	Results
	• Quantitative tables and charts
	• Qualitative examples (e.g., confusion matrix, sample outputs)
	• Statistical significance tests where applicable
	Discussion
	• Interpretation of results (why methods worked or failed)
	• Comparison with baselines or published benchmarks
	• Limitations of your study
	Conclusion
	• Recap of contributions and findings
	• Practical implications
	Future Work
	• Concrete next steps or open problems
	Acknowledgments (if appropriate)
	• Funding sources, collaborators, data providers
	References
	• Properly formatted bibliography (IEEE, APA, etc.)
	Appendices (optional)
	• Supplementary proofs, additional graphs, extensive tables, code snippets