import streamlit as st # Page Title st.title("🛠️ Feature Engineering & Feature Selection") # Feature Engineering Section st.markdown(""" ### ✨ Feature Engineering: Several transformations were applied to prepare the dataset for modeling: - **Encoding**: Used **Ordinal Encoding** to convert categorical variables like Gender, Sleep Duration, and Dietary Habits into numerical values. - **Scaling**: Applied **StandardScaler** to normalize numerical features such as CGPA, Age, and Schedule Pressure. - **Data Cleaning**: Removed irrelevant or noisy columns that did not contribute to the prediction task. - **Balancing**: Checked for class imbalance in the target (`Depression`) to ensure proper model generalization. """) # Selected Features Section st.markdown(""" ### ✅ Selected Features: The following features were retained for training the model based on correlation analysis and domain relevance: - Gender - Age - Academic Pressure - Study Satisfaction - Sleep Duration - Dietary Habits - Financial Stress - CGPA - Schedule Pressure - Integration Complexity """) # Dropped Features Section st.markdown(""" ### 🚫 Dropped Features: - Redundant or low-impact features such as `Job Satisfaction`, `Profession`, and `City` - Highly correlated features that introduced multicollinearity The refined dataset was then used to train the **KNN classifier** for depression prediction. """) if st.button("Next >>"): st.switch_page(r"pages/5 Model Building.py") if st.button("<< Back"): st.switch_page(r"pages/3 EDA.py")