import streamlit as st | |
# Page Title | |
st.title("π οΈ Feature Engineering & Feature Selection") | |
# Feature Engineering Section | |
st.markdown(""" | |
### β¨ Feature Engineering: | |
Several transformations were applied to prepare the dataset for modeling: | |
- **Encoding**: Used **Ordinal Encoding** to convert categorical variables like Gender, Sleep Duration, and Dietary Habits into numerical values. | |
- **Scaling**: Applied **StandardScaler** to normalize numerical features such as CGPA, Age, and Schedule Pressure. | |
- **Data Cleaning**: Removed irrelevant or noisy columns that did not contribute to the prediction task. | |
- **Balancing**: Checked for class imbalance in the target (`Depression`) to ensure proper model generalization. | |
""") | |
# Selected Features Section | |
st.markdown(""" | |
### β Selected Features: | |
The following features were retained for training the model based on correlation analysis and domain relevance: | |
- Gender | |
- Age | |
- Academic Pressure | |
- Study Satisfaction | |
- Sleep Duration | |
- Dietary Habits | |
- Financial Stress | |
- CGPA | |
- Schedule Pressure | |
- Integration Complexity | |
""") | |
# Dropped Features Section | |
st.markdown(""" | |
### π« Dropped Features: | |
- Redundant or low-impact features such as `Job Satisfaction`, `Profession`, and `City` | |
- Highly correlated features that introduced multicollinearity | |
The refined dataset was then used to train the **KNN classifier** for depression prediction. | |
""") | |
if st.button("Next >>"): | |
st.switch_page(r"pages/5 Model Building.py") | |
if st.button("<< Back"): | |
st.switch_page(r"pages/3 EDA.py") |