File size: 1,569 Bytes
0484d85
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4baf6ae
0484d85
 
4baf6ae
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
import streamlit as st

# Page Title
st.title("πŸ› οΈ Feature Engineering & Feature Selection")

# Feature Engineering Section
st.markdown("""
### ✨ Feature Engineering:
Several transformations were applied to prepare the dataset for modeling:

- **Encoding**: Used **Ordinal Encoding** to convert categorical variables like Gender, Sleep Duration, and Dietary Habits into numerical values.  
- **Scaling**: Applied **StandardScaler** to normalize numerical features such as CGPA, Age, and Schedule Pressure.  
- **Data Cleaning**: Removed irrelevant or noisy columns that did not contribute to the prediction task.  
- **Balancing**: Checked for class imbalance in the target (`Depression`) to ensure proper model generalization.
""")

# Selected Features Section
st.markdown("""
### βœ… Selected Features:
The following features were retained for training the model based on correlation analysis and domain relevance:

- Gender  
- Age  
- Academic Pressure  
- Study Satisfaction  
- Sleep Duration  
- Dietary Habits  
- Financial Stress  
- CGPA  
- Schedule Pressure  
- Integration Complexity
""")

# Dropped Features Section
st.markdown("""
### 🚫 Dropped Features:
- Redundant or low-impact features such as `Job Satisfaction`, `Profession`, and `City`  
- Highly correlated features that introduced multicollinearity

The refined dataset was then used to train the **KNN classifier** for depression prediction.
""")

if st.button("Next >>"):
    st.switch_page(r"pages/5 Model Building.py")

if st.button("<< Back"):
    st.switch_page(r"pages/3 EDA.py")