Spaces:
Running
Running
Zekun Wu
commited on
Commit
Β·
64703c4
1
Parent(s):
7e568f5
update
Browse files
app.py
CHANGED
|
@@ -1,16 +1,61 @@
|
|
| 1 |
import streamlit as st
|
| 2 |
|
| 3 |
st.set_page_config(
|
| 4 |
-
page_title="
|
| 5 |
page_icon="π",
|
| 6 |
)
|
| 7 |
|
| 8 |
-
st.title('JobFair: A Benchmark for Fairness in LLM Employment Decision')
|
| 9 |
-
st.write("Welcome to JobFair! This benchmark is designed to evaluate the fairness of language models in employment decision-making. ")
|
| 10 |
-
|
| 11 |
-
st.sidebar.success("Select a demo above.")
|
| 12 |
|
| 13 |
st.markdown(
|
| 14 |
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
"""
|
| 16 |
-
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
import streamlit as st
|
| 2 |
|
| 3 |
st.set_page_config(
|
| 4 |
+
page_title="JobFair: Fairness Benchmark",
|
| 5 |
page_icon="π",
|
| 6 |
)
|
| 7 |
|
| 8 |
+
st.title('JobFair: A Benchmark for Fairness in LLM Employment Decision-Making')
|
| 9 |
+
st.write("Welcome to JobFair! This benchmark is designed to evaluate the fairness of language models in employment decision-making. Our goal is to provide a comprehensive tool for analyzing potential biases in how language models score resumes and make hiring recommendations.")
|
|
|
|
|
|
|
| 10 |
|
| 11 |
st.markdown(
|
| 12 |
"""
|
| 13 |
+
## About JobFair
|
| 14 |
+
|
| 15 |
+
The JobFair benchmark enables users to:
|
| 16 |
+
- **Upload and process** resumes to be evaluated by language models.
|
| 17 |
+
- **Analyze fairness** through various statistical tests, correlations, and divergences.
|
| 18 |
+
- **Download detailed evaluation results** for further review and reporting.
|
| 19 |
+
|
| 20 |
+
### Key Features
|
| 21 |
+
|
| 22 |
+
- **Fairness Analysis**: Perform a variety of statistical tests to uncover potential biases in language model evaluations.
|
| 23 |
+
- **Comprehensive Reporting**: Generate detailed reports on the fairness of LLMs, including visualizations and downloadable data.
|
| 24 |
+
- **User-Friendly Interface**: Easily upload data, run analyses, and download results through an intuitive web interface.
|
| 25 |
+
|
| 26 |
+
### How to Use
|
| 27 |
+
|
| 28 |
+
1. **Upload Data**: Start by uploading a CSV file containing the resumes and their respective scores.
|
| 29 |
+
2. **Run Evaluations**: Use the provided tools to perform statistical analyses and visualize the results.
|
| 30 |
+
3. **Download Results**: Export the analysis results for further examination and reporting.
|
| 31 |
+
|
| 32 |
+
We hope JobFair helps you in making more informed and fair employment decisions using language models.
|
| 33 |
"""
|
| 34 |
+
)
|
| 35 |
+
|
| 36 |
+
# Sidebar content
|
| 37 |
+
st.sidebar.title("Demos")
|
| 38 |
+
|
| 39 |
+
st.sidebar.subheader("Injection Demo")
|
| 40 |
+
st.sidebar.markdown(
|
| 41 |
+
"""
|
| 42 |
+
In this demo, you can upload a dataset of resumes and use our language models to process and score them based on various parameters.
|
| 43 |
+
|
| 44 |
+
- **Model Settings**: Configure your model settings by selecting the type of agent (GPTAgent or AzureAgent), and specifying the API key, endpoint URL, model name, temperature, and max tokens.
|
| 45 |
+
- **Data Upload**: Choose to upload your own CSV file or use an example dataset.
|
| 46 |
+
- **Process Data**: Enter the relevant details such as occupation, group name, privilege label, and protect label. Specify the number of runs and process the data to get the model's scores.
|
| 47 |
+
- **Download Results**: After processing, download the generated results as a CSV file.
|
| 48 |
+
"""
|
| 49 |
+
)
|
| 50 |
+
|
| 51 |
+
st.sidebar.subheader("Evaluation Demo")
|
| 52 |
+
st.sidebar.markdown(
|
| 53 |
+
"""
|
| 54 |
+
In this demo, you can evaluate the fairness of the scores generated by the language models.
|
| 55 |
+
|
| 56 |
+
- **Upload Results**: Upload the CSV file containing the processed results from the injection demo.
|
| 57 |
+
- **Statistical Tests**: Perform a variety of statistical tests to evaluate potential biases in the scores.
|
| 58 |
+
- **Correlations and Divergences**: Calculate correlations and divergences to further analyze the fairness of the results.
|
| 59 |
+
- **Download Evaluation**: Download the comprehensive evaluation results for further analysis.
|
| 60 |
+
"""
|
| 61 |
+
)
|