Abhishek Thakur commited on
Commit
d84ea96
·
1 Parent(s): b565952
docs/source/competition_space.mdx CHANGED
@@ -1 +1,38 @@
1
- # Competition Space
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Competition Space
2
+
3
+ A competition space is a Hugging Face Space where the actual competition takes place. It is a space where you can submit your model and get a score. It is also a space where competitors can see the leaderboard, discuss, and make submissions.
4
+
5
+ Check out an example competition space below:
6
+
7
+ ![competition space](https://github.com/abhishekkrthakur/public_images/blob/main/competition_space.png?raw=true)
8
+
9
+ The competition space consists of the following:
10
+
11
+ - Competition description (content is fetched from private competition repo)
12
+ - Dataset description (content is fetched from private competition repo)
13
+ - Leaderboard (content is fetched from private competition repo)
14
+ - Public (available to everyone, all the time)
15
+ - Private (available to everyone, but only after the competition ends)
16
+ - Submissions
17
+ - Submission guidelines (content is fetched from private competition repo)
18
+ - My submissions (users can see their own submissions)
19
+ - New submission (users can make new submissions)
20
+ - Discussions (accessible via community tab)
21
+
22
+ ### Secrets
23
+
24
+ The competition space requires two secrets:
25
+
26
+ - `HF_TOKEN`: this is the Hugging Face write token of the user who created the competition space. This token must be kept alive for the duration of the competition. In case the token expires, the competition space will stop working. If you change/refresh/delete this token, you will need to update this secret.
27
+ - `COMPETITION_ID`: this is the path of private competition repo. e.g. `abhishek/private-competition-data`. If you change the name of the private competition repo, you will need to update this secret.
28
+
29
+ Note: The above two secrets are crucial for the competition space to work!
30
+
31
+ ### Public & private competition spaces
32
+
33
+ A competition space can be public or private. A public competition space is available to everyone, all the time.
34
+ A private competition space is only available to the members of the organization the competition space is created in.
35
+
36
+ You can at any point make the competition public.
37
+
38
+ Generally, we recommend testing every aspect of the competition space in a private competition space before making it public.
docs/source/custom_metric.mdx CHANGED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Custom metric
2
+
3
+ In case you don't settle for the default scikit-learn metrics, you can define your own metric.
4
+
5
+ Here, we expect the organizer to know python.
6
+
7
+ ### How to define a custom metric
8
+
9
+ To define a custom metric, change `EVAL_METRIC` in `conf.json` to `custom`. You must also make sure that `EVAL_HIGHER_IS_BETTER` is set to `1` or `0` depending on whether a higher value of the metric is better or not.
10
+
11
+ The second step is to create a file `metric.py` in the private competition repo.
12
+ The file should contain a `compute` function that takes competition params as input.
13
+
14
+ Here is the part where we check if metric is custom and calculate the metric value:
15
+
16
+ ```python
17
+ def compute_metrics(params):
18
+ if params.metric == "custom":
19
+ metric_file = hf_hub_download(
20
+ repo_id=params.competition_id,
21
+ filename="metric.py",
22
+ token=params.token,
23
+ repo_type="dataset",
24
+ )
25
+ sys.path.append(os.path.dirname(metric_file))
26
+ metric = importlib.import_module("metric")
27
+ evaluation = metric.compute(params)
28
+ .
29
+ .
30
+ .
31
+ ````
32
+
33
+ You can find the above part in competitions github repo `compute_metrics.py`
34
+
35
+ `params` is defined as:
36
+
37
+ ```python
38
+ class EvalParams(BaseModel):
39
+ competition_id: str
40
+ competition_type: str
41
+ metric: str
42
+ token: str
43
+ team_id: str
44
+ submission_id: str
45
+ submission_id_col: str
46
+ submission_cols: List[str]
47
+ submission_rows: int
48
+ output_path: str
49
+ submission_repo: str
50
+ time_limit: int
51
+ ```
52
+
53
+ You are free to do whatever you want to in the `compute` function.
54
+ In the end it must return a dictionary with the following keys:
55
+
56
+ ```python
57
+ {
58
+ "public_score": metric_value,
59
+ "private_score": metric_value,
60
+ }
61
+ ```
62
+
63
+ public and private scores can be floats or can also be dictionaries in case you want to use multiple metrics.
64
+ Example for multiple metrics:
65
+
66
+ ```python
67
+ {
68
+ "public_score": {
69
+ "metric1": metric_value,
70
+ "metric2": metric_value,
71
+ },
72
+ "private_score": {
73
+ "metric1": metric_value,
74
+ "metric2": metric_value,
75
+ },
76
+ }
77
+ ```
78
+
79
+ Note: When using multiple metrics, a base metric (the first one) will be used to rank the participants in the competition.
80
+
81
+ ### Example of a custom metric
82
+
83
+ ```python
84
+ import pandas as pd
85
+ from huggingface_hub import hf_hub_download
86
+
87
+
88
+ def compute(params):
89
+ solution_file = hf_hub_download(
90
+ repo_id=params.competition_id,
91
+ filename="solution.csv",
92
+ token=params.token,
93
+ repo_type="dataset",
94
+ )
95
+
96
+ solution_df = pd.read_csv(solution_file)
97
+
98
+ submission_filename = f"submissions/{params.team_id}-{params.submission_id}.csv"
99
+ submission_file = hf_hub_download(
100
+ repo_id=params.competition_id,
101
+ filename=submission_filename,
102
+ token=params.token,
103
+ repo_type="dataset",
104
+ )
105
+ submission_df = pd.read_csv(submission_file)
106
+
107
+ public_ids = solution_df[solution_df.split == "public"][params.submission_id_col].values
108
+ private_ids = solution_df[solution_df.split == "private"][params.submission_id_col].values
109
+
110
+ public_solution_df = solution_df[solution_df[params.submission_id_col].isin(public_ids)]
111
+ public_submission_df = submission_df[submission_df[params.submission_id_col].isin(public_ids)]
112
+
113
+ private_solution_df = solution_df[solution_df[params.submission_id_col].isin(private_ids)]
114
+ private_submission_df = submission_df[submission_df[params.submission_id_col].isin(private_ids)]
115
+
116
+ public_solution_df = public_solution_df.sort_values(params.submission_id_col).reset_index(drop=True)
117
+ public_submission_df = public_submission_df.sort_values(params.submission_id_col).reset_index(drop=True)
118
+
119
+ private_solution_df = private_solution_df.sort_values(params.submission_id_col).reset_index(drop=True)
120
+ private_submission_df = private_submission_df.sort_values(params.submission_id_col).reset_index(drop=True)
121
+
122
+ # CALCULATE METRICS HERE.......
123
+ # _metric = SOME METRIC FUNCTION
124
+ target_cols = [col for col in solution_df.columns if col not in [params.submission_id_col, "split"]]
125
+ public_score = _metric(public_solution_df[target_cols], public_submission_df[target_cols])
126
+ private_score = _metric(private_solution_df[target_cols], private_submission_df[target_cols])
127
+
128
+ evaluation = {
129
+ "public_score": public_score,
130
+ "private_score": private_score,
131
+ }
132
+ return evaluation
133
+ ```
134
+
135
+ Take a careful look at the above code.
136
+ You can see that we are downloading the solution file and the submission file from the dataset repo.
137
+ We are then calculating the metric on the public and private splits of the solution and submission files.
138
+ Finally, we are returning the metric values in a dictionary.
docs/source/leaderboard.mdx CHANGED
@@ -0,0 +1 @@
 
 
1
+ # Understanding the leaderboard
docs/source/pricing.mdx CHANGED
@@ -2,7 +2,7 @@
2
 
3
  Creating a competition is free. However, you will need to pay for the compute resources used to run the competition. The cost of the compute resources depends the type of competition you create.
4
 
5
- - generic: generic competitions are free to run. you can, however, upgrade the compute to cpu-basic to speed up the metric calculation and reduce the waiting time for the participants.
6
 
7
  - script: script competitions require a compute resource to run the participant's code. you can choose between a variety of cpu and gpu instances (T4, A10g and even A100). the cost of the compute resource is charged per hour.
8
 
 
2
 
3
  Creating a competition is free. However, you will need to pay for the compute resources used to run the competition. The cost of the compute resources depends the type of competition you create.
4
 
5
+ - generic: generic competitions are free to run. you can, however, upgrade the compute to cpu-upgrade to speed up the metric calculation and reduce the waiting time for the participants.
6
 
7
  - script: script competitions require a compute resource to run the participant's code. you can choose between a variety of cpu and gpu instances (T4, A10g and even A100). the cost of the compute resource is charged per hour.
8
 
docs/source/submit.mdx CHANGED
@@ -0,0 +1 @@
 
 
1
+ # Making a submission
docs/source/teams.mdx CHANGED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ # Teaming up
2
+
3
+ Coming soon!