Spaces:

agents-course
/

Students_leaderboard

Running

App Files Files Community

concept of the final assessment submission?

by crcdng - opened Apr 26

Discussion

crcdng

Apr 26

•

edited Apr 26

There seems to be a problem with the concept of the final assessment submission. Current Student Leaderboard entries 2 - 6 are submitted by different users but all point to the same code. I suspect when you try somebody elses space and you enter your credentials then the result is logged under your name. Or is something else going on? 🧐

Example:

Rank 1 user susmitsil with code https://huggingface.co/spaces/susmitsil/FinalAgenticAssessment/tree/main Seems legit.

Rank 2 user mmkkkaa with code https://huggingface.co/spaces/baixianger/RobotPai/tree/main ???
same code (baixianger) with the currently ranked 3 to 6

romain-lembo

Apr 28

Indeed, simply running the “baixianger” project allows you to appear on the Students leaderboard.

martinsu

30 days ago

Well this is not protected stuff - anyone can find right files with right answers and just submit with any variables as web request

mriusero

29 days ago

If you want to see your real rank :

Go directly on the dataset (https://huggingface.co/datasets/agents-course/unit4-students-scores)
Run the SQL Query below by changing your_username:

WITH FilteredTrains AS (
    SELECT *
    FROM train
    WHERE code LIKE '%' || username || '%'
),
RankedTrains AS (
    SELECT
        code,
        username,
        score,
        RANK() OVER (ORDER BY score DESC) AS rank
    FROM
        FilteredTrains
)
SELECT
    rank
FROM
    RankedTrains
WHERE
    code LIKE '%' || 'your_username' || '%';

martinsu

28 days ago

•

edited 28 days ago

If you want to see your real rank :

Go directly on the dataset (https://huggingface.co/datasets/agents-course/unit4-students-scores)
Run the SQL Query below by changing your_username:

WITH FilteredTrains AS (
    SELECT *
    FROM train
    WHERE code LIKE '%' || username || '%'
),
RankedTrains AS (
    SELECT
        code,
        username,
        score,
        RANK() OVER (ORDER BY score DESC) AS rank
    FROM
        FilteredTrains
)
SELECT
    rank
FROM
    RankedTrains
WHERE
    code LIKE '%' || 'your_username' || '%';

Its not perfect solution, i ran code locally and my code skipped username in code URL when submitting. Yes, my bad in that part.

I think best approach would be to build custom GAIA level 1 closed Q and A set where nobody knows real answers, say some 20 Q for everyone as contest in short timeframe, will cut cheating alot.

florinhegedus

12 days ago

•

edited 12 days ago

I was thinking of the exact same problem. Many entries in the leaderboard use the same code, just by picking random entries you can see this. This lowers the credibility of the certification.

GatinhoEducado

6 days ago

•

edited 5 days ago

Yep. I launched another's SPACE just to see how it is supposed to work and got into the LeaderBoard. And there is no even an option to delete or change it (typical haggingface). Moreover, I can't even test my agent bcs without PRO version only 10 requests/month to LLM are available.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment