hieunguyen1053 commited on
Commit
441c2e8
·
1 Parent(s): f9521a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -15
README.md CHANGED
@@ -11,27 +11,61 @@ pinned: false
11
 
12
  ## 1. Important dates
13
 
14
- Aug 05, 2023: Team Registration opens
15
- Aug 31, 2023: Team Registration closes
16
- Sep 30, 2023: Release: Test samples and evaluation instruction
17
- Nov 10, 2023: System submission deadline (API only)
18
- Nov 26, 2023: Technical report submission
19
- Dec 15, 2023: Result announcement (workshop day)
20
 
21
  ## 2. Task Description
22
 
23
- In recent years, Large Language Models (LLMs) have gained widespread recognition and popularity worldwide, with models such as GPT-X, BARD and LLaMa making significant strides in natural language processing tasks. In Vietnam, there is also a growing interest in developing LLMs specifically tailored for the Vietnamese language. However, unlike LLMs developed for other languages, the availability of publicly accessible evaluation data for Vietnamese LLMs is significantly limited. The limited availability of evaluation data for Vietnamese LLMs presents a substantial obstacle for organizations seeking to establish uniform evaluation standards.
24
- The goal of VLSP2023-VLLMs is to promote the development of large language models for Vietnamese by constructing an evaluation dataset for VLLMs. This dataset will be different from conventional datasets for downstream NLP tasks, as it will focus on testing the following capabilities:
25
- Overcoming the Hallucination phenomenon of LLMs.
26
- Ability to generate text based on broad contextual understanding.
27
- Ability to accurately answer open questions.
 
28
 
29
- The teams participating in this challenge will build their own LLMs for Vietnamese, and these models will be provided with a test dataset and instructions on how to evaluate them. The models participating in this competition remain the copyright of the respective development groups and are not required to be open-source.
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
- ## 3. Registration
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  Link:
33
 
34
- ## 4. Resource
35
 
36
  We will provide the following instructions to the participating groups:
37
  - The publicly shared pre-trained LLMs.
@@ -45,4 +79,9 @@ We will provide the following instructions to the participating groups:
45
  - Lê Anh Cường (TDTU)
46
  - Nguyễn Trọng Hiếu (TDTU)
47
  - Nguyễn Việt Cường (Intelligent Integration Co, Ltd.)
48
- - Le-Minh Nguyen - JAIST(Japan Advanced Institute of Science and Technology)
 
 
 
 
 
 
11
 
12
  ## 1. Important dates
13
 
14
+ - Aug 05, 2023: Team Registration opens
15
+ - Aug 31, 2023: Team Registration closes
16
+ - Sep 30, 2023: Release: Test samples and evaluation instruction
17
+ - Nov 10, 2023: System submission deadline (API only)
18
+ - Nov 26, 2023: Technical report submission
19
+ - Dec 15, 2023: Result announcement (workshop day)
20
 
21
  ## 2. Task Description
22
 
23
+ In recent years, Large Language Models (LLMs) have gained widespread recognition and popularity worldwide, with models such as GPT-X, BARD and LLaMa making significant strides in natural language processing tasks.
24
+ In Vietnam, there is also a growing interest in developing LLMs specifically tailored for the Vietnamese language.
25
+ However, unlike LLMs developed for other languages, the availability of publicly accessible evaluation data for Vietnamese LLMs is significantly limited.
26
+ The limited availability of evaluation data for Vietnamese LLMs presents a substantial obstacle for organizations seeking to establish uniform evaluation standards.
27
+ The goal of VLSP2023-VLLMs is to promote the development of large language models for Vietnamese by constructing an evaluation dataset for VLLMs.
28
+ This dataset will be different from conventional datasets for downstream NLP tasks, as it will focus on 4 primary abilities, divided into 8 skills and divided into 9 domains.
29
 
30
+ ### Abilities
31
+ - Logical thinking
32
+ - Logical Correctness
33
+ - Background knowledge
34
+ - Factuality
35
+ - Commonsense Understanding
36
+ - Problem handling
37
+ - Comprehension
38
+ - Insighfulness
39
+ - Metacognition
40
+ - User alignment
41
+ - Harmlessness
42
+ - Follow the Correct Instruction
43
 
44
+ ### Domains
45
+ - Humanities: Communication, Education
46
+ - Language: Poetry, Literature
47
+ - Social Science: Business, Finance, Law
48
+ - History: History
49
+ - Culture: Food, Sports, Art, Music
50
+ - Technology: Marketing, Electronics, Engineering
51
+ - Math: Mathematic, Logic
52
+ - Natural Science: Biology, Chemistry, Physics
53
+ - Health: Healthcare, Exercise, Nutrition
54
+
55
+ The teams participating in this challenge will build their own LLMs for Vietnamese, and these models will be provided with a public test dataset and instructions on how to evaluate them. The models participating in this competition remain the copyright of the respective development groups and are not required to be open-source.
56
+ We will provide the following instructions to the participating groups:
57
+ - The publicly shared pre-trained LLMs.
58
+ - Plain text datasets for Vietnamese.
59
+ - Instruction datasets.
60
+ - Sample examples of the evaluation dataset.
61
+
62
+ ## 3. Evaluation
63
+ Results would be evaluated by model-based evaluation and human-based evaluation.
64
+
65
+ ## 4. Registration
66
  Link:
67
 
68
+ ## 5. Resources
69
 
70
  We will provide the following instructions to the participating groups:
71
  - The publicly shared pre-trained LLMs.
 
79
  - Lê Anh Cường (TDTU)
80
  - Nguyễn Trọng Hiếu (TDTU)
81
  - Nguyễn Việt Cường (Intelligent Integration Co, Ltd.)
82
+ - Le-Minh Nguyen - JAIST(Japan Advanced Institute of Science and Technology)
83
+
84
+ ## References
85
+ [1] Long Ouyang and Jeff Wu and Xu Jiang and Diogo Almeida and Carroll L. Wainwright and Pamela Mishkin and Chong Zhang and Sandhini Agarwal and Katarina Slama and Alex Ray, et al. Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155, 2022.
86
+
87
+ [2] Seonghyeon Ye and Doyoung Kim and Sungdong Kim and Hyeonbin Hwang and Seungone Kim and Yongrae Jo and James Thorne and Juho Kim and Minjoon Seo. FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets. arXiv preprint arXiv:2307.10928, 2023.