HervΓ© Bredin commited on
Commit
db94671
Β·
1 Parent(s): e900ca9

feat: rename /paper to /reproducible_research

Browse files
Files changed (24) hide show
  1. README.md +23 -41
  2. {paper β†’ reproducible_research}/dihard3_custom_split/development.txt +0 -0
  3. {paper β†’ reproducible_research}/dihard3_custom_split/train.txt +0 -0
  4. {paper β†’ reproducible_research}/expected_outputs/osd/AMI.development.rttm +0 -0
  5. {paper β†’ reproducible_research}/expected_outputs/osd/AMI.test.rttm +0 -0
  6. {paper β†’ reproducible_research}/expected_outputs/osd/DIHARD.development.rttm +0 -0
  7. {paper β†’ reproducible_research}/expected_outputs/osd/DIHARD.test.rttm +0 -0
  8. {paper β†’ reproducible_research}/expected_outputs/osd/VoxConverse.development.rttm +0 -0
  9. {paper β†’ reproducible_research}/expected_outputs/osd/VoxConverse.test.rttm +0 -0
  10. {paper β†’ reproducible_research}/expected_outputs/rsg/AMI.development.rttm +0 -0
  11. {paper β†’ reproducible_research}/expected_outputs/rsg/AMI.test.rttm +0 -0
  12. {paper β†’ reproducible_research}/expected_outputs/rsg/DIHARD.development.rttm +0 -0
  13. {paper β†’ reproducible_research}/expected_outputs/rsg/DIHARD.test.rttm +0 -0
  14. {paper β†’ reproducible_research}/expected_outputs/rsg/VoxConverse.development.rttm +0 -0
  15. {paper β†’ reproducible_research}/expected_outputs/vad/AMI.development.rttm +0 -0
  16. {paper β†’ reproducible_research}/expected_outputs/vad/AMI.test.rttm +0 -0
  17. {paper β†’ reproducible_research}/expected_outputs/vad/DIHARD.development.rttm +0 -0
  18. {paper β†’ reproducible_research}/expected_outputs/vad/DIHARD.test.rttm +0 -0
  19. {paper β†’ reproducible_research}/expected_outputs/vad/VoxConverse.development.rttm +0 -0
  20. {paper β†’ reproducible_research}/expected_outputs/vad/VoxConverse.test.rttm +0 -0
  21. {paper β†’ reproducible_research}/expected_outputs/vbx/AMI.rttm +0 -0
  22. {paper β†’ reproducible_research}/expected_outputs/vbx/DIHARD.rttm +0 -0
  23. {paper β†’ reproducible_research}/expected_outputs/vbx/VoxConverse.rttm +0 -0
  24. {paper β†’ reproducible_research}/report.pdf +0 -0
README.md CHANGED
@@ -19,13 +19,9 @@ inference: false
19
 
20
  # pyannote.audio // speaker segmentation
21
 
22
- This model is described in the technical report *[End-to-end speaker segmentation for overlap-aware resegmentation](paper/report.pdf)*, by HervΓ© Bredin and Antoine Laurent.
23
-
24
  ![Example](example.png)
25
 
26
- ## Citation
27
-
28
- If you use this model for academic research, please consider citing the `pyannote.audio` library:
29
 
30
  ```bibtex
31
  @inproceedings{Bredin2020,
@@ -40,7 +36,8 @@ If you use this model for academic research, please consider citing the `pyannot
40
 
41
  ## Support
42
 
43
- If you (would like to) use this model in commercial products and need help to make the most of it, please contact [me](mailto:[email protected]).
 
44
 
45
  ## Requirements
46
 
@@ -90,16 +87,6 @@ pipeline.instantiate(HYPER_PARAMETERS)
90
  vad = pipeline("audio.wav")
91
  ```
92
 
93
- In order to reproduce results of the [technical report](paper/report.pdf), one should use the following hyper-parameter values:
94
-
95
- Dataset | `onset` | `offset` | `min_duration_on` | `min_duration_off`
96
- ----------------|---------|----------|-------------------|-------------------
97
- AMI Mix-Headset | 0.851 | 0.430 | 0.115 | 0.146
98
- DIHARD3 | 0.855 | 0.292 | 0.036 | 0.001
99
- VoxConverse | 0.883 | 0.688 | 0.106 | 0.526
100
-
101
- We also provide the [expected output](tree/main/paper/expected_outputs/vad) on those three datasets in RTTM format.
102
-
103
  ### Overlapped speech detection
104
 
105
  ```python
@@ -109,16 +96,6 @@ pipeline.instantiate(HYPER_PARAMETERS)
109
  osd = pipeline("audio.wav")
110
  ```
111
 
112
- In order to reproduce results of the [technical report](paper/report.pdf), one should use the following hyper-parameter values:
113
-
114
- Dataset | `onset` | `offset` | `min_duration_on` | `min_duration_off`
115
- ----------------|---------|----------|-------------------|-------------------
116
- AMI Mix-Headset | 0.552 | 0.311 | 0.131 | 0.180
117
- DIHARD3 | 0.564 | 0.264 | 0.158 | 0.080
118
- VoxConverse | 0.617 | 0.387 | 0.367 | 0.334
119
-
120
- We also provide the [expected output](tree/main/paper/expected_outputs/osd) on those three datasets in RTTM format.
121
-
122
  ### Resegmentation
123
 
124
  ```python
@@ -126,27 +103,32 @@ from pyannote.audio.pipelines import Resegmentation
126
  pipeline = Resegmentation(segmentation="pyannote/segmentation",
127
  diarization="baseline")
128
  pipeline.instantiate(HYPER_PARAMETERS)
 
 
129
  ```
130
 
131
- In order to reproduce (VBx) results of the technical report, one should use the following hyper-parameter values:
 
 
 
132
 
133
- Dataset | `onset` | `offset` | `min_duration_on` | `min_duration_off`
 
 
 
 
 
 
 
 
 
 
 
 
134
  ----------------|---------|----------|-------------------|-------------------
135
  AMI Mix-Headset | 0.542 | 0.527 | 0.044 | 0.705
136
  DIHARD3 | 0.592 | 0.489 | 0.163 | 0.182
137
  VoxConverse | 0.537 | 0.724 | 0.410 | 0.563
138
 
139
-
140
-
141
- [VBx RTTM files](tree/main/paper/expected_outputs/vbx) are also provided in this repository for convenience:
142
-
143
- ```python
144
- from pyannote.database.utils import load_rttm
145
- vbx = load_rttm("paper/expected_outputs/vbx/DIHARD.rttm")
146
- resegmented_vbx = pipeline({"audio": "DH_EVAL_000.wav",
147
- "baseline": vbx["DH_EVAL_000"]})
148
- ```
149
-
150
-
151
- We also provide the [expected output](tree/main/paper/expected_outputs/rsg) on those three datasets in RTTM format.
152
 
 
19
 
20
  # pyannote.audio // speaker segmentation
21
 
 
 
22
  ![Example](example.png)
23
 
24
+ Model from *[End-to-end speaker segmentation for overlap-aware resegmentation](reproducible_research/report.pdf)*, by HervΓ© Bredin and Antoine Laurent.
 
 
25
 
26
  ```bibtex
27
  @inproceedings{Bredin2020,
 
36
 
37
  ## Support
38
 
39
+ For commercial enquiries and scientific consulting, please contact [me](mailto:[email protected]).
40
+ For [technical questions](https://github.com/pyannote/pyannote-audio/discussions) and [bug reports](https://github.com/pyannote/pyannote-audio/issues), please check [pyannote.audio](https://github.com/pyannote/pyannote-audio) Github repository.
41
 
42
  ## Requirements
43
 
 
87
  vad = pipeline("audio.wav")
88
  ```
89
 
 
 
 
 
 
 
 
 
 
 
90
  ### Overlapped speech detection
91
 
92
  ```python
 
96
  osd = pipeline("audio.wav")
97
  ```
98
 
 
 
 
 
 
 
 
 
 
 
99
  ### Resegmentation
100
 
101
  ```python
 
103
  pipeline = Resegmentation(segmentation="pyannote/segmentation",
104
  diarization="baseline")
105
  pipeline.instantiate(HYPER_PARAMETERS)
106
+ resegmented_baseline = pipeline({"audio": "audio.wav", "baseline": baseline})
107
+ # where `baseline` should be provided as a pyannote.core.Annotation instance
108
  ```
109
 
110
+ ## Reproducible research
111
+
112
+ In order to reproduce the results of the paper ["End-to-end speaker segmentation for overlap-aware resegmentation
113
+ "](reproducible_research/report.pdf), use the following hyper-parameters:
114
 
115
+ Voice activity detection | `onset` | `offset` | `min_duration_on` | `min_duration_off`
116
+ ----------------|---------|----------|-------------------|-------------------
117
+ AMI Mix-Headset | 0.851 | 0.430 | 0.115 | 0.146
118
+ DIHARD3 | 0.855 | 0.292 | 0.036 | 0.001
119
+ VoxConverse | 0.883 | 0.688 | 0.106 | 0.526
120
+
121
+ Overlapped speech detection | `onset` | `offset` | `min_duration_on` | `min_duration_off`
122
+ ----------------|---------|----------|-------------------|-------------------
123
+ AMI Mix-Headset | 0.552 | 0.311 | 0.131 | 0.180
124
+ DIHARD3 | 0.564 | 0.264 | 0.158 | 0.080
125
+ VoxConverse | 0.617 | 0.387 | 0.367 | 0.334
126
+
127
+ VBx resegmentation | `onset` | `offset` | `min_duration_on` | `min_duration_off`
128
  ----------------|---------|----------|-------------------|-------------------
129
  AMI Mix-Headset | 0.542 | 0.527 | 0.044 | 0.705
130
  DIHARD3 | 0.592 | 0.489 | 0.163 | 0.182
131
  VoxConverse | 0.537 | 0.724 | 0.410 | 0.563
132
 
133
+ Expected outputs (and VBx baseline) are also provided in the `/reproducible_research` sub-directories.
 
 
 
 
 
 
 
 
 
 
 
 
134
 
{paper β†’ reproducible_research}/dihard3_custom_split/development.txt RENAMED
File without changes
{paper β†’ reproducible_research}/dihard3_custom_split/train.txt RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/osd/AMI.development.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/osd/AMI.test.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/osd/DIHARD.development.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/osd/DIHARD.test.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/osd/VoxConverse.development.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/osd/VoxConverse.test.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/rsg/AMI.development.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/rsg/AMI.test.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/rsg/DIHARD.development.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/rsg/DIHARD.test.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/rsg/VoxConverse.development.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/vad/AMI.development.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/vad/AMI.test.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/vad/DIHARD.development.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/vad/DIHARD.test.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/vad/VoxConverse.development.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/vad/VoxConverse.test.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/vbx/AMI.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/vbx/DIHARD.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/expected_outputs/vbx/VoxConverse.rttm RENAMED
File without changes
{paper β†’ reproducible_research}/report.pdf RENAMED
File without changes