Hervé Bredin commited on
Commit
2dbbe55
·
1 Parent(s): 52200fc

doc: update README

Browse files
Files changed (1) hide show
  1. README.md +30 -28
README.md CHANGED
@@ -31,44 +31,24 @@ Relies on pyannote.audio 2.0 currently in development: see [installation instruc
31
  For commercial enquiries and scientific consulting, please contact [me](mailto:[email protected]).
32
  For [technical questions](https://github.com/pyannote/pyannote-audio/discussions) and [bug reports](https://github.com/pyannote/pyannote-audio/issues), please check [pyannote.audio](https://github.com/pyannote/pyannote-audio) Github repository.
33
 
34
- ## Basic usage
35
 
36
- ```python
37
- from pyannote.audio import Inference
38
- inference = Inference("pyannote/segmentation")
39
- segmentation = inference("audio.wav")
40
- # `segmentation` is a pyannote.core.SlidingWindowFeature
41
- # instance containing raw segmentation scores like the
42
- # one pictured above (output)
43
 
44
- from pyannote.audio.pipelines import Segmentation
45
- pipeline = Segmentation(segmentation="pyannote/segmentation")
 
46
  HYPER_PARAMETERS = {
47
  # onset/offset activation thresholds
48
  "onset": 0.5, "offset": 0.5,
49
- # remove speaker turn shorter than that many seconds.
50
  "min_duration_on": 0.0,
51
- # fill within speaker pauses shorter than that many seconds.
52
  "min_duration_off": 0.0
53
  }
54
-
55
- pipeline.instantiate(HYPER_PARAMETERS)
56
- segmentation = pipeline("audio.wav")
57
- # `segmentation` now is a pyannote.core.Annotation
58
- # instance containing a hard binary segmentation
59
- # like the one picutred above (reference)
60
- ```
61
-
62
-
63
- ## Advanced usage
64
-
65
- ### Voice activity detection
66
-
67
- ```python
68
- from pyannote.audio.pipelines import VoiceActivityDetection
69
- pipeline = VoiceActivityDetection(segmentation="pyannote/segmentation")
70
  pipeline.instantiate(HYPER_PARAMETERS)
71
  vad = pipeline("audio.wav")
 
72
  ```
73
 
74
  ### Overlapped speech detection
@@ -78,6 +58,7 @@ from pyannote.audio.pipelines import OverlappedSpeechDetection
78
  pipeline = OverlappedSpeechDetection(segmentation="pyannote/segmentation")
79
  pipeline.instantiate(HYPER_PARAMETERS)
80
  osd = pipeline("audio.wav")
 
81
  ```
82
 
83
  ### Resegmentation
@@ -91,6 +72,17 @@ resegmented_baseline = pipeline({"audio": "audio.wav", "baseline": baseline})
91
  # where `baseline` should be provided as a pyannote.core.Annotation instance
92
  ```
93
 
 
 
 
 
 
 
 
 
 
 
 
94
  ## Reproducible research
95
 
96
  In order to reproduce the results of the paper ["End-to-end speaker segmentation for overlap-aware resegmentation
@@ -118,6 +110,16 @@ Expected outputs (and VBx baseline) are also provided in the `/reproducible_rese
118
 
119
  ## Citation
120
 
 
 
 
 
 
 
 
 
 
 
121
  ```bibtex
122
  @inproceedings{Bredin2020,
123
  Title = {{pyannote.audio: neural building blocks for speaker diarization}},
 
31
  For commercial enquiries and scientific consulting, please contact [me](mailto:[email protected]).
32
  For [technical questions](https://github.com/pyannote/pyannote-audio/discussions) and [bug reports](https://github.com/pyannote/pyannote-audio/issues), please check [pyannote.audio](https://github.com/pyannote/pyannote-audio) Github repository.
33
 
34
+ ## Usage
35
 
36
+ ### Voice activity detection
 
 
 
 
 
 
37
 
38
+ ```python
39
+ from pyannote.audio.pipelines import VoiceActivityDetection
40
+ pipeline = VoiceActivityDetection(segmentation="pyannote/segmentation")
41
  HYPER_PARAMETERS = {
42
  # onset/offset activation thresholds
43
  "onset": 0.5, "offset": 0.5,
44
+ # remove speech regions shorter than that many seconds.
45
  "min_duration_on": 0.0,
46
+ # fill non-speech regions shorter than that many seconds.
47
  "min_duration_off": 0.0
48
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
  pipeline.instantiate(HYPER_PARAMETERS)
50
  vad = pipeline("audio.wav")
51
+ # `vad` is a pyannote.core.Annotation instance containing speech regions
52
  ```
53
 
54
  ### Overlapped speech detection
 
58
  pipeline = OverlappedSpeechDetection(segmentation="pyannote/segmentation")
59
  pipeline.instantiate(HYPER_PARAMETERS)
60
  osd = pipeline("audio.wav")
61
+ # `osd` is a pyannote.core.Annotation instance containing overlapped speech regions
62
  ```
63
 
64
  ### Resegmentation
 
72
  # where `baseline` should be provided as a pyannote.core.Annotation instance
73
  ```
74
 
75
+ ### Raw scores
76
+
77
+ ```python
78
+ from pyannote.audio import Inference
79
+ inference = Inference("pyannote/segmentation")
80
+ segmentation = inference("audio.wav")
81
+ # `segmentation` is a pyannote.core.SlidingWindowFeature
82
+ # instance containing raw segmentation scores like the
83
+ # one pictured above (output)
84
+ ```
85
+
86
  ## Reproducible research
87
 
88
  In order to reproduce the results of the paper ["End-to-end speaker segmentation for overlap-aware resegmentation
 
110
 
111
  ## Citation
112
 
113
+ ```bibtex
114
+ @inproceedings{Bredin2021,
115
+ Title = {{End-to-end speaker segmentation for overlap-aware resegmentation}},
116
+ Author = {{Bredin}, Herv{\'e} and {Laurent}, Antoine},
117
+ Booktitle = {Proc. Interspeech 2021},
118
+ Address = {Brno, Czech Republic},
119
+ Month = {August},
120
+ Year = {2021},
121
+ ```
122
+
123
  ```bibtex
124
  @inproceedings{Bredin2020,
125
  Title = {{pyannote.audio: neural building blocks for speaker diarization}},