- Lexical Entries and Baseforms
may have multiple pronunciations & correspond to pauses
- HTK format
- consistent with acoustic & language models
- morphological analysis by ChaSen
- frequent words of Mainichi newspaper articles of 4 years (91-94)
Lexical Coverage
vocabulary size | coverage |
5000 | 88.3% |
20000 | 96.4% |
24000 | 97.0% |
53000 | 99.0% |
60000 | 99.2% |
101000 | 99.7% |
154000 | 99.9% |
Next: Morphological Analysis
Up: Specification of Modules
Previous: Acoustic Model
Tatsuya Kawahara
5/31/2000