List of Acoustic Models
#states | #mixtures | gender | |
monophone | 129 | 4, 8, 16 | GD, GI |
triphone 1000 | 1000 | 4, 8, 16 | GD |
triphone 2000 | 2000 | 4, 8, 16 | GD, GI |
triphone 3000 | 3000 | 4, 8, 16 | GD |
PTM triphone | 3000/129 | 64 | GD, GI |
List of Japanese Phones
a i u e o a: i: u: e: o: N w y |
p py t k ky b by d dy g gy ts ch |
m my n ny h hy f s sh z j r ry |
q sp silB silE (pauses) |
Training...ASJ (Acoustical Society of Japan) databases
20K sentences / 132 speakers for each gender
Acoustic Analysis
A/D | 16kHz,16bit |
frame shift | 10ms |
analysis | MFCC (12-th order) |
LogPow | |
CMN | done for whole utterance |
pattern: MFCC + MFCC + LogPow (25 variables)
HMM
left-to-right 3 states (excluding initial & final)
decision tree-based clustering:
(logical triphone 21000) (physical triphone 8000)
PTM (Phonetic Tied-Mixture) model