Training...Mainichi newspaper article texts
45 month | 75 month | |
period | '91/01-'94/09 | '91/01-'94/09 |
'95/01-'97/06 | ||
data amount | 65M words | 118M words |
Language Model Compression
Baseline model (cutoff-1-1)
List of 20K Language Models
2-gram | 3-gram | |
entries | entries | |
45month cutoff-1-1 | 1,238,929 | 4,733,916 |
45month cutoff-4-4 | 657,759 | 1,593,020 |
45month compress10% | 1,238,929 | 473,176 |
75month cutoff-1-1 | 1,675,803 | 7,445,209 |
75month cutoff-4-4 | 901,475 | 2,629,605 |
75month compress10% | 1,675,803 | 744,438 |
List of 60K Language Models
2-gram | 3-gram | |
entries | entries | |
75month cutoff-1-1 | 2,420,231 | 8,368,507 |
75month compress10% | 2,420,231 | 836,852 |
backward 3-gram (for forward-backward search)