上传者: 38588854
|
上传时间: 2021-02-20 20:10:20
|
文件大小: 512KB
|
文件类型: PDF
Not all languages, e.g. Chinese, have delimiters for words. To extract words from a sentence in these languages, we usually rely on a dictionary for known words. For unknown words, some approaches rely on a domain specific dictionary or a tailor-made learning data set. However, this information may not be available. Another direction is to use unsupervised methods. These methods rely on a goodness measure to evaluate how likely the words are meaningful based on a statistical argument on the give