OAK

도서관 로그인

검색

SUNGSHIN Repository 교내연구소 인문과학연구소 학술논문

Metadata Downloads

Alternative Title: Improving the Performance of Statistical Korean Morphological Analyzer

Abstract: Statistical Korean morphological analysis is a brand-new approach in that it does not require a manually built machine-readable morphology dictionary. Instead, it uses statistical information that is acquired from POS-tagged corpus. The acquisition of statistical information is fully automated, so that no human intervention is required in the process. This is a good side of the statistical approach to Korean morphological analysis. The bad side of the approach is its low precision, meaning that the number of false positives is relatively high. In order to improve the precision, this paper proposes a method of filtering false positives. The proposed method introduces two types of dictionaries, one-syllable-morpheme dictionary and josa-eomi dictionary, which are automatically constructed when statistical information is collected from the POS-tagged corpus. To evaluate the performance of the proposed method, 10-fold cross-validation is performed with 10 million eojeol Sejong POS-tagged corpus. The experimental results show that the precision has been improved by 5%.

URI: http://repository.sungshin.ac.kr/handle/2025.oak/7811
https://kiss.kstudy.com/Detail/Ar?key=3419915

공개 및 라이선스

qrcode

OAK SUNGSHIN Repository는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.