순서형 회귀 모형을 활용한 인공지능 악성댓글탐지 모형의 성능 향상 연구
- Alternative Title
- A Study on the Performance Improvement of Artificial Intelligence HateSpeech Detection Model Using Ordinal Regression Model
- Abstract
- 컴퓨터 통신 기술의 발달과 COVID-19 바이러스의 여파로 온라인에서 의 활동이 활발해졌다. 특히 YouTube, Tiktok 과 같은 온라인 콘텐츠를 즐 길 수 있는 SNS 활동이 급속도로 증가함에 따라 SNS 플랫폼에서 콘텐츠 를 즐기고 자신의 의견을 온라인 댓글을 통해 표출하는 경우가 많아졌다. 온라인 특성상 익명이 보장되기 때문에 표현의 자유를 악용하여 혐오 발언 또는 편견 발언이 담긴 악성댓글이 작성되기 쉽다. 온라인 악성댓글은 오프 라인에 실존하는 대상에게 정신적인 피해를 준다. 악성댓글로 인해 대상자 가 극단적인 선택을 하는 경우가 발생할 수 있으므로 악성댓글에 대한 사 전 방지책과 규제 방안이 절실하게 필요하다.
한국어 악성 댓글 분류 모델을 학습하기 위해 한국어 댓글 데이터를 수 집한 데이터셋으로 KOCO(KOrean COmments) 데이터셋이 있다. KOCO 데이터셋 중 KOCO-hate 데이터셋은 악성 댓글을 혐오감의 정도에 따라 정상, 공격적인 발언, 심한 혐오 발언으로 레이블링을 수행하였다. 따라서 악성댓글 분류는 혐오 발언의 정도에 따른 다중 분류 문제이기 때문에 각 클래스의 순서 정보를 활용하기 위해 순서가 있는 클래스를 분류하는데 효 과적인 순서형 회귀 모형을 활용한 악성 댓글 분류 모델을 제안한다. 먼저, 혐오 발언 분류를 위해서 사전학습된 한국어 자연어처리 모델에 순서형 회 귀 모형인 CORAL(COnsistent RAnk Logits) 프레임워크와 CORN(Conditional Ordinal Regression for Neural network) 프레임워크를 악성 댓글 분류 모델에 적용하였다. 기본모형, CORAL 모형, CORN 모형의 분류 성능을 비교했을 때 순서형 회귀 모형을 활용한 CORAL과 CORN 모 형에서 성능이 향상된 것을 확인하였다.|Online activities have become more active in the wake of the development of computer communication technology and the COVID-19 virus. In particular, as SNS activities that allow users to enjoy online content such as YouTube and Tiktok increase rapidly, they often enjoy content on SNS platforms and express their opinions through online comments. Because anonymity is guaranteed due to the nature of the online, malicious comments containing hate speech or prejudice remarks are easily written by exploiting freedom of expression. Online malicious comments cause mental damage to objects that exist offline. Since malicious comments can lead to extreme choices, preventive measures and regulatory measures for malicious comments are urgently needed.
Among the KOCO (KOrean COmments) datasets that collected Korean comment data to learn the Korean malicious comment classification model, the KOCO-hate dataset labeled malicious comments with normal, aggressive, and severe hate speech depending on the degree of disgust. Therefore, since malicious comment classification is a multi-classification problem according to the degree of hate speech, we propose a malicious comment classification model using an ordered regression model that is effective in classifying ordered classes to utilize ordered information of each class. First, for hate speech classification, the CORAL (COnsistent Rank Logits) framework and CORN (Conditional Ordinal Regression for Neural Network) framework were applied to the pre-learned Korean natural language processing model.
When comparing the classification performance of the basic model, the CORAL model, and the CORN model, it was confirmed that the performance was improved in the CORAL and CORN models using the ordinal regression model.
- Author(s)
- 이세영
- Issued Date
- 2023
- Awarded Date
- 2023-02
- Type
- Dissertation
- URI
- https://repository.sungshin.ac.kr/handle/2025.oak/4411
http://dcollection.sungshin.ac.kr/common/orgView/000000014555
- Alternative Author(s)
- Lee Se Young
- Affiliation
- 성신여자대학교 일반대학원
- Department
- 일반대학원 미래융합기술공학과
- Advisor
- 박새롬
- Table Of Contents
- 논문 개요
Ⅰ. 서론 ························································································································1
1. 연구 배경 및 목적 ························································································1
2. 논문 구성 ·······································································································3
Ⅱ. 한국어 악성댓글 분류 ······················································································4
1. 악성댓글 분류 연구 ······················································································4
2. 악성댓글 데이터셋 ························································································5
Ⅲ. 자연어처리 모델 ···································································································9
1. Transformer 모델 ·························································································9
2. BERT 모델 ···································································································10
3. ELECTRA 모델 ···························································································12
4. GPT 모델 ······································································································13
Ⅳ. 악성댓글 분류 모델 설계 ··············································································15
1. Ordinal Regression ···················································································15
2. CORAL ···········································································································16
3. CORN ··············································································································18
Ⅴ. 실험 결과 ···········································································································21
1. 실험 환경 ·····································································································21
2. 실험 모델 구성 ···························································································25
1) 데이터 전처리 ··························································································25
2) 실험 모델 구성 ························································································25
2. 혐오 발언 분류 결과 ··················································································28
1) 기본 분류 모형 성능 ··············································································28
2) CORAL 분류 모형 성능 ·······································································29
3) CORN 분류 모형 성능 ··········································································30
4) KOCO-hate test 성능 ···········································································30
Ⅵ. 결론 및 향후 연구 ·····························································································32
ACKNOWLEDGEMENTS
참고문헌
ABSTRACT
- Degree
- Master
- Publisher
- 성신여자대학교 일반대학원
-
Appears in Collections:
- 미래융합기술공학과 > 학위논문
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.