download493 view1,258
twitter facebook

CC_BYThis item is licensed Creative Commons License

Title
Domain Adaptation for Opinion Classification: A Self-Training Approach
Author(s)
Ning Yu
Publication Year
2013-02-28
Abstract
Domain transfer is a widely recognized problem for machine learning algorithms because models built upon one data domain generally do not perform well in another data domain. This is especially a challenge for tasks such as opinion classification, which often has to deal with insufficient quantities of labeled data. This study investigates the feasibility of self-training in dealing with the domain transfer problem in opinion classification via leveraging labeled data in non-target data domain(s) and unlabeled data in the target-domain. Specifically, self-training is evaluated for effectiveness in sparse data situations and feasibility for domain adaptation in opinion classification. Three types of Web content are tested: edited news articles, semi-structured movie reviews, and the informal and unstructured content of the blogosphere. Findings of this study suggest that, when there are limited labeled data, self-training is a promising approach for opinion classification, although the contributions vary across data domains. Significant improvement was demonstrated for the most challenging data domain-the blogosphere-when a domain transfer-based self-training strategy was implemented.
Keyword
Domain adaptation; Opinion classification; Self-training; Semi-supervised learning; Sentiment analysis; Machine learning
Journal Title
Journal of Information Science Theory and Practice
Citation Volume
1
ISSN
2287-4577
DOI
10.1633/JISTaP.2013.1.1.1
Files in This Item:
Thumbnail E1JSCH_2013_v1n1_10.pdf332.33 kBDownload
Appears in Collections:
8. KISTI 간행물 > JISTaP > Vol. 1 - No. 1
Type
Article
URI
https://repository.kisti.re.kr/handle/10580/8624
Export
RIS (EndNote)
XLS (Excel)
XML

Browse