Title |
Performance Analyses of Different Text Feature Extraction Algorithms in Restaurant Fake Review Detection |
Authors |
강한솔(Hansol Kang) ; 윤성욱(Seongwook Youn) |
DOI |
https://doi.org/10.5370/KIEE.2020.69.6.924 |
Keywords |
Fake review detection; similarity-based detection; Natural Language Processing; Term Frequency; doc2vec |
Abstract |
Blog reviews about restaurants may have a great influence on the customers’ choice as the use of social network services (SNS) increases. Thus, few restaurant owners bribe to post fake positive reviews on the social networking sites, which can fool many customers. Since the number of the fake reviews increases sharply, detecting the fake reviews become important to protect customers. In this paper, we detected the fake reviews by using the similarity between other fake reviews. To analyze the similarity of the review contents, we use feature extraction algorithms from the documents. Specifically, we compared performance of three different feature extraction algorithms: Term Frequency (TF), Term Frequency Inverse Document Frequency (TF-IDF), and doc2vec. ? doc2vectakes accounts for the meaning of keywords in the documents similarly that famous word2vec algorithm does, while TF and TF-IDF only consider the occurrence of the keywords. Accordingly, doc2vec demonstrated the best performance in detecting the fake reviews. |