Mining Unstructured Turkish Economy News Articles


KAHYA ÖZYİRMİDOKUZ E.

Procedia Economics and Finance, vol.16, pp.320-328, 2014 (Refereed Journals of Other Institutions)

  • Publication Type: Article / Article
  • Volume: 16
  • Publication Date: 2014
  • Title of Journal : Procedia Economics and Finance
  • Page Numbers: pp.320-328

Abstract

Text mining is the analysis of unstructured data by combining techniques from knowledge discovery in databases, natural language processing, information retrieval, and machine learning. Text mining allows us to analyze web content dynamically to find meaningful patterns within large collections of textual data.

There are too many economic news articles to read. Therefore, it is a necessary to summarize them. In this study, TM is used to analyze the vast amount of text produced in newspaper articles in Turkey. We mine unstructured economy news with natural language processing techniques including tokenization, transform cases, filtering stopwords and stemming. Similarity analysis is also used to determine similar documents. The word vector is extracted. Therefore, economy news is structured into numeric representations that summarize them. In addition, k-means clustering is used. Consequently, the clusters and similarities of the articles are obtained.