Insider Trading Fraud Detection: A Data Mining Approach

Bilgiç E., Esen M. F.

10th INTERNATIONAL STATISTICS CONGRESS, Ankara, Turkey, 6 - 08 December 2017, vol.1, pp.93

  • Publication Type: Conference Paper / Summary Text
  • Volume: 1
  • City: Ankara
  • Country: Turkey
  • Page Numbers: pp.93
  • Kayseri University Affiliated: No


Prior researches provide evidence that insiders generate significant profits by trading on private information which is unknown to the market. Separating opportunistic insider trades from routine ones is highly important for detecting a fraud. In the literature, there is only a few studies on fraud detection of insiders’ trades [1][2].

In this study, Outlier Detection approach will be used to detect potential frauds. Outlier detection, with other words; anomaly or novelty detection is the task of finding patterns that do not conform to the normal behaviour of the data. This study is organized to detect outliers with data mining approach, then inspect outlying transactions’ portfolio by estimating abnormal returns to flag potential fraudulent transactions. Outlier detection is the first step in many data mining applications, as in our case. A clustering-based outlier detection method called “peer group analysis” will be used in this paper. Peer group analysis is first introduced by Bolton and Hand [3] which detects individual objects that begin to behave in a way distinct from similar objects over time. Although the logic behind Bolton & Hand’s and this study is same, analysis in this study differs from Bolton & Hand’s since they consider time concept additionally. The procedure for this paper searches unusual cases (outliers) based on deviations from the norms of their (cluster) groups. The clustering mentioned here is based on input variables such as volume or price of the trade. After clusters called “peer groups” are produced, anomaly indices based on deviations from peer group norms are calculated. SPSS is used for outlier detection with peer group analysis. A dataset is obtained from Thomson Reuters Insider Filings, containing 1,244,815 transactions belong to 61,780 insiders during the period of January 2010 - April 2017 in NYSE. First of all, NPR and NVR values are calculated for each transaction. Note that an insider may have hundreds or even thousands of transactions between that periods. Then, outlier detection with peer groups analysis is performed using the purchase and sale transaction data separately. 16,362 outliers have been found for purchases data which contain 328,112 transactions, however 4 of them significantly differ from their peer group. The primary reason of these 4 outliers are NVR values, and for other NPR values. Furthermore, outliers in sales data are also inspected and 27,190 outliers are obtained out of 916,703 transactions, again 4 of them significantly differs from their peer group. The primary reason of these 4 outliers is NVR values of the transactions and for others NPR values as in the case for purchase transactions. Since insiders’ purchases and sales have different characteristics, future work will focus on measuring returns of purchase and sale portfolios separately for each outlier.