A Novel Approach to Improve the Performance of the Database Storing Big Data with Time Information

Taşyürek M.

Balkan Journal of Electrical and Computer Engineering, vol.10, no.4, pp.388-396, 2022 (Peer-Reviewed Journal)


Big data is defined as data sets that are too large and/or complex to be processed by classical data processing methods. Big data analysis is essential because it enables more competent business movements, more efficient operations, and higher profits by using the data of institutions and organizations. However, large datasets are difficult to analyze because they are produced quickly, require large storage areas in computer systems, and the diversity of their data. In this study, a new approach using the denormalization method is proposed to accelerate the response time of the database in database systems where large volumes of data containing historical information are stored. Denormalization is defined as the process of adding rows or columns that are not needed to increase the reading performance of the database to the database system that has been normalized. In the proposed approach in this study, a large-volume dataset consisting of real spatial data belonging to Kayseri Metropolitan Municipality, containing temporal information and having approximately 96,000,000 row records, was used. In the proposed approach, the response time of the query is accelerated by recording the time information as numbers to increase the query performance of large volumes of data recorded in date format due to the temporal query process. The performance of the proposed method is compared with the performance of the normalization method using actual data on Microsoft SQL Server and Oracle database systems. The method proposed in the experimental evaluations shows that it works approximately eight times faster. In addition, the experimental results showed that the proposed method improves query performance more than the normalization-based method as the data size increases.