Saudi M.M.Ridzuan F.Hashim H.A.-B.2024-05-292024-05-2920179789880000000207809582-s2.0-85041207306https://www.scopus.com/inward/record.uri?eid=2-s2.0-85041207306&partnerID=40&md5=785c6a2548ac9a6cc19b565132b02eadhttps://oarep.usim.edu.my/handle/123456789/10359Growth of data over time especially in term of volume, velocity, value, veracity and variety led to many challenges especially in extracting useful information from it. Furthermore, managing and transforming raw data into a readable format is crucial for subsequent analysis. Therefore, this paper presents a new web server log file classification and an efficient way of transforming raw web log files by using knowledge database discovery (KDD) technique into a readable format for data mining analysis. An experiment was conducted to the raw web log files, in a controlled lab environment, by using KDD technique and k-nearest neighbor (IBk) algorithm. Based on the experiment conducted, the IBk algorithm generates 99.66% for true positive rate (TPR) and 0.34% for false positive rate (FPR) which indicates the significant efficiency of the new web log file classification and data transformation technique used in this paper.en-USBig DataData transformationKnowledge database discoveryLog analysisBig dataBlogsClassification (of information)MetadataNearest neighbor searchData transformationFalse positive ratesK-nearest neighborsKnowledge databaseLog analysisTrue positive ratesWeb log fileWeb server logsData miningAn efficient data transformation technique for web logConference Paper4344392229