Publication: The significant effect of feature selection methods in spam risk assessment using dendritic cell algorithm
No Thumbnail Available
Date
2017
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers Inc.
Abstract
The vast amount of online documentation and the thriving of Internet especially mobile technology have caused a crucial demand to handle and organize unstructured data appropriately. An information retrieval or even knowledge discovery can be enhanced when a proper and structured data are available. This paper studies empirically the effect of pre-selected term weighting schemes, namely as Term Frequency (TF), Information Gain Ratio (IG Ratio) and Chi-Square (CHI2) in the assessment of a threat's impact loss. This feature selection method then further fed in conjunction with the Dendritic Cell Algorithm (DCA) as the classifier to measure the risk concentration of a spam message. The final outcome of this research is very much expected to be able in assisting people to make a decision once they knew the possible impact caused by a particular spam. The findings showed that TF is the best feature selection methods and well suited to be demonstrated together with the DCA, resulted with high accuracy risk classification rate. � 2017 IEEE.
Description
Keywords
dendritic cell algorithm, feature selection methods, spam risk concentration, spam severity assessment, term weighting schemes, Cells, Feature extraction, Dendritic cell algorithms, Dendritic cell algorithms (DCA), Feature selection methods, Information gain ratio, On-line documentations, Risk classification, spam severity assessment, Term weighting scheme, Risk assessment