Publication: The Significant Effect of Feature Selection Methods in Spam Risk Assessment Using Dendritic Cell Algorithm
No Thumbnail Available
Date
2017
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Abstract
The vast amount of online documentation and the thriving of Internet especially mobile technology have caused a crucial demand to handle and organize unstructured data appropriately. An information retrieval or even knowledge discovery can be enhanced when a proper and structured data are available. This paper studies empirically the effect of pre-selected term weighting schemes, namely as Term Frequency (TF), Information Gain Ratio (IG Ratio) and Chi-Square (CHI2) in the assessment of a threat's impact loss. This feature selection method then further fed in conjunction with the Dendritic Cell Algorithm (DCA) as the classifier to measure the risk concentration of a spam message. The final outcome of this research is very much expected to be able in assisting people to make a decision once they knew the possible impact caused by a particular spam. The findings showed that TF is the best feature selection methods and well suited to be demonstrated together with the DCA, resulted with high accuracy risk classification rate.
Description
Keywords
term weighting schemes, feature selection methods, dendritic cell algorithm, spam severity assessment, spam risk concentration