home
about
browse
publish
contact
A Random Forest Classifier – Based Email Spam Detection Model
Adedoyin S. Adebanjo, Oreoluwa A. Adesegha, Elizabeth Ogungbefun, Faysal O. Aliyu, Emmanuel Mgbeahuruike, Babajide E. Adeoti, and Emmanuel Oyerinde

Abstract

Email spam is a constant threat to productivity and security. Traditional rule-based filters often struggle to keep up with changing spam techniques. This study introduces a spam detection model based on a Random Forest Classifier that uses a publicly available dataset. We applied text preprocessing with Natural Language Processing (NLP) methods, such as tokenization, stop-word removal, and TF-IDF, to extract important features. We evaluated the model using accuracy, precision, recall, and F1-score. The results were impressive, achieving 99% accuracy, 97% precision for legitimate emails, 100% precision for spam, 99% recall for both categories, and F1-scores of 98% for legitimate emails and 99% for spam. These results highlight the effectiveness of Random Forest in spam detection and show its promise for creating reliable and flexible email filtering systems that improve security and user experience.

Keywords: Email Spam Detection, Ensemble Learning, Random Forrest Classifier, Supervised Machine Learning.

PDF
Call For Papers
The College of Postgraduate Studies, Babcock University is pleased to announce as part of its multi-disciplinary research endeavour the Call for Papers (CFP) for publication in the first issue of its edited volume:

CURRENT TRENDS IN INFORMATION COMMUNICATION TECHNOLOGY RESEARCH (CTICTR).

click here for details

Understanding open access
Open access is a set of principles and a range of practices through which research outputs are distributed online, free of access charges or other barriers.