A SYSTEMATIC LITERATURE REVIEW ON PHISHING DETECTION MODEL
DOI:
https://doi.org/10.24203/7pmk5z83Keywords:
phishing, algorithms, detection, social media, attackAbstract
This paper introduces a unique method using supervised learning techniques in a hybrid crime detection model to identify phishing attempts on social media sites. Effective detection systems are desperately needed given the rise in criminality on social media, especially phishing. The suggested model combines the best features of several supervised learning algorithms which comprises of random forest, decision tree, support vector machine which are frequently used in analyzing the phishing attacks, taking use of their capacity to extrapolate patterns from labeled datasets and spot questionable behavior suggestive of phishing efforts. The commonly used algorithm was Decision Tree (DT), with 14% of the total, followed by Random Forest (RF), Support Vector Machine (SVM), and Naïve Bayes (12%), with 8%. The least popular algorithms were LSTM, SCS, STARMA, AUC, and FURIA, with 2% each.
Decision trees and Support Vector Machines (SVMs) are often used in phishing assault detection since they excel at classification tasks exactly what phishing detection entail. The reason for this is their ability to differentiate between trustworthy and malevolent websites or emails. Decision trees offer a clear and concise example of decision-making processes.
Decision tree (DT) presents several gaps which need to be solved, should important characteristics associated with phishing offenses be omitted or misidentified, the efficacy of the model may be jeopardized. Overfitting and class imbalance is a common problem with decision trees, particularly when working with complicated datasets. This might result in poor generalization to fresh, untested data, which would make the model less effective at identifying unusual phishing scams. Phishing statistics on social media frequently exhibit a class imbalance, with a comparatively smaller number of phishing crimes than lawful activity.
References
Adeyemo, V. E., Balogun, A. O., Mojeed, H. A., Akande, N. O., & Adewole, K. S. (2021). Ensemble-based logistic model trees for website phishing detection. In Advances in Cyber Security: Second International Conference, ACeS 2020, Penang, Malaysia, December 8-9, 2020, Revised Selected Papers 2 (pp. 627-641). Springer Singapore.
A. Aleroud and L. Zhou,(2017). "Phishing environments techniques and countermeasures: A survey", Comput. Secur., vol. 68, pp. 160-196.
Arshey, M., & Viji, K. A. (2021). Thwarting cyber crime and phishing attacks with machine learning: a study. In 2021 7th international conference on advanced computing and communication systems (ICACCS) (Vol. 1, pp. 353-357). IEEE.
Arora, T., Sharma, M., & Khatri, S. K. (2019, October). Detection of cyber crime on social media using random forest algorithm. In 2019 2nd International Conference on Power Energy, Environment and Intelligent Control (PEEIC) (pp. 47-51). IEEE.
Desrousseaux, R., Bernard, G., & Mariage, J. J. (2021). Profiling money laundering with neural networks: a case study on environmental crime detection. In 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI) (pp. 364-369). IEEE.
Espinoza, B., Simba, J., Fuertes, W., Benavides, E., Andrade, R., & Toulkeridis, T. (2019, December). Phishing attack detection: A solution based on the typical machine learning modeling cycle. In 2019 International Conference on Computational Science and Computational Intelligence (CSCI) (pp. 202-207). IEEE.
Drury, B., Drury, S. M., Rahman, M. A., & Ullah, I. (2022). A social network of crime: A review of the use of social networks for crime and the detection of crime. Online Social Networks and Media, 30, 100211.
G. Diksha and J. A. Kumar (2018). "Mobile phishing attacks and defence mechanisms: State of art and open research challenges", Comput. Secur., vol. 73, pp. 519-544.
Ghimire, A., Jha, A. K., Thapa, S., Mishra, S., & Jha, A. M. (2021, January). Machine learning approach based on hybrid features for detection of phishing URLs. In 2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence) (pp. 954-959). IEEE.
Goyal, B., Gill, N. S., Gulia, P., Prakash, O., Priyadarshini, I., Sharma, R., ... & Yadav, K. (2023). Detection of fake accounts on social media using multimodal data with deep learning. IEEE Transactions on Computational Social Systems.
Jensen, M. L., Dinger, M., Wright, R. T., & Thatcher, J. B. (2017). Training to Mitigate Phishing Attacks Using Mindfulness Techniques. Journal of Management Information Systems, 34(2), 597–626. https://doi.org/10.1080/07421222.2017.1334499
Kayode-Ajala, O. (2022). Applying Machine Learning Algorithms for Detecting Phishing Websites: Applications of SVM, KNN, Decision Trees, and Random Forests. International Journal of Information and Cybersecurity, 6(1), 43-61.
L. Anh, T. Nguyen, and H. K. Nguyen (2015). "Developing an efficient fuzzy model for phishing identification." In Control Conference (ASCC), 2015 10th Asian, pp. 1-6. IEEE.
Li, Q., Cheng, M., Wang, J., & Sun, B. (2020). LSTM based phishing detection for big email data. IEEE transactions on big data, 8(1), 278-288.
Marivate, V., & Moiloa, P. (2016, November). Catching crime: Detection of public safety incidents using social media. In 2016 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech) (pp. 1-5). IEEE.
M. S. Gerber, (2014) ‘‘Predicting crime using Twitter and kernel density estimation,’’ Decis. Support Syst., vol. 61, pp. 115–125.
Naqvi, B., Perova, K., Farooq, A., Makhdoom, I., Oyedeji, S., & Porras, J. (2023). Mitigation strategies against the phishing attacks: A systematic literature review. Computers & Security, 132, 103387. https://doi.org/10.1016/j.cose.2023.103387
N.Zainuddin, A. Selamat, and R. Ibrahim (2016). ‘‘Improving Twitter AspectBased sentiment analysis using hybrid approach,’’ in Intelligent Information and Database Systems, vol. 9621, N. T. Nguyen, B. Trawinski, H. Fujita, and T.-P. Hong, Eds. Berlin, Germany: Springer, pp. 151–160
Tam, S., & ÖzgürTanrıöver, Ö. (2023). Multimodal Deep Learning Crime Prediction Using Crime and Tweets. IEEE Access.
Rathee, D., & Mann, S. (2022). Detection of E-Mail Phishing Attacks – using Machine Learning and Deep Learning. International Journal of Computer Applications, 183(47), 1–7. https://doi.org/10.5120/ijca2022921868
Sahingoz, O. K., Buber, E., Demir, O., & Diri, B. (2019). Machine learning based phishing detection from URLs. Expert Systems with Applications, 117, 345-357.
Şentürk, Ş., Yerli, E., & Soğukpınar, İ. (2017, October). Email phishing detection and prevention by using data mining techniques. In 2017 International Conference on Computer Science and Engineering (UBMK) (pp. 707-712). IEEE.
Shaukat, M. W., Amin, R., Muslam, M. M. A., Alshehri, A. H., & Xie, J. (2023). A hybrid approach for alluring ads phishing attack detection using machine learning. Sensors, 23(19), 8070.
S. K. Shinde (2014). "Detection of Phishing Websites Using Data Mining Techniques." International Journal of Engineering Research and Technology. Vol. 2. No. 12 (December-2013). ESRSA Publications.
Singh, C. (2020, March). Phishing website detection based on machine learning: A survey. In 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS) (pp. 398-404). IEEE.
Suzuki, Y. E., & Monroy, S. A. S. (2022). Prevention and mitigation measures against phishing emails: A sequential schema model. Security Journal, 35(4), 1162–1182. https://doi.org/10.1057/s41284-021-00318-x
S. Wedyan and F. Wedyan, (2013). "An Associative Classification Data Mining Approach for Detecting Phishing Websites." Journal of Emerging Trends in Computing and Information Sciences 4, no. 12.
Vo, T., Sharma, R., Kumar, R., Son, L. H., Pham, B. T., Tien Bui, D., Priyadarshini, I., Sarkar, M., & Le, T. (2020). Crime rate detection using social media of different crime locations and Twitter part-of-speech tagger with Brown clustering. Journal of Intelligent & Fuzzy Systems, 38(4), 4287–4299. https://doi.org/10.3233/JIFS-190870
Wang, M. S. Gerber, and D. E. Brown. (2012). ‘‘Automatic crime prediction using events extracted from Twitter posts,’’ in Social Computing, Behavioral-Cultural Modeling and Prediction. Berlin, Germany: Springer, pp. 231–238.
X. Chen, Y. Cho, and S. Y. Jang (2015).‘‘Crime prediction using Twitter sentiment and weather,’’ in Proc. Syst. Inf. Eng. Design Symp., Apr. pp. 63–68
Zhu, E., Ju, Y., Chen, Z., Liu, F., & Fang, X. (2020). DTOF-ANN: an artificial neural network phishing detection model based on decision tree and optimal features. Applied Soft Computing, 95, 106505.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Nicholas Muriuki Muriithi, Josphat Karani
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The articles published in International Journal of Computer and Information Technology (IJCIT) is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.