Scam Detector

Try it out here: https://scamdetectorpublic-1.streamlit.app

During this AI4ALL accelerator project, we tackled the problem of scam and smishing text messages. By utilizing Python, random forest model, and streamlit we developed a program that helps identify if a text message is a scam of not.

Problem Statement

Scams are increasing and becoming harder to spot, especially with advances in AI-generated messages. So to tackle this problem, we wanted to explore how machine learning can be trained to recognize these messages and help flag them before people interact with them.

Key Results

Achieved 95.8% accuracy in scam message detection accuracy
Trained various models to achieve best performance: Random Forest, Decision Tree, Logistic Regression
Deployed to a website to allow users to easily paste their message and detect the scam

Methodologies

To accomplish this, we utilized pandas, scikit learn, TF-IDF vectorizer to preprocess and train various machine learning models such as random forest, decision tree, and logistic regression. Trained the models on 5000+ text messages labeled either as scam (promotion messages), smishing (dangerous fraud), or ham (safe).

Data Sources

Mendeley Dataset: Link to Dataset

Technologies Used

Python
pandas
scikit-learn
Random forest, decision trees, logistic regression
Streamlit

Authors

This project was completed in collaboration with:

Datt Patel https://github.com/dattpatel123
Seonyoung Lee https://github.com/Seonyoungsyl
Rupashi Bahl https://github.com/rupashibahl
Ruth Chane https://github.com/Ruth-Ch