Scam Detector
Try it out here: https://scamdetectorpublic-1.streamlit.app
During this AI4ALL accelerator project, we tackled the problem of scam and smishing text messages. By utilizing Python, random forest model, and streamlit we developed a program that helps identify if a text message is a scam of not.
Problem Statement
Scams are increasing and becoming harder to spot, especially with advances in AI-generated messages. So to tackle this problem, we wanted to explore how machine learning can be trained to recognize these messages and help flag them before people interact with them.
Key Results
- Achieved 95.8% accuracy in scam message detection accuracy
- Trained various models to achieve best performance: Random Forest, Decision Tree, Logistic Regression
- Deployed to a website to allow users to easily paste their message and detect the scam
Methodologies
To accomplish this, we utilized pandas, scikit learn, TF-IDF vectorizer to preprocess and train various machine learning models such as random forest, decision tree, and logistic regression. Trained the models on 5000+ text messages labeled either as scam (promotion messages), smishing (dangerous fraud), or ham (safe).
Data Sources
Mendeley Dataset: Link to Dataset
Technologies Used
- Python
- pandas
- scikit-learn
- Random forest, decision trees, logistic regression
- Streamlit
Authors
This project was completed in collaboration with:
- Datt Patel https://github.com/dattpatel123
- Seonyoung Lee https://github.com/Seonyoungsyl
- Rupashi Bahl https://github.com/rupashibahl
- Ruth Chane https://github.com/Ruth-Ch