Machine Learning: Filtering Email for Spam or Ham

You may have seen our previous posts on machine learning — specifically, how to let your code learn from text and working with stop words, stemming, and spam. So today, we’re going to build our machine learning-based spam filter, using the tools we walked through in those posts: tokenizer, stemmer, and naive bayes classifier. We are going to work with bluebird promise library here, so if you are not used to promises, please take a look at the bluebird API reference. Training and Testing Dataset Before we begin, it’s important to have good training data. You can download some here — we are interested in two. TR-mails.zip, the raw emails’ corpus spam-mail.tr, the correct labels (spam or ham) associated to each training email in TR-mails.zip, where each line tells us…


Link to Full Article: Machine Learning: Filtering Email for Spam or Ham

Pin It on Pinterest

Share This

Join Our Newsletter

Sign up to our mailing list to receive the latest news and updates about homeAI.info and the Informed.AI Network of AI related websites which includes Events.AI, Neurons.AI, Awards.AI, and Vocation.AI

You have Successfully Subscribed!