Blog

Building SMS SPAM Detector and Generating a WordCloud with Kaggle Dataset in JupyterLab

Background problem At least 97% of American use text messages over mobile phones every day. In 2016, according to the research conducted by Portio research, 8.3 trillion messages exchanged over the mobile phones. The rising flood of big data shows an exchange of 23 billion messages per day and 16 million messages per minute. There are around 6.4 billion mobile subscribers around the world by the end of 2012. According to Portio Research, there will be a CAGR growth of 4.8% o...

Read more

Building Email SPAM Detector with Naïve Bayes and AdaBoost Machine Learning Classifiers in JupyterLab

Natural language processing is a sub-branch of artificial intelligence. Building a machine or a tool to process the data through natural language processing requires mathematics, statistics, algorithms, and Python programming. Advanced techniques such as Word2Vec can convert words into vectors which makes it easier to process the text through mathematics and deep learning algorithms. Python language can handle the language humans speak, write, and understand. Before we begin ...

Read more

K-Nearest Neighbor Machine Learning algorithm

The German credit dataset can be downloaded from UC Irvine, Machine learning community to indicate the predicted outcome if the loan applicant defaulted or not. Applying the logistic regression with three variables duration, amount, and installment, K-means classification, and K-Nearest Neighbor machine learning algorithm. # Logistic regression # Load the file from the hard disk after setting the work directory germandata # Print dataset to see the pattern of the data g...

Read more

Brace yourself for weathering the data storm: RDBMS

RDBMS is relational database management system. In the early 1970s, Codd invented the relational database management system. It was an advancement to spur DBMS movement they had at that time by implementing the cardinality and normalization to the database. Codd conceptualized and created 12 rules for a traditional RDBMS. Though the rules are laid out, it made the database more flexible and integrated with these principles. (a) RDBMS deems all database management ...

Read more