Popular Blogs

Building Email SPAM Detector with Naïve Bayes and AdaBoost Machine Learning Classifiers in JupyterLab

Natural language processing is a sub-branch of artificial intelligence. Building a machine or a tool to process the data through natural language processing requires mathematics, statistics, algorithms, and Python programming. Advanced techniques such as Word2Vec can convert words into vectors which makes it easier to process the text through mathematics and deep learning algorithms. Python language can handle the language humans speak, write, and understand. Before we begin ...

Read more

K-Nearest Neighbor Machine Learning algorithm

The German credit dataset can be downloaded from UC Irvine, Machine learning community to indicate the predicted outcome if the loan applicant defaulted or not. Applying the logistic regression with three variables duration, amount, and installment, K-means classification, and K-Nearest Neighbor machine learning algorithm. # Logistic regression # Load the file from the hard disk after setting the work directory germandata # Print dataset to see the pattern of the data g...

Read more

Brace yourself for weathering the data storm: RDBMS

RDBMS is relational database management system. In the early 1970s, Codd invented the relational database management system. It was an advancement to spur DBMS movement they had at that time by implementing the cardinality and normalization to the database. Codd conceptualized and created 12 rules for a traditional RDBMS. Though the rules are laid out, it made the database more flexible and integrated with these principles. (a) RDBMS deems all database management ...

Read more

The Architecture of Apache Hadoop

Apache Hadoop platform is highly fault-tolerant for system disasters. At the core of Apache Hadoop, there are Hadoop Distributed File System and MapReduce that diffuse high-velocity streams of big data across multiple racks of low-priced servers. Hadoop Distributed File System does not have a limitation on the size of the file for data storage, write, and read operations. The limitation can only arise from the disk capacity of the machine, but not from HDFS. HDFS also...

Read more

R has knives out for IBM SPSS and SAS

Introduction ​ Originally Bell Labs has conceived the idea of language S in the mid-1970s to resolve data analytics and statistical conundrums. The purpose of the implementation project was to perform statistical analysis of their corporation leveraging the libraries of Fortran language. The invention of S language did not include the functions needed for statistical computing. In the late 1980s, the act of rebuilding the source code in language C reinvented S languag...

Read more

    Page 2 Of 3
  • 1
  • 2
  • 3