Data Wrangling Part 2: Cleaning up Ohio Crime Data for Machine Learning
In a previous post, I discuss cleaning public Ohio crime data. As I start to get deeper into the data, and go through years...
What is the difference between data science vs data analytics?
Perhaps you are at the beginning of your career or making a change in your career and want to know the difference between data...
How to run iPython notebook online for Machine Learning projects
Recently Google had a Kaggle image contest with test and train image dataset files that were well over a TB in size. My Macbook...
How I wrote a terrible machine learning Nirvana song
We all see articles on how this is successful or that is successful and we get to brag about our successes. But what about...
What is an epoch in machine learning?
An epoch is one pass through an entire dataset. This can be in random order. You an also batch your epoch so that you...
What is Conditional Probability and formula?
Conditional probability is used to find out the probability of some event happening given that some other event has happened. Easy right?
Therefore, conditional probability...
Analyzing NFL Concussion data for Kaggle Data Science Competition
Recently, I entered the NFL Concussion on punt returns contest for data scientists. It wasn't the normal machine learning problem. In fact, it is...
What is probability mass function?
Probability mass function is recognized as a probability that is distributed over discrete variables.
First, probability mass function is always denoted with the capital P.
Second,...
Use Google Colab and Kaggle Data with bonus: fastai2
I was just running through this process and thought it might be helpful for others:
What to do in Kaggle:
Step 1. Go to your Kaggle...
How to make the first row in your spreadsheet or dataframe the header in...
If you have imported a CSV file into your notebook and use Pandas to view the dataframe you might find that the header of...