Disclaimer: I am in no way paid or promoting these cards for these companies nor do I suggest you have to get these cards. This is purely my own opinion on which credit card is good in the market right now that I personally would use.

If you haven’t already purchased things for yourself or as a gift from Black Friday/Cyber Monday sales, then December is going to be a busy month. There will be a lot of spending and you would want the best deal you can get when you blow your cash away. My previous career was in…

There is a lot of visual data out in the world and it is important that we are able to utilize and interpret this data. This project is a baseline direction towards computer vision by using deep learning techniques. How accurately can we predict and find the correct name of the celebrity in a given set of images.

The Data

In order to build an image classification model on faces, data collecting and preprocessing is a very crucial step of the process. The dataset used for my model was collected from: https://github.com/prateekmehta59/Celebrity-Face-Recognition-Dataset. …

I will cover a simple step by step on how to build a simple convolutional neural network from a dataset taken from Kaggle’s digit recognizer. Disclosure I will not be going over how neural networks works that is for another post, but just how to code them up.

Gather and Set up Data

Download the dataset from Kaggle and save it to your local directory.

Import Libraries

You will first need to read in some imports in order to start your coding.

import numpy as np
import pandas as pd
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split…

Using the basics of Python Pandas to conduct an analysis on SAT and ACT data and providing a solution to the College Board (hypothetically). My very take at a data science problem.

According to an article by Education Week, in 2016 more than 2 million students, which is roughly 64% of high school graduates took the ACT compared to 1.64 million students who took the SAT. …

Time series data analysis are most commonly seen when monitoring industrial processes or tracking corporate business metrics. The analysis accounts for any correlation, trend or seasonal variation in the data points that were taken over time. I will be discussing one approach and method on how to process the data and make predictions. To demonstrate time series analysis, I did a project on the West Nile Virus dataset taken from Kaggle.

Every year, people in Chicago are diagnosed with the West Nile Virus. The virus is spread by mosquitoes of the Culex species. While less than one percent of people…

A simple data science classification problem, given two subreddits will we be able to determine what post came from which subreddit? This blog post will explore this in more detail.

Depicts the growth rate of subreddits over the past few years

One of the widely used natural language processing task in business problems is text classification. For educational purposes, let’s say that being able to distinguish which subreddit was more popular is an actual business problem. Data science comes handy in this situation, we just need to gather Reddit’s data, which is very accessible through an API request by putting the data into a Json format. …

A project that required data cleaning, feature engineering, exploratory data analysis, building a model, and using regression techniques to accurately predict the sale prices in Ames, IA.

Source: Google Images

The Objective

During my time at General Assembly’s Data Immersive Program, projects were completed to apply and demonstrate what we learned. The goal for the project I will be talking about today is to predict the sale prices of homes in Ames from a data set taken from Kaggle. The project challenged us on how we go about preprocessing the data so that we can build the best model for predicting the housing prices.

Data Preprocessing


Benjamin Cho

Data Science/Business/Project Management Enthusiast

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store