More interactive visualizations of flights data in D3

This post covers two more interactive visualizations I've made with d3 on flights data. I had data on all domestic flights within the USA in 2016 (courtesy of the Bureau of Transportation). Delay Choropleth The interactive plot is located here – https://shivathudi.github.io/flights-choropleth/. It shows the percentage of flights delayed from states across the US. The color red …

Interactive visualizations using D3.js and D3 wrappers in Shiny

I think D3.js is the one of the best libraries out there for interactive visualizations. This post will cover two cases; the first one uses D3.js while the second one uses D3 wrappers in Shiny. Domestic Flights by City in the US The interactive plot is located here – https://shivathudi.github.io/flights-chord/. The mouseover interactions can only be seen at […]

Collaborative Filtering on Netflix Data to Predict User Ratings of Movies

Given explicit ratings from users on movies they like, we can use collaborative filtering to recommend other movies which they haven’t watched yet but which have been rated highly by other users with similar interests. The dataset that I will be using is a subset of the movie ratings data from the Netflix Prize, which …

Linear Regression vs k-Nearest Neighbors on Boston Housing Prices and U.S. Monthly Climate Data

In this post, I will compare the results of applying linear regression and k-nearest neighbors to two different datasets. Boston Housing Prices You can download this data from the UCI  Machine Learning Repository, at https://archive.ics.uci.edu/ml/datasets/Housing. Alternatively, you can find the Boston data in CSV format at http://github.com/shivathudi/machine-learning/linear_regression_vs_kNN/boston.csv. There are 14 columns and around 500 rows. …

Predicting Life Expectancy using Decision Tree Ensembles

In this post, I will predict life expectancy for the “average” person born in a certain year in one or more countries. I gathered all my data from Gapminder. For the predictor variables, I used birth rates, mortality rates (male, female, under5, and infant), HIV rates, GNI, and Internet usage rates. You can find this data in a …