Drafting a rookie in Fantasy Football can be a risky move, but it can pay huge dividends if you happen to snag a diamond in the rough. After accounting for a player’s draft position, do physical attributes (height/weight) and combine performance (40 yard dash, bench press, etc.) provide any additional explanatory power of points scored during a player’s first NFL season? I’ll explore this question for rookie Running Backs and Wide Receivers.
Tired of waiting around for your simulations to finish? Run them in parallel! This post covers how to use Spark and ForEach to add parallelism to your R code.
This post focuses on some of my favorite things – football and forecasting – and will outline how to leverage external regressors when creating forecasts. We’ll do some web scraping in R and Python to create our dataset, and then forecast how many people will visit Tom Brady’s Wikipedia page.
Feature selection is an integral part of machine learning and this post explores what happens when lots of irrelevant features are added to the modeling process. We’ll also identify which algorithms are affected the most by such features. These questions will be addressed as we build a classifier and try to predict which wines we’ll like based on their chemical properties. So pour yourself a glass of Pinot Noir and fire up your R terminal!
Sometimes a controlled experiment isn’t an option yet you want to establish causality. This post outlines a method for quantifying the effects of an intervention via counterfactual predictions.
That’s a dense title – Monte Carlo Simulation, Power, Mixed-Effect models. Each of these topics could be their own post. However, I’m going to discuss their interrelations in the context of experimental power and keep everything high-level. The goal is to get an intuitive idea of how we can leverage simulation to provide sample size estimates for experiments with nested data.
Exception handling is a critical component of any data science workflow. You write code. It breaks. You build logic to deal with the exceptions. Repeat. From my experience, one of point of confusion for new R users is how to handle exceptions, which is a bit more intuitive in Python. Accordingly, this post provides a practical overview of how to handle exceptions in R by first illustrating the concept in Python.
Early trend detection is a major area of focus in the analytics realm, because it can inform key business strategy yet it an remains extremely difficult task. This post outlines one trend-detection method in an effort to predict where a stock’s price will go in the future.