Logo
Data Scientist at

Hi! I'm a data scientist at C.H. Robinson, having just finished up my PhD in Neuroscience at the University of Minnesota with David Redish. I'm interested in data science and machine learning - especially in predictive models which capture uncertainty.

On the side, I've been working on ProbFlow, a Python package for building Bayesian neural networks.

Tags

All Posts

Comparing Pure Geo Experiments to TBR Causal Effect Analysis

31 Oct 2019 - measurement

Using Google's GeoexperimentsResearch R package to estimate the effectiveness of ad campaigns, and simulating experiments to see how pure geo experiments vs time-based regression analysis perform under different conditions.

Trip Duration Prediction using Bayesian Neural Networks and TensorFlow 2.0

23 Jul 2019 - bayesian, neural networks, uncertainty, tensorflow, and prediction

Using a dual-headed Bayesian density network to predict taxi trip durations, and the uncertainty of those estimates.

Visualizing multiple sources of uncertainty with semitransparent confidence intervals

03 Jul 2019 - visualization

Improving on Matlab's default plotting tools for uncertainty visualization.

Customer Loyalty Prediction 3: Predictive Modeling

19 Jun 2019 - prediction

Performing hyperparameter optimization, and creating ensemble and stacking models to predict customer loyalty.

Bayesian Gaussian Mixture Modeling with Stochastic Variational Inference

12 Jun 2019 - bayesian and tensorflow

How to fit a Bayesian Gaussian mixture model via stochastic variational inference, using TensorFlow Probability and TensorFlow 2.0 eager execution.

Career Village Question Recommendation System

20 May 2019 - feature engineering and recommendation

Joining and aggregating data across multiple tables, and building a content-based implicit reccomendation system to recommend questions asked by students to professionals who can answer them.

Customer Loyalty Prediction 2: Feature Engineering and Feature Selection

04 Apr 2019 - feature engineering and feature selection

Engineering features, performing aggregations with transaction information, and using mutual information and permutation-based feature importance to select features.

Bayesian Hyperparameter Optimization using Gaussian Processes

28 Mar 2019 - bayesian, prediction, and optimization

Finding the best hyperparameters for a predictive model in an automated way using Bayesian optimization.

Customer Loyalty Prediction 1: Data Cleaning and EDA

20 Mar 2019 - data cleaning and eda

Data loading, cleaning, and exploratory data analysis for the Elo customer loyalty prediction challenge.

Representing Categorical Data with Target Encoding

04 Mar 2019 - prediction

Representing categorical variables with high cardinality using target encoding, and mitigating overfitting often seen with target encoding by using cross-fold and leave-one-out schemes.

Documenting Python Packages with Sphinx and ReadTheDocs

05 Jan 2019 - tools

Writing and generating documentation for python packages using Sphinx, and hosting and automatically building the documentation with ReadTheDocs.

Prediction Intervals for Taxi Fares using Quantile Loss

15 Dec 2018 - eda, prediction, uncertainty, and visualization

Training gradient boosted decision trees with a quantile loss to predict taxi fares, in python using catboost and vaex.

Bayesian Regressions with MCMC or Variational Bayes using TensorFlow Probability

03 Dec 2018 - bayesian, tensorflow, and uncertainty

Bayesian regressions via MCMC sampling or variational inference using TensorFlow Probability, a new package for probabilistic model-building and inference.

Multilevel Gaussian Processes and Hidden Markov Models with Stan

15 Nov 2018 - bayesian and stan

Multilevel and multitrial Gaussian Processes and hidden Markov models in R, using Stan and bridge sampling.

Automated Feature Engineering with Featuretools

11 Nov 2018 - feature engineering

Running deep feature synthesis for automated feature engineering, using the Featuretools package for Python.

Home Credit Group Loan Risk Prediction

11 Oct 2018 - data cleaning and prediction

Prediction of loan default using python, scikit-learn, and XGBoost.

Bayesian Modeling of Gaussian Processes and Hidden Markov Models with Stan

10 Oct 2018 - bayesian and stan

Model comparison between Bayesian fits of Gaussian Processes and hidden Markov models in R, using Stan and bridge sampling.

Running a Docker Container on AWS EC2

30 Aug 2018 - aws, docker, and tools

How to set up an AWS account, launch an instance, run a docker container in that instance, and upload/download data to and from the container.

Nice Ride Bike Share EDA

02 Aug 2018 - eda and visualization

Exploratory data analysis of Nice Ride MN bike share's system data for 2017.

Fatal Police Shootings EDA

08 Jul 2018 - eda and visualization

Exploratory data analysis of the Washington Post's database of fatal police shootings in the US since 2015.

Multilevel Bayesian Correlations

27 Jun 2018 - bayesian and stan

Fitting Bayesian models of the correlation between two variables to data with multiple observations per subject or group. Using Stan!

Posts tagged "aws"

Running a Docker Container on AWS EC2

30 Aug 2018 - aws, docker, and tools

How to set up an AWS account, launch an instance, run a docker container in that instance, and upload/download data to and from the container.

Posts tagged "bayesian"

Trip Duration Prediction using Bayesian Neural Networks and TensorFlow 2.0

23 Jul 2019 - bayesian, neural networks, uncertainty, tensorflow, and prediction

Using a dual-headed Bayesian density network to predict taxi trip durations, and the uncertainty of those estimates.

Bayesian Gaussian Mixture Modeling with Stochastic Variational Inference

12 Jun 2019 - bayesian and tensorflow

How to fit a Bayesian Gaussian mixture model via stochastic variational inference, using TensorFlow Probability and TensorFlow 2.0 eager execution.

Bayesian Hyperparameter Optimization using Gaussian Processes

28 Mar 2019 - bayesian, prediction, and optimization

Finding the best hyperparameters for a predictive model in an automated way using Bayesian optimization.

Bayesian Regressions with MCMC or Variational Bayes using TensorFlow Probability

03 Dec 2018 - bayesian, tensorflow, and uncertainty

Bayesian regressions via MCMC sampling or variational inference using TensorFlow Probability, a new package for probabilistic model-building and inference.

Multilevel Gaussian Processes and Hidden Markov Models with Stan

15 Nov 2018 - bayesian and stan

Multilevel and multitrial Gaussian Processes and hidden Markov models in R, using Stan and bridge sampling.

Bayesian Modeling of Gaussian Processes and Hidden Markov Models with Stan

10 Oct 2018 - bayesian and stan

Model comparison between Bayesian fits of Gaussian Processes and hidden Markov models in R, using Stan and bridge sampling.

Multilevel Bayesian Correlations

27 Jun 2018 - bayesian and stan

Fitting Bayesian models of the correlation between two variables to data with multiple observations per subject or group. Using Stan!

Posts tagged "data cleaning"

Customer Loyalty Prediction 1: Data Cleaning and EDA

20 Mar 2019 - data cleaning and eda

Data loading, cleaning, and exploratory data analysis for the Elo customer loyalty prediction challenge.

Home Credit Group Loan Risk Prediction

11 Oct 2018 - data cleaning and prediction

Prediction of loan default using python, scikit-learn, and XGBoost.

Posts tagged "docker"

Running a Docker Container on AWS EC2

30 Aug 2018 - aws, docker, and tools

How to set up an AWS account, launch an instance, run a docker container in that instance, and upload/download data to and from the container.

Posts tagged "eda"

Customer Loyalty Prediction 1: Data Cleaning and EDA

20 Mar 2019 - data cleaning and eda

Data loading, cleaning, and exploratory data analysis for the Elo customer loyalty prediction challenge.

Prediction Intervals for Taxi Fares using Quantile Loss

15 Dec 2018 - eda, prediction, uncertainty, and visualization

Training gradient boosted decision trees with a quantile loss to predict taxi fares, in python using catboost and vaex.

Nice Ride Bike Share EDA

02 Aug 2018 - eda and visualization

Exploratory data analysis of Nice Ride MN bike share's system data for 2017.

Fatal Police Shootings EDA

08 Jul 2018 - eda and visualization

Exploratory data analysis of the Washington Post's database of fatal police shootings in the US since 2015.

Posts tagged "feature engineering"

Career Village Question Recommendation System

20 May 2019 - feature engineering and recommendation

Joining and aggregating data across multiple tables, and building a content-based implicit reccomendation system to recommend questions asked by students to professionals who can answer them.

Customer Loyalty Prediction 2: Feature Engineering and Feature Selection

04 Apr 2019 - feature engineering and feature selection

Engineering features, performing aggregations with transaction information, and using mutual information and permutation-based feature importance to select features.

Automated Feature Engineering with Featuretools

11 Nov 2018 - feature engineering

Running deep feature synthesis for automated feature engineering, using the Featuretools package for Python.

Posts tagged "feature selection"

Customer Loyalty Prediction 2: Feature Engineering and Feature Selection

04 Apr 2019 - feature engineering and feature selection

Engineering features, performing aggregations with transaction information, and using mutual information and permutation-based feature importance to select features.

Posts tagged "measurement"

Comparing Pure Geo Experiments to TBR Causal Effect Analysis

31 Oct 2019 - measurement

Using Google's GeoexperimentsResearch R package to estimate the effectiveness of ad campaigns, and simulating experiments to see how pure geo experiments vs time-based regression analysis perform under different conditions.

Posts tagged "neural networks"

Trip Duration Prediction using Bayesian Neural Networks and TensorFlow 2.0

23 Jul 2019 - bayesian, neural networks, uncertainty, tensorflow, and prediction

Using a dual-headed Bayesian density network to predict taxi trip durations, and the uncertainty of those estimates.

Posts tagged "optimization"

Bayesian Hyperparameter Optimization using Gaussian Processes

28 Mar 2019 - bayesian, prediction, and optimization

Finding the best hyperparameters for a predictive model in an automated way using Bayesian optimization.

Posts tagged "prediction"

Trip Duration Prediction using Bayesian Neural Networks and TensorFlow 2.0

23 Jul 2019 - bayesian, neural networks, uncertainty, tensorflow, and prediction

Using a dual-headed Bayesian density network to predict taxi trip durations, and the uncertainty of those estimates.

Customer Loyalty Prediction 3: Predictive Modeling

19 Jun 2019 - prediction

Performing hyperparameter optimization, and creating ensemble and stacking models to predict customer loyalty.

Bayesian Hyperparameter Optimization using Gaussian Processes

28 Mar 2019 - bayesian, prediction, and optimization

Finding the best hyperparameters for a predictive model in an automated way using Bayesian optimization.

Representing Categorical Data with Target Encoding

04 Mar 2019 - prediction

Representing categorical variables with high cardinality using target encoding, and mitigating overfitting often seen with target encoding by using cross-fold and leave-one-out schemes.

Prediction Intervals for Taxi Fares using Quantile Loss

15 Dec 2018 - eda, prediction, uncertainty, and visualization

Training gradient boosted decision trees with a quantile loss to predict taxi fares, in python using catboost and vaex.

Home Credit Group Loan Risk Prediction

11 Oct 2018 - data cleaning and prediction

Prediction of loan default using python, scikit-learn, and XGBoost.

Posts tagged "recommendation"

Career Village Question Recommendation System

20 May 2019 - feature engineering and recommendation

Joining and aggregating data across multiple tables, and building a content-based implicit reccomendation system to recommend questions asked by students to professionals who can answer them.

Posts tagged "stan"

Multilevel Gaussian Processes and Hidden Markov Models with Stan

15 Nov 2018 - bayesian and stan

Multilevel and multitrial Gaussian Processes and hidden Markov models in R, using Stan and bridge sampling.

Bayesian Modeling of Gaussian Processes and Hidden Markov Models with Stan

10 Oct 2018 - bayesian and stan

Model comparison between Bayesian fits of Gaussian Processes and hidden Markov models in R, using Stan and bridge sampling.

Multilevel Bayesian Correlations

27 Jun 2018 - bayesian and stan

Fitting Bayesian models of the correlation between two variables to data with multiple observations per subject or group. Using Stan!

Posts tagged "tensorflow"

Trip Duration Prediction using Bayesian Neural Networks and TensorFlow 2.0

23 Jul 2019 - bayesian, neural networks, uncertainty, tensorflow, and prediction

Using a dual-headed Bayesian density network to predict taxi trip durations, and the uncertainty of those estimates.

Bayesian Gaussian Mixture Modeling with Stochastic Variational Inference

12 Jun 2019 - bayesian and tensorflow

How to fit a Bayesian Gaussian mixture model via stochastic variational inference, using TensorFlow Probability and TensorFlow 2.0 eager execution.

Bayesian Regressions with MCMC or Variational Bayes using TensorFlow Probability

03 Dec 2018 - bayesian, tensorflow, and uncertainty

Bayesian regressions via MCMC sampling or variational inference using TensorFlow Probability, a new package for probabilistic model-building and inference.

Posts tagged "tools"

Documenting Python Packages with Sphinx and ReadTheDocs

05 Jan 2019 - tools

Writing and generating documentation for python packages using Sphinx, and hosting and automatically building the documentation with ReadTheDocs.

Running a Docker Container on AWS EC2

30 Aug 2018 - aws, docker, and tools

How to set up an AWS account, launch an instance, run a docker container in that instance, and upload/download data to and from the container.

Posts tagged "uncertainty"

Trip Duration Prediction using Bayesian Neural Networks and TensorFlow 2.0

23 Jul 2019 - bayesian, neural networks, uncertainty, tensorflow, and prediction

Using a dual-headed Bayesian density network to predict taxi trip durations, and the uncertainty of those estimates.

Prediction Intervals for Taxi Fares using Quantile Loss

15 Dec 2018 - eda, prediction, uncertainty, and visualization

Training gradient boosted decision trees with a quantile loss to predict taxi fares, in python using catboost and vaex.

Bayesian Regressions with MCMC or Variational Bayes using TensorFlow Probability

03 Dec 2018 - bayesian, tensorflow, and uncertainty

Bayesian regressions via MCMC sampling or variational inference using TensorFlow Probability, a new package for probabilistic model-building and inference.

Posts tagged "visualization"

Visualizing multiple sources of uncertainty with semitransparent confidence intervals

03 Jul 2019 - visualization

Improving on Matlab's default plotting tools for uncertainty visualization.

Prediction Intervals for Taxi Fares using Quantile Loss

15 Dec 2018 - eda, prediction, uncertainty, and visualization

Training gradient boosted decision trees with a quantile loss to predict taxi fares, in python using catboost and vaex.

Nice Ride Bike Share EDA

02 Aug 2018 - eda and visualization

Exploratory data analysis of Nice Ride MN bike share's system data for 2017.

Fatal Police Shootings EDA

08 Jul 2018 - eda and visualization

Exploratory data analysis of the Washington Post's database of fatal police shootings in the US since 2015.