The Bootstrap
Recently I’ve had occasion to use the bootstrap and have been reminded at what a remarkably powerful technique this is despite it’s simplicity. I thought…
Recently I’ve had occasion to use the bootstrap and have been reminded at what a remarkably powerful technique this is despite it’s simplicity. I thought…
I’ve been learning the Tidymodels framework for building Machine Learning models in R pioneered by Max Kuhn and Julia Silge. After spending a few weeks…
Continue reading → Look what the Cat dragged in: Catboost with Tidymodels
Introduction A very interesting paper was brought to my attention which proposes an adjustment to the traditional Stochastic Gradient Descent approach called Oddball Stochastic Gradient…
Introductory Concepts In the field of statistics, researchers are interested in making inferences from data. The data is collected from a population; the data drawn…
This is a post I've been wanting to write for a while - Quadratic forms and Definite matrices are everywhere in linear algebra and they…
Recently, I’ve had a chance to play with R’s plumber library and used it to run scripts on a schedule. This post will show how…
Continue reading → Batch Updating with Plumber and Google Scheduler
How many times have you heard someone say they are data-driven or data centric or that “data is the heart of everything they do”? I’ve…
Continue reading → Why You’re Not as Data-Driven as You Think You Are
Introduction I have a startling admission to make. When I was a student, I scoffed at Dijkstra's algorithm - I had paid it no mind…
Introduction Recently, I've had a chance to play with word embedding models. Word embedding models involve taking a text corpus and generating vector representations for…
Although data science as a job function is relatively new compared to roles like software engineer or database administrator, in the age of “Big Data”,…