One of the core principles of software engineering is “encapsulation”. However, this principle is often overlooked when deploying a machine-learned model. A machine-learned model is a composite of two things: transformations (such as One Hot Encoding, Imputing Values, etc.) and scoring (e.g., Linear Regression). …

Assume you have a table where each row contains various features about customers. These features can be static (eg. account creation time, gender, etc) and/or dynamic (eg. “number of searches”, “last visited on”, etc). One of the challenges encountered in maintaining such a table is that different processes updated these…

While analyzing an experiment data, I encountered an interesting brain teaser. I wanted to use the bootstrap method and for that needed to sample my data with replacement. After some iterations, I got the perfect way to do it. The trick is using UNNEST with sequence operation to duplicate the…

The importance of “Sampling” cannot be overstated. The conclusions we draw from the data as well as the quality of the machine-learned model significantly depend on how we sample the data. However, there are many different ways to sample the data and expressing these different ways of sampling in SQL…

Ritesh Agrawal

Senior Machine Learning Engineer, Varo Money; Contributor and Maintainer of sklearn-pandas library

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store