Authors: Ritesh Agrawal, Mia F, Vinod Pazhettilkambarath, Viral Parikh

Putting ML models in production requires a robust platform. In an earlier post, we discussed the design and architecture of Feature Store, one of the critical components of ML Platform. …

Authors: Ritesh Agrawal, Brandon Lee

At Varo, our mission is to help millions of Americans achieve financial well-being by building a bank for all of us. We are investing in building Varo’s AI/ML platform to build, train and deploy models. While machine learning (ML) involves mathematics and algorithms, it is…

One of the core principles of software engineering is “encapsulation”. However, this principle is often overlooked when deploying a machine-learned model. A machine-learned model is a composite of two things: transformations (such as One Hot Encoding, Imputing Values, etc.) and scoring (e.g., Linear Regression). …

Assume you have a table where each row contains various features about customers. These features can be static (eg. account creation time, gender, etc) and/or dynamic (eg. “number of searches”, “last visited on”, etc). One of the challenges encountered in maintaining such a table is that different processes updated these…

PyTorch is one of the most used libraries for deep learning but is also one of the very difficult libraries to understand due to lot of side-effects that one object can have over another. For instance, calling the “step” method of an optimizer updates the module object’s parameters. Trying to…

One of the first challenges in machine learning on structured data is “Feature Engineering.” It involves deciding whether to treat a variable as a numerical variable or a categorical variable and further to choose various transformations of data such as log transformation, one-hot encoding, target encoding, etc. These decisions are…

From self-driving cars to medical assistants, businesses have found a multitude of ways to leverage machine learning and artificial intelligence to develop smart and responsive products. However, I am often intrigued to notice that the application of machine learning at the infrastructure almost non-existing. For instance, while we have models…

While analyzing an experiment data, I encountered an interesting brain teaser. I wanted to use the bootstrap method and for that needed to sample my data with replacement. After some iterations, I got the perfect way to do it. The trick is using UNNEST with sequence operation to duplicate the…

The importance of “Sampling” cannot be overstated. The conclusions we draw from the data as well as the quality of the machine-learned model significantly depend on how we sample the data. However, there are many different ways to sample the data and expressing these different ways of sampling in SQL…

There are many metrics for evaluating a regression model. But often they seem cryptic. Below is an attempt to help understand the intuition two often used such metrics: mean/median absolute error and R2 (or the coefficient of determination)

Average Accuracy of the Model (Mean/Median Absolute Error)

Let’s assume you got a model that can predict house prices. Naturally…

Ritesh Agrawal

Senior Machine Learning Engineer, Varo Money; Contributor and Maintainer of sklearn-pandas library

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store