SupplyKick is an innovative online e-commerce company specializing in selling goods online on Amazon.com and other online retailers. Their mission is to simplify and revolutionize the world of e-commerce and grow manufacturer’s brands to their full potential through lasting partnerships. The goal with this project was to find a more accurate approach to predict sales, to inform order restocking and inventory levels. As an Amazon retailer, overstocking leads to high warehouse storage charges, while understocking risks running out of inventory, bringing sales to a halt. This is a very complex problem for humans to solve as each product has different ordering trends and seasonality.
Let’s try getting some predictions! After no training, here’s how well the model fits to the training data. It’s pretty terrible, but this is what we expect without any training. This would be like walking into a calculus test without even having heard what math is. After training on the data 100 times we are given the following result.
So, what exactly is Machine Learning?
Machine learning is a growing concept in computer science, where scientists attempt to teach computers to learn abstract concepts just as a human would. This is a major paradigm shift from traditional programming, in which programs operate using a set of logical rules that are explicitly stated by programmers. In machine learning, the programmers instead develop a model, a structure that is able to learn the data that the user feeds into it.
One example that illustrates the difference between traditional and machine learning is trying to read handwriting. With traditional programming a programmer could attempt to program a set of rules that determine which letter is which, but this would be very time consuming and it would be impossible to generate a set of rules that is able to identify all types of handwriting. In machine learning, a user creates a model, which can be thought of as a complex mathematical function. The programmer will feed in handwriting data for which the correct answers are known. The model will read in all of this data, attempt to make predictions on what it thinks the answers (labels) are, compares its predictions with the correct answers, and adjusts itself until it is able to make accurate predictions. The machine learning model is not only much less labor intensive to produce, but also makes much more accurate predictions on many more types of data.
The model used in this project is a special type of machine learning model called a “artificial neural network”. These types of models have been around for over half a century, the first was theorized in 1958, however recent improvements in computation power has made it possible for practical use. Essentially, these are modeled off of the way brains (or “neural networks”) work, as they are an interconnected structure of “neurons” (or nodes) and “synapses” (or connections). Unlike real neural networks, which send biological compounds called “neurotransmitters” between neurons, neurons in artificial neural networks simply communicate using numbers. As it turns out, this seemingly basic structure is able understand complex concepts, similar to humans. It’s worth noting that in this project a special type of neural network called a “recurrent neural network”, was applied, which is specifically designed to read sequences of data.
The first thing we need to do was get an actual dataset to manipulate. We did so using Amazon’s Marketplace Web Services API. We pulled in the responses from this API and stored them as CSV files that are easy for our program to read. We then clean the data by filtering out repeat entries or records that don’t contain any data at all. Because the goal is to predict future sales, we ordered the rows by date and totaled all orders on the same day. Finally, through experimentation, we found that the model fits best to data for individual products; rather than for all products, so we filtered down to only rows containing a specific product. The result of all of this cleaning is plotted in blue.
As we can see, the data is pretty noisy, which is not great for the model to achieve a fit. Really, we want the general trend rather than data that changes drastically day to day. In order to do so, we apply a smoothing function as illustrated in red.
Let’s try getting some predictions! After no training, here’s how well the model fits to the training data.
It’s pretty terrible, but this is what we expect without any training. This would be like walking into a calculus test without even having heard what “math” is. After training on the data 100 times we are given the following result.
It seems to be picking up the general trends in the data but doesn’t quite hit the mark. Lets try doing another 900 iterations through the data, for a total of 1000 training runs.
Now the prediction is clearly fitting to the actual sales trend. Also, keep in mind that although 1000 iterations sound like a lot, the entire training was completed in about 15 seconds. However, the real challenge is fitting the model to data that the model has never seen before. Below are the predictions on the “test” dataset. Though these predictions aren’t quite as good as on the training set, this is expected, and they are still very accurate overall and likely more precise than a human’s predictions.
What this could look like for you
Of course, machine learning is cool, but you probably want to see the real business applications for the technology. Here’s an example of how it can be implemented in our services.
This is a screenshot of the dashboard interface we have built for the client. As you can see it already provides them with lots of useful metrics to influence their business decisions. The end goal is to have the model running on the server, constantly pulling new data from Amazon and retraining itself. Then it would be able to show a projection of future sales in order to aid the client in reordering products.
For the purposes of this post the concepts mentioned are heavily abstracted. However, if you’d like to learn more about the field here are some other resources you may want to look into:
- 3Blue1Brown– YouTube channel with amazing visual representations of abstract mathematical concepts
- Two Minute Papers– YouTube channel that distills recent machine learning research papers down to 2-minute videos
- Siraj Rival– YouTube channel with tutorials on lots of machine learning concepts
- Google Machine Learning Crash Course– Web tutorial series on machine learning (has more of a business/real-world focus)
- Machine Learning Mastery– Website with tons of machine learning tutorials and resources
- Colah’s Blog– In depth explanation of ML concepts by Google Brain Scientist
- Distill– Explains abstract concepts through interactive machine learning widgets