The Advantages of Making Your Startup Truly Data-Driven

The Advantages of Making Your Startup Truly Data-Driven

Jun 04 2020

//Jonathan Løw, CEO @ JumpStory, for The Hub.

The private detective Sherlock Holmes says the following famous words in one of the brilliant crime-novels by his creator Sir Arthur Conan Doyle:

“It is a capital mistake to theorize before one has data.”

Interestingly enough, this Sherlock Holmes quote could also be an excellent mantra for startups in 2020. In one of my books, “The GuruBook”, I talk to my entrepreneur-colleague Danny Lange about the importance of being data-driven and using machine learning as part of this.

Danny Lange is the former Head of AI at Uber and Amazon, and we spoke a lot about the importance of data and how to use it in the right way. I’ve since used his insights a lot in my own tech-startup JumpStory, and I would love to share some of them here on TheHub.


Learning to learn 🎒


I spoke with Danny about both Uber & Amazon, and what smaller tech-startups and growth companies can learn from them.

At Uber they focus on metrics and measurements as the center of everything. They use machine learning to estimate the time of arrival, pairing people up for Uber pool rides and improving the pickup experience by having a computer learn over time where the good pickup spots are in a particular city.

Basically, the core function of the machine learning algorithms is to measure the experience and minimize the friction during a pickup.

machine learning

For instance, in the United States, there are a lot of situations and places in which an Uber vehicle cannot stop. Therefore, Uber designed a system that learned this and thus was able to offer a problem-free experience by suggesting both driver and customer meet 20–30 yards away from where the car was booked to stop in the app.

In the second business, the mapping business, Uber used machine learning to build maps for the drivers. They made a system that could read street signs and populate the map. This means that where most companies and people build maps by hand, they used machine learning for the same purpose.

As with the core business, the benefit from doing it like this is that you develop a system that keeps learning and improving. Rather than building a system you constantly have to update manually or calculate all possible outcomes, they have a system that learns by itself and continues to improve—without them having to do more than support it a little on the side.


The potential of machine-learning (ML) and being data-driven 💡


Uber is just one of thousands of companies benefitting combining machine-learning with being data-driven, and therefore it’s important to understand how it works. Let me try to explain.

Imagine that you have to build an application that predicts the shipping time for a company. In the old days, you would look at it the following way: There is a place where you pick up the package and a destination address. You then have to build up a complicated set of rules, a rule system, to include the speed of the trucks, planes, delays, and so on. You would try to compute it and maybe end up with two or three hundred rules to try to predict the shipping time.

AI startup jumpstory

In machine learning, you don’t think or work like this. Instead, you will base your system on millions of package deliveries that have already been made; this is the most important thing—your data. Within this data you will have the weekdays, the sizes of the packages, how quickly they were delivered, and so on. Within machine learning you call this the ground truth.

So, ground truthing refers to the process of gathering the proper objective, provable data for the test. For example, Bayesian spam filtering is a common example of this. In this system, the algorithm is manually taught the differences between spam and nonspam. This depends on the ground truth of the messages used to train the algorithm; inaccuracies in the ground truth will correlate to inaccuracies in the resulting spam/nonspam verdicts.

In the case of the shipping, you will then have millions of packages delivered, and the computer can learn a statistical model. When you feed in a new delivery, the system will use the statistics to predict the shipping time based on history. What we have learned is that this system will always outperform the rule-based system. We have stopped trying to understand the rules. Instead, we leave it to the machine learning system to do that.

Since the world is constantly changing, this will improve the predictions a lot and save a lot of manpower and time. In the case of machine learning, you can monitor the feedback and constantly measure how good your model is.


AI and how this fit together with ML and data 📲


Oftentimes big data, machine learning and AI seem to be referred to as the same thing, but it’s not, so it’s important to understand the difference to get the most out of these things in your startup.

Artificial intelligence is about how a system is being perceived and how a system presents itself. If you look at predicting shipping time that is not really intelligence. But when you start taking an entire organization and have everything it does—from predicting shipping times to detecting hazard materials with computer vision, self-driving trucks, dynamic prizing based on demand, and so on—it actually appears pretty smart and intelligent.

AI machine learning

At Amazon, where Danny Lange also used to work, this was the philosophy: that the whole company would start appearing more intelligent to the customer. And, at one point, they could actually claim that we were a really smart organization.

Of course, not everyone can be Amazon, but we can all learn from them – even as Scandinavian startups. In the case of Amazon, their mindset is that they need to be able to beat every retailer out there. They do this by knowing you better. Getting you things faster. Giving you more reasonable prices. Offering you more than a billion products. And all of this is only possible by integrating machine learning and AI into every aspect of the business and business model.

Uber is the same. They can run out of San Francisco but beat a taxi company in almost every country. Both companies are using the technology to a scale that has never been seen before. It enables them to run a service in a faraway country better than the people actually living in the country.


You will be in trouble, if you don’t start collecting and using data 💾


We’re seeing a development where you will be in trouble 24–36 months from now if you don’t start taking machine learning seriously. It will happen especially in industries such as transportation, shipping, finance, and retail, but all kinds of companies and leaders – including startups – should look into this much deeper.

Data collection

Of course, the big companies have an advantage due to the amount of data they often have. The startups lack this, and data is increasingly becoming king. For example, you may be able to build a better app with a better backend than Uber, and pay a crew of drivers more money, but if you don’t have the data to deliver a consistently better pickup experience, all of that might not matter at all.

Fortunately, you don’t have to be Uber or Amazon to succeed, but you have to start collecting data and working with machine learning. A lot of startups are running into the problem that they don’t have the data. Currently, we build homes and offices based on the architect’s creativity and our history and experience of building houses. However, in the future, we could use data and AI to totally change the way we think about the design of houses. And this is just one small example.

The core thing is that startups have to become truly data driven, if they aren’t already, and it’s really about culture more than projects. Stop creating projects and test-projects. Instead, think about how machine learning and AI can fundamentally change how you are working and innovating. Remember – it’s a capital mistake to theorize before one has data.


Are you looking for funding? 


Investor funding matching tool