BigQuery. The path to modern data analytics
Ocado: significant results thanks to big data
The world’s largest online-only product retailer, Ocado, improves operational efficiency and customer care using machine learning powered by the Google Cloud Platform.
Results of applying Google cloud and ML
- 4x faster response time for urgent client emails
- Increase contact center efficiency by 7%, enabling agents to use the additional time for high-priority tasks
- Deliver insights 80 times faster and 33% cheaper than using the old Ocado data warehouse
- Reducing IT costs while providing scalability and flexibility.
According to market research firm Mintel, in the United Kingdom, the popularity of online grocery sales is expected to grow from 6% of the current market to 9% in 2021. One of the pioneers of exclusively online grocery sales is Ocado, located in Hatfield, Hertfordshire, United Kingdom. . Since starting commercial deliveries in 2002, the company has grown to 600,000 active customers, 260,000 weekly orders, and £1.39 billion in annual revenue.
Ocado is phasing out supermarkets by enabling shoppers to shop online using their familiar web and mobile apps. Things are collected and packed in automated warehouses and sent directly to the client within one hour of selection. Ocado’s delivery punctuality is 95%, and its service footprint lies with over 70% of the UK population.
“The Google Cloud Platform gives us the flexibility and ability to tackle large and complex data challenges unique to our business.”
Paul Clark, Chief Technology Officer, Ocado
The company has succeeded by building almost all the technology in-house and implementing automation that enhances its e-commerce, workflow, and logistics platform. Ocado has also developed a new platform, the Ocado Smart Platform (OSP), which offers large retailers worldwide access to first-class online grocery sales solutions.
Democratization of machine learning
Traveling for online retail grocery shopping is very different from other forms of e-business. Shoppers often buy dozens of products at a time, single households can have multiple shoppers using multiple devices, and a grocery shelf can last several days.
“We often say that by building a full-cycle platform that can make online grocery sales large and profitable, we can do other forms of online retail, but the reverse is not necessarily true.” Paul Clark, chief technology officer at Ocado, says. “The Google Cloud Platform gives us the flexibility and ability to tackle large and complex data challenges unique to our business.”
Ocado’s business model takes advantage of consumer preference bias and links digital technology to the shopping experience.
“The Google Cloud Machine Learning Engine gives us the flexibility we need. Our developers can use TensorFlow and see the benefits of machine learning in the cloud firsthand.”
Paul Clark, Chief Technology Officer, Ocado
The company has been integrating machine learning into its systems for more than five years. Until recently, Ocado’s machine learning applications required data scientists, often with a Ph.D. in machine learning, to build these solutions from scratch. It also needed specialists to set up systems and expensive local infrastructure to train and use them.
However, working with Google as a private alpha test site for the Machine Learning Engine of the Google Cloud has accelerated their adoption of AI.
“We have been considering how the cloud could democratize artificial intelligence,” Paul says. “The Google Cloud Machine Learning Engine gives us the flexibility we need. Our developers can use TensorFlow and see the benefits of machine learning in the cloud firsthand.”
TensorFlow is an open-source machine learning library developed by the Google Brain team. Ocado developers, engineers, and data scientists now use TensorFlow for many machine learning projects. They deploy models built using the Google Cloud Machine Learning Engine, which allows them to train them faster on servers, desktops, and mobile devices through a single application programming interface (API). Additionally, the Google Cloud Machine Learning Engine easily integrates with other Google Cloud Platform products widely used in Ocado.
What do buyers really want?
The first machine learning model built in Ocado based on TensorFlow was a machine learning system that tokenized and cataloged email messages and prioritized responses. The contact center receives thousands of email messages daily, and Ocado wanted to automate the process of determining which ones needed an immediate response and which ones could wait. For example, a first-time shopper who expresses their desire to use Ocado doesn’t need an answer as urgently as a shopper who loses a product from an order or won’t be home at the delivery time.
“Using Reply Agents instead of sorting out less urgent mail improves Ocado’s distinctness and user experience.”
James Donkin, Ocado CEO
“We get a lot of emails from customers saying, ‘Your service is great or ‘The courier was very courteous,” says James Donkin, CEO of Ocado. “However, when things like the weather or traffic situation have the potential to impact delivery, we often get a surge in urgent matters. Using reply agents instead of sorting out less urgent mail improves Ocado’s distinctness and user experience.”
Using Google Cloud’s Machine Learning Engine, Tensorflow, and large datasets collected over years of manually categorizing client email messages, Ocado experimented with the kinds of neural network architectures that best prioritize email messages. After testing their models, Ocado implemented the most productive of them and responded to urgent messages four times faster. The company also found that 7% of its email messages did not need a response, allowing contact center representatives to spend more time on high-priority messages.
“Without the Machine Learning Engine of the Google cloud, it would have been much harder to succeed on a project like an email classification,” says Roland Plazowski. He, until recently, led several big data projects and initiatives at Ocado. “Even if we invested heavily in infrastructure, it would still be difficult to manage because of the computational intensity. It’s difficult and expensive to run machine learning projects without having an infrastructure that you can scale easily.”
By analyzing order data, Ocado makes shopping as convenient as possible. For example, ordering systems can pre-populate shopping carts with the products they are most likely to buy, remind shoppers of things they may have forgotten, and provide them with package deals during the buying process. Ocado also uses machine learning to predict user behavior and improve the experience. For example, buying one item gets the second one free. Based on machine learning from previous purchases, the Ocado system can also suggest new products that customers will most likely like.
“You will regularly see products that are most suitable for you, instead of things offered in a more general way,” says James. “I am a vegetarian, and as such, I am offered discounts on vegetarian products that I usually buy and on new products that I have never bought. And I, too, would least of all like to see things in which I am not interested.”
Machines and machine learning
In the Internet of Things (IoT), Ocado is looking to improve its warehouse robots with machine learning. The integral part of OSP is thousands of robots continuously transferring data to Google cloud storage and Google BigQuery.
Data Scientists at Ocado are using machine learning to create a form of swarm intelligence that would allow warehouse robots to work cooperatively towards a common goal. The projects include models to look up telemetry data from robots, such as whether a battery pack is operating within standard tolerances or whether firmware has been successfully downloaded, and to optimize schedules or detect wear patterns.
“Another issue we are considering is how to build machine learning directly into robots so that they become smarter in terms of self-testing, exception handling, and failure recovery,” says Paul. “This is a complex combination of IoT, data analytics, and machine learning, which we hope Google BigQuery and Google Cloud Machine Learning are the best fit for.”
“The old databases weren’t fast enough. We need a solution that can scale with the amount of data we generate and understand how to use it. Google Cloud Storage and Google BigQuery are now the data backbone of the Ocado Smart Platform.”
Paul Clark, Chief Technology Officer, Ocado
Scale for new business
Scalability is also a significant reason for Ocado’s initiatives, including migrating all on-premises data to the cloud. Ocado wanted to improve user experience, empower business teams with better insights, and reduce IT overhead, so it consolidated on the Google cloud platform.
“The old databases weren’t fast enough,” says Paul. “We need a solution that can scale with the data we generate and understand how we can use it. Google Cloud Storage and Google BigQuery now represent the data backbone of the Ocado Smart Platform.”
Ocado values its business, product, and transaction data at approximately two petabytes. Combining data from the buying and supply chains helps Ocado’s operations team and the company’s ambitions to commercialize OSP.
“Compared to other international expansion opportunities, selling OSP as a managed service allows us to acquire companies that could compete,” says Paul. “We want to create an OSP and then offer it to a variety of business-to-business buyers.”
Every time Ocado adds a new hosting client to the OSP, it launches a new custom instance to meet the client’s needs. The capabilities and productivity of each new OSP instance must be rapidly scalable as a back-end platform for established retailers with a wide range of products, customers, and transactions.
Ocado’s first customer, Morrisons, has already benefited from this decision. Morrison is one of the UK’s four largest supermarkets and uses OSP to strengthen its online retail business. Using the Google Cloud Platform, Ocado stored, processed, and analyzed terabytes of Morrison data using dedicated datasets and Google BigQuery.
In addition to using the Google Cloud Platform for OSP, Ocado has adapted its operations to the online grocery retail business. Ocado pioneered the open source Apache Spark and Apache Hadoop frameworks on Google Compute Engine for their data platform. Moving to Google BigQuery frees Ocado business analysts from the complex queries and workflows associated with Spark and Hadoop. Plus, it allows Ocado to distribute data analysts across vendors and partners.
Google BigQuery integrates very well with TensorFlow on the Google Cloud Machine Learning Engine and Google Cloud Dataproc, Apache Spark, and Apache Hadoop services, allowing Ocado to use open source tooling for: batch processing, queries, streaming, and machine learning. Google Cloud Dataflow and Google Cloud Dataproc provide cluster management and an easy-to-use framework so developers can spend less time and money on administration and more time delivering business-essential things.
Google controls everything. Moving from Hadoop to Google BigQuery showed price and performance improvements. For example, Ocado no longer needs to decide how many instances to connect to a cluster and wait for them to launch.
“We just ran our queries and paid for the resources we used,” Roland added. “One big win with Google BigQuery is no customization required. The best we’ve seen is Google BigQuery outperforming our Hadoop cluster by 80x on the largest dataset at only two-thirds of the cost.”