December 02, 2022

How we upgraded an old, 3PB large, Elasticsearch cluster without downtime. Part 4 - Tokenization and normalization for high recall in all languages

This is part 4 in our series on how we upgraded our Elasticsearch cluster without any downtime and with minimal user impact.

In part 2 we explained that we decided to do a full reindexing of our entire dataset as part of this Elasticsearch upgrade project. This blog post explains some of the changes we made to our documents during that re-indexing.

November 25, 2022

How we upgraded an old, 3PB large, Elasticsearch cluster without downtime. Part 3 - Search Performance & Wildcards

This is part 3 in our series on how we upgraded our Elasticsearch cluster without any downtime and with minimal user impact.

As part of the Elasticsearch Upgrade project, we needed to investigate the search performance improvements between the old and the new versions. Running an older version of Elasticsearch has presented many performance issues over the years and we hoped that upgrading to a more recent version would help.

This blog post will describe how we tested the search performance of our new Elasticsearch cluster and the different optimizations we used to improve it. Specifically, we will focus on how we solved the major bottleneck for our use case: wildcards.

November 18, 2022

How we upgraded an old, 3PB large, Elasticsearch cluster without downtime. Part 2 - Two consistent clusters

This is part 2 in our series on how we upgraded our Elasticsearch cluster without downtime and minimal user impact.

As described in Part 1, our requirements were to both provide a smooth transition between two different versions of our system, while still keeping the opportunities for a rollback open.

With that in mind it was obvious from the beginning that we would have to run two Elasticsearch clusters in parallel and then manage a seamless transition between them. This blog post will describe how we solved the indexing consistency and data migration parts of that problem.

November 11, 2022

How we upgraded an old, 3PB large, Elasticsearch cluster without downtime. Part 1 - Introduction

Machine (robot) learning how to do load balancing

Back in 2018, now five years ago, we published a blog post describing our 400+ node Elasticsearch cluster. In that post we brought up an important topic:

So far, we have elected to not upgrade the cluster. We would like to, but so far there have been more urgent tasks. How we actually perform the upgrade is undecided, but it might be that we choose to create another cluster rather than upgrading the current one.

Well, the day to upgrade finally came.

August 30, 2022

Knowledge Sharing as a Catalyst for Professional Growth

We had another Devopsicon — our internal engineering (un)conference. We record most of these company-internal sessions to ensure we can share the knowledge with those who were unable to attend, and to build a knowledge base over time.

We have decided to go one step further and share some of these sessions publicly. This post describes how this approach has benefitted both us as a company and the professional growth of our engineers.

← Older Blog Archives Newer →