At the heart of Meltwater’s and Fairhair.ai’s information retrieval systems lies a collection of Elasticsearch clusters containing billions of social media posts and editorial articles.
The index shards in our clusters vary greatly in their access pattern, workload and size which presents some very interesting challenges.
This blog post describes how we use Linear Optimization modeling for distributing search and indexing workload as evenly as possible across all nodes in our clusters.