Meltwater Blog

Inside Meltwater Engineering.

We build the platforms that help comms teams see around corners. Expect practical engineering lessons, data pipelines at scale, and product thinking from the people behind Meltwater.

Locality-sensitive Hashing in Elixir

Locality-sensitive Hashing in Elixir

elixir erlang profiling locality-sensitive hash simhash deduplication near-duplicate detection LSH

My team and I have built a solution that mines a stream of online articles for real-time insights for our customers. This component’s logic could be dramatically simplified if we could assume that it never receives near-duplicates of articles. While deduplication of identical documents is simple, detection of near-duplicates (i.e. “same thing,...

Monitoring your System’s Heartbeat using Cloudwatch

Monitoring your System’s Heartbeat using Cloudwatch

AWS Cloudwatch logging heartbeat monitoring

Have you implemented a system that is supposed to perform tasks at regular intervals? Does the repeated failure of such a system pose a threat to your quality of service? If so, I am sure you would want to be alerted, if your system suddenly stops performing these tasks. We at Meltwater’s...

JUGRI: The JUpyter - GRemlin Interface

JUGRI: The JUpyter - GRemlin Interface

gremlin jupyter knowledge graph python data science

Jupyter is a popular web framework used with Python to easily visualize and manipulate data. It can display the results of many databases using the Pandas library, but the popular Gremlin graph query language hasn’t been supported. To solve this problem we created and open-sourced JUGRI to show your Gremlin query results...

Risk-free Deployments with Immutable Web Apps

Risk-free Deployments with Immutable Web Apps

open source web apps immutablewebapps

Today we are excited to share our Immutable Web Applications methodology with you. Immutable Web Applications is a framework-agnostic methodology for building and deploying static, single-page applications that minimizes the complexity of live releases and enables continuous delivery through simple, flexible, atomic deployments. If you care about building web applications, and want...

Hosting the Elixir Berlin Meetup

Hosting the Elixir Berlin Meetup

elixir berlin

In Meltwater’s Berlin office, we are enthusiastic users and advocates for Elixir and ruby. Hence we were excited to get the chance to host the Elixir Berlin meetup for the first time this November. It was the #53’rd edition of the Elixir Berlin already, what a great streak! Besides hosting the event...

Optimal Shard Placement in a Petabyte Scale Elasticsearch Cluster

Optimal Shard Placement in a Petabyte Scale Elasticsearch Cluster

elasticsearch linear optimization load balancing fairhair.ai

At the heart of Meltwater’s and Fairhair.ai’s information retrieval systems lies a collection of Elasticsearch clusters containing billions of social media posts and editorial articles. The index shards in our clusters vary greatly in their access pattern, workload and size which presents some very interesting challenges. This blog post describes how we...

Quitsies - A Minimal Persisted Memcached Replacement

Quitsies - A Minimal Persisted Memcached Replacement

Memcached Persisted Memcached key-value store open source

Quitsies is a distributed and disk persisted caching system that implements a subset of the Memcached text protocol. It was built as a minimal drop-in replacement for Memcached, and has been running in our production pipelines for over a year. This post explains why we needed Quitsies, and how we went about...

Increase Diversity by Reducing Biases in your Hiring Process

Increase Diversity by Reducing Biases in your Hiring Process

Bias Diversity Gothenburg Hiring Recruiting

Would you agree that your biases are affecting your recruitment process? We have been thinking about it and we were especially curious how we can improve our recruitment process by working with our biases and learning how to disarm those when hiring. In this post we are sharing the tools and processes...

Using Machine Learning to Load Balance Elasticsearch Queries

Using Machine Learning to Load Balance Elasticsearch Queries

elasticsearch machine learning ml deep learning load balancing fairhair.ai fhai meltwater

Meltwater recently launched the Fairhair.ai data science platform. Part of this platform are several large Elasticsearch clusters, which serve insights over billions of social media posts and editorial articles. The nature of the searches that our customers need to run against this data quickly make the default load balancing behaviour of Elasticsearch...