For our fairhair.ai platform we enrich over 450 million documents such as news articles and social posts per day, with a dependency tree of more than 20 NLP syntactic and semantic enrichment tasks. We ingest these documents as a continuous stream of data and guarantee delivery of enriched documents within 5 minutes of ingestion.
This technical feat required tight collaboration between two specialised teams: data science and platform engineering. Enabling both teams to efficiently work together around a common workflow execution engine was another problem we needed to solve. Hopefully that description fully piqued your interest because our solution (Benthos) is totally boring.