Unfolding the universe of possibilities..

Dancing with the stars of binary realms.

System Design Cheatsheets: ElasticSearch

Understand how and when to use ElasticSearch in systems, with three practical system design examples


What is Search? And why it is important?

If you’ve read my previous articles on search, you’d know how critical search is to an application. Think about it: out of all the different web apps and mobile apps you use every day, be it Netflix, Amazon, Swiggy, etc., the search bar is probably the only common UI element in all of them, and that too is usually at the homepage, right at the top. If you are designing a system, ninety-nine times out of a hundred, you’ll think of how to power search.

Building a search system is no small feat, but a great starting point is ElasticSearch. If you don’t know anything about how search or recommendation systems work, this blog post is a good starting point for you. We will discuss what ElasticSearch is, where it works and where it doesn’t, and three common designs in which ElasticSearch is used. There are a lot more attributes of a search system, but more on that towards the end of the article.

What is ElasticSearch?

ElasticSearch is a popular database that does something that most databases struggle with: Searching. Searching is so core to ElasticSearch, it’s literally in its name!

But if you haven’t heard about ElasticSearch, you’re probably thinking: why is searching so difficult? Why can’t a relational database perform a search? Most relational databases support various ways to search and filter through data, like the WHERE query, the LIKE keyword, or indexes. Or why can’t a document database like MongoDB work? You can write find queries in MongoDB as well.

To understand the answer, imagine you are building a news website. When the user searches for news using your search bar, maybe for “COVID19 infections in New Delhi”, the user is interested in all the articles that talk about COVID infections in New Delhi. In a simple search system, it would mean scanning all the articles in the database, and returning those that contain the words “COVID19”, “infections” or “New Delhi”. You can’t do that with a relational database. A relational database would allow you to search for articles based on specific attributes, for example, articles written by a particular author or articles published today, etc. but it can’t (at least, not efficiently) perform a search in which it scans every single news article (usually in tens of millions) and return those that contain certain words.

Moreover, there are a lot more intricacies to consider. How do you score these articles? Maybe there is an article that talks about COVID19 infection spread, and maybe there is one that talks about new infections, how do you know which is more relevant to the user query, or in other words, how do you sort these articles based on relevance?

Answer: ElasticSearch! ElasticSearch can do all this and much much more right out of the box.

But, like everything else in the world, it comes with its fair share of disadvantages. Let’s discuss what ElasticSearch is, when to use it, and most importantly when it doesn’t make sense.


Searching Capabilities

ElasticSearch provides a way to perform a “full-text search”. Full-text search refers to searching for a phrase or a word in a huge corpus of documents. Let’s continue with our previous example, imagine you are building a news website that contains millions of news articles. Each article contains some data, like a heading, subheading, the content of the article, when it was published, etc. In the context of ElasticSearch, each article is stored as a JSON document.

You can load all these documents into ElasticSearch and then search for specific words or phrases within each of these documents in a few milliseconds. So if you load up all the news articles, and then perform a search, “COVID19 infections in Delhi”, ElasticSearch returns all the articles that have the words “COVID19”, “infections”, or “Delhi”.

To demonstrate searching in ElasticSearch, let’s set up Elasticsearch and load some data in it. For this post, I will use this News dataset I found on Kaggle(Misra, Rishabh. “News Category Dataset.” arXiv preprint arXiv:2209.11429 (2022)) (Source) (License). The dataset is pretty simple, it contains around 210,000 news articles, with their headlines, short descriptions, authors, and some other fields we don’t care much about. We don’t really need all 210,000 documents, so I will load up around 10,000 documents in ES and start searching.

These are a few examples of the documents in the dataset —

“link”: “https://www.huffpost.com/entry/new-york-city-board-of-elections-mess_n_60de223ee4b094dd26898361”,
“headline”: “Why New York City’s Board Of Elections Is A Mess”,
“short_description”: ““There’s a fundamental problem having partisan boards of elections,” said a New York elections attorney.”,
“category”: “POLITICS”,
“authors”: “Daniel Marans”,
“country”: “IN”,
“timestamp”: 1689878099

Each document represents a news article. Each article contains a link, headline, a short_description, a category, authors, country(random values, added by me), and timestamp(again random values, added by me).

Elasticsearch queries are written in JSON. Instead of diving deep into all the different syntaxes you can use to create search queries, let’s start simple and build from there.

One of the simplest full-text queries is the multi_match query(don’t worry too much about querying data in ElasticSearch, it’s pretty simple and we will talk about it towards the end of the article). The idea is simple, you write a query and Elasticsearch performs a full-text search, essentially scanning all the documents in your database, finding those that contain the words in that query, assigning a score to them, and returning them. For example,

GET news/_search
“query”: {
“multi_match”: {
“query”: “COVID19 infections”

The above query finds relevant articles for the query “COVID19 infections”. These are the results I got back –

“_index” : “news”,
“_id” : “czrouIsBC1dvdsZHkGkd”,
“_score” : 8.842152,
“_source” : {
“link” : “https://www.huffpost.com/entry/china-shanghai-lockdown-coronavirus_n_62599aa1e4b0723f8018b9c2”,
“headline” : “Strict Coronavirus Shutdowns In China Continue As Infections Rise”,
“short_description” : “Access to Guangzhou, an industrial center of 19 million people near Hong Kong, was suspended this week.”,
“category” : “WORLD NEWS”,
“authors” : “Joe McDonald, AP”,
“country” : “IN”,
“timestamp” : 1695106458
“_index” : “news”,
“_id” : “ODrouIsBC1dvdsZHlmoc”,
“_score” : 8.064016,
“_source” : {
“link” : “https://www.huffpost.com/entry/who-covid-19-pandemic-report_n_6228912fe4b07e948aed68f9”,
“headline” : “COVID-19 Cases, Deaths Continue To Drop Globally, WHO Says”,
“short_description” : “The World Health Organization said new infections declined by 5 percent in the last week, continuing the downward trend in COVID-19 infections globally.”,
“category” : “WORLD NEWS”,
“authors” : “”,
“country” : “US”,
“timestamp” : 1695263499

As you can see, it returns documents that discuss COVID19 infections. It also returns them sorted in the order of relevance(The _score field indicates how relevant a particular document is).

ElasticSearch has a rich query language with a lot of features, but for now, it is enough to know that building a simple search system is very easy, simply load all your data into ElasticSearch and use a simple query that we discussed. We have a plethora of options to improve, configure, and tweak search performance and relevance (again, more on search queries towards the end of this post).

Distributed Architecture

ElasticSearch works as a distributed database. This means that there are multiple nodes in a single ElasticSearch cluster. If a single node becomes unavailable or fails, that doesn’t usually mean downtime for our system, and other nodes would usually pick up the extra work and continue to serve user requests. So multiple nodes facilitate higher availability.

Multiple nodes also help us scale our systems, data and user requests can be divided across these nodes which leads to less load per node. For example, if you want to store 100 million news articles in ElasticSearch, you can split that data into multiple nodes, with each node storing a certain set of articles. And it’s pretty easy to do, in fact, ElasticSearch comes with built-in features to make this as simple and seamless as possible.


ElasticSearch scales horizontally and is able to partition data across multiple nodes. This means that you can always improve query performance by adding more nodes to your ElasticSearch cluster.

There is a lot more thought process about architecting your ElasticSearch cluster than just running more servers though. There are different types of nodes, these nodes run processes called “shards”, and each shard, node, can have multiple types and configuration options. There is a lot to discuss about the architecture of an ElasticSearch cluster and how it works, so I’ve written a complete post on the architecture here if you want to dive deeper into it.

TLDR: you can add more machines to scale your cluster and improve performance. Data and queries would be divided into multiple machines. This facilitates better performance and high scalability.

Document-based data modeling

ElasticSearch is a document database, that stores data in JSON document format, similar to MongoDB. So, in our example, every news article is stored as a JSON document in the cluster.

Real-time data analysis

Real-time data analysis is looking at user actions in real-time and understanding user patterns and behavior. We can chart user behavior and better understand our users, using which we can improve our product. For example, let’s say we measure every single click, scroll event, and reading time per user on our news website. We chart these metrics in a dashboard and observe them for a few days. Using this, we can collect a lot of actionable insights to improve our news app. We found out that users usually use the website at 9–10 AM in the morning, and we found out that users generally click on articles that are relevant to their country. Using this information, we can overprovision resources during peak times (9–10 AM) and maybe show articles from the user’s country on their homepage.

Elasticsearch is well-suited for real-time data analysis due to its distributed architecture and powerful search capabilities. When dealing with real-time data, such as logs, metrics, or social media updates, Elasticsearch efficiently indexes and stores this information. Its near real-time indexing allows data to be searchable almost instantly after ingestion. ElasticSearch also works well with other tools, like Kibana for visualization or Logstash and Beats for collecting metrics.

Towards the end of the article, we will look at an architecture that facilitates this.


ElasticSearch is expensive to run and maintain. As with everything in this world, everything good comes at a price. To perform full-text search, ElasticSearch keeps a large amount of data in RAM and builds complex indices. This means it requires a lot of RAM to run, which is expensive.

So, in short, it gives you amazing performance when performing full-text search but it ain’t cheap.

When not to use ElasticSearch

ACID compliance

ElasticSearch, like most NoSQL databases, has very limited support for ACID, so if you want strong consistency or transactional support, ElasticSearch might not be the choice of database for you. Consequences of this are that if you insert a document (called “indexing” a document in ElasticSearch) in ElasticSearch, it might not be available to other nodes immediately and might take a few milliseconds before it is visible to other nodes.

Let’s say you are building a banking system; if a user deposits money into his/her account, you want that data visible instantly to every other transaction that the user performs. On the other hand, if you are using ElasticSearch to power searches on your news website when a new article gets published, it’s probably acceptable that the article is not visible to all users for the first few milliseconds.

When you need complex joins

ElasticSearch does not support JOIN operations or relationships among different tables. If you’ve been using relational databases, this might come as a bit of a shock to you but most NoSQL databases have limited support for these types of operations.

If you want to perform JOINs or use foreign keys for highly related structured data, ElasticSearch may not be the best choice for your use case.

Small dataset or simple query needs

ElasticSearch is complex and costly. Running and managing a large ElasticSearch cluster not only requires the knowledge and skill of software engineers and DevOps engineers but might even require specialists who excel at managing and architecting ElasticSearch clusters, called “ElasticSearch Architects”. There is a plethora of configuration options and architectural choices to play around with and each one of them has a significant impact on your queries and ingestion, thus having an indirect impact on user experience on core flows in your system.

If you want to execute simple queries or have relatively low data, then a simple database might be better for your application.

How to use ElasticSearch in your system design

A single software system would usually require multiple databases, each powering a different set of functionalities. Let’s take an example to understand the design choices of using ElasticSearch better.

Let’s say you want to build a video streaming service, something like Netflix. Let’s see where ElasticSearch can fit in in this example.

As a Search system

A very common use case of ElasticSearch is as a secondary database powering full-text search queries. This is very useful for our video streaming application. We can’t store the videos in ElasticSearch, and we probably don’t want to store data related to billing or users in ElasticSearch as well.

For that, we can have other databases, but we can store the titles of movies, along with their description, genres, ratings, etc. in ElasticSearch.

We can have an architecture similar to this:

Image by author

We can ingest data on which we want to power full-text search into ElasticSearch. When the user performs a search operation, we can query the ElasticSearch cluster. This way we get the full-text search capabilities of ElasticSearch and when we want to update user information, we can perform those updates in our primary storage.

As a real-time data analysis pipeline

As we discussed, understanding user behavior and patterns is an essential step in deciding how to evolve the product. We can publish events, such as clickstream events, and scroll events to better understand how our users use our product.

For example, in our video streaming application, we can publish an event with user and movie data whenever a user clicks on a movie or a show. We can then analyze and chart aggregations to better understand how users are using our product. For example, we might notice that users use our product more in the evening than in the afternoon or that users may prefer shows or movies in their local language over other languages. Using this, we can develop our product to improve user experience.

This is how a basic system for real-time data analysis using ElasticSearch and Kibana (a dashboarding tool that works well with ElasticSearch) would look like:

Image by author

As a recommendations system

We can build queries in ElasticSearch that would give more preference(called boosting) to certain attributes. For example, instead of a simple query

We can build basic recommendation systems with ElasticSearch. We can store information about the user, such as the user’s country, age, preferences, etc., and generate queries to get popular movie shows or series for that user.

Understanding the query language and how to boost certain fields, and perform aggregations is a large topic in itself, but I’ve written a blog post covering the basics here:

Mastering Elasticsearch: A Beginner’s Guide to Powerful Searches and Precision — Part 1


How to Architect ElasticSearch Clusters?

Architecting an ElasticSearch cluster is no easy feat, it requires knowledge of nodes, shards, indexes, and how to orchestrate all of them. There are near-infinite architectural choices to make, and the field is constantly evolving(especially more with the popularity of AI and AI-powered search). To discuss it more, I’ve written a complete blog post that starts from the very basics to everything you’d need to know to architect a search cluster:

System Design Series: ElasticSearch, Architecting for search

Understanding Search Queries and Improving Search Systems

Search is complex, very complex. There are a lot of ways we can improve search systems, making them more powerful and understanding of user needs. You have already learned about ElasticSearch and what it is. Continue this journey as we start from here, build a basic search query, understand the problems in the query and our system, and evolve and improve the system, step-by-step with examples.

Mastering Elasticsearch: A Beginner’s Guide to Powerful Searches and Precision — Part 1

Context-aware Searching

I recently read a great analogy on search systems. You can think of the search system we have discussed so far as a mechanical, rigid search. When a user enters a word, we find all the documents where the word appears and return them.

Or you can think of a search system as a librarian. When the user asks a question, let’s say, “What was Winston Churchill’s role in the second world war?”, the librarian doesn’t just tell him the books which have the words “Winston”, “Churchill” or “Second World War”. Instead, the librarian evaluates and understands the customer and the context. Maybe it’s a school kid, so instead of recommending a huge textbook, she finds a book more relevant to a younger kid. Or maybe she doesn’t have any book with the title of Winston Churchill, so she finds a book that talks about the Second World War or British prime ministers and recommends that instead. The librarian may even recommend different books for exams and different for summer vacation homework(some of you may not know this, but in some countries, you are given a huge amount of homework for summer vacations)

This is easy to understand for you and me but how would our system know that Winston Churchill was a British prime minister and recommend books on Britain during the Second World War, or how would our system understand the context of the discussion, understand the user, and recommend appropriate books?

As difficult as it may seem, it’s actually not so hard. It’s called Semantic Search and it is how most big tech companies build their search systems.
Semantic search is a set of search techniques that aims to understand the meaning behind user queries and the context of content, enabling more accurate and contextually relevant search results by considering the relationships between words and the intent behind the search.

It’s a large topic, and I am still reading and understanding more about it, but a blog post that starts at the basics is coming soon, so if you want to know more about this topic, follow me here on Medium.

Other databases

I write about system design concepts, like databases, queues, and pub-sub systems, so follow me here on Medium for similar articles. I also write a lot of byte-sized content on LinkedIn (for example, this post on the differences between RabbitMQ and Kafka), so follow me on LinkedIn for shorter forms of content here.

Meanwhile, you can check out my blog posts on other databases and system design concepts-

Sanil Khurana on Medium curated some lists

System Design Cheatsheets: ElasticSearch was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

Leave a Comment