Why is Solr preferred over Lucene?

Do you know that the Lucene search engine will be deprecated in later releases of Sitecore and Sitecore recommends not to use this search engine.

But why is Solr preferred over Lucene?

The answer is – Scalability. Let us understand how Solr is scalable with below example.

Suppose we have a website that has 1 CD server.

As the traffic on our website increases, we may need to scale horizontally by deploying more CD servers.

Suppose we have scaled and now we have 3 CD servers.

But what about the indexes?

Each CD server doesn’t need to have their own index. They can reference same index. As each of these servers are referencing the same index, there is no data inconsistency.

Suppose there is one Solr server and it is connected to the CM and 3 CDs.

Now, when index rebuild takes place, the solr index gets updated. For ensuring minimal downtime, solr can follow master slave architecture wherein while master index is getting updated by CM, CD servers can reference the slave index so that data is available on the CD servers and the data is same across CD servers. Once the rebuild completes, the slave indexes can take the delta from the master indexes.

But if we would have used Lucene, in that case, each server will have its own index. Due to latency, there may be delays in syncing up the indexes on each server which may result in data inconsistency and also, data unavailability during index rebuild.