Kelvin Tan - Solr/Elasticsearch Consultant - Idea: 2-stage recovery of corrupt Solr/Lucene indexes

I was recently onsite with a client who happened to have a corrupt Solr/Lucene index. The CheckIndex tool (lucene 2.4+) diagnosed the problem, and gave the option of fixing it.

Except… fixing the index in this case meant losing the corrupt segment, which also happened to be the one containing over 90% of documents.

Because Solr has the concept of a doc uid (which Lucene doesn't have), what I did was write a tool for them to dump out the uids in that corrupted segment into a text file, so after recovering the index, they were able to reindex the docs that were lost in that segment.

No Comments »

Supermind Search Consulting Blog Solr - Elasticsearch - Big Data

Idea: 2-stage recovery of corrupt Solr/Lucene indexes

Supermind Search Consulting Blog
Solr - Elasticsearch - Big Data