solr – Make Me Engineer

Apache Solr string field or text field?

June 9, 2023 by Tarik

The fields as default defined in the solr schema are vastly different. String stores a word/sentence as an exact string without performing tokenization etc. Commonly useful for storing exact matches, e.g, for facetting. Text typically performs tokenization, and secondary processing (such as lower-casing etc.). Useful for all scenarios when we want to match part of … Read more

Solr Custom Similarity

June 2, 2023 by Tarik

I figured it out on my own. I have stored my own implementation of DefaultSimilarity under /dist/ folder in solr. Then i add <lib dir=”../../../dist/org/apache/lucene/search/similarities/” regex=”.*\.jar”/> to my solrconfig.xml and everything works fine. package org.apache.lucene.search.similarities; import org.apache.lucene.index.FieldInvertState; import org.apache.lucene.search.similarities.DefaultSimilarity; public class MyNewSimilarityClass extends DefaultSimilarity { @Override public float coord(int overlap, int maxOverlap) { return 1.0f; … Read more

How can I tell Solr to return the hit search terms per document?

May 16, 2023 by Tarik

Kind of depends on your requirements, but as far as I know there is no specific support for this in Solr. You can however hack it together in a few other ways. Not sure what you can expect for performance for these, tho.. Use Highlightning If you use highlighting you can parse the returned highlighted … Read more

Solr index vs stored

May 9, 2023 by Tarik

That is correct. Typically you will want your field to be either indexed or stored or both. If you set both to false, that field will not be available in your Solr docs (either for searching or for displaying). See Alexandre’s answer for the special cases when you will want to set both to false. … Read more

ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage? [closed]

August 6, 2022 by Tarik

As the creator of ElasticSearch, maybe I can give you some reasoning on why I went ahead and created it in the first place :). Using pure Lucene is challenging. There are many things that you need to take care for if you want it to really perform well, and also, its a library, so … Read more