Description
It seems that for SpanOrQuery IDF of terms belonging in subqueries that will not match a given document, will affect said document's score.
I have observed this through on which I have 3 documents:
doc1:
field: something
doc2:
field: nothing
doc3:
field: anything
And I issue the following query:
spanOr([Contents:something, Contents:nothing])
If you check at the score explanation you will notice that in both document's score the idf of both terms affects it even though for each document only one matches.
This is an example of the explanation of the first document's score:
3.9616547 = weight(spanOr([Contents:something, Contents:nothing]) in 0) [AsBM25Similarity], result of:
3.9616547 = score(freq=1.0), computed as boost * idf * tf from:
51.0 = boost
3.9616585 = idf, sum of:
1.9808292 = idf for term nothing , computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) + 1 from:
1 = docFreq
3 = docCount
1.9808292 = idf for term something , computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) + 1 from:
1 = docFreq
3 = docCount
0.019607842 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:
1.0 = phraseFreq=1.0
50.0 = k1, term saturation parameter
0.0 = b, length normalization parameter
1.0 = dl, length of field
2.0 = avgdl, average length of field
Version and environment details
lucene 9.7.0 through solr 9.3.0
Description
It seems that for SpanOrQuery IDF of terms belonging in subqueries that will not match a given document, will affect said document's score.
I have observed this through on which I have 3 documents:
And I issue the following query:
spanOr([Contents:something, Contents:nothing])If you check at the score explanation you will notice that in both document's score the idf of both terms affects it even though for each document only one matches.
This is an example of the explanation of the first document's score:
Version and environment details
lucene 9.7.0 through solr 9.3.0