gnes.score_fn.chunk module¶
-
class
gnes.score_fn.chunk.
BM25ChunkScoreFn
(threshold=0.8, *args, **kwargs)[source]¶ Bases:
gnes.score_fn.base.CombinedScoreFn
- score = relevance * idf(q_chunk) * tf(q_chunk) * (k1 + 1) / (tf(q_chunk) +
- k1 * (1 - b + b * (chunk_in_doc / avg_chunk_in_doc)))
- in bm25 algorithm:
- idf(q_chunk) = log(1 + (doc_count - f(q_chunk) +0.5) / (f(q_chunk) + 0.5)),
where f(q_chunk) is number of docs that contains q_chunk. In our system, this denotes number of docs appearing in query results.
In elastic search, b = 0.75, k1 = 1.2
-
train
(*args, **kwargs)¶ Train the model, need to be overrided
-
class
gnes.score_fn.chunk.
CoordChunkScoreFn
(score_mode='multiply', *args, **kwargs)[source]¶ Bases:
gnes.score_fn.base.CombinedScoreFn
score = relevance * query_coordination query_coordination: #chunks return / #chunks in this doc(query doc)
Parameters: score_mode ( str
) – specifies how the computed scores are combined-
train
(*args, **kwargs)¶ Train the model, need to be overrided
-
-
class
gnes.score_fn.chunk.
TFIDFChunkScoreFn
(threshold=0.8, *args, **kwargs)[source]¶ Bases:
gnes.score_fn.base.CombinedScoreFn
score = relevance * tf(q_chunk) * (idf(q_chunk)**2) tf(q_chunk) is calculated based on the relevance of query result. tf(q_chunk) = number of queried chunks where relevance >= threshold idf(q_chunk) = log(total_chunks / tf(q_chunk) + 1)
-
train
(*args, **kwargs)¶ Train the model, need to be overrided
-
-
class
gnes.score_fn.chunk.
WeightedChunkOffsetScoreFn
(score_mode='multiply', *args, **kwargs)[source]¶ Bases:
gnes.score_fn.base.CombinedScoreFn
score = d_chunk.weight * relevance * offset_divergence * q_chunk.weight offset_divergence is calculated based on doc_type:
TEXT && VIDEO && AUDIO: offset is 1-D IMAGE: offset is 2-DParameters: score_mode ( str
) – specifies how the computed scores are combined-
train
(*args, **kwargs)¶ Train the model, need to be overrided
-
-
class
gnes.score_fn.chunk.
WeightedChunkScoreFn
(score_mode='multiply', *args, **kwargs)[source]¶ Bases:
gnes.score_fn.base.CombinedScoreFn
score = d_chunk.weight * relevance * q_chunk.weight
Parameters: score_mode ( str
) – specifies how the computed scores are combined-
train
(*args, **kwargs)¶ Train the model, need to be overrided
-