Trigram based algorithm for Addok.
Project description
Addok-trigrams
Alternative indexation pattern for Addok, based on trigrams.
Installation
pip install addok-trigrams
Configuration
In your local configuration file:
-
remove unwanted RESULTS_COLLECTORS_PYPATHS:
from addok.config.default import RESULTS_COLLECTORS_PYPATHS RESULTS_COLLECTORS_PYPATHS.remove('addok.helpers.collectors.extend_results_reducing_tokens') RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.only_commons_but_geohash_try_autocomplete_collector') RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.no_meaningful_but_common_try_autocomplete_collector') RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.only_commons_try_autocomplete_collector') RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.autocomplete_meaningful_collector') RESULTS_COLLECTORS_PYPATHS.remove('addok.fuzzy.fuzzy_collector')
-
remove all
autocomplete
andfuzzy
RESULTS_COLLECTORS_PYPATHS, add new ones:RESULTS_COLLECTORS_PYPATHS += [ 'addok_trigrams.extend_results_removing_numbers', 'addok_trigrams.extend_results_removing_one_whole_word', 'addok_trigrams.extend_results_removing_successive_trigrams', ]
-
add
trigramize
to PROCESSORS_PYPATHS:from addok.config.default import PROCESSORS_PYPATHS PROCESSORS_PYPATHS += [ 'addok_trigrams.trigramize', ]
-
remove pairs and autocomplete indexers from
INDEXERS_PYPATHS
:from addok.config.default import INDEXERS_PYPATHS INDEXERS_PYPATHS.remove('addok.pairs.PairsIndexer') INDEXERS_PYPATHS.remove('addok.autocomplete.EdgeNgramIndexer')
By default, digit only words are not turned into trigrams. To prevent this,
set TRIGRAM_SKIP_DIGIT=False
.
Usage
Use addok batch
just like with genuine addok for importing documents, but no
need for running addok ngrams
, given they are already part of the index
strategy.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
addok-trigrams-1.1.0.tar.gz
(3.3 kB
view hashes)
Built Distribution
Close
Hashes for addok_trigrams-1.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d951755463b2670a1c9b91b38e157197bdd1775366b6bf0f8bf339a709354cf1 |
|
MD5 | 8ae0809e625f0eeb648d094b95dcd6eb |
|
BLAKE2b-256 | 09af3d550c9e809daff62bbd0de047de68a6a774f79869ddc8207bd59be694be |