Static memory-efficient and fast Trie-like structures for Python.
Project description
marisa-trie
Static memory-efficient Trie-like structures for Python (2.7 and 3.4+) based on marisa-trie C++ library.
String data in a MARISA-trie may take up to 50x-100x less memory than in a standard Python dict; the raw lookup speed is comparable; trie also provides fast advanced methods like prefix search.
Installation
python -m pip install -U marisa-trie
Usage
Current limitations
The library is not tested with mingw32 compiler;
.prefixes() method of BytesTrie and RecordTrie is quite slow and doesn’t have iterator counterpart;
read() and write() methods don’t work with file-like objects (they work only with real files; pickling works fine for file-like objects);
there are keys() and items() methods but no values() method.
License
Wrapper code is licensed under MIT License.
Bundled marisa-trie C++ library is dual-licensed under LGPL and BSD 2-clause license.
CHANGES
0.7.6 (2021-07-28)
Wheels are now published for all platforms.
Fixed ResourceWarning: unclosed file in setup.py.
Run black on the entire source code.
Moved the QA/CI to GitHub.
Rebuild Cython wrapper with Cython 0.29.24.
Updated libmarisa-trie to the latest version (0.2.6).
Fixed failing tests and usage of deprecated methods.
Expanded supported Python version (2.6 - 3.10).
0.7.5 (2018-04-10)
Removed redundant DeprecationWarning messages in Trie.save and Trie.load.
Dropped support for Python 2.6.
Rebuild Cython wrapper with Cython 0.28.1.
0.7.4 (2017-03-27)
Fixed packaging issue, MANIFEST.in was not updated after libmarisa-trie became a submodule.
0.7.3 (2017-02-14)
Added BinaryTrie for storing arbitrary sequences of bytes, e.g. IP addresses (thanks Tomasz Melcer);
Deprecated Trie.has_keys_with_prefix which can be trivially implemented in terms of Trie.iterkeys;
Deprecated Trie.read and Trie.write which onlywork for “real” files and duplicate the functionality of load and save. See issue #31 on GitHub;
Updated libmarisa-trie to the latest version. Yay, 64-bit Windows support.
Rebuilt Cython wrapper with Cython 0.25.2.
0.7.2 (2015-04-21)
packaging issue is fixed.
0.7.1 (2015-04-21)
setup.py is switched to setuptools;
a tiny speedup;
wrapper is rebuilt with Cython 0.22.
0.7 (2014-12-15)
trie1 == trie2 and trie1 != trie2 now work (thanks Sergei Lebedev);
for key in trie: is fixed (thanks Sergei Lebedev);
wrapper is rebuilt with Cython 0.21.1 (thanks Sergei Lebedev);
https://bitbucket.org/kmike/marisa-trie repo is no longer supported.
0.6 (2014-02-22)
New Trie methods: __getitem__, get, items, iteritems. trie[u'key'] is now the same as trie.key_id(u'key').
small optimization for BytesTrie.get.
wrapper is rebuilt with Cython 0.20.1.
0.5.3 (2014-02-08)
small Trie.restore_key optimization (it should work 5-15% faster)
0.5.2 (2014-02-08)
fix Trie.restore_key method - it was reading past declared string length;
rebuild wrapper with Cython 0.20.
0.5.1 (2013-10-03)
has_keys_with_prefix(prefix) method (thanks Matt Hickford)
0.5 (2013-05-07)
BytesTrie.iterkeys, BytesTrie.iteritems, RecordTrie.iterkeys and RecordTrie.iteritems methods;
wrapper is rebuilt with Cython 0.19;
value_separator parameter for BytesTrie and RecordTrie.
0.4 (2013-02-28)
improved trie building: weights optional parameter;
improved trie building: unnecessary input sorting is removed;
wrapper is rebuilt with Cython 0.18;
bundled marisa-trie C++ library is updated to svn r133.
0.3.8 (2013-01-03)
Rebuild wrapper with Cython pre-0.18;
update benchmarks.
0.3.7 (2012-09-21)
Update bundled marisa-trie C++ library (this may fix more mingw issues);
Python 3.3 support is back.
0.3.6 (2012-09-05)
much faster (3x-7x) .items() and .keys() methods for all tries; faster (up to 3x) .prefixes() method for Trie.
0.3.5 (2012-08-30)
Pickling of RecordTrie is fixed (thanks lazarou for the report);
error messages should become more useful.
0.3.4 (2012-08-29)
Issues with mingw32 should be resolved (thanks Susumu Yata).
0.3.3 (2012-08-27)
.get(key, default=None) method for BytesTrie and RecordTrie;
small README improvements.
0.3.2 (2012-08-26)
Small code cleanup;
load, read and mmap methods returns ‘self’;
I can’t run tests (via tox) under Python 3.3 so it is removed from supported versions for now.
0.3.1 (2012-08-23)
.prefixes() support for RecordTrie and BytesTrie.
0.3 (2012-08-23)
RecordTrie and BytesTrie are introduced;
IntTrie class is removed (probably temporary?);
dumps/loads methods are renamed to tobytes/frombytes;
benchmark & tests improvements;
support for MARISA-trie config options is added.
0.2 (2012-08-19)
Pickling/unpickling support;
dumps/loads methods;
python 3.3 workaround;
improved tests;
benchmarks.
0.1 (2012-08-17)
Initial release.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for marisa_trie-0.7.6-pp37-pypy37_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 214aebad004050bf11b99296bdd4e4f75bd743e4d0bab9b3b2853e830f595801 |
|
MD5 | 0a6fab169dc98fbdc2751c449f63b739 |
|
BLAKE2b-256 | 1249003eb15179ac89ae94bca4499a5772d82293e5c5db40c52e32ef190bde66 |
Hashes for marisa_trie-0.7.6-pp37-pypy37_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b05c83c99c92a88eaedc587294e3a47777d6baa906fccc84938930c85b6ae4c1 |
|
MD5 | 266bab030272a3d09f523931ab6f228d |
|
BLAKE2b-256 | 02ab4c6554b2a0311d83717cb1fb766c70e6b1f9df538652c78a3a00ceca2ba0 |
Hashes for marisa_trie-0.7.6-pp37-pypy37_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c69d70603e13624f9c5dcb2db959acddb12cf5da494056aa19852ac7863c222b |
|
MD5 | 86b866a398553ec0595d31a434d765b5 |
|
BLAKE2b-256 | 17985fb8791dc10badc07ba60d97fde109521ac4a33f7bf8425b146ec8b43c5a |
Hashes for marisa_trie-0.7.6-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d0a3803ef1c571a7be10ee4934cb6d777fadc1d7ed968f827c7803f68dcba30e |
|
MD5 | 59f600af773c3177df1f2ad7b5d83238 |
|
BLAKE2b-256 | b78949fc385fe0163079f3e1d193162b6e3148b2131e8ddd8932bfa40f313266 |
Hashes for marisa_trie-0.7.6-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba26adc7ce099c4187bdeb2320882c2cde1d515e3e246e399f9b73225e0bc464 |
|
MD5 | c04bec11eb953376ae64597028b97c0c |
|
BLAKE2b-256 | bb109333e7e8b5a881d06bd94e9db4b2594ec215b5064c62fbe8b2ccaf4ff5b6 |
Hashes for marisa_trie-0.7.6-cp39-cp39-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7047565f86c6561e0b8e9239ad6d3e33efb99c6f36cbed2642945b113667450a |
|
MD5 | ff165b022fd07d264620914f793ea78e |
|
BLAKE2b-256 | 0eac007a37de0b8cd4634b578efb077fc1a9848a2e692789f93cea296d6c7252 |
Hashes for marisa_trie-0.7.6-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4138078253092df5e703d0806da18a8fa2b0ddcbe16fc8df2fd3abbb40d40f70 |
|
MD5 | 1faae87ae932ca06fe49291392f1254b |
|
BLAKE2b-256 | d72e9d8851851a05629190bc6bf0f2a2a099b23cb39f4bb0acee7c794bcff9e3 |
Hashes for marisa_trie-0.7.6-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8413278bb2090bbf655a12df87970804a50c4a59cc32f52c180fb6c678690ee1 |
|
MD5 | fe8dc7f3d4a54664d4552daa619d856c |
|
BLAKE2b-256 | 788c32697fdf2b87adba59465681254a3c860c4bb0ad129b4a934f1f575f2742 |
Hashes for marisa_trie-0.7.6-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dc7c1e5b62ac59599afe59331bf36bc0db160974e304b279cbee5b814468857f |
|
MD5 | b6dddaebb8cfd89adc8722187f6cbe98 |
|
BLAKE2b-256 | 1e631facf294b51b45931151b4629c65710d64e43afe6c3ce7fb8d60528abcac |
Hashes for marisa_trie-0.7.6-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aad342601a80bbd24e15e79983f9dc21c532cef435c57e5de9cb03122462c842 |
|
MD5 | 6e9ee22fa8e59b3a7cddd7653dc7ef09 |
|
BLAKE2b-256 | c9700ecc8d4c78164a2ea3a8e762bb31b576a11e5a30a94221199a39109ae5f6 |
Hashes for marisa_trie-0.7.6-cp39-cp39-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ff383350a984f358643a423cc9aebfff86c4bee8be24594511b89caf9dc23438 |
|
MD5 | 84c89998858ad6f1020aa2744249d76c |
|
BLAKE2b-256 | 3ba1d1cb1630d7369ab646e3fc6177c91ab40fc2bd88dff30d52abc2d00e7c08 |
Hashes for marisa_trie-0.7.6-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bb5140253ccd42236ad5f96f7bb0280b628878a0ad4092e837a59a62ceac0a6b |
|
MD5 | 6c5c0ac6e1454dd3d40513b00ea91798 |
|
BLAKE2b-256 | 7d3a9da0b04dbae267faf1535d554cac63bab82cdb74ef22d6bac4cd14f36d21 |
Hashes for marisa_trie-0.7.6-cp38-cp38-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2c6dc882b9eb06659f03edf5fbbbc51b9023508da548f88b9d4451fbc20c3f70 |
|
MD5 | 9e5dfcc1c993d03a2777c9f0e5a6acd3 |
|
BLAKE2b-256 | 0050907fb99afc0f3cf89ef4c7422b1b9ac4ae14ad483afa27a3bd2513c94758 |
Hashes for marisa_trie-0.7.6-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 240ac005f782482abb59e0b726a7270a49b1d930f475c39ddb446a9892ae460e |
|
MD5 | d4fd9acf482c2902da130999298dcc87 |
|
BLAKE2b-256 | b3881347572d0e0b0ad42bb4d5a01f4cd78f92f828441b9d25848ffbc560da66 |
Hashes for marisa_trie-0.7.6-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9cbfaf1a97fa739c4505a57e20a72abe658f742839390bc3d4dbbfe17d79e740 |
|
MD5 | 0b5923816ecdae5ee38095829e479cb7 |
|
BLAKE2b-256 | ff5b3884caa91fb1dd3f8db458b7a89e58b2ee01649d1ba5d0da569e606fab11 |
Hashes for marisa_trie-0.7.6-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d94e44dbd7949a0b3c4d204862696a280594a354e57abe42592eb91c3a41e394 |
|
MD5 | 4bd1685e6830d1b27ffeb323e9c05674 |
|
BLAKE2b-256 | eed282fa8581154eb12d9459580ba87dc5f9213596797bd58b1b6c5153648732 |
Hashes for marisa_trie-0.7.6-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ea335d3118cf51b3b3d6dc45aa9763673a20b1aa135dd72eceec652389349337 |
|
MD5 | 6e47349091afd9de2a6e8cb6b087f765 |
|
BLAKE2b-256 | 24d1faf1a7d17a2f751c3931d076c19e1b631b9898828c3ec2de4dabc75052f8 |
Hashes for marisa_trie-0.7.6-cp38-cp38-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4cd46c27874d0523422147d76e73ef55e0d7ad77e91540f10edd8cd12dd16ff1 |
|
MD5 | 997b85ee9806ea13c8197d0a026ede75 |
|
BLAKE2b-256 | 615851de926ccfbb09e07212ad9992b5f000c6f8f75e2d486dbe3683250808b5 |
Hashes for marisa_trie-0.7.6-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9644257518a471cc39df7a5a05c1c013330eb052b4021cd283f6520fdc61c397 |
|
MD5 | 84b94a50d08df8bd263bbcb1e821f508 |
|
BLAKE2b-256 | 454768844823a837b6e3608c6a91a61ebd383f8fda7967f1f808af46360660ed |
Hashes for marisa_trie-0.7.6-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9a2de8ddcaf3e89283a450af1c4d69632513b74fa6314cd22d0230cf363d3d37 |
|
MD5 | 10c1969c7aa24a21279f92f5104103eb |
|
BLAKE2b-256 | 857d45bdf4add4fc717ce6cb8c1ebc8badf6257735c387868c7cd47e73f23549 |
Hashes for marisa_trie-0.7.6-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ee127d768f8ba42265306ed9b1d78ebcc6dbde1c0f923493ab154156d269e76 |
|
MD5 | 6e6f5b7e7e4ada802f2297422cbd9b64 |
|
BLAKE2b-256 | 0d0a7613a461bce3e30bcecbfaa13a8afbb026f443df9e0e8558ebd4d45dfba7 |
Hashes for marisa_trie-0.7.6-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e2964ef24282815b2320f9c5790476e71791c80dc225d8b3ba7e2a7fa68fca9e |
|
MD5 | dc23b50c7a22a3aa06b2757f27827bd5 |
|
BLAKE2b-256 | 1f8a3f0a1da421acd52534f5e037d18be0232f0f33a2f67748818afaeb6748db |
Hashes for marisa_trie-0.7.6-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e5853dae802b7974893110ba018ab0b6a07bec21fff61e409f6427d4431b83c |
|
MD5 | 559088b8e8c19c9d04c9cc1c6ad2e894 |
|
BLAKE2b-256 | 5594c494b184ed5b3136e0b88706435923519e33d4a2aae450d514b356ca5dd8 |
Hashes for marisa_trie-0.7.6-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b9b599b602ba9cc868fad9b8b406bd68f3fc89f27980c088c1a3f9383586fca |
|
MD5 | 5329b38aa8f665095946a8d527e0498a |
|
BLAKE2b-256 | 1b5226f8b923483b6200317b32fecaeac5bdd47ab9fc733214eab646f6adfe56 |
Hashes for marisa_trie-0.7.6-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d9ce01df065dd0deb69e9050aac2d948e8764c14e184aec7e4380bf1deab1666 |
|
MD5 | 4e948a64735ec7d208eeef4ffe8bbdd6 |
|
BLAKE2b-256 | f72d35fb22f437abffeea1a0604bd595cabae5877187934fa4c764ed4eadc785 |
Hashes for marisa_trie-0.7.6-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3bc958ed4290b47f32888990cfb338265fa7947be4b2d54fb575892100285b38 |
|
MD5 | 211eba624ce87e303c00f2991ee19c20 |
|
BLAKE2b-256 | ea929641597dcda49326edd2c76972e8b0d799673f947b03860cf663feab3c82 |
Hashes for marisa_trie-0.7.6-cp36-cp36m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c756b7a53caeb3d3afd11e98f03763d43dce36c9c5962b223cf28052999121d |
|
MD5 | 1f9b641ec153a6bccc026d56fe6869ff |
|
BLAKE2b-256 | 4026464853765fd9e1a514d367b0845b044e1cebc816e2bcf12cac2c6fcb8833 |
Hashes for marisa_trie-0.7.6-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 68f832478f4d84e8b3dae77e63b31ef4e23541d5d2adc9c794c135d8b0ef5f5e |
|
MD5 | 1f9ef1d2db5d6c14a2e41cd15936ea40 |
|
BLAKE2b-256 | 064b4d0b06e4fc07c89f5f5dc738bf0c10f0b61903308f8b418c39efab26ae5b |