A fast tool to calculate Hamming distances
Project description
A small C++ tool to calculate pairwise distances between gene sequences given in fasta format.
Python interface
To use the Python interface, you should install it from PyPI:
python -m pip install hammingdist
Distances matrix
Then, you can e.g. use it in the following way from Python:
import hammingdist
# To see the different optional arguments available:
help(hammingdist.from_fasta)
# To import all sequences from a fasta file
data = hammingdist.from_fasta("example.fasta")
# To import only the first 100 sequences from a fasta file
data = hammingdist.from_fasta("example.fasta", n=100)
# To import all sequences and remove any duplicates
data = hammingdist.from_fasta("example.fasta", remove_duplicates=True)
# To import all sequences from a fasta file, also treating 'X' as a valid character
data = hammingdist.from_fasta("example.fasta", include_x=True)
# The distance data can be accessed point-wise, though looping over all distances might be quite inefficient
print(data[14,42])
# The data can be written to disk in csv format (default `distance` Ripser format) and retrieved:
data.dump("backup.csv")
retrieval = hammingdist.from_csv("backup.csv")
# It can also be written in lower triangular format (comma-delimited row-major, `lower-distance` Ripser format):
data.dump_lower_triangular("lt.txt")
retrieval = hammingdist.from_lower_triangular("lt.txt")
# If the `remove_duplicates` option was used, the sequence indices can also be written.
# For each input sequence, this prints the corresponding index in the output:
data.dump_sequence_indices("indices.txt")
# Finally, we can pass the data as a list of strings in Python:
data = hammingdist.from_stringlist(["ACGTACGT", "ACGTAGGT", "ATTTACGT"])
Distances from reference sequence
The distance of each sequence in a fasta file from a given reference sequence can be calculated using:
import hammingdist
distances = hammingdist.fasta_reference_distances(sequence, fasta_file, include_x=True)
This function returns a numpy array that contains the distance of each sequence from the reference sequence.
You can also calculate the distance between two individual sequences:
import hammingdist
distance = hammingdist.distance("ACGTX", "AAGTX", include_x=True)
OpenMP on linux
The latest versions of hammingdist on linux are now built with OpenMP (multithreading) support. If this causes any issues, you can install a previous version of hammingdist without OpenMP support:
pip install hammingdist==0.11.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for hammingdist-0.15.0-pp39-pypy39_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 772eef587ca7012fac27f9d2239942e69b399b7098876fa8043e2f680e7b7914 |
|
MD5 | 38aea74287f12293225a2392d688b1c8 |
|
BLAKE2b-256 | 68f09cb6fa2f042b87651ca01e970601fb6c7658e1e630b15fb9f1b25827b3a0 |
Hashes for hammingdist-0.15.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b813dd5f7242aa04b71ef4c6fe2f54ae4555db50d32ebc215be467cfae22e508 |
|
MD5 | d070c492ce20daa73172409b9efd239e |
|
BLAKE2b-256 | f96c2d41d61cbe4f5aa729aca5f344dc9a5bb7a5d9061e7feac31cf1aa860bbd |
Hashes for hammingdist-0.15.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d5868756c1af77a45cd0c7fd78415a501dcfadeca5a035e1468a88e880ed6004 |
|
MD5 | fa728915be1a093b31b8463c8df5e883 |
|
BLAKE2b-256 | 8988fcc89484ce08c138871939d7bd0877682ffce387c2b70ab4aebeedb03e1b |
Hashes for hammingdist-0.15.0-pp38-pypy38_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 32989405b36c7fc888471918413e2fe5793b922fee7aa8473019befe654d6da5 |
|
MD5 | 397d1cfbd5f19bb54cfbbaa20b93e1b7 |
|
BLAKE2b-256 | 7d231ba4a2fcebba7f8f6426c067a1bc6700ae72e0e3387fa1b188c7071a3afa |
Hashes for hammingdist-0.15.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa80163f8fc0931a6c90d106cc2b56926b8824744c0906650c39ee57ec6bd0ec |
|
MD5 | c34f7cec94af04c5b5ec81f06643af2b |
|
BLAKE2b-256 | 76bd56c9d7d9f5b775d45170fb3ead7472be1baa45cb14b8a17102445f35936c |
Hashes for hammingdist-0.15.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 08df0827c8ffd4088f4aea55e12c2db52291cf621ac914f49a3081ecc8c85b10 |
|
MD5 | 86f4bea72566e8127df6767d903cb80e |
|
BLAKE2b-256 | 48b1efff94df21ac5b621cadb3fda1993cbf4b439d0b7f4c88931d1c4151b35f |
Hashes for hammingdist-0.15.0-pp37-pypy37_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 46417b29104922e39ed6547448822362fe7204f903abc1a3c5f523b83d3d1e03 |
|
MD5 | 4814336d02cab83c6e0883a69e80d514 |
|
BLAKE2b-256 | 4e20ddc93e2cf2464bbd9c1174c269ec1811b9c8454b8a6f9782ca4750bb9746 |
Hashes for hammingdist-0.15.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e0a6bb25306ecda97ef3208f246288853af0afda85b6536fc23e901ee1931d1 |
|
MD5 | f2bfba6e0ba7c64f81ad1c36cc5006e7 |
|
BLAKE2b-256 | e4e66ecb25d02f40f306367ff1ae6400a2062cb889be089be60f76706e206df6 |
Hashes for hammingdist-0.15.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 46435ddf2b208a68a5a3385cbb353c553e0f12b65bdbe58d3f1287361b0a22f7 |
|
MD5 | 1a5b8fd8dd75bbadb298d21ce5453984 |
|
BLAKE2b-256 | eed98b65de1a801bdf15fd3863759e84e9b19c1fa097747757986d909295c8f0 |
Hashes for hammingdist-0.15.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e4b2d183440846c4f6f9ab108005c6f71b1521f175d31ca83a0d326ce2663283 |
|
MD5 | 28c3e3caad263867914981614e456da8 |
|
BLAKE2b-256 | 48e66e63aedac8446b40868ba67c917981c5370e3b58ef649e08f853bba1702f |
Hashes for hammingdist-0.15.0-cp310-cp310-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 41209275f5a556c796e83221e06615f5044bc423bd1a2feb14a0c9448b92eb00 |
|
MD5 | e014b53a70b7b42de56f40446f9d77df |
|
BLAKE2b-256 | cb056176687c7830b5820b51a1b507b69c1b965eb609c8bbaf4ee25e061fe027 |
Hashes for hammingdist-0.15.0-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e61fad55544f6fcbe91778ae2306503d6d5b0b679a1b6e7a21c98d65eafdcecc |
|
MD5 | f3bc06999d36395432d9c29a8617456e |
|
BLAKE2b-256 | 4bd334294f712098674c47f4fc89004a97ebffe21c90ae00f5ff037a8fde6bff |
Hashes for hammingdist-0.15.0-cp310-cp310-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1ff93c3ef5a3960ba56e08758b93e40e775577e23b2613f7a4fc098a58c91e83 |
|
MD5 | bb917af76c2b41120516f1f49daa3763 |
|
BLAKE2b-256 | f906f99fb0d70b8bea33426d4e66f663f0cbd980d26681ec841cbe32f5621b16 |
Hashes for hammingdist-0.15.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e0bd0d89295a60dded1a2b6ad1a4a4ba0e5269248ee3266e25c57e1c4f6d7cb5 |
|
MD5 | 47ee1bff213c4d992d09d1b1d58a6435 |
|
BLAKE2b-256 | 2ff89b951620860246b0ff4cfbd3528e15476f6f2316e891b6d8a566fdcbbddb |
Hashes for hammingdist-0.15.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4bf889197cb172010ffa26916cdc2a4bb7e0b6098e806d48b3edaee5b51eeccd |
|
MD5 | d8aa760ab37ca038a3075ce74aa50f4b |
|
BLAKE2b-256 | 6b2de3e33f955853a91c6963ef6d4133e788b95782e27645804e7b0a3b4f9f48 |
Hashes for hammingdist-0.15.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c7489119de859e52a3310964886cba5231bafae7d1dbf7d7db710c68507b634e |
|
MD5 | 809e5892a35f1daaddebfaac768271c6 |
|
BLAKE2b-256 | 6e082a05f6ce25c61a56cedcf135f6e76daaa42c06c3a8a1ae1e37c9739850c0 |
Hashes for hammingdist-0.15.0-cp39-cp39-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aeaa9b3efc22f578caceb9664b8d6b4d1f537cc2ad944c3e28593650a9ad66f0 |
|
MD5 | 078d98e9882564d64555fea344e7545f |
|
BLAKE2b-256 | 99fe58d42f6994154936b324a18a2e1b98d432925a481cddb6de78873563bb56 |
Hashes for hammingdist-0.15.0-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c3826675c4a53ef852bc8214f128697e8ae26275183b4dc4010a93f2368af006 |
|
MD5 | 6cbb33fe3e6185697870adaef99a729c |
|
BLAKE2b-256 | d7b7211fa5ebd6ed1cae997304ab45fd884415f7969fd94428d81892ecbe7aaf |
Hashes for hammingdist-0.15.0-cp39-cp39-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 66753971d5498687cf5abbd5c008493a72ce252fcdd55778ec1a7b58973e1d94 |
|
MD5 | 6c7f1186fb8ddf4b033e44e738c2e447 |
|
BLAKE2b-256 | 298da44493b73958f8480750daa9ae7786c711905a5ac3163100811df2434bcd |
Hashes for hammingdist-0.15.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5ab12ec44ccce9ff50cc2ce1f64cc04b36a750d73f98724894941fffc1cff00 |
|
MD5 | 0510f652efc3cbe519353a7dfb716dde |
|
BLAKE2b-256 | 6cfc36088ad546faabead4b675e7a6a803a5a553ce2967f1e9d90e6633aada5a |
Hashes for hammingdist-0.15.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bd8beec3a316860b00df3c2d825d3367f808f32d0f44fdf121cfc8b992081245 |
|
MD5 | 38ade53915a135fc699110314e071638 |
|
BLAKE2b-256 | 4f3ac7cfa98cbf9cfb0beddcc50b4dfec16222651f2c92391a699904eea9eb90 |
Hashes for hammingdist-0.15.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 37453d8faac5a10aaa59a5113c6410fde155b52e54ee5a7242f01a53c8a2e383 |
|
MD5 | 9716c11f1c8d742be42e91363f8d1512 |
|
BLAKE2b-256 | 1b6871d7b469ba41d3b87e04b31a7803115cc8be1e9705f155c10835755f1fb3 |
Hashes for hammingdist-0.15.0-cp38-cp38-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5a94be79d0644e33dc1a42c54563503f1fbe921489d17135154ba9a9f721584 |
|
MD5 | 499a8b41cab3a92da28bde982fc7b584 |
|
BLAKE2b-256 | 4203dd97545ae13e965cdb5e9bca1df2b20f5c64c80797040ec2d3383575f02f |
Hashes for hammingdist-0.15.0-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35a5134f349a494417ae5f263e694ac0a1062d463ae297eaa839596f907ba321 |
|
MD5 | 26cd30a50f25c8c928b0bef02b1a5489 |
|
BLAKE2b-256 | cf4a7604634c7cad00d148108114261aa21699720ad456102366c0c9556a90d8 |
Hashes for hammingdist-0.15.0-cp38-cp38-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4fc288de53f029289c6b55c356a41e7d1172c995a5a08017a82f24fcca184737 |
|
MD5 | aceb88cee036e4c4f80c14a20ed69ff6 |
|
BLAKE2b-256 | 0745876231c3ca3ffad80c17c582c75881217fec4e2b619f19287271b63ab95d |
Hashes for hammingdist-0.15.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f624f736bb2ce2fbfb700e1d8f3db96686982a4dcac0e2ead71d13de7de49429 |
|
MD5 | 63135d816aa713af5adeb1d1544bfbb9 |
|
BLAKE2b-256 | 8388d376682fa1346bb2a4787ce432911fd0ddf8c35bba5bac2a167b95ff2cf3 |
Hashes for hammingdist-0.15.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a50eef2e5770c7c872888176a3d00b02ef3c50a39fd737a40616427584a30cc4 |
|
MD5 | 452ec19631f5ce618815c1ccf6c5565c |
|
BLAKE2b-256 | 84d5a8119ded03b5afe1a9edbda2ea4dffb4eb71477c3117b971e40b1873df91 |
Hashes for hammingdist-0.15.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b29f0b62f389dba34fc10b77cf08e47d95563b447d48cf751c4af5b1c64eed93 |
|
MD5 | b1e9929839475628f059c665170f6b2d |
|
BLAKE2b-256 | ce62bbe58ed00a7b859f83d005531740c367f7d13765af043906cb329d808900 |
Hashes for hammingdist-0.15.0-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | df1e9183ecc39cbcfc486c4ccadada1e09c38a2d1e65b7bf7f763649d973272d |
|
MD5 | 3ddffc8c0507bd2686b8e8e8fc718812 |
|
BLAKE2b-256 | 6f76d1e482ab139449728a13dcf146d03088baa63ca0f65aa49187e55a9b6c86 |
Hashes for hammingdist-0.15.0-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1297f6bf386a3aa5850b199fd244a4c32e6e897891ca006381544c2e7563b21 |
|
MD5 | f76982c6594993b085f97d67f3b7a913 |
|
BLAKE2b-256 | 48e6177a50e561da06d979c788eb388532244cc82a1b76d176e9a8211f3a9d61 |
Hashes for hammingdist-0.15.0-cp37-cp37m-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2d2b632dab3565dee7c7e0d982a937f28c7d568a6f81182ab6387ff1379d72c8 |
|
MD5 | c33eacb69732699c4afc911f91f29135 |
|
BLAKE2b-256 | 9807879e3c6a10d355661b1ffbadf2010b33c539b6092773ac0e8494e53347b9 |
Hashes for hammingdist-0.15.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e7c43d6031cf2eebe031221603c0e72a0ea9aed4cae99fe5889134ead579699 |
|
MD5 | a1aa765e107bd52629134327616880e7 |
|
BLAKE2b-256 | 0ce083f334c56f6ea50c9ed71000c863134e4db1414832207651c898c75e7fd1 |
Hashes for hammingdist-0.15.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e1380e69955af593c3dcd0f5070b44b5868d87019dfecf2af584d3c94877699 |
|
MD5 | 95181e21baddacaad127f4e7817ddf7f |
|
BLAKE2b-256 | fade59b4908d8569e8fe3f557d32bcd840c8c05b6ac0b2a2014ff5476d7faf9a |
Hashes for hammingdist-0.15.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e24373a9f206e8abab6d0bc35a9815a627eeedc90c076443fd801aa3a38133a |
|
MD5 | 22c6bac51fa2d08b5378a80969ceb6d3 |
|
BLAKE2b-256 | 8fb5fc09f44c2e9b1a48b2eab62f1bb1be4cba596a9b60552ec2344b6e0070b2 |
Hashes for hammingdist-0.15.0-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8708d8d4b3562572d4d192ad86dcc74f8ab2bcd77d4aeb4dc362c72971dba93e |
|
MD5 | 67cf15331e191be7010c9ed96b456f03 |
|
BLAKE2b-256 | 499bdf14c14acd832c1e5608da843e190711ce016ccbbcb2e1ef308f38a843dc |
Hashes for hammingdist-0.15.0-cp36-cp36m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8ddeafcbbe525f3846e5a23a8c8a5f7e910e1b35b0de70ef01bf663f3f648a9 |
|
MD5 | 42e9fbb82885a6bfc86d9df7c5085051 |
|
BLAKE2b-256 | 5732c785d5f1d4bb037500be3b40bda059162407ea50de575233654927c3a202 |
Hashes for hammingdist-0.15.0-cp36-cp36m-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3cabc0c6bec424e694f03f7868dee46e4a1911a918d7eca9a60b16d89ba8b521 |
|
MD5 | 99f2a92ca95157121911059e730944b0 |
|
BLAKE2b-256 | 82582abe7aeb2b9fccc490babc59372d7a06b97b9560de6254c8d092836958a2 |
Hashes for hammingdist-0.15.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cf24c2ee9849089323168a04a5efc58e04d1e38b9227997cbbd0350265fdeb2a |
|
MD5 | 8fa4949b9bcb9354340667df97028591 |
|
BLAKE2b-256 | 7721e25b6c288e5fe3f0fbf94841e6296c2f031509df11a77ba4ecbc92e979cf |
Hashes for hammingdist-0.15.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d57378d15dfa96e2eb14bf9e2033bf982ccdc52a0cdbb9d819487cb646ca9e45 |
|
MD5 | ce532c74b288941842cc0eb6fdff0944 |
|
BLAKE2b-256 | 4fb726af18ffdd6fbd2aa77ad6d8a044dd018ebeeee2c547a85283ae4b5af8e2 |