A fast tool to calculate Hamming distances
Project description
A small C++ tool to calculate pairwise distances between gene sequences given in fasta format.
Python interface
To use the Python interface, you should install it from PyPI:
python -m pip install hammingdist
Distances matrix
Then, you can e.g. use it in the following way from Python:
import hammingdist
# To see the different optional arguments available:
help(hammingdist.from_fasta)
# To import all sequences from a fasta file
data = hammingdist.from_fasta("example.fasta")
# To import only the first 100 sequences from a fasta file
data = hammingdist.from_fasta("example.fasta", n=100)
# To import all sequences and remove any duplicates
data = hammingdist.from_fasta("example.fasta", remove_duplicates=True)
# To import all sequences from a fasta file, also treating 'X' as a valid character
data = hammingdist.from_fasta("example.fasta", include_x=True)
# The distance data can be accessed point-wise, though looping over all distances might be quite inefficient
print(data[14,42])
# The data can be written to disk in csv format (default `distance` Ripser format) and retrieved:
data.dump("backup.csv")
retrieval = hammingdist.from_csv("backup.csv")
# It can also be written in lower triangular format (comma-delimited row-major, `lower-distance` Ripser format):
data.dump_lower_triangular("lt.txt")
retrieval = hammingdist.from_lower_triangular("lt.txt")
# If the `remove_duplicates` option was used, the sequence indices can also be written.
# For each input sequence, this prints the corresponding index in the output:
data.dump_sequence_indices("indices.txt")
# Finally, we can pass the data as a list of strings in Python:
data = hammingdist.from_stringlist(["ACGTACGT", "ACGTAGGT", "ATTTACGT"])
Distances from reference sequence
The distance of each sequence in a fasta file from a given reference sequence can be calculated using:
import hammingdist
distances = hammingdist.fasta_reference_distances(sequence, fasta_file, include_x=True)
This function returns a numpy array that contains the distance of each sequence from the reference sequence.
You can also calculate the distance between two individual sequences:
import hammingdist
distance = hammingdist.distance("ACGTX", "AAGTX", include_x=True)
OpenMP on linux
The latest versions of hammingdist on linux are now built with OpenMP (multithreading) support. If this causes any issues, you can install a previous version of hammingdist without OpenMP support:
pip install hammingdist==0.11.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for hammingdist-0.16.0-pp39-pypy39_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 84082d5763d135c89465616802cb0be904b3277f0aed7508f7329fb7420d5323 |
|
MD5 | 95a8ab31f0483ccead758c99676c5bab |
|
BLAKE2b-256 | f9cd365eedb4853e8463273f47727375e5b05f1771ed41a8f3fc6fcaf0707a08 |
Hashes for hammingdist-0.16.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a82477f3a596246a497c2a034d5abd83a59956a785ec94766c58d3448ba5a6da |
|
MD5 | 868b8f59b68eaef49bddacd77d394e4b |
|
BLAKE2b-256 | 8de0e9cf8f760338450b6e9862327dd1b8801310f09074625376cca78275271a |
Hashes for hammingdist-0.16.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e0d04b465239c92c47f50016bf51c303dc839ef7a9134d852ef5491570eb6525 |
|
MD5 | 0f8d5980c94152798186f56699483586 |
|
BLAKE2b-256 | 5bd649031ea592546adb86b1d439e81888a6e713d1c457ceded90ef67c3bfa52 |
Hashes for hammingdist-0.16.0-pp38-pypy38_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 53c36c7880e64217e0e55c24edeea9f17fb0bd550860b2dc627486982293bcdc |
|
MD5 | eea3fa64af6900edb5aa629954a2c8a8 |
|
BLAKE2b-256 | ccc86c0b6da774980b9fe2439a53315e82282607b4efc078f529109a93dcabbe |
Hashes for hammingdist-0.16.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba890322c65d069666a30f4ff62d50aa61701bafc1c12faf927257916f296441 |
|
MD5 | af2307ffd10e8f646268a194b83afa07 |
|
BLAKE2b-256 | f36d14ff734175ac793cff9642bd988c91d5025fe46b7483524ead85a0ab9574 |
Hashes for hammingdist-0.16.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ff14b86fba2ec21d502bb37315f7e211e282e9f8f27a7dd0d438f54290ac3e6d |
|
MD5 | fac94e94c47ebc3580168a582c7ee3bf |
|
BLAKE2b-256 | 85455d2f09e8f736e560ad0df4ef3828cbbc10c097209e9a61ed77612113b298 |
Hashes for hammingdist-0.16.0-pp37-pypy37_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63786ceaa85747217752daa4caa5215c35e798bb38fe0f774825780227ad69e0 |
|
MD5 | 6508324285d9e3d3abf38c357df80e5c |
|
BLAKE2b-256 | 273abd2f7412a059a674c7f348bb6851d67bb617b93f05a84e6af3e3dc14bffe |
Hashes for hammingdist-0.16.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5369f914993dce0aadbac0a059c792fbcf22984d3178fb51d10d6f00ea8abaaf |
|
MD5 | acee6595a16b0a0ffa1695799b724b9d |
|
BLAKE2b-256 | 2b8d1e2d2d3842c30019dec4e950fb13c30311d5133407c32972cbcfd325c17b |
Hashes for hammingdist-0.16.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 361e3b6d2116b47635caae03441095edfa468dddce873ba50c785d7e06a25bed |
|
MD5 | 99eb624c9c4adba7eb7a7d7efc7edf2a |
|
BLAKE2b-256 | 55ea04c21ca2d6a7552a7293ceac20d3f4a81dd828316c6bc36566ec01a78622 |
Hashes for hammingdist-0.16.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 12611664a02e628cb94a692e1af8712cb234e5a485e2b74de4d6c2ad98172a64 |
|
MD5 | aeb3e3cc73ddcd3da167c02dde0d339f |
|
BLAKE2b-256 | c0fe0f36e75ee9e8d083a729050930766282735f5c269e96304d1434669cbbf0 |
Hashes for hammingdist-0.16.0-cp310-cp310-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 41ff598bd5271b6fc99ddecc1e4b1b9b8fc01cf8c2a9fd9e53254599a35afe34 |
|
MD5 | f784f27b8ca0cf0853079c120db9b8cc |
|
BLAKE2b-256 | 26e488b6fe0f37a876be77e55655d7706b6b1e836c0556f317e73d5c7b6bcedf |
Hashes for hammingdist-0.16.0-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9435cc51b6eefa461391d02516d225db43c0bda141a1cfef5b7c642813d845a9 |
|
MD5 | 311754c8088d442b1af5408b06752647 |
|
BLAKE2b-256 | 14d6a88d23437a9234a9b63eaf0bb6375afe1c4febf2ca7b4d56f425c22a0d38 |
Hashes for hammingdist-0.16.0-cp310-cp310-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ab882017a6121dfa1bd4f3117217bcbcac2fbb125644a1782b651815e29aed1 |
|
MD5 | fbaa1989814c0cb758d7e68aeb71a78e |
|
BLAKE2b-256 | c331684d7f036bc18cf8aa8a7d062c128862e81b3284951c69a465960a6a7a12 |
Hashes for hammingdist-0.16.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 65934f8c8b1db7ae96453a6636cb0cff722fc4ac0c4efbbedc17f23db9592931 |
|
MD5 | f91c9fad45331ea6e24e7ca8cc9ff099 |
|
BLAKE2b-256 | ecdcdb9a2b1b5560f0b5c5278ef7fedb5a9303a26277ce0444e1c9076bf4fc26 |
Hashes for hammingdist-0.16.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | acce3dd4bd616cd981c34cbca5f48847d90e5bc25a8dbaf27517c32e58abd87f |
|
MD5 | 2fd7b116fd5def3ee46a23d7adac9ecf |
|
BLAKE2b-256 | eb1b71c3e45cb6629d20229622d957eed09d62c43fb1f4985b4d0449d63ae658 |
Hashes for hammingdist-0.16.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d1a6c34745a2f03bdfc4ec4d60bfdbe5247f410cbd3e0403aa449b7d970898a |
|
MD5 | 755e2afd29e09767eff4324a2a1a42eb |
|
BLAKE2b-256 | 9fa45e14e8112ef889fd89fda7c09ebd60279a008a25ff0f32d028a55b2851bc |
Hashes for hammingdist-0.16.0-cp39-cp39-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 209de5b7da4f1d4d160733022bde59b95db8400cedbb52b181558b3e5f68ea49 |
|
MD5 | 1be28d5b80bb2c0b24003457a69db374 |
|
BLAKE2b-256 | 3db683bf09fcb6a66ba9a2ba618d0fb52b1049a835addb9bd5067ad593102d41 |
Hashes for hammingdist-0.16.0-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ba137c5cc78f8b13a2ebcfa590371dc71fb59d14cb6515daab7adc70ac6a485 |
|
MD5 | 4eda3d48d5c000a6d51cd5a6d0811137 |
|
BLAKE2b-256 | 368c70a9343104a53b7101d30a6cf30118ef7f9091ae651dc194746fd116b1d0 |
Hashes for hammingdist-0.16.0-cp39-cp39-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fce621dc5b051a648af225b21684a3dbcbd1c4f43dc8bcf32d36d98a94ab5b60 |
|
MD5 | 956fc8da18dfda6c36302fdd8f44aa97 |
|
BLAKE2b-256 | 77b622117f118a70ae2a90164bfe1c75edd75460fb25bab64ebaefb90b4a9ad8 |
Hashes for hammingdist-0.16.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a16e09d072d5fb6926c7f6d9e7885db495f21a9377ba9d53752738d98431f113 |
|
MD5 | c68f1ba7803c717b6c884d728ce829e3 |
|
BLAKE2b-256 | 9e83218446e8e6b2039ef39470628278ef519072c4e48faae3ac925bb2fb0918 |
Hashes for hammingdist-0.16.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc8ff37de67c2ceb6ad160344aa6f08e0278274353cabfab9f31949f79801930 |
|
MD5 | 2b03fcfb11fbf8af36ddcf08c991cf09 |
|
BLAKE2b-256 | 771e0636a203a9f07204a18e4351ba63dddf7d3a422ebcdfd8954e4fc29a2330 |
Hashes for hammingdist-0.16.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8f066d02dbcd237f19a860b8903af98fb50a27230623dc74e933563aeecf2668 |
|
MD5 | e3bd872301cd7f474927dc5fcaa370ae |
|
BLAKE2b-256 | ccac1d5531da52a1ecfcd43fc91d7eec0938ba85ad8b170e6adfeeaa76422e5b |
Hashes for hammingdist-0.16.0-cp38-cp38-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7b78a2fb384b08e178bbe3af46809c20d856fb8988ce5bbe00110a2f73c92c1c |
|
MD5 | 5d4b9c3cbf0ae5a4e6987ddc2bedb305 |
|
BLAKE2b-256 | 1bf2df3435bd743510939d0dfe64753366d2b8a893a2925d2be73dc235663c5b |
Hashes for hammingdist-0.16.0-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8fe89c536e374bacb9b5636f059f84b16bb59027b517a5fe7156000bbf9d52bc |
|
MD5 | 9ee6b904bbcd6eab42da2c79163839bd |
|
BLAKE2b-256 | 730ecbf9f867da102e07da262fc53a991d671561f6c7748d35173a9697928ac8 |
Hashes for hammingdist-0.16.0-cp38-cp38-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8a8d78aca2cfc8a2516f61af66bd8787c7a8539e9f7838c38c8815323676c20c |
|
MD5 | 48631ccb82962368118b8a84c576c67a |
|
BLAKE2b-256 | 35d74c87c0f3bd2f722b9f3b741bcd4193926c719d298dca18c93319e3e2083a |
Hashes for hammingdist-0.16.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8677dc271cb23ead7fe52364b5468ecab05811f0cc74767e0a4570a325e1444d |
|
MD5 | 141dd98c2ea37a36ee49abea48d81d10 |
|
BLAKE2b-256 | 26f45f1c49d09f6da57face09f2327ff860fa2dc66653e81b39c4b65c732cd13 |
Hashes for hammingdist-0.16.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a027307b13a0dae0b33c24cea48a2399e4b5004f2fba4597b41400f4ddfa4260 |
|
MD5 | cc3268be80cae5c605a12decb717bf1c |
|
BLAKE2b-256 | cff9d177d0812a37f8b54f0997629521a87da8732dcec5ddb973582ea3e1fdf7 |
Hashes for hammingdist-0.16.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 141c980ff8f3cb9fb18fa1759e93b24fdfe5688adb2bade8fdc72373f08bf874 |
|
MD5 | b3ab14b69d0fac1afbf5ecf518d36186 |
|
BLAKE2b-256 | acd8d4a98e3773038efc0358e48af374cb221f4e82264938c604f429f4930320 |
Hashes for hammingdist-0.16.0-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fa5e3dfa452c4d8817a534a7134d060b115aba25052f5353d368d8de90c92bbb |
|
MD5 | 6e8566fc94784117c9b47ab438e13645 |
|
BLAKE2b-256 | 21f9fda8706562162e4a180169d0d9b11c46519899961f5873335e8648f415cf |
Hashes for hammingdist-0.16.0-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 529018e5b78fab40ff6feb01a1c1480a97237a593659fd1db6f8e465b3ec045f |
|
MD5 | fdd5f1bc3e3c2b4e1f78ee269c397a9f |
|
BLAKE2b-256 | 367caa4d9bafcba8828f4064ac450e449684e46018af46b9f67f28d6fb5bff7d |
Hashes for hammingdist-0.16.0-cp37-cp37m-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d5b78c54d2a93fbd33cfd081b204c18e908dad25f1e4b6d94a284337283ddb11 |
|
MD5 | e57f86ab1d5d312b0d20279abe3ce701 |
|
BLAKE2b-256 | eda2689adf4d77a3fc0984d1c9b67ee90c117a5ae8c62d15ea08b6366b861050 |
Hashes for hammingdist-0.16.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a67a2a92709e8d89ed83796e5ff1c18839c7f4223f6ae3cd7fc2424ffd9f682 |
|
MD5 | b5834f5cf2d2422ecc297c37357bfc69 |
|
BLAKE2b-256 | 4fc51529ec20fd759e6108901a8b1e867316879fa1c05ce38e6c91bb591be5d5 |
Hashes for hammingdist-0.16.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 07a9e7e20243cbaa6dac4b069c7896a2dd2596fa00b3dede260908ca507ca059 |
|
MD5 | 79a9ec97ef8c53e80dff804d86fde1e2 |
|
BLAKE2b-256 | b05d9b6556069c3675d229d150e24420e361973da3f0a845a5cf588ad6de5912 |
Hashes for hammingdist-0.16.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 230cfb431c9795c2bff14d811111f2abdc16a29c9c28ace5964558e4c94a04b2 |
|
MD5 | fde5583d6435faaf46af16993a2536f9 |
|
BLAKE2b-256 | e6e9152a4a04790c582c0a160bdd63ee0739cabd8de0414a14b4bb5e0b3c3fdb |
Hashes for hammingdist-0.16.0-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3948c053d8a92323f1e5cde6697380e11ec5a3c0ccb33e0b94f3c0383f8c3b72 |
|
MD5 | a320b8199de6ea8eda8736a2309623ff |
|
BLAKE2b-256 | 6f7d53fc2b732d8b879798350c3e9764e6b769ee7b28ec5909bed9167f57b577 |
Hashes for hammingdist-0.16.0-cp36-cp36m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69a25425c8baafd5ec0fe1a709fffcdcf4a6f562da4086976b8c39a47d6b8025 |
|
MD5 | 12708097b51ae6ea3061ec4ff02f8f79 |
|
BLAKE2b-256 | b6e2ef97c29944eede2a71945040db20ba3053d17c8cbd0450c6baf740b18fd0 |
Hashes for hammingdist-0.16.0-cp36-cp36m-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 57958b0bbcc9b9082e88bb452c9105af8569a5ed3b317cfa28e62e442bd62c8d |
|
MD5 | 5b4a55dee4197628bb4592a88a9bc740 |
|
BLAKE2b-256 | 68714ce05ceae610b80b470dde6f5c8affba319599119d35b604eb6f144428db |
Hashes for hammingdist-0.16.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cf35f7a8596d0b97abd4f96d9f84a50ca7c2bdcfedf21363aad9e2465736546d |
|
MD5 | 40b4f4047d6a5eef919088adb7aacfbc |
|
BLAKE2b-256 | f2dc5d779bad47c5391f82c2f3bf3ca17c95bf0a46f3ccecc4e53817182bbea0 |
Hashes for hammingdist-0.16.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c6bc31d30f6bdd8c7f5b4008da490b9ef469ea216e7f58d4cdb3c0d8ee33e85 |
|
MD5 | 6e66115c4b58c6f16edcc6dd352c3fc9 |
|
BLAKE2b-256 | 2948e03927ee001a10c3d4086b087baa17f44b6743d19d3d413863fd18cc6a97 |