Repackaging of Google's Diff Match and Patch libraries.
Project description
diff-match-patch
Google's Diff Match and Patch library, packaged for modern Python.
Since August 2024, Googles diff-match-patch library is archived, and this project will now track the maintained fork.
Install
diff-match-patch is supported on Python 3.7 or newer. You can install it from PyPI:
python -m pip install diff-match-patch
Usage
Generating a patchset (analogous to unified diff) between two texts:
from diff_match_patch import diff_match_patch
dmp = diff_match_patch()
patches = dmp.patch_make(text1, text2)
diff = dmp.patch_toText(patches)
Applying a patchset to a text can then be done with:
from diff_match_patch import diff_match_patch
dmp = diff_match_patch()
patches = dmp.patch_fromText(diff)
new_text, _ = dmp.patch_apply(patches, text)
Original README
The Diff Match and Patch libraries offer robust algorithms to perform the operations required for synchronizing plain text.
- Diff:
- Compare two blocks of plain text and efficiently return a list of differences.
- Diff Demo
- Match:
- Given a search string, find its best fuzzy match in a block of plain text. Weighted for both accuracy and location.
- Match Demo
- Patch:
- Apply a list of patches onto plain text. Use best-effort to apply patch even when the underlying text doesn't match.
- Patch Demo
Originally built in 2006 to power Google Docs, this library is now available in C++, C#, Dart, Java, JavaScript, Lua, Objective C, and Python.
Reference
- API - Common API across all languages.
- Line or Word Diffs - Less detailed diffs.
- Plain Text vs. Structured Content - How to deal with data like XML.
- Unidiff - The patch serialization format.
- Support - Newsgroup for developers.
Languages
Although each language port of Diff Match Patch uses the same API, there are some language-specific notes.
A standardized speed test tracks the relative performance of diffs in each language.
Algorithms
This library implements Myer's diff algorithm which is generally considered to be the best general-purpose diff. A layer of pre-diff speedups and post-diff cleanups surround the diff algorithm, improving both performance and output quality.
This library also implements a Bitap matching algorithm at the heart of a flexible matching and patching strategy.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for diff_match_patch-20241021.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | beae57a99fa48084532935ee2968b8661db861862ec82c6f21f4acdd6d835073 |
|
MD5 | dd6de83ff4bcda48d424f5a73bc6888c |
|
BLAKE2b-256 | 0ead32e1777dd57d8e85fa31e3a243af66c538245b8d64b7265bec9a61f2ca33 |
Hashes for diff_match_patch-20241021-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 93cea333fb8b2bc0d181b0de5e16df50dd344ce64828226bda07728818936782 |
|
MD5 | 737f2e9d2ce38efbfffa97d66683c675 |
|
BLAKE2b-256 | f7bb2aa9b46a01197398b901e458974c20ed107935c26e44e37ad5b0e5511e44 |