I am using difflib.HtmlDiff
to compare two files. I want开发者_开发百科 the differences to be highlighted in the outputted html.
This already works when there are a maximum of two different chars in one line:
a = "2.000"
b = "2.120"
But when there are more different characters on one line then in the output the whole line is marked red (on the left side) or green (on the right side of the table):
a = "2.000"
b = "2.123"
Is this behaviour configurable? So can I set the number of different characters at which the line is marked as deleted / added?
EDIT:
Example:
import difflib
diff=difflib.HtmlDiff()
print(diff.make_file(
'''
2.000
2.000
2.000
'''.splitlines(),
'''
2.001
2.010
2.011
'''.splitlines()))
Gives me this output:
Line 2 is the output I want. It highlights the differences in yellow. Line 3 is odd for me because it does not detect the one character change but instead shows it as delete / add. Line 4 same as for line 3 but the whole line is marked.
difflib's algorithm does not claim to yield minimal edit sequences.
Although that statement comes from the docs for SequenceMatcher
, I suspect it applies to difflib
in general, and HTMLDiff
in particular.
While googling around for "python alternative difflib minimal edit" I found google-diff-match-patch. If you try out their demo for Diff with your example strings, it yields
Although the output is not exactly what you requested, it does show that it found the minimal edits.
The API docs state
diff_prettyHtml(diffs) => html
Takes a diff array and returns a pretty HTML sequence. This function is mainly intended as an example from which to write ones own display functions.
which suggests looking at the source code for diff_prettyHtml
might be a good starting point from which to build the HTML table you are looking for.
精彩评论