开发者

diff'ing without grouping unrelated blocks

开发者 https://www.devze.com 2022-12-17 22:45 出处:网络
Is there a diff algorithm that does not group unrelated blocks? For example: hello world lorem ipsum dolor sit amet

Is there a diff algorithm that does not group unrelated blocks?

For example:

hello world
lorem ipsum dolor sit amet

vs.

Hello World
Lorem Ipsum Dolor Sit Amet

Comparing these (e.g. with standard Unix diff) genera开发者_运维技巧lly results in the following:

< hello world
< lorem ipsum dolor sit amet
---
> Hello World
> Lorem Ipsum Dolor Sit Amet

However, a line-by-line comparison like the following would seem more sensible:

< hello world
---
> Hello World

< lorem ipsum dolor sit amet
---
> Lorem Ipsum Dolor Sit Amet

The latter, IMO, makes it much easier to analyze minor changes. (Note that I'm concerned with human readability here, not machine readability.)

I understand diff'ing is a complex issue, but this often leaves me puzzled nonetheless.


Although it is intentional that diff behaves like that you can change it by throwing in blank lines. This will get the result you want.

1:

hello world

lorem ipsum dolor sit amet

Same

2:

Hello World

Lorem Ipsum Dolor Sit Amet

Same

The line number has to be fixed though (n/2 + 1).

1c1
< hello world
---
> Hello World
3c3
< lorem ipsum dolor sit amet
---
> Lorem Ipsum Dolor Sit Amet

If multiple lines replace one line the output may still not be what you want:

1,3c1
< hello world
<
< lorem ipsum dolor sit amet
---
> Hello World


The diff algorithm is a solution to the longest common subsequence problem. However, it seems youre not interested in another algorithm. Because, related or not, both lines have changed and what you are talking about is how the difference is presented in text.

Thomas Jung showed the original format. Wikipedia shows a few variations. But take the time to experiment some.

diff original new

Will produce the original format.

diff -c original new

Will produce the context format.

diff -u original new

Will produce the unified format. For some trivia, this is the one most commonly used, patches to open source projects are more often than not requested in this format.

Of course, if the way the difference is presented to you is crucial, I think you will find any of the diff viewers vastly superior.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号