Post History

60%

+1 −0

Q&A How to create a useful diff of a markdown writing project iteration

I'm not sure about using Git commit. I don't use it much. How about tokenizing your text in Python? Then calculate the entropy per sentence. This would show difference between sentences. Alternat...

posted 7y ago by Boondoggle‭ · last activity 5y ago by System‭

Answer

#3: Attribution notice added by

System‭ · 2019-12-08T07:36:40Z (over 5 years ago)

Copy Link

Raw

Markdown

Source: https://writers.stackexchange.com/a/32241
License name: CC BY-SA 3.0
License URL: https://creativecommons.org/licenses/by-sa/3.0/

#2: Initial revision by

Boondoggle‭ · 2019-12-08T07:36:40Z (over 5 years ago)

Copy Link

Raw

Markdown

I'm not sure about using Git commit. I don't use it much.

How about tokenizing your text in Python? Then calculate the entropy per sentence. This would show difference between sentences. Alternatively, you could do this for words or ngrams. The output can be stored to list and txt file.

    # py3 algo for calculating entropy
    # import math
    from collections import Counter
    p, lns = Counter(s), float(len(s))
    return -sum( count/lns * math.log(count/lns, 2) for count in p.values())

Or perhaps use one of the fuzzy search algorithms to find sentences that are not an exact but are a fuzzy match (e.g. one substituted character).

Perhaps even easier, but not as powerful, Google Docs has a version history where you can view all edits for each day you have edited the file. It is also possible to export to Markdown.

Let us know your solution.

#1: Imported from external source by

System‭ · 2017-12-29T13:54:11Z (over 7 years ago)

Copy Link

Raw

Markdown

Original score: 1

Communities

Post History