Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Post History

60%
+1 −0
Q&A How to create a useful diff of a markdown writing project iteration

I'm not sure about using Git commit. I don't use it much. How about tokenizing your text in Python? Then calculate the entropy per sentence. This would show difference between sentences. Alternat...

posted 7y ago by Boondoggle‭  ·  last activity 5y ago by System‭

Answer
#3: Attribution notice added by user avatar System‭ · 2019-12-08T07:36:40Z (almost 5 years ago)
Source: https://writers.stackexchange.com/a/32241
License name: CC BY-SA 3.0
License URL: https://creativecommons.org/licenses/by-sa/3.0/
#2: Initial revision by user avatar Boondoggle‭ · 2019-12-08T07:36:40Z (almost 5 years ago)
I'm not sure about using Git commit. I don't use it much.

How about tokenizing your text in Python? Then calculate the entropy per sentence. This would show difference between sentences. Alternatively, you could do this for words or ngrams. The output can be stored to list and txt file.

    # py3 algo for calculating entropy
    # import math
    from collections import Counter
    p, lns = Counter(s), float(len(s))
    return -sum( count/lns * math.log(count/lns, 2) for count in p.values())

Or perhaps use one of the fuzzy search algorithms to find sentences that are not an exact but are a fuzzy match (e.g. one substituted character).

Perhaps even easier, but not as powerful, Google Docs has a version history where you can view all edits for each day you have edited the file. It is also possible to export to Markdown.

Let us know your solution.

#1: Imported from external source by user avatar System‭ · 2017-12-29T13:54:11Z (almost 7 years ago)
Original score: 1