Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Post History

50%
+0 −0
Q&A When to ask for constructive criticism?

Ask the Hungarian TL;DR Unsurprisingly, this question has a mathematical solution based on the Hungarian algorithm. I have not done the calculations, but I imagine that1 the typical answer is: ...

posted 5y ago by _X_‭  ·  last activity 5y ago by System‭

Answer
#4: Attribution notice removed by user avatar System‭ · 2019-12-18T21:34:25Z (about 5 years ago)
Source: https://writers.stackexchange.com/a/46642
License name: CC BY-SA 3.0
License URL: https://creativecommons.org/licenses/by-sa/3.0/
#3: Attribution notice added by user avatar System‭ · 2019-12-08T12:27:31Z (about 5 years ago)
Source: https://writers.stackexchange.com/a/46642
License name: CC BY-SA 3.0
License URL: https://creativecommons.org/licenses/by-sa/3.0/
#2: Initial revision by (deleted user) · 2019-12-08T12:27:31Z (about 5 years ago)
# Ask the Hungarian

**TL;DR**

1. Unsurprisingly, this question has a mathematical solution based on the [Hungarian algorithm](https://en.wikipedia.org/wiki/Hungarian_algorithm).

2. I have not done the calculations, but I imagine that<sup>1</sup> the typical answer is: 

<sub>1: this is based on the guess that the time availability of beta-readers does not scale exponentially relative to their feedback quality ranking. See below for details.</sub>

* * *

# A far-from-formal discussion of the problem

## The setup

you have an assignment problem, with a series of tasks to be split between agents. The task is to provide feedback on a text whose length increases over time, up to an upper limit of which you have a rough estimate (and the accuracy of the estimate increases over time). The agents are a number of beta readers with varying degrees of professionality (i.e. quality of the feedback), patience (i.e. size of the task that they can digest), and available time (i.e. how many times you can ask them to read your writings).

## The goal

distribute the workload between beta readers across time to maximize the benefit of your feedback. In math terms, I see this as the maximization of the impact of the feedback, that is the product between the quality of the feedback and the certainty that the text given to the reader is closer to the final draft. It makes no sense to have high-quality feedback on text which you are going to rewrite anyway.

## The assumptions

1. You cannot discern your beta-readers based on their taste. You can distinguish them based on their reviewing skills, and on their patience and availability, but they will all like (or hate) the same written works. 
2. beta-readers read at the same speed.
3. beta-readers can read unit chunks of text and do not need to read the entire work at once.

## Estimating the total length of the text

This total length is the total number of words that you have written. If you never rewrite your work, then it is exactly the number of workds in your novel. If you rewrite parts of it once, then it is the sum of the words in the initial draft, plus the edited parts.

That being said, imagine that your initial writeup goal is 60k words. If we simplify the problem _a lot_, we can think that for every word there is a certain chance that you will have to rewrite it. Everytime you rewrite a word, the chance of rewriting that specific word decreases. For simplicity, imagine that when you start writing there is a 50% chance that you will rewrite any word. If you rewrite it, there is still a 50% chance that you will rewrite the new one: how many words would have you been writing? 1 word with 50% chance, plus another word with 25% chance, plus another word with 12.5% chance... if you continue forever, and weight the sum by the probability, you get that a 50% rewriting chance results in writing 2 words for any word in the final manuscript: writing a 40k novel will require writing _on average_ 80k words. Obviously, [higher rewriting chance require writing more words](https://en.wikipedia.org/wiki/Geometric_progression#Geometric_series).

## The beta-readers

The beta readers are characterized by:

- the quality of their feedback, which we can treat as a ranking (1: amateurs, 100:professional literary critic)
- the patience: the longest chunk of text that they are willing to read in one go.
- the available time: i.e. the total amount of text that they are willing to read

We can simplify this enormously by splitting each beta-reader in sub-beta-readers, where each sub-reader is simply a unit of time of work of a beta-reader, for instance, a half-hour beta-reading. From the assumption n.2 this corresponds to a fixed amount of words. Now your most willing beta-readers correspond to an army of tiny half-hour subreaders, while your busy beta-readers may only be represented by one sub-beta-reader. From assumption n.3, for the purpose of determining task assignments, there is no distinction between a single beta-reader and an army of sub-beta readers that read the same amount of text. The only practical distinction is that sub-beta-readers cannot read in parallel as they may correspond to the same person. Considering how long it takes to craft a novel, this is an irrelevant detail.

## The tasks

Once you have defined your sub-beta-readers, you also know what is the length of the chunk of text that each sub-beta-reader will read: if you have half-hour sub-beta-readers, then it is the hourly reading speed divided by two. How many tasks do you have? Total text length divided by length of a task.

Each chunk of text has a certainty score, which is the inverse of the chance that you will edit it. For instance, if the initial chance of rewriting a word was 50%, then the chunks from the first pass have a certainty score of 2 (1 divided by 50%), the chunks from the first round of editing have a score of 4 (1 divided by 25%), and so forth. This means that the certainty score is an exponential function, i.e. it grows very rapidly with the number of edits.

## The measure to optimize

Each sub-beta-reader reads a chunk of your text. The reader has a quality associated to them, given by their relative ranking, as we said above. The chunk of text assigned to the reader has a certainty score. The product of these two numbers is higher for text of which you are more certain and which were reviewed by the more qualified people.

## And now what?

Plug all this in the [Hungarian algorithm](https://en.wikipedia.org/wiki/Hungarian_algorithm): the agents are the sub-beta readers, and the chunks of text are the tasks. Map the sub-beta-readers back to your beta-readers and the output of the algorithm should tell you which piece of text to give to which reader.

I hope that helps.

## Addenda, Notes and PS

- **I'm a discovery writer! I don't know how long my novel will be from start!** : worry not, you can use the average novel length from your genre as an initial estimate, or you could use a [Fermi estimate](https://en.wikipedia.org/wiki/Fermi_problem), or a [German tank estimation approach](https://en.wikipedia.org/wiki/German_tank_problem) (this may work surprisingly well if you write random scenes to be glued together in the end), or use as upper bound the maximum total length of text that all your beta readers may be willing to read. 

- **I'm bad at estimating chances of rewriting!** : Also, not worry. Start with a guess, like 50%. Then you can apply an expectation-maximization algorithm to tune the chance of rewriting as you actually rewrite. In simple terms, at each round of rewriting, you can plug in the number of words you have actually rewritten and come up with a new guess for what your rewriting chance is. Be careful to track how many times you have rewritten each passage, though.

- **What to do if you write more than the number of available readers?** : Add pseudo-readers. The ideal case would be to do a random sampling with replacement from your actual beta-readers. A simple and effective way to do that is just to add a bunch of low-quality beta-readers. This will push reading revisions to later stages. Re-run the algorithm if you get new beta-readers.

- **What if my readers do not have a similar reading taste?** : Group readers in super-readers by taste groups with identical revision quality. Assign chunks of text to each super-reader. Then rerun the algorithm for the group of readers in each super-reader group limiting the text to the collection of chunks assigned to that particular super-reader.

- **My readers do not read at the same speed** : the conversion to sub-beta readers is done in amount of text per unit of available time. You can also convert a reader to sub-beta readers by the number of text chunks of a given length that they can read. In formulas: avilable\_time\_for\_reader \* reading\_speed / length\_of\_basic\_text\_chunk

- **I have already written 40k words! How do I go from here?** subtract these 40k words from the total length of the words that need beta-reading. Distribute the work over the remaining text.

- **When should I run this algorithm?** You can run it even before you start writing. You can re-run it everytime a beta-reader reads a chunk of your text: you remove that amount of text from the total text to be read, remove the corresponding sub-beta-reader, update the estimated total length and the chance of rewriting, and rerun the algorithm. Doing so will allow you to distribute your work more precisely and in a more timely fashion. Once you have it, it is probably a matter of a couple of click.

- **I want to give my reader single chapters, rather than chunks of text, and they are of different length** : as long as the readers will consider all chapters to be equal in terms of workload, replace the text chunks with chapters in the computation above. Each reader is divided in sub-readers corresponding to the total number of chapters that they are willing to read. Your novel length is given in chapters, and the probability of rewriting, refers to a chapter and not to a single word anymore. Leave the rest unchanged. A word of advice, do not disclose whether a chapter is longer or shorter than average to your beta-readers unless you are sure that they are not easily offended by workload inequality.

- **Is there readily available software to calculate this?** yes and no, and sadly the margin is too narrow for me to put it here :)

#1: Imported from external source by user avatar System‭ · 2019-07-15T16:24:10Z (over 5 years ago)
Original score: 2