Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

How should we go from Stack Exchange Q/A to publishable PDF with the least hassle?

+0
−0

Over on another site we're talking about taking some of our content (on a particular theme) and re-packaging it as a printable PDF. (The primary use case is paper.) This wouldn't be a straight dump of the original posts; sometimes you want to edit some for a different audience, links don't work, and so on. We're currently thinking about using meta posts to facilitate this editing (so we can crowd-source that part of the work).

My question is: what's the best way to get from those posts to the final product, preserving as much formatting as possible so we don't have to re-do it? One could work with the Markdown (are there translators for that to other formats?), or with the generated HTML (the actual web page). Or one could cut/paste into one's favorite document-creation tool, which sounds like an unfortunate choice because it's labor-intensive and the formatting wouldn't follow. An additional consideration is that some of our content is in Hebrew (so non-ASCII).

I realize that I'm treading dangerously close to "too localized", but it seems like the same techniques that are used for wikis and blogs might apply here too.

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

2 answers

+1
−0

If you only want to convert a handful of pages into PDF, then you can do that in Microsoft Word and you will probably be ok.

If you want to convert a large quantity of webpages into PDFs and wish to preserve their edibility and eliminate unnecessary information, I am going to strongly suggest the following:

Export the webpage with the source information as HTML. Open the saved page in Adobe Dreamweaver (or similar) and make all the changes to the text and page layout in HTML and then save the new content again as HTML. When all final changes have been made, then create the PDF from the HTML in Adobe Acrobat (or similar).

Why this and not Word? Two reasons.

One, I find that Word tends to get mucked up when you've cut and paste from the web. Things tend to flow incorrectly and un-mucking it up tends to be quite frustrating. You experience may vary.

Two, if you wish to be forward thinking and want to eventually create an ePub or a Kindle book or what have you I have found in my experiences that you get better results when you create your e-book from HTML as opposed to MS Word or even PDFs.

If you want to be forward thinking it's better to handle your product once through HTML editing than to handle it twice through Word.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

This post was sourced from https://writers.stackexchange.com/a/7426. It is licensed under CC BY-SA 3.0.

0 comment threads

+0
−0

We have learned through experimentation that a new-enough version of Microsoft Word (we tested with 2010) supports format-preserving cut-and-paste from Stack Exchange posts. We drew up some formatting guidelines to get the content into shape (e.g. de-linkifying, since this is for paper). This still involves manually cutting and pasting from the browser into some other program (e.g. Word), but it turns out we don't need to extract the source HTML or markdown to work with after all, so for a small project we can live with that.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »