Notifications
Sign Up Sign In
Meta

How shall we handle our old (imported) content?

+6
−0

When we created this site, I made the executive decision to import all our content from Stack Exchange instead of starting with a blank slate. I did that for a few reasons:

  • We have a lot of good content there, and we should continue to have ready access to, and curate, that content.

  • I felt there would be a stronger incentive for SE users to come here if they could bring their work here. Having some of your content there and some here would be a pain, and I feared we'd lose some people because managing two sites is a hassle.

  • I wanted there to be a front page full of questions when we invited people here.

I considered asking ArtOfCode to run queries that would pull in only some of the content, like only upvoted non-closed questions or only questions with answers or other things. But that could get complicated (especially when we still want people to be able to have all their content if they want it), and Art was already doing us a big favor in setting up this community for us while we wait for the Codidact software to be ready.

We now have people here (yay!), and as we look through existing posts and (re)cast votes (1) and edit, we're seeing that there is in fact a lot of stuff here. And a lot of it is good, and we should give it the attention it deserves! And some of it is, maybe, not so good, and we should give it the attention it deserves too.

What I, and I think some others, have been doing is to kind of meander through the site, reading and voting. I've tried to review all the answers to all of my own questions, and in the process made some improvements. I also use tags as a starting point, though I'm nowhere near through all the questions on my favorite tags yet. And sometimes I just pick a page of questions and go. I encourage others to do any or all of these, too.

But my question is: how should we be curating this content?

Specifically:

  • What should we do with questions that were closed at the time of import? I reopened one yesterday, but most should probably stay closed. Some of them are of historical significance (we don't have locks here yet, sorry) and some were well-received if ultimately closed. At the other end of the spectrum, there might be some that have no answers or are downvoted, and maybe those should be deleted -- they can be re-asked if applicable.

  • What should we do if we come across answers that don't answer the question or are link-only answers? We have a couple moderator flags about this already, which we haven't handled pending some community consensus.

  • Assuming Art is willing to run some queries, should we do any systematic culling, and if so what? (Downvoted unanswered questions?)

  • How does our community feel about moderators making unilateral deletion decisions? We don't have auditing tools for this right now, but we can keep a list of deleted posts here on meta and lower the rep threshold for being able to view them. (That threshold is currently 1000, which nobody has.)

  • Other issues or suggestions?


(1) When we imported the content we reset scores to zero. We did this for two reasons: first, we do not have access to data about who voted, so we can't track your individual votes from that content. Second, we felt that in this respect a new site called for a fresh start, and that the people here should cast the votes that affect the ranking of the content here. We have the original scores available (though not, I think, the upvote/downvote split); if you think we should revisit this decision, please raise it.

Why should this post be closed?

0 comments

6 answers

+7
−0

The model on SE was moderation, not curation. Nothing was ever removed. Duplicates, were marked, but never resolved. The only way any kind of curation occurred at all was through voting, and voting was not based on the expertise of the voter. Bad advice was supposed to sink to the bottom of the page, and good advice to rise to the top, through voting. But this assumed that the mass of voters were reliable arbiters of quality and accuracy.

It did not always work. Votes often went to the first answer posted, while the question was fresh and attracting eyeballs. A much better answer posted a week later would have very little chance of ever rising to the top because the question would just not get as many views a week after it was posted, and many would not read all the way down to the new answer if they were satisfied with the inferior but highly rated answer at the top. This effect tended to be worse for writing than for more technical stacks, because most of the answers are not provable mechanically in the way programming answers are, for instance.

No model is perfect, of course. But I think that the questions you raise about handling the old content really come down to this distinction between curation and moderation. Moderators deal with behavior. Curators deal with content. Anything we do with the imported content is curation.

No method of managing content is perfect. Community curation through voting is an interesting model, and clearly performs well in some cases. But it also clearly leads to the accumulation of an immense amount of duplication and cruft. And to start the process over again for a body of content as large as this will clearly mean that it will be months, and perhaps years, before the curation effect of voting really kicks in.

So if you really want to do anything with the current content, what you are really talking about is curation. And maybe that is not such a bad thing. Vast numbers of poor answers, silly questions, and duplicate questions and answers could be removed with little controversy by a reasonable curator or team of curators.

That curation effort would yield a site that is far easier to navigate, and thus far more useful and more likely to attract traffic. Fixing up question titles so that they actually reflected the question asked -- making them actually be questions -- would, by itself, make a huge difference.

And maybe it is worth thinking about whether active curation, alone or in combination with voting, should be a permanent feature of the new site. After all, if you want to draw traffic to this site, despite it having fewer numbers of active users, making it easier to use would be a good draw.

But if we go that route, it seems to me that curators and moderators should be two distinct roles. Moderators should be focussed on behavior and the topics that are active right now. Curators should probably not get involved until the questions have cooled a little, and they should deal strictly with the content.

4 comments

I am talking about curation, yes. On Stack Exchange, high-rep users can vote to delete. Here we don't have that; only moderators (I think) can delete, so to implement this type of curation we need moderator action. (By the way, you might be interested to know that Codidact is planning a different answer-ranking system, including giving new answers to older questions initial priority.) Monica Cellio 7 months ago

Right, but there is potentially far more to curation than simple deletion of the egregiously bad. Consolidation and pruning could make a huge difference to the quality of the information set. On the other hand, they could offend the contributors and make reputation counting more complex. Not easy choices. Mark Baker 7 months ago

@MonicaCellio, I do like the idea of giving priority to the new, though. It would be useful, as a user, to be able to view the site in "What's new" more or in "What's best" mode, depending on the reason for my visit. Mark Baker 7 months ago

Yes, agreed -- deletion is part of curation, but so is editing and even writing new (better) answers to old questions. I was trying to address curation in all its forms; sorry for being unclear. Monica Cellio 7 months ago

+3
−0

To be honest, part of the reason why I asked this question on the Codidact forum was because of how writing.codidact.com (Writing.CO) looks a lot like a clone site.

There's 366-ish pages of questions. There's maybe 360-ish pages of cloned questions and answers. Writing.CO is basically a clone site. There's going to be a difficult decision here: What percentage of cloned material do we want on Writing.CO?

The questions you ask now will set the tone and topic for a long time to come. Try not to "seed" your site too much or the whole thing is going to start to look staid and forced. That will not make for an interesting site.
(Robert Cartaino; see also Your New Site: Asking the First Questions, 2010)

There are non-negligible drawbacks to being a clone site. Yet, there are drawbacks to not cloning (e.g. what if someone asks a duplicate of a Writing.SE question?). I don't see an easy answer to this problem.

Also, while people assert "QPixel is not Codidact", the URL (writing.codidact.com) overrides this. Practically, Codidact is QPixel, and QPixel is a basically a Stack Exchange clone. (Moreover, it's likely that Writing.CO is going to be the make-or-break site for Codidact.)

2 comments

I wonder how often people look beyond the first page. If there are 360ish pages but we mostly look at the first one, then the old stuff is there but unnoticed until somebody does a search or looks at something from someone's profile. Maybe that's ok but we should try to improve what we find when we do that? Thinking out loud. Monica Cellio 7 months ago

I think active participants are certainly going to look; maybe passers-by don't look. It may be possible to consider these posts "inactivated" [activated by voting, or a user arriving], but that would require some careful thought and implementation. becky82 7 months ago

+3
−0

One element of curating is which answer got accepted. It relates to what Amadeus says about first answer getting all the votes: the OP presumably sees all answers, and picks the one that helped most. That helps draw attention to that answer, which might not be the first one. I suppose we're going to have the "accept answer" feature here eventually? Then marking which answer to older questions was accepted shouldn't be an issue - we know who did the accepting.

With regards to the volume, I think with better visibility of older questions, they'll curate themselves eventually in the same way new questions get curated. For that, I believe we need some sort of suggestion mechanism, similar to what we had on SE, especially when we start asking a new question. (Such a mechanism is extremely useful for other reasons as well.)

We also need the curating tools: vote to close, vote to reopen, vote to delete. It shouldn't all be on the moderators.

0 comments

+2
−0

I'd say if they (Q) were closed before, open them and give them a downvote, and if they get "enough" downvotes (3? 5?) close them. I would like the same for both Q and A. SE had a "review queue", ours would just be downvoted questions.

For answers, I'd say any answer with 2+ downvotes should be hidden (viewable with a click) and available in a review queue; deleted after five downvotes. As a programmer myself, I'd put a counter on the user's account to see how often they get hammered for Q and A on review, at some point they are just here to offend, or work out their psychological aggression, and candidates for expulsion.

I agree with Mark about the structural issues with voting; i.e. the best answers (mine!) are often too late to get the most attention. Things I can think of to address that is to put the highest rated or accepted answer LAST, or leave them in answer order but reversed: The Newest answer is on top, whether it is accepted or the highest score or not. Or use the default SE sort and provide buttons to sort the answers by score or age. (I see this was discussed in the comments on Mark's answer.)

I don't mind moderators making unilateral decisions on OLD Q or A or comments. I wouldn't want to see this site get into the autocratic control mode of SE, but I do agree that we don't need to see spam, insults, racism, homophobia, misogyny, etc (unless it is a legitimate question or answer about racism, etc).

There is a rather fuzzy line to be drawn somewhere!

I like the idea of community moderation and curation, supplemented by more professional eyes at times. The only problem would be self-serving moderation; voting down Q or A as part of the "game" of getting the most points. SE tried to fix that in their gamification by making the downvote cost you two points, but an upvote costs you nothing.

I don't think discouraging downvotes is an intelligent answer though, then you have to sacrifice points to be altruistic and serve as a community moderator. A better answer might be requiring a comment to explain a downvote, instead of just accepting it. Review of the lame excuses a user makes for their downvotes could reveal that self-serving downvotes is the real pattern. So it doesn't cost points to downvote, but takes more of an effort than upvoting, you do have to publicly explain yourself. Which might, when moderating is the true motivator, help educate the poster as to what was wrong with their Q or A.

2 comments

Just in case I wasn't clear, with this meta post I'm more focusing on the bootstrapping issue with our large body of imported content. I'm not talking about new content or site policies months from now. Monica Cellio 7 months ago

@MonicaCellio Still, I'd say just add them with a downvote and leave them open; or stick them in a review queue. SE gets too complicated, it is really just another tab for questions. You could make it just like searching for a tag; questions with negative scores, sorted by how negative. Then we can visit the pages to help out with down/up votes, and you (admin) could close them manually for now by deleting whatever gets to <= -3. Doing the automatic thing can happen later, or never. Amadeus 7 months ago

+2
−0

I just joined, so I'm a bit late to the party, and to begin with I wasn't exactly a particularly prominent SE user. I'd just like to quietly express interest in the idea of also importing vote information from SE, for the following reasons that I think match up well with the reasons you gave for importing the other content:

  • We do have a lot of good content there, and many people here (myself included) have already done some curation of that content. As far as I can tell, the same voting rules apply here as they did on SE, and hypothetically the same users. Given that the situation is more-or-less identical, I don't see why the votes would be invalidated.

  • I firmly believe that the strong incentive for content transfer applies equally, if not more-so, to votes. People value their "points" on SE a great deal, and it's a decent way to evaluate contributions to a site. Understandably we would want some way to identify old contributions to SE vs. new contributions to codidact, but that could potentially be solved by having two fields for old vs new points. That way people can still enjoy having their points while also feeling encouraged to help bring life to the new site.

  • I think having a front page full of questions is a great idea that helps things feel more alive, but in my opinion it kind of serves the opposite purpose if all of the posts are totally devoid of all votes. It sets a strange precedent for new users, I think, and makes it obvious where content is "lifeless".

Hopefully this feedback isn't totally useless. I really like what you guys are doing here.

P.S. A 'Preview' for creating a post would be really nice :)

5 comments

We imported scores but didn't apply them - click the history link under any imported post, and you'll be able to find the score under the "imported from external source" event. ArtOfCode 6 months ago

@ArtOfCode In that case I still think that my arguments apply towards applying them, as well, since hiding them still has the same issues as described, I feel. OnyZ 6 months ago

Thank you for this input (definitely not useless!). One reason we didn't import the votes is that we can't connect them to users; from the data dump we can only tell how many votes (in each direction) a post had, but we can't tell where those votes came from. Rather than having people voting twice (you voted there and then see the question and vote again here), we decided to reset. The vote totals are available, and maybe we should figure out how to make them more visible. Monica Cellio 6 months ago

@MonicaCellio Ahh, if the vote totals can't be assigned to people, that does complicate things... In that case as you say, I guess the best way to alleviate the issues as I see it would be to increase the visibility of these "Archived" votes? OnyZ 6 months ago

It occurs to me that one of the ways in which social proof militates against curation is that any reduction of duplication or elimination of inferior answers involves reducing reputation (unless reputation is separated from individual content items somehow). Mark Baker 6 months ago

+1
−0

I think on the whole, anything that was marked "keeping only for historical value" -- those are probably delete-able. In general, how many of the oldest questions ARE worth keeping as they are? Should any be merged towards a NEWER duplicate instead of the newer ones going to the older ones?

Also, do we plan to handle duplicates the exact same way SE did? I'd love if there were a way to "cluster" them -- instead of saying "this one is a duplicate and thus not needed," say "this appears to be a duplicate of that -- people may have several ways of coming to this question, so we'll CLUSTER it with the core one, but it can stay as its own branch? Something like that.

1 comment

Duplicates are kept (not deleted) specifically because of what you say -- people ask questions in different ways so don't find the original via search, and the duplicate links help bring them together. The "cluster" idea is interesting; right now we have unidirectional links, and that seems like something to improve. Monica Cellio 6 months ago

Sign up to answer this question »