Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Meta

Comments on What new data should we import from SE?

Parent

What new data should we import from SE?

+1
−0

When we set up this site we imported from SE as of the December data dump (the latest we had at the time). We didn't have a way to get the delta; the import code didn't use the API.

We now have better data-import tools, and there's been a new data dump. We can, therefore, import stuff that was posted on SE between December and now -- through sometime in March via the data dump, and through the API for the rest. We can also now specify what we want via a SQL query, so we can make import decisions based on tags, question status, votes, whether the question has answers, and more.

This is for new posts; we can't integrate edits.

I think it's worth pulling in more posts, as some of our users here were still active there during this period of time. I also think it's worth being a little more picky than we were on the initial data load. I'm thinking about excluding anything that's downvoted on SE, and any question that's closed. Is that a good approach? If not, what should we do instead?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

Post
+3
−0

I'm honestly inclined to agree that what Writing Codidact needs isn't really another data import. It's to get actual people to come here, read, contribute, and remain.

People who have posted on Writing SE are by no means legally prevented from posting that same content here, since it's still their content and SE only has a license to use it (for a wide variety of purposes, including redistributing it under a CC license). The SE terms of service clearly acknowledge this. Edits can be slightly trickier, but certainly the initial revision could be copied verbatim if the person who posted it wants to do so, and we can then tweak it here.

Also, there's a small amount of original content here that wasn't imported from SE at all. Making another import from Writing SE seems likely to bury that content, which seems to me like absolutely the wrong thing to do when we're trying to get people to come here. We should be giving people a reason to come here specifically, and I don't see how the way to do that would be to import more content from elsewhere.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

1 comment thread

General comments (2 comments)
General comments
Monica Cellio‭ wrote over 4 years ago

Good point about burying the original content, which we need more of (and to be more findable). And we should continue to prune stuff from the original import that is not helping us -- downvotes and flags are helpful there.

Mark Baker‭ wrote over 4 years ago

Agreed. The only way this place thrives is if it becomes known as the best place to ask questions, and that only happens if it becomes known for having the best answers. Otherwise SE's first mover advantage will be insurmountable. More vigorous curation could certainly do a lot to improve the quality of answers here. It may not be enough by itself, but without some distinguishing property in the model, I don't see how the mouse bests the elephant.