Random thought: I wonder if sentence classification could be a low-cost tool for moderators in a system that tries to spot potential arguments. Start with a baseline of what's on most threads. Then, a flurry of exclamatory and interrogative might indicate one form of argument.

I'm sure there's high, false positives. One could always use more reliable, costlier techniques on whatever these quick and dirty methods draw attention, too. A few stages in the process before human sees it. Votes and flags would obviously still be there. This is just an extra, preemptive technique.

The main issue is actually data. I've hand labeled much of my training set for my website:

For a problem such as argument identification (which I think is totally possible), it would likely take a hundred thousand or more labeled pieces of data. Meaning it would likely cost thousands of dollars.

Now... if you have people who flag content provide a reason, you have the labels generated for you ;)


Good point. Good idea. Might even have people guiding the process with many folks in decent communities doing it over time. I imagine almost all decent ones, regardless of group norms, will be against a specific subset of behavior such as trolling, obvious words denoting hate, and specific patterns of heated, pointless argument. So, even a lowest, common denominator among many forums downvotes and flags might be useful. Then, forum-specific methods take over from there.


