azurelunatic: Vivid pink Alaskan wild rose. (Default)
Azure Jane Lunatic (Azz) 🌺 ([personal profile] azurelunatic) wrote in [site community profile] dw_suggestions2012-08-28 01:05 am

Sitewide antispam capability: comment CAPTCHAs for inactive accounts

Title:
Sitewide antispam capability: comment CAPTCHAs for inactive accounts

Area:
comments, anonymous users, inactive users, antispam

Summary:
When an account becomes inactive (discussion of what constitutes "inactive" for the purposes of this concept to follow), require any anonymous comments to fill out a CAPTCHA. If/when the account becomes active again, revert to the user's settings. This would not delete anything already in the journal, would not stop logged-in users from commenting, and would allow anonymous users who could solve the CAPTCHA to comment.

Description:
Spun off the comments on http://dw-suggestions.dreamwidth.org/1374810.html --

Spam comments are a woe that should be discouraged, while not discouraging comments of legitimate discourse from real sentient beings.

Most comment spam on Dreamwidth is anonymous. One of the places that spammers strike is the accounts of people who have become inactive. If someone's not around for any particular reason, they generally can't get rid of any spam that shows up in their journal. Emboldened by the way their first overtures have not been repelled or cleaned up, the spammer strikes again, and again, and again.

Anonymous spam is present until cleaned up by the journal owner (or someone logged in as them). Registered user/OpenID spam is only present until someone (someone else hit by the spammer, or a good neighbor) reports the spammer and the spammer is suspended; all of the spam comments left by that user will then go away across the site.

For this reason, it is more important to attempt to repel anonymous spam in the event that the journal owner is not around and therefore not able to take action.

When the journal owner becomes inactive, and the journal allows anonymous comments, and the journal owner does not already present anonymous comments with a CAPTCHA, and anonymous comments are not screened by default (screening leaves the anonymous comments invisible to search engines unless the journal owner comes through and unscreens them, and by that time first the owner is active, and second, the owner is hardly going to unscreen spam on purpose unless there's a bigger problem) then there should be a sitewide setting to put up a CAPTCHA upon the attempted anonymous comments to those journals.

Now the definition of "inactive" for the purposes of probably not actively gardening journal comments. This should be something that can be adjusted on the administrative end of things should it not be got right on the first try. As a first attempt:

No new or edited entries in personal journal
No new or edited community entries? (Can we track this?)
No new or edited comments (from the journal, not to the journal, either in their own journal or abroad) (can we track this?)
No active login sessions

... for at least 60 days? Doing anything that touches one of the above things would start the clock over again. If someone logs in, leaves a comment in a community, deletes the comment, and logs out, that would restart the clock. If someone leaves themselves logged in after doing that, they would have until that login automatically expires before the clock starts.

Poll #11570 Sitewide antispam capability: comment CAPTCHAs for inactive accounts
Open to: Registered Users, detailed results viewable to: All, participants: 54


This suggestion:

View Answers

Should be implemented as-is.
39 (72.2%)

Should be implemented with changes. (please comment)
8 (14.8%)

Shouldn't be implemented.
0 (0.0%)

(I have no opinion)
7 (13.0%)

(Other: please comment)
0 (0.0%)

matgb: Artwork of 19th century upper class anarchist, text: MatGB (Default)

[personal profile] matgb 2012-08-29 01:38 pm (UTC)(link)
From /stats
Total Accounts: 1712622
That are active in some way: 75715
That have ever posted an entry: 160079
That have posted an entry in last 30 days: 33277
That have posted an entry in the last 7 days: 18548
That have posted an entry in the last 24 hours: 7767
For the sake of simplicity, whatever the definition is of "active in some way" should be the definition for this as well, any account not active in some way counts as inactive.

So, IIRC, I count as active because I check my reading page and comment even though I've not posted for nearly a year, but someone who hasn't logged in in a period of time gets set inactive.

FWIW, that's what I had in mind when I said "inactive" in the comment thread you link to.

[personal profile] swaldman 2012-08-29 01:59 pm (UTC)(link)
Seems like a good idea in principle. I would be in favour of notifying the user somehow (email? no point in an inbox notification if they're not using DW) when they become "inactive", on the principle of not giving people captchas when they have asked not to have them without explaining why. But, the notification might be perceived as spammy...

If we already have the last login date easily accessible in the database, would this be an acceptable way to determine whether somebody is active? I have little idea what I'm talking about here, but I imagine that it would be easier to implement (but a counter such as you describe might also be easy. I'll leave that to the people who have a clue ;-))
denise: Image: Me, facing away from camera, on top of the Castel Sant'Angelo in Rome (Default)

[staff profile] denise 2012-08-29 02:05 pm (UTC)(link)
[personal profile] matgb is correct above: there's already an activity check for determining "active" for a bunch of things. :)
amadi: A bouquet of dark purple roses (Default)

[personal profile] amadi 2012-08-29 05:22 pm (UTC)(link)
I said "with changes" only because 60 days seems light, and IME this type of comment spam tends toward entries that are at least six months old, if not older. 120 days seems a more reasonable length of time.
ninetydegrees: Art & Text: heart with aroace colors, "you are loved" (Default)

[personal profile] ninetydegrees 2012-08-29 08:06 pm (UTC)(link)
This. I would have said three months at least.
Also +1 to notifying the user that apparent inactivity status triggered the turning on of CAPTCHA tests for anonymous comments (if they're allowed).
the_shoshanna: brown sheep dreams of Dreamwidth (Dreamsheep)

[personal profile] the_shoshanna 2012-08-29 09:01 pm (UTC)(link)
+1
aedifica: Me with my hair as it is in 2020: long, with blue tips (Default)

[personal profile] aedifica 2012-08-29 05:45 pm (UTC)(link)
YES.
daweaver:   (Default)

[personal profile] daweaver 2012-08-29 07:33 pm (UTC)(link)
Spam comments are a woe that should be discouraged, while not discouraging comments of legitimate discourse from real sentient beings.


I don't agree that captchae serve the purpose outlined above. Specifically, there is some evidence that these tests do discourage comments from actual humans. Furthermore, the original poster appears to be claiming that humans who cannot solve such tests are either unreal or not thinking. Are the visually-impaired to be forever disadvantaged?

From the tenor of this and other recent suggestions, Dreamwidth seems to treat captchae as a magic bullet, something that will prevent spam in all its forms. Are there no other tools in the arsenal?

There's a question of social expectations. Dreamwidth has presented itself as a company that honours its contracts, and one that will not change policies without evidence. It would be wrong to present a captcha in any circumstance where the account holder has chosen against them.

Can the original poster provide evidence that this is a widespread problem? That the harm from spam is greater than the harm from overturning deliberate journal decisions? Absent such evidence, I could not support the proposal in any form.
denise: Image: Me, facing away from camera, on top of the Castel Sant'Angelo in Rome (Default)

[staff profile] denise 2012-08-29 07:42 pm (UTC)(link)
The original suggester is the head of the antispam team, just for the record. :)

Spam is definitely a medium problem now, and if you look at similar services, it's a huge problem there: unless a services is diligent and vigilant about the issue as aggressively as possible, it becomes a major spam target and once that happens, it's a lot harder to address. (Look at InsaneJournal, for instance; I don't know what percentage of their activity is spam, but judging by comments to their news posts, the results of their random journal search, and the fact that whenever you load their stats page, you have pretty good chances of hitting over half spammers in the "recently created" and "recently updated" sections, I'd venture a guess of anywhere from 50-75% of their activity is spam no matter how hard they try to squash it.) The only way to keep a service from being a major spam magnet is to diligently and vehemently address even the smallest bits of spam and prove to spam networks that spending time trying to spam the service is not a good return on investment.

DW's default CAPTCHA implementation is no longer image-based, by the way; individual journal owners can choose to switch back, but by default site-wide, we use text-based captchas with a significantly lower false-block rate and a much, much higher standard of accessibility. They also have the advantage of being more resistant to the proxying attack that's the standard way of breaking captchas these days.
daweaver:   (Default)

[personal profile] daweaver 2012-08-31 03:25 pm (UTC)(link)
My preference is always to respond to the suggestion as it is presented. It wasn't at all obvious that Azure Lunatic was anything other than an interested observer.

Thank you also for clarifying the change in captcha format. I very much doubt that data exists to show how many (or few) humans are deterred from interacting by this form of test, or how many humans are deterred by the presence of any test.
deborah: the Library of Congress cataloging numbers for children's literature, technology, and library science (Default)

[personal profile] deborah 2012-09-05 03:30 pm (UTC)(link)
there is constant research going into accessible captcha alternatives (as you know), and I would definitely like us to be looking into them at least once a year or so, whether we implement this suggestion or not. Just periodic checkups to make sure that there isn't something usable which is better than what we have. I keep a folder of them but tend not to spend the resources into looking into them -- I suppose that is kind of my job. ;-)
denise: Image: Me, facing away from camera, on top of the Castel Sant'Angelo in Rome (Default)

[staff profile] denise 2012-09-05 05:10 pm (UTC)(link)
Oh, absolutely. Now that we have the framework for alternate captcha implementations (and soon, hopefully, we will have a viewer preference as well as a by-journal preference) it should be easy to toss in more if somebody comes up with something better.
denise: Image: Me, facing away from camera, on top of the Castel Sant'Angelo in Rome (Default)

[staff profile] denise 2012-08-29 07:47 pm (UTC)(link)
I should also add, we have been very clear and up-front from the very beginning that there will be times when we need to place restrictions or make alterations to the site as a whole for the benefit of the service as a whole, and preventing the site from being overrun by spammers is definitely something I think qualifies.

Also: do you really think that someone who hasn't been active on the site in a year or whatever is going to be that upset over the service requiring commenters to solve a captcha in order to comment, when on multiple other services of similar reach and remit, not being active for a year is enough to get your entire account deleted?
daweaver:   (Default)

[personal profile] daweaver 2012-08-31 03:34 pm (UTC)(link)
Boiled down to basics, the original poster want to test that the person making the comment is a person, and not a robospammer. Current technology does not permit a direct test, so proxy tests have to be used, one of which is the captcha. Proxy tests are poor and imperfect.

I don't disagree that Dreamwidth should make reasonable efforts to prevent spam posts from appearing on its servers. The original poster makes good points that the specific circumstances (anonymous comments on inactive journals) warrant some action.

It remains unclear that the circumstances outlined in the suggestion are particularly common. I'm not familiar enough with the Dreamwidth defaults to know if no-captcha-for-anon-comments can arise by accepting defaults. If that is the default case, I'm somewhat easier about Dreamwidth making minor changes to site behaviour than if the owner has consciously chosen not to show a test.

On further reflection, and acknowledging the evil nature of captchae, I would not raise tremendous objections to this as an interim resolution, pending a better (and fully automated) internal spam-check process.

There would need to be publicity about this, possibly including a message to the effect of "You're seeing this test because the journal owner hasn't logged in for some months." Such a message might also prod the owner to logon again.
cesy: "Cesy" - An old-fashioned quill and ink (Default)

[personal profile] cesy 2012-08-29 08:15 pm (UTC)(link)
I agree with using the existing definition of "active in some way", and emailing the user to notify them of the change and that logging in or posting a comment or entry will turn it back off.
msilverstar: (corset)

[personal profile] msilverstar 2012-08-30 01:23 am (UTC)(link)
+1
zeborah: Map of New Zealand with a zebra salient (Default)

[personal profile] zeborah 2012-08-30 11:44 am (UTC)(link)
+1