azurelunatic: A glittery black pin badge with a blue holographic star in the middle. (Default)
Azure Jane Lunatic (Azz) 🌺 ([personal profile] azurelunatic) wrote in [site community profile] dw_suggestions2010-07-20 10:50 am

Hide specific referer information from locked posts

Title:
Hide specific referer information from locked posts

Area:
privacy, locked entries, driving webmasters up the wall

Summary:
Optionally redirect URLs in entries that are posted locked, to give less incidental information about locked entries to external entities.

Description:
I may be technically wrong in some of this; I beg your indulgence and crave correction of any misunderstandings.

When a person follows a link, the person's browser sends information to the linked website including the referer, or the place that the person just was. (Unless, of course, the person has instructed their browser not to give referer information.) This generally includes the URL of the page on which the link was clicked. In the case of a locked entry, this means that one's browser is cheerfully giving out the location of a locked entry that links somewhere else, to the webmaster of the destination page. Sometimes this is perfectly innocuous; sometimes, this betrays too much information and can start drama.

A quick look at the topic shows that the usual server-side way to strip a referer of private information is to send it through a redirection service, which shows the URL of the redirection service rather than the URL of the actual source. This seems a reasonable enough way it could be done -- set up a redirector on Dreamwidth in a dedicated subdomain, and have a brief explanation/link to the FAQ on that page if someone goes there without a redirection argument to see what this place is and why they're getting traffic from it.

Actually doing this could be done a couple different ways.

1, obvious: Upon detecting the magic combination of links and a locked entry, prompt and offer an actual rewrite of the links, to include the redirect. Advantage: transparent to the journal owner, the code that's entered is the code that comes out. Disadvantage: Harder for DW to deal with if things change somewhere down the line; harder for an individual user to go back and manually change URLs when security of entries changes.

2, reader-centric: A setting on the reader's end to automatically insert (probably via the HTML cleaner, when the entry is called up and displayed) a redirect on all in-entry/in-comment links from locked entries (it would be silly to have a redirect on things like the comment link, or on usernames in comment metadata). Advantages: the reader gets to set it, which means that if on something like a mobile device, it could be unset if it causes delays; since the reader sets it, the reader won't be surprised by it; if things on DW's end change, it's changing it in one place; if entries change security, there's no need to edit the entry. Disadvantage: that puts some more of the journal owner's privacy in the reader's hands, less transparent.

3, owner-centric: The journal owner sets whether a redirect will appear for any locked entries. Advantages: the journal owner is in control, and again, if things change in the way DW wants to handle it, it's just working with the HTML cleaner, not risking breaking links in entries permanently; if entries change security, there's no need to edit the entry. Disadvantages: potentially bad for readers with slow devices, perhaps confusing to readers who don't know about the feature.

[Edited to add: 4, global: Done for all locked entries, will-they-nil-they.]

Advantages to all: more privacy.
Disadvantages to all: less information to the owner of the remote website (which would include other Dreamwidth-hosted journals); link redirection is not what DW particularly wants to specialize in; the redirector could possibly get abused by people who don't have DW journals but are always glad to see an unguarded redirection service.

Poll #3893 Hide specific referer information from locked posts
Open to: Registered Users, detailed results viewable to: All, participants: 42


This suggestion:

View Answers

Should be implemented as-is.
10 (23.8%)

Should be implemented with changes. (please comment)
6 (14.3%)

Shouldn't be implemented.
5 (11.9%)

(I have no opinion)
21 (50.0%)

(Other: please comment)
0 (0.0%)

charmian: a snowy owl (Default)

[personal profile] charmian 2010-07-22 08:57 pm (UTC)(link)
I vote for #3. This is how wordpress.com does it.
charmian: a snowy owl (Default)

[personal profile] charmian 2010-07-22 11:26 pm (UTC)(link)
I am definitely against that option. IMHO it should be up to the user and they must clearly choose to opt-in. I think wp.com implements it in a good way. It's an option, but you don't HAVE to do it if you don't want to.
charmian: a snowy owl (Default)

[personal profile] charmian 2010-07-23 12:12 am (UTC)(link)
http://www.facebook.com/note.php?note_id=392382738919

Also, perhaps FB's implementation could provide some hints.
charmian: a snowy owl (Default)

[personal profile] charmian 2010-07-23 01:16 am (UTC)(link)
Yeah, it is a problem.

This wouldn't work for single entry view, but what if for the read page DW were to do what Tumblr or Posterous does, and instead of it being username.dreamwidth.org/read, it was www.dreamwidth.org/read? That way there would be less information.

(Also, selfishly I would like this changed because when it comes to my google analytics page, it becomes clogged with refers from people's read pages, when what I really want to know is whether entries are linking me. I have a rant about how I find Google analytics annoying that I never do finish)
charmian: a snowy owl (Default)

[personal profile] charmian 2010-07-23 01:45 am (UTC)(link)
Yeah, it wouldn't be as bad if Google analytics didn't show only the subdomain and not the exact URL.

Heh, ok.
pauamma: Cartooney crab holding drink (Default)

[personal profile] pauamma 2010-07-22 09:23 pm (UTC)(link)
If we ever get around to implementing vanity URLs properly, will this cause problems in conjunction with them?
kyrielle: A photo of kyrielle, in profile, turned slightly toward the viewer (Default)

[personal profile] kyrielle 2010-07-22 09:30 pm (UTC)(link)
#3, perhaps with something to allow the redirector to be used only from within the DW domain to avoid the abuse factor if that's possible.

Not sure if there's any feature that would make it useful to redirect links leading into the DW domain from elsewhere within it, either, so could maybe avoid extra steps on those if there's not a reason to need it.
turlough: deckchairs on Brighton Beach, June 2013 (Default)

[personal profile] turlough 2010-07-22 10:47 pm (UTC)(link)
+1
matgb: Artwork of 19th century upper class anarchist, text: MatGB (Default)

[personal profile] matgb 2010-07-22 10:33 pm (UTC)(link)
Definitely need something. Mot just the location, but also the page title, so if I had a locked post title "sometimes I really hate Az" and linked to you, you could pick that up if anyone followed the links.

There are services, like anonym.to, that you can use, but I don't want that hardcoded on my page, and it wouldn't stop sidebar links &c.

So yes, I approve this idea, but it'd need to parse the whole page, not just the entry. And I have no clue how to do it.
charmian: a snowy owl (Default)

[personal profile] charmian 2010-07-22 11:34 pm (UTC)(link)
Maybe there's something I'm not getting, but how could they pick that up? The referring URL would still be username.dreamwidth.org/######.html wouldn't it?
matgb: Artwork of 19th century upper class anarchist, text: MatGB (Default)

[personal profile] matgb 2010-07-22 11:58 pm (UTC)(link)
yup, but refer data passed on includes page title as well, not just url.
charmian: a snowy owl (Default)

[personal profile] charmian 2010-07-22 11:59 pm (UTC)(link)
Huh, I didn't know that.
damned_colonial: Convicts in Sydney, being spoken to by a guard/soldier (Default)

[personal profile] damned_colonial 2010-07-24 11:31 pm (UTC)(link)
What? Got a cite for that? I've never heard of it... The referee header in the http request just gives the URL doesn't it? And that's certainly all that's shown in the default setup of apache's combined server logs

ETA: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html gives the spec for the HTTP referer [sic] header. It gives a URL, and nothing else. A quick google for "http referer title" doesn't suggest that the title is passed.
Edited 2010-07-25 01:08 (UTC)
matgb: Artwork of 19th century upper class anarchist, text: MatGB (Default)

[personal profile] matgb 2010-07-25 01:55 pm (UTC)(link)
Yes, I'm wrong.

I don't know why I had convinced myself it was true, but was sure that my old stats package (statcounter) told me referral titles. Appears I was wrong and conflating that with pages viewed.

It's been a long time since I used it (took multiple attempts to find the password because while it was one of my standard mid-secure passwords, it was a 5 year old one).

Thanks for correcting me.
melannen: Commander Valentine of Alpha Squad Seven, a red-haired female Nick Fury in space, smoking contemplatively (Default)

[personal profile] melannen 2010-07-22 11:16 pm (UTC)(link)
This gets "no opinion" from me not because I don't care, but because I'm *utterly* of two minds about it. I can't decide if the privacy gain is worth the disadvantages.
aedifica: Me looking down at laptop (off screen).  Short hair. (Default)

[personal profile] aedifica 2010-07-26 02:38 pm (UTC)(link)
+1
msilverstar: (craig)

[personal profile] msilverstar 2010-07-22 11:37 pm (UTC)(link)
I actually did a redirect like that via my personal web server, and it was a pain in the ass. So automation would be good.

bit.ly and such redirectors no pass on the referer so the target page will get the SEO credit for it in search results. Which means it's not good enough for our purposes.

1, 3, or 4 works for me
zeborah: Map of New Zealand with a zebra salient (Default)

[personal profile] zeborah 2010-07-23 06:08 am (UTC)(link)
I vote #3. #4 would be my second choice - it has disadvantages but I think the disadvantages of the other options are worse.
noracharles: (Default)

[personal profile] noracharles 2010-07-23 07:44 am (UTC)(link)
I voted no, but I could accept a solution that was opt-in on a post by post basis.

All the suggestions have disadvantages, but #3 is the least objectionable. A simple way of doing it is to write links like this: linkIdontwantreferedataon.com (copy and paste), but it's not as accessible and I can see why journal owners might not want to inconvenience their readers like that. On the other hand, I find it annoying when I'm redirected when I click a link.

So, opt-in :-)
zvi: self-portrait: short, fat, black dyke in bunny slippers (Default)

[personal profile] zvi 2010-07-23 04:22 pm (UTC)(link)
I vote for number 4. It's a security hole and, look, there's a fix!

ETA: well, okay, privacy hole, not security. But still, good privacy protection is one of our things, and this would add to it.
Edited (expansion/correction) 2010-07-23 16:25 (UTC)
noracharles: (Default)

[personal profile] noracharles 2010-07-23 06:16 pm (UTC)(link)
You're right that it's a privacy hole, when users may be unaware of how referer information works, and unwittingly have their links reveal private information about their locked posts.

On the other hand, I may link to something on my special access filter for writing about a private issue and want there to be referer information, because the site I'm an affiliate of isn't going to act awkward around me because I have [rare disease], or the writer of the [controversial kink] fic I just recced won't think less of me for having that kink, or the friend who's already on that access filter would feel better knowing that all that traffic to her entry is coming from me and not from someone mocking her.

So no to global hiding of referer information on locked entries.
noracharles: (Default)

[personal profile] noracharles 2010-07-24 04:10 pm (UTC)(link)
No, they don't depend on the referrer, but I'm an affiliate of a few bookstore through my website, and it's in the terms of service that I may only post affiliate links from the url connected to my account. I assume so I can't pull in a lot of traffic using content which they disallow in connection with their brand and hide it from them by not having it on the vetted site.
daweaver:   (Default)

[personal profile] daweaver 2010-07-25 11:00 am (UTC)(link)
Let me see if I've got this right:
* Person A makes secure post, including link α.
* Dreamwidth servers do something to transform α into β.
* Person B reads secure post, clicks on link β.
* Link β is to a redirection engine on Dreamwidth, which then goes to the original link α with different HTML headers, the effect is to not disclose information about A's entry, including that it's from A.
* The owner of α is aware that they're getting traffic from Dreamwidth, but nothing more specific. They will also be aware that they're getting a visit from B.

From a privacy point of view, this is good. To deal with the last point, I don't see that there's a possible way to stop the owner of α from finding out some information about B's visit, short of Dreamwidth providing a browsing proxy, and *that* is outside the scope of this suggestion.

If such a plan were to be implemented, it would only alter disclosure about A's journal. I would therefore counsel that A is the only person who can set (and potentially unset) this redirection. This may also save in computational power, as Dreamwidth need only perform the translation once per entry, and store the post containing transformed links. Option 2 would require each post to be changed on the fly, which is much more expensive.

Dreamwidth is going to have to keep fuel for the redirection engine, the translation between α and β will have to be stored somewhere. This requires additional computation power, and provides an additional point of failure and point of compromise. I don't see it as a significantly greater problem than keeping data on Dreamwidth in the first place. If the translation table is leaked, all that will be available is a list of URLs and their Dreamwidth codes.

In principle, the redirection could be done on a link-by-link basis, but it may be easier to require authors to use the feature on a post-by-post basis. Such a setting would also enable the server to determine whether to use the real or translated URLs of page furniture links. It would not be an appropriate to have this option available for public posts, as they're already public.

Use of this setting must be under the control of the individual poster, I've no firm opinion about the default setting.
sophie: A cartoon-like representation of a girl standing on a hill, with brown hair, blue eyes, a flowery top, and blue skirt. ☀ (Default)

[personal profile] sophie 2010-07-27 12:13 am (UTC)(link)
They will also be aware that they're getting a visit from B.


Well, they're aware that they're getting a visit from someone, but the only information they'd get from that which would be potentially useful in identifying the person is the IP address (and maybe the browser used, if the browser is sufficiently rare, with extra domain knowledge, to tie to a single person).

The username wouldn't be given.