Finer grained control of robots.txt
Title:
Finer grained control of robots.txt
Area:
privacy, account settings
Summary:
Allow users to upload their own robots.txt file
Description:
The current robots.txt file generated when you check "Minimize my journal's inclusion in search engine results" forbids all automated robots from indexing your journal.
However, not all robots originate with search engines. For instance, someone who did not want their journal to be googlable might still want the backup provided by archive.org. (Or, perhaps it's the other way around, you do want Google to search, but you don't want a permanent record.) Browsershots.org uses a robot to display a web page in a variety of different browsers. I'm sure there are other webservices out there (translation? found art?) which use robots and obey robots.txt directives.
I suggest permitting users to upload a robots.txt file for their journal, so they can have finer grained control of their privacy.
If everybody having a robots.txt file is too expensive, let people add individual robots to allow and have the service automatically generate a robots.txt which allows those user-agents on a person's journal.
This suggestion:
Should be implemented as-is.
24 (60.0%)
Should be implemented with changes.
6 (15.0%)
Shouldn't be implemented.
1 (2.5%)
(I have no opinion)
8 (20.0%)
(Other: please comment)
1 (2.5%)

no subject
no subject
no subject
no subject
no subject
On the one hand, it would be good to at least have some sort of parser so you could see what would get blocked and what would not.
On the other hand, there are a lot of things that are Deep Magic That One Oughtn't Mess With Unless One Knows. (Though in any case there should be the ability to quickly turn back on the default.)
On the gripping hand, impolite robots do not even abide by them, and it is all public content unless said bot has an account that has granted access.
no subject
I dunno, I presume it works the same for community accounts as it does for personal accounts, at least on the robot side? Would it be the same for both on the DW side too?
no subject
no subject
no subject
To get a semblance of privacy, journal entries have to have controlled access. And even then, it's not totally reliable.
no subject