flo: A lovely, purple-shaded teapot. (Default)
flo ([personal profile] flo) wrote in [site community profile] dw_suggestions2011-06-14 12:24 pm

Revamping the 'Export Journal' tool to allow exporting in larger time intervals

Title:
Revamping the 'Export Journal' tool to allow exporting in larger time intervals

Area:
exporting, interoperability, backups

Summary:
Revamping the 'Export Journal' tool at http://www.dreamwidth.org/export so that you can export entries from your journal on a yearly basis, and also export it all as one giant file. Export options on that page could also use some clarification so that you understand why or why not to tick them.

Description:
Basically, I want to the basic export tool to allow me to export my journal as XML on a yearly basis as well, or even just all at once. Currently, the export tool at http://www.dreamwidth.org/export only allows you to export on a monthly basis, which means lots and lots of repetitive action if you post even occasionally on your journal. As far as I know, people that want to back up their Dreamwidth journals generally resort to something like LJ Archive, which is thorough, but can be inexplicably buggy, and is not at all maintained by Dreamwidth or otherwise affiliated with Dreamwidth. I would much prefer being able to export my journal with an on-site tool that I know will be fixed or improved as necessary, and I would be willing to go without exported comments if the Dreamwidth tool will provide a complete set of all my entries.

It would also be rather nice if the more esoteric options on the tool's page had more explanatory labels. Currently, this is approximately what the "Fields" options look like:

- ID Number
- Event Time (from your clock) *
- Log Time (from system's clock) *
- Subject
- Event *
- Security Level
- Allow Mask *
- Current Mood & Music

Some of these fields (the unstarred ones) are fairly self-explanatory. The rest are somewhat confusing to me-- "Event Time" and "Log Time" seem self-explanatory at first glance, but then you also have "Event", and before I actually looked at an export file, I honestly wasn't sure what all those fields would mean when taken together. "Allow Mask" is also really badly named, and probably should not be an option at all, since that field is basically the bit that is used to determine whether an entry is public or access-only. Not exporting that field just removes part of the distinction made between private and public entries in the export file, which really doesn't serve any purpose since the tool currently exports everything. A fine-grained option to control what is exported would be nice, but it probably should not be at all related to what fields get exported.

Another confusing option on the page is: "Don't translate between encodings". Is this something an end user is supposed to just know? I know what encodings are, and that option does not really tell me exactly what it does. Does ticking the option mean that it will just leave any entries that are not encoded as X (where X is the encoding you selected from the preceding combo box) as they currently are? Is this an option that's more for debugging or troubleshooting export files that don't import elsewhere correctly? Should it even be on the page at all?

As far as using the tool goes...well. There is nothing on the page that indicates that the tool will only export by month, so unless you got to it from the FAQ, you won't even know about that. When I tried putting in just the year to see if I would get an explanatory error, it basically just gave me an empty CSV or XML file named after the year. Putting in the wrong month/year combination also gave me an empty file, which, remember, I won't know about until I actually look within the file. Considering that the tool is supposed to let you export entries, I really think it should let you know when your settings mean that you will not be exporting any entries at all.

In summary, what I want is for the export tool to
a) Allow you to export entries by year
b) Allow you to export all your journal's entries at once
c) Provide understandable and useful options that will not make your export useless or incomplete in ways that you did not mean (e.g. if, for some reason, you do not select "Event", none of your entry text will be added to the export file)

If there is a question of the tool putting strain on the servers, the number of times that free users can export their journals can be restricted (e.g. once a month, once every couple months, etc).

Poll #7709 Revamping the 'Export Journal' tool to allow exporting in larger time intervals
Open to: Registered Users, detailed results viewable to: All, participants: 46


This suggestion:

View Answers

Should be implemented as-is.
21 (45.7%)

Should be implemented with changes. (please comment)
9 (19.6%)

Shouldn't be implemented.
0 (0.0%)

(I have no opinion)
16 (34.8%)

(Other: please comment)
0 (0.0%)

msilverstar: (corset)

[personal profile] msilverstar 2011-08-09 02:25 am (UTC)(link)
Please don't remove the flag for public vs. more private, as the setting for a particular entry means a lot to me, even in an archive. It's hugely important if someone's going to export to another system.

In fact, I'd say make the access field correspond to the entry options, so export whether it's public, private, locked to accessors, or locked to custom access groups (and if so, which ones).

zvi: self-portrait: short, fat, black dyke in bunny slippers (Default)

[personal profile] zvi 2011-08-09 03:52 pm (UTC)(link)
I believe what the OP proposed was removing the option to not include that information in the export, i.e. to force all exports to include that information.
Edited (word choice) 2011-08-09 15:54 (UTC)
solitarywalker: (Default)

[personal profile] solitarywalker 2011-08-09 02:45 am (UTC)(link)
You seem to be making a lot of suggestions here.

I definitely agree that it'd be helpful to be able to export by year instead of having to go by month.

Rather than adding explanations to the exportable fields, why not just offer a link to the relevent FAQ? Easy access for people with questions, out of the way for people who already know what they're doing.

I don't think any labels should be changed, but if they are, please change them only on the export page, and NOT in the header row of the export files themselves.

I strongly disagree about removing options (e.g. Allow Mask). If it's not useful to you, fine, don't export it. But don't keep everyone else from being able to use it.

Showing an error message if the called-for export is empty makes sense.

If exporting a year's worth of entries is significantly more resource-intensive than exporting twelve months individually (unlikely), and so many people usse/abuse it that it's causing a problem, make annual exports a paid feature, and continue allowing everyone to do monthly exports as they can now, without frequency limitations.
solitarywalker: (Default)

[personal profile] solitarywalker 2011-08-10 11:55 pm (UTC)(link)
i can imagine someone caring only about the main text of the entry and not about the security level, e.g. for the purpose of printing a book of one's journal. Personally, I want the allow mask because I'm making a backup. Whether something is "useless" depends on the intented usage.

Much as I hate tooltips, your suggested usage of them here makes sense.

Labels probably shouldn't be changed; I'm guessing it'd be nontrivial, and anyway there's no need if tooltips (or whatever) are implemented to explain them. But to answer your question: Changing the labels of the header rows would disrupt importing into other systems that use the header rows to match things up. (I store all my old entries, LJ and here, in a database; surely I'm not the only one.) Adding new columns (with appropriate headers) isn't disruptive; the data in the unrecognized column would just be ignored (or support could be added for it).

[personal profile] voldsom 2011-08-09 06:22 am (UTC)(link)
I'd really like to see better facilities for exporting journals, and I hope that it's on Dreamwidth's wish list. The Export Journal is a reasonable base line, but it needs a little polish.

I agree with the suggestion of being able to export a year at a time. That's something that I would find very useful. I'd like to retain the ability to export monthly, but in terms of catching up with my backlog and also in terms of permament archiving, I'd like to see a year's worth of data, though I know this would cause additional loading.

I'd be considerably more cautious with a full journal export, because it would need much more consideration for how people are going to use it. Unless you're going to re-implement it from the ground up, you risk the scenario of people backing up their entire journal on a monthly basis, which isn't actually beneficial to anybody.

The UI is, I believe, exactly the same one as from LiveJournal and is half assed has way too much of a technical emphasis. It needs a massive useability pass. A little more on page help clarifying the information, or even direct linking to the FAQ would help here, but the whole fields section needs a little love. I don't think there are any fields that actively need removing, though.

The other thing that would be nice would be to change the Proceed option to a two stage option. So you hit Verify, and you get additional screen information indicating the number of entries that will be exported, giving the user needed feedback before they start exporting empty files, and then Proceed to actually generate and save the file. (Means an additional db hit, but I hope that's a fair trade off on the added useability)
susanreads: my avatar, a white woman with brown hair and glasses (Default)

[personal profile] susanreads 2011-08-09 11:48 am (UTC)(link)
I voted "no opinion" because I don't understand the question, but what you say here, especially the two stage option, sounds like a jolly good idea.

[personal profile] feathertail 2011-08-09 05:08 pm (UTC)(link)
What if instead of a two-stage thing we just hid the advanced options behind a cut tag or similar, the way the proposed entry post overhaul does? That's my +1 with changes.

I also think there should be the option to do a yearly or complete journal export. There already is, technically; just roll your own Dreamwidth-code journal and import everything from this one. As long as that's an option that we encourage people to have, well ...

I think that the way things are right now is a holdover from LJ days. Not sure it's the best way to manage things going forward, but it might depend on server load and what features we want to make paid-only options. Making backups really tedious for free users doesn't seem very Dreamwidth-y though.
Edited (Typo) 2011-08-09 17:11 (UTC)

[personal profile] voldsom 2011-08-09 05:53 pm (UTC)(link)
Thank you.

I see your point with the whole-journal backup. I think my concern is not so much from a load perspective issue, but... It feels wrong from a [wannabe] developer perspective. I guess the issue is that we have no control over the file we're writing to. If we did, or perhaps in some other really clever way, you could make use of the existing sync options so that you only export changes (I know external tools like ljarchive use this, I don't know if the web export...which probably pre-dates it, could or even should consider this).

It feels like there should be an alternative way of doing it, but I guess that's the point, there. It would be an alternative way. :) I think if the existing export was to allow full journals, then from the perspective of one who writes too much, I'd want to add to my entry count display an estimate of the size of the file that would be generated... And, if possible (reaching for the stars) the ability to compress the file and download it in a compressed form, to mitigate data transfer issues on both sides.
denise: Image: Me, facing away from camera, on top of the Castel Sant'Angelo in Rome (Default)

[staff profile] denise 2011-08-10 12:28 am (UTC)(link)
Just FYI, LJ Archive uses the same export functionality that our importer uses, so yeah, it has access to comments. (Friend data too, I think, via a different API. I'm pretty sure the only thing we have to screen-scrape from LJ is the profile.)

There are two "export" functions: the one at http://www.dreamwidth.org/export (which is really old, really bad UI, and not very fully featured) and the export/backup API, which is only accessible via client. The on-site export function doesn't do incremental; the export/backup API does.
cesy: "Cesy" - An old-fashioned quill and ink (Default)

[personal profile] cesy 2011-08-10 06:27 pm (UTC)(link)
Importing from a backup file carries the risk that you can edit that backup file manually to change other people's words, which was the original reason for not allowing import from the export formats that already exist.
denise: Image: Me, facing away from camera, on top of the Castel Sant'Angelo in Rome (Default)

[staff profile] denise 2011-08-11 03:43 pm (UTC)(link)
I don't think there'd be a problem with exporting comments to a file, just with re-importing them once they'd been exported.
matgb: Artwork of 19th century upper class anarchist, text: MatGB (Default)

[personal profile] matgb 2011-08-11 06:10 pm (UTC)(link)
I've always wondered why that objection is there anyway-it effectively means an import from Blogger or Wordpress is impossible as well, as a blog owner can edit anything on their blog on both those platforms. If the plan continues to be to allow imports from them, then the difference between that and importing from file is tiny-sure, there might be some abuse caes, but they'd be done by people who'd do abusive stuff anyway.
azurelunatic: Vivid pink Alaskan wild rose. (Default)

[personal profile] azurelunatic 2011-08-11 06:17 pm (UTC)(link)
I believe the traditionally-suggested thing involves labeling such imports so that their source is apparent and clear (like an "(imported from file) (?)" note), thus allowing anyone who saw it to make up their own mind about how much trust to put in the words as published, complete with nice prominent FAQ link about what exactly "imported from file" meant and implied, for those who did not already know on their own.

Not sure how much that would be wp-beans to people in possession of files to import from, however. Oh well.
Edited ([citation needed]) 2011-08-11 18:20 (UTC)
denise: Image: Me, facing away from camera, on top of the Castel Sant'Angelo in Rome (Default)

[staff profile] denise 2011-08-11 06:18 pm (UTC)(link)
Comments from those services will likely be a different case, as they're essentially anonymous comments and the commenter has no expectation of being able to manage their comment once it's been made. (Not to mention, yeah, the owner of the blog can edit comments/change comments/yadda.) So, we'll likely import those as anonymous, not as authenticated; they're a different use case and a different community expectation.
montuos: cartoon portrait of myself (Default)

[personal profile] montuos 2011-08-09 05:36 pm (UTC)(link)
Wow. I didn't even realize that there was an export tool! I am all for revamping it for increased usability, including all of the following:

In summary, what I want is for the export tool to
a) Allow you to export entries by year
b) Allow you to export all your journal's entries at once
c) Provide understandable and useful options that will not make your export useless or incomplete in ways that you did not mean (e.g. if, for some reason, you do not select "Event", none of your entry text will be added to the export file)


Details of c) as suggested in other comments before I got here:
c1) Include tooltips on each option and field to explain what it does or is
c2) Include a direct link to the FAQ for more detailed help
c3) Include a verification step to check whether the selected options result in an empty file

And I'd like to add the following to the wishlist on my own account (basically, anything you can import, you should be able to export):
d) Allow you to export the full richness of entries (if the exporter already does this, it isn't at all clear that it does):
d1) icon, tags, mood, music, location, age restriction and reason, etc.
d2) comments (possibly translating users to openid?)
e) Allow you to export your journal basics
e1) bio
e2) all your icons
e3) all the people you're subscribed to, and those you give access to (possibly translating users to openid?)
e4) all your custom access groups (possibly translating users to openid?)

And yes, I would expect some of these options to be paid features if implemented at all.

But seriously, the UI desperately needs to be translated to plain English so Joe Not-a-Geek can use it too!
aedifica: Me with my hair as it is in 2020: long, with blue tips (Default)

[personal profile] aedifica 2011-08-10 08:15 pm (UTC)(link)
I'm in favor of an option to back up a whole year at a time, and no opinion on the rest.