![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
![[site community profile]](https://www.dreamwidth.org/img/comm_staff.png)
Revamping the 'Export Journal' tool to allow exporting in larger time intervals
Title:
Revamping the 'Export Journal' tool to allow exporting in larger time intervals
Area:
exporting, interoperability, backups
Summary:
Revamping the 'Export Journal' tool at http://www.dreamwidth.org/export so that you can export entries from your journal on a yearly basis, and also export it all as one giant file. Export options on that page could also use some clarification so that you understand why or why not to tick them.
Description:
Basically, I want to the basic export tool to allow me to export my journal as XML on a yearly basis as well, or even just all at once. Currently, the export tool at http://www.dreamwidth.org/export only allows you to export on a monthly basis, which means lots and lots of repetitive action if you post even occasionally on your journal. As far as I know, people that want to back up their Dreamwidth journals generally resort to something like LJ Archive, which is thorough, but can be inexplicably buggy, and is not at all maintained by Dreamwidth or otherwise affiliated with Dreamwidth. I would much prefer being able to export my journal with an on-site tool that I know will be fixed or improved as necessary, and I would be willing to go without exported comments if the Dreamwidth tool will provide a complete set of all my entries.
It would also be rather nice if the more esoteric options on the tool's page had more explanatory labels. Currently, this is approximately what the "Fields" options look like:
- ID Number
- Event Time (from your clock) *
- Log Time (from system's clock) *
- Subject
- Event *
- Security Level
- Allow Mask *
- Current Mood & Music
Some of these fields (the unstarred ones) are fairly self-explanatory. The rest are somewhat confusing to me-- "Event Time" and "Log Time" seem self-explanatory at first glance, but then you also have "Event", and before I actually looked at an export file, I honestly wasn't sure what all those fields would mean when taken together. "Allow Mask" is also really badly named, and probably should not be an option at all, since that field is basically the bit that is used to determine whether an entry is public or access-only. Not exporting that field just removes part of the distinction made between private and public entries in the export file, which really doesn't serve any purpose since the tool currently exports everything. A fine-grained option to control what is exported would be nice, but it probably should not be at all related to what fields get exported.
Another confusing option on the page is: "Don't translate between encodings". Is this something an end user is supposed to just know? I know what encodings are, and that option does not really tell me exactly what it does. Does ticking the option mean that it will just leave any entries that are not encoded as X (where X is the encoding you selected from the preceding combo box) as they currently are? Is this an option that's more for debugging or troubleshooting export files that don't import elsewhere correctly? Should it even be on the page at all?
As far as using the tool goes...well. There is nothing on the page that indicates that the tool will only export by month, so unless you got to it from the FAQ, you won't even know about that. When I tried putting in just the year to see if I would get an explanatory error, it basically just gave me an empty CSV or XML file named after the year. Putting in the wrong month/year combination also gave me an empty file, which, remember, I won't know about until I actually look within the file. Considering that the tool is supposed to let you export entries, I really think it should let you know when your settings mean that you will not be exporting any entries at all.
In summary, what I want is for the export tool to
a) Allow you to export entries by year
b) Allow you to export all your journal's entries at once
c) Provide understandable and useful options that will not make your export useless or incomplete in ways that you did not mean (e.g. if, for some reason, you do not select "Event", none of your entry text will be added to the export file)
If there is a question of the tool putting strain on the servers, the number of times that free users can export their journals can be restricted (e.g. once a month, once every couple months, etc).
This suggestion:
Should be implemented as-is.
21 (45.7%)
Should be implemented with changes. (please comment)
9 (19.6%)
Shouldn't be implemented.
0 (0.0%)
(I have no opinion)
16 (34.8%)
(Other: please comment)
0 (0.0%)
no subject
In fact, I'd say make the access field correspond to the entry options, so export whether it's public, private, locked to accessors, or locked to custom access groups (and if so, which ones).
no subject
no subject
no subject
I definitely agree that it'd be helpful to be able to export by year instead of having to go by month.
Rather than adding explanations to the exportable fields, why not just offer a link to the relevent FAQ? Easy access for people with questions, out of the way for people who already know what they're doing.
I don't think any labels should be changed, but if they are, please change them only on the export page, and NOT in the header row of the export files themselves.
I strongly disagree about removing options (e.g. Allow Mask). If it's not useful to you, fine, don't export it. But don't keep everyone else from being able to use it.
Showing an error message if the called-for export is empty makes sense.
If exporting a year's worth of entries is significantly more resource-intensive than exporting twelve months individually (unlikely), and so many people usse/abuse it that it's causing a problem, make annual exports a paid feature, and continue allowing everyone to do monthly exports as they can now, without frequency limitations.
no subject
In that vein, I feel like you shouldn't need an FAQ to understand how to get a basic, complete export out of any system. Why not add the relevant information to the actual form or make the options there self-explanatory instead of putting all that on a separate page? If the goal is to make sure that people understand how to use it, most of the instructions should be right there in front of them when they are trying to use it. As for how to mask some of the information for advanced users, I'm thinking of the little question marks by fields in the update form, the ones that show helpful tooltips when you hover over them. That would be a better way to handle extra explanations, and it would be more usable than having to go elsewhere.
Also, why change labels on the form but not in the export file? I don't think the current export formats are actually compatible with any blogging services (Wordpress, etc), and I can't think of any clients that actually use it to produce exports or anything. And if they did implement a more granular access field that distinguishes between levels of access (access only, custom filters, etc) they would have to change things in the export file to match. Basically I'm just curious as to why you think those should be kept the same.
no subject
Much as I hate tooltips, your suggested usage of them here makes sense.
Labels probably shouldn't be changed; I'm guessing it'd be nontrivial, and anyway there's no need if tooltips (or whatever) are implemented to explain them. But to answer your question: Changing the labels of the header rows would disrupt importing into other systems that use the header rows to match things up. (I store all my old entries, LJ and here, in a database; surely I'm not the only one.) Adding new columns (with appropriate headers) isn't disruptive; the data in the unrecognized column would just be ignored (or support could be added for it).
no subject
Now, the reason why I want them to change the labels is that they are confusing at first glance. There wouldn't be much of a need for tooltips if the labels of the fields they are changing were made to be descriptive. I'm not talking about changing labels in their gigantic 'Entries' table or whatever, mind, I'm talking about the labels on the form and possibly also the ones in the export file, since those are what the users will see.
I do understand that changing the header labels in the export file would be disruptive for personal backup systems. That said, I still think that changing them would be fine so long as they let us know about it when they do. Compatibility with external systems like other blogging services and so forth would be more of a priority, IMO, but of course it's up to Dreamwidth to decide what they want to do. Also, if you use a database for backup, why wouldn't it be easy to change field names? Just curious here, since the databases I can think of that you would set up for that would allow you to change stuff like that pretty easily.
no subject
I agree with the suggestion of being able to export a year at a time. That's something that I would find very useful. I'd like to retain the ability to export monthly, but in terms of catching up with my backlog and also in terms of permament archiving, I'd like to see a year's worth of data, though I know this would cause additional loading.
I'd be considerably more cautious with a full journal export, because it would need much more consideration for how people are going to use it. Unless you're going to re-implement it from the ground up, you risk the scenario of people backing up their entire journal on a monthly basis, which isn't actually beneficial to anybody.
The UI is, I believe, exactly the same one as from LiveJournal and
is half assedhas way too much of a technical emphasis. It needs a massive useability pass. A little more on page help clarifying the information, or even direct linking to the FAQ would help here, but the whole fields section needs a little love. I don't think there are any fields that actively need removing, though.The other thing that would be nice would be to change the Proceed option to a two stage option. So you hit Verify, and you get additional screen information indicating the number of entries that will be exported, giving the user needed feedback before they start exporting empty files, and then Proceed to actually generate and save the file. (Means an additional db hit, but I hope that's a fair trade off on the added useability)
no subject
no subject
I do disagree about whole-journal backups not being useful, though. If you're backing up your journal regularly just to make sure you have everything you've posted, it is easier to grab the whole thing in one go rather than piecing together which months you missed or risking overlap when you grab a new month. If it would be too much load for the site to allow total backups for everyone, then that should probably be made a paid option.
no subject
I also think there should be the option to do a yearly or complete journal export. There already is, technically; just roll your own Dreamwidth-code journal and import everything from this one. As long as that's an option that we encourage people to have, well ...
I think that the way things are right now is a holdover from LJ days. Not sure it's the best way to manage things going forward, but it might depend on server load and what features we want to make paid-only options. Making backups really tedious for free users doesn't seem very Dreamwidth-y though.
no subject
Also, a cut tag would be great as well for hiding some of the instructions. It would blend in nicely, and allow the second page of the process to just be given over to reporting errors, showing you what exactly you should be getting in the export file, etc.
no subject
I see your point with the whole-journal backup. I think my concern is not so much from a load perspective issue, but... It feels wrong from a [wannabe] developer perspective. I guess the issue is that we have no control over the file we're writing to. If we did, or perhaps in some other really clever way, you could make use of the existing sync options so that you only export changes (I know external tools like ljarchive use this, I don't know if the web export...which probably pre-dates it, could or even should consider this).
It feels like there should be an alternative way of doing it, but I guess that's the point, there. It would be an alternative way. :) I think if the existing export was to allow full journals, then from the perspective of one who writes too much, I'd want to add to my entry count display an estimate of the size of the file that would be generated... And, if possible (reaching for the stars) the ability to compress the file and download it in a compressed form, to mitigate data transfer issues on both sides.
no subject
As for tools like ljarchive, I think they just go through /interface/flat with the user's login credentials. They certainly can't get comments through the export API, and I don't think friend/community data can be gotten that way either. With tools like that, you can already sync new entries and back them up individually, so it's something the right client is able to do.
Lastly, you could certainly tweak the exporter to allow only exporting changes, but then you'd have the question of how users would apply those changes. If you have them copying or merging the new data into their old file, you might as well have just given them the whole file right away. Less confusing that way, IMO. Yeah, it would require compression for large journals, but then even a yearly export could possibly end up needing that. Even if you don't write a ton, all those posts can add up; when last I went through my old journal, I had waaay more than I expected :P
no subject
There are two "export" functions: the one at http://www.dreamwidth.org/export (which is really old, really bad UI, and not very fully featured) and the export/backup API, which is only accessible via client. The on-site export function doesn't do incremental; the export/backup API does.
no subject
no subject
no subject
I don't know where I stand wrt the severity of the issue of being able to change/edit people's comments before an import, but it does strike me as an artifact of how comments are treated here in general. They're in this nebulous space where they're technically owned by the commenter, but not actually transportable with them or directly under their control. That said, taking just my entries with me in useful formats would be an improvement over what I can do now.
no subject
no subject
no subject
Not sure how much that would be wp-beans to people in possession of files to import from, however. Oh well.
no subject
no subject
Details of c) as suggested in other comments before I got here:
c1) Include tooltips on each option and field to explain what it does or is
c2) Include a direct link to the FAQ for more detailed help
c3) Include a verification step to check whether the selected options result in an empty file
And I'd like to add the following to the wishlist on my own account (basically, anything you can import, you should be able to export):
d) Allow you to export the full richness of entries (if the exporter already does this, it isn't at all clear that it does):
d1) icon, tags, mood, music, location, age restriction and reason, etc.
d2) comments (possibly translating users to openid?)
e) Allow you to export your journal basics
e1) bio
e2) all your icons
e3) all the people you're subscribed to, and those you give access to (possibly translating users to openid?)
e4) all your custom access groups (possibly translating users to openid?)
And yes, I would expect some of these options to be paid features if implemented at all.
But seriously, the UI desperately needs to be translated to plain English so Joe Not-a-Geek can use it too!
no subject
no subject