Exporting LiveJournal

For a long time now, I’ve wanted to export my LiveJournal account to PDF files so that I have a local copy of it. But LiveJournal has no export feature. There are sites like BlogBooker, who (for a fee and my LJ login) will generate PDFs for me; there are also other sites which (if I give them my LJ login) will import my LJ posts and comments. But I don’t trust any of those services to get everything. Plus, I wanted to find a solution on my own.

Attempt #1: use Automator (a macOS tool) to go through the site, scrape the URLs, visit the pages, and print them. And Automator has built-in functions to do all this! Problem is, LJ requires a login for displaying my restricted-access posts, and Automator can’t log in, so it misses a lot of my posts. And Automator insists on doing all the work itself instead of going through Safari. Oh well.

Attempt #2: use Selenium to control Safari to go through the site and print the pages. I’ve used Selenium at work for automated web page tests; it’s a powerful tool. Only, it insists on launching a clean instance of Safari every time (not reusing my login). I’ve tried binding it to my existing Safari instance but that just throws errors. Oh well.

Attempt #3: maybe I can write an AppleScript to send JavaScript to Safari to scrape links from pages, visit them, and select File -> Export As PDF automatically. And actually, as I was working on this, I had a better idea: why need to scrape links at all? I could just point AppleScript at the first post in my LJ, code it to use the Export As PDF menu item, and code it to click the “next entry” link on the page. It could then sequentially go through my entire journal, exporting posts as it went.

So that’s the approach I decided to go with, and it worked fine.

Here’s the AppleScript I came up with, in case it helps anyone else out there. This works with Safari, in Script Editor on macOS Catalina.

-- This script will export LiveJournal pages to PDF, one by one.
-- Start on the first LiveJournal entry page that you want to export to PDF.
-- (An entry, not a date. The URL needs to end in ".html".)
-- The script will save it (filenames will be sequential numbers) then click "Next Entry" and repeat.

-- Update postNumber here if the script fails and you need to restart it in the middle of your journal.
set postNumber to 1

-- Make sure this directory exists first.
set savePdfPath to "~/Desktop/pdfs/"

-- You may need to change the title to whatever your journal style uses.
set nextEntryLink to "document.querySelector('[title=\"next entry\"]')"

-- Coordinates of the Safari window: left, top, right, bottom
-- Only the width really matters, because that affects the width of the PDFs.
-- (We want to keep the width consistent across your PDFs,
-- and not too wide or the text will be tiny if you ever print it out.)
tell application "Safari" to set the bounds of the first window to {100, 100, 1115, 1000}

set done to false
repeat until done
	-- save this page as a PDF by using the "Export as PDF…" menu item
	tell application "System Events"
		tell process "Safari"
			set frontmost to true
			-- if you want 3-digit numbers, change -4 to -3
			set numberAsString to text -4 thru -1 of ("0000" & postNumber)
			repeat until exists sheet 1 of window 1 -- loop because it misses the click sometimes
				-- note that the menu item text uses an ellipsis character, not three periods
				click menu item "Export as PDF…" of menu "File" of menu bar 1
				delay 1
			end repeat
			keystroke "g" using {command down, shift down} -- go to folder
			repeat until exists sheet 1 of sheet 1 of window 1
				delay 0.02
			end repeat
			tell sheet 1 of sheet 1 of window 1
				set value of combo box 1 to savePdfPath
				click button "Go"
			end tell
			set value of text field 1 of sheet 1 of window 1 to numberAsString
			click button "Save" of sheet 1 of window 1
		end tell
	end tell
	-- make sure we have a link to the next page
	tell application "Safari" to set hasNext to (do JavaScript nextEntryLink & " !== null;" in document 1)
	if hasNext is false then
		set done to true
		exit repeat
	end if
	-- go to the next page
	tell application "Safari" to (do JavaScript nextEntryLink & ".click();" in document 1)
	set postNumber to postNumber + 1
	delay 2 -- give us a chance to leave the previous page first
	-- then wait until JavaScript says the page finished loading (though I don't know if this is reliable)
	tell application "Safari"
		tell document 1 to repeat
			do JavaScript "document.readyState"
			if the result = "complete" then exit repeat
			delay 0.5
		end repeat
	end tell
end repeat

display notification "Finished exporting your LiveJournal to PDF."

4 thoughts on “Exporting LiveJournal

  1. This is genius. I had done an xml export a million years ago with a tool that probably no longer exists, and I’ve always found it striking that there’s no easy way to do an export natively. I’ve been slowly cleaning mine out for over a year, just a few posts at a time, when I think of it. I miss the community aspect of LJ very much, but most folks have long since wandered off.

    • Thank you very much! This was a fun project and I’m glad it works as well as it does. I also really do miss LiveJournal, for the community aspect as well as the long-form writing. Maybe I’ll get back into it again one of these days…

      But, next up is to modify my script to delete posts.

    • Slightly different use cases.

      His goal was to export his LJ in a way that could then be brought into WP. Doing it that way you’ve always got to wonder if you got everything, if you handled all the edge cases correctly, and you have to decide how you want the data represented on WP where the feature set is somewhat different.

      My goal was to get a pretty archive of my LJ, with the custom layout it uses on LJ, so that the PDFs look like the original LJ pages.

      It’s like the difference between being an organ donor and going to a Sears portrait session. Both involve representations of who you are, but they differ greatly in what the end result gives you.

Leave a Reply