Download Xmarks bookmarks in Python

Xmarks is currently the only viable solution for synchronizing bookmarks across all web browsers over Internet, including Firefox / Iceweasel, Google Chrome, and Apple Safari. I attempted to setup the Mozilla Sync server, but it is so poorly packaged that it isn’t possible to install it cleanly without big efforts. So the only viable alternative remains Xmarks. Xmarks works great, but my problem is that all the data is stored on their servers, with no public API to access it. This article introduces is a small Python 2.6+ script to retrieve your bookmarks as an HTML page from Xmarks, under the AGPL3+ license. The complete script is available here.

Since Xmarks doesn’t provide a proper public API, the script first logs in like any end user, to set the cookies used by Xmarks for authentication. This is achieved by using an urllib2 opener to open all URLs, associated with a cookielib CookieJar to temporarily store those cookies:

cookie_jar = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie_jar))

This is then used to read the login form page, to extract a hidden encrypted token from the login form. This token is required by Xmark along with the username and password, to limit repeated automated login attempts:

# First get an authentication token from the login page.
url = 'https://login.xmarks.com/login/login'
first_login_page = opener.open(url).read()
auth_token_match = re.search(
'',
first_login_page)
auth_token = auth_token_match.group(1)

# Then authenticate for real.
params = {'token': auth_token,
'username': username,
'password': password}
opener.open(url, urllib.urlencode(params)).read()

Once the token, username and password have been sent to the login page, and the user is successfully logged in, Xmarks sets the right cookies for authenticating this user, and all data can then be accessed using the same opener. Here, we print out the latest snapshot of the bookmarks as HTML:

url_format = 'https://my.xmarks.com/bookmarks/export_to_html/0/' \
    '{username}-bookmarks-{date}.html'
url = url_format.format(username=username,
    date=datetime.date.today().isoformat())
print '%s' % (opener.open(url).read(),)

Any other URL could be accessed, e.g. to add or modify bookmarks. Finally, as a good citizen, log out to free the server-side session stored in association with the authentication cookies:

url = 'https://login.xmarks.com/logout'
opener.open(url)

The complete script additionally provides command-line help and secure password prompt, etc.

Overall, this is not complicated, and the only challenge was to find the right standard Python libraries to use. Nowadays many services still don’t provide a proper public API, or any authentication scheme that doesn’t use cookies. The technique presented this article can easily adapted to those cases. For instance, I have used very similar code to access a Redmine server that is setup to authenticate using cookies.