my reclaimed content workflow

Alan’s post this morning got me thinking about what my reclaimed/co-claimed/com-plained content publishing workflow has evolved into. At a high level, this:

Content publishing workflow

Basically, I host as much of the stuff I care about as possible. My blog (and a handful of tools running on subdomains on the same server) serves as the primary place where I post stuff. Much of what I publish doesn’t show up on my blog’s front page or RSS feed, but it’s there for me, and I use it all daily.

Of the whole thing, I consider 2 parts absolutely essential: the WordPress-powered1 blog/site running at, and my Aperture library living on my home laptop. If I ever lost either of those, I’d be out of action. So I back them up somewhat rigorously (but could definitely do a better job of it).

The rest of the workflow, I treat as less critical, even ephemeral. If some of it disappeared, I may not even notice. If the third party stuff vanished (or I decided not to renew subscriptions), I’d feel it, but I’d be able to move on. Evernote is probably the one piece of third party kit that I’d find hardest to live without – it serves as the glue to hold all of my work stuff together – notes scrawled on iPads, snapshots from my phone, notes from meetings etc… all pulled together in a platform-agnostic holding pen in Evernote.

  1. running a few plugins and a customized theme to make it behave the way I want it []

reclaiming website search

I’ve been withdrawing from relying on Google wherever possible, for various reasons. One place where I was still stuck in the Googleverse was with the embedded site search I was using on my self-hosted static file photo gallery site. That was one of the few places where I couldn’t find a decent replacement for Google, so it stayed there. And I wasn’t comfortable with that – I don’t think Google needs to be informed every time someone visits a page I host1. I use that embedded search pretty regularly, and cringe every time the page loads.

There had to be a good search utility that could be self-hosted. I went looking, and tried a few. My requirements were pretty basic – I don’t need multiple administrators, or shards of database replication, or multiple crawling schedulers etc… I don’t want to have to install a new application framework or runtime environment just for a search engine. I want it to be a simple install – ideally either a simple CGI script or something that can trivially drop onto a standard LAMP server.

Today, I installed a website indexer on a fresh new subdomain. Currently, the only website it indexes is, but I can add any site to it, and then index and search on my own terms, without feeding data into or out of Google (or any other third party).

The search tool is powered by Sphider and seems pretty decent. It’s a simple installation process, and uses a MySQL database to store the index. Seems pretty fast – on my single-site index, with one user (me).

The biggest flaw I’ve found with Sphider so far is in how it handles relative links. Say you have a website structure like this:

  • index.html
    • page1.html
    • page2.html

If index.html uses a simple relative link like <a href="page1.html">Page 1</a>, Sphider skips it. Unless the index.html page has a <base> element to tell Sphider explicitly how to regenerate full URLs for the relative links. Something like this:

<base href="" />

Which Sphider can then use to turn relative links into fully resolved absolute links.

But this is strange – I had 2 choices:

  1. hack the Sphider code to teach it how to behave properly (and then re-hack the code if there’s an update)
  2. update each gallery menu page to add the <base> head element

I chose #2, because I just didn’t have the energy to fix Sphider, and the HTML fix was simple enough. It definitely feels like a bug – there’s no way that editing every page to add a <base> element should be required, but whatever.

Bottom line, Sphider works perfectly for my needs. It’s now powering the site search for my photo gallery site, and works quite well for that. And, it’s going to be available to index any of my other projects if needed.

  1. as would happen when the embedded search javascripts are loaded – that activity data could then be tracked/stored/analyzed by Google to better model what you’re interested in, who you know, etc… []

giving up on owncloud (for now)

I’ve really been loving running my own dropbox clone, by using owncloud running on my Hippie Hosting Co-op account. It’s (mostly) seamless and automatic, and (usually) Just Works™. It’s not as polished as Dropbox’s UI, but that’s not critical (although the status badges on files and folder badges would be nice…)

But, over the last week or 2, I’ve been noticing that owncloud on my work computer gets wedged. Digging into the status, the URL changes from my owncloud instance to something intercepted by browser-based wifi authentication. Just changing the URL in configuration doesn’t seem to solve it. I have to nuke my owncloud settings, add a new config, delete it because it insists on syncing to /clientsync rather than /, and then re-adding it manually. Then, deleting the /clientsync folder on the server. Annoying. I just need this to work.

So. I’m back to Dropbox for awhile. I don’t have time to fart around with this stuff right now. I need my file sync service to Really Just Work™. I’ll try owncloud again when I have some downtime to muck about with it.

Reclaim Project: 2 steps forward, 1 step back.

google kills iGoogle (slowly)

the iGoogle service let people put together rich dashboard-style home pages, with widgets sucking data from various places into one handy location. Great stuff. I know lots of people use it as their home page, and use it daily.

But, Google has decided it’s (almost) time to kill it, turning it off in November 2013.

I shifted off of a hosted homepage long ago, because I didn’t like the idea of feeding the tracking databases every time I opened a browser. So I set up a vintage 1997-style static homepage, but with some live data widgets powered by Feed2JS.

Google’s dead-service-walking iGoogle:

My always-on, never-tracking, even-more-useful self-hosted homepage dashboard:

So, the iGoogle shutdown won’t impact me. But, I’m wondering why anyone would come to rely on any Google service. They have a history of killing services that have fallen out of grace with Google Corporate, even if there are still diehard users who have come to depend on them because they are free and Do No Evil. Sketchup comes to mind. Lots of teachers were building stuff with their students in it. Until Google decided it didn’t like it anymore. iGoogle.

From the Techcrunch article, here’s a list of abandoned/killed Google projects:

Google Video, Google Mini, Google Bookmarks Lists, Google Friend Connect, Google Gears, Google Search Timeline, Google Wave, Knol, Renewable Energy Cheaper than Coal (RE-C), Aardvark, Desktop, Fast Flip, Google Maps API for Flash, Google Pack, Google Web Security, Image Labeler, Notebook, Sidewiki, Subscribed Links,Google Flu Vaccine Finder, Google Related, Google Sync for BlackBerry, mobile web app for Google Talk, One Pass, Patent Search, Picasa for Linux, Picasa Web Albums Uploader for Mac and Picasa Web Albums Plugin for iPhoto, and all Slide products.

How long until Google Reader is put down (who uses RSS anymore, anyway)? GMail? Google Docs? Search?