Searching PDF with ht://Dig

I’ve just enabled indexing and searching of .pdf documents on the Learning Commons website.

We’re using ht:/Dig as our search engine, and it’s quite flexible. It can take external parsers to teach it to read non-text-only file formats. There are libraries available that can teach it to read .rtf, .pdf, .ps, .doc, .swf, .xls, and even .ppt files.

For now, I’ve only added the .pdf parser, using the Xpdf library. There was no binary available for MacOSX, so I had to compile from source. Here’s a link to the compiled binaries for MacOSX (compiled without support for the X11 windowing system – these are just the command line utilities). Just drop them in /usr/local/bin and enjoy!

Using SubEthaEdit for shared notes in Pachyderm Training

I’ve opened up a SubEthaEdit document to serve as a shared workspace for the Pachyderm training session today. Not sure if anyone’s going to use it, but it might be a cool way to get a rough draft of documentation, on the fly (especially since most Pachydermers are in San Francisco, and there are a few stragglers – myself included – scattered around the continent).

The direct link to the shared document is here, and the SubEthaTrack listing is here.

ImageMagick Script to Generate Pachyderm Images

I’ve written a simple ImageMagick shell script to batch convert a bunch of images into the various sizes required by Pachyderm. Man, ImageMagick is pretty sweet. Installed in a few minutes using Fink, the I was off and running. I call this script to generate the images, which can then be fed to Generator to create the .swf files used by Pachyderm:

!/bin/sh

for img in ls *.jpg do convert -sample 46x36 ../11/ convert -sample 56x56 ../12/ convert -sample 72x72 ../13/ convert -sample 160x160 ../14/ convert -sample 280x200 ../15/ convert -sample 300x260 ../16/ convert -sample 480x380 ../17/ convert -sample 790x540 ../18/ convert -sample 1280x1024 ../19/ convert -sample 72x72 ../23/ done

WODev: WebObjects Wiki Engine

I just got an email from someone asking about the wiki engine I used for the CAREO wiki stuff. He was wondering if there were any WebObjects wiki engines out there. I thought I’d seen one, a long time ago, so I did a quick Google for it.

After filtering the usual noise (some from my own blog. doh.) I found a page by Pierre Bernard, with a link to WODev.

WODev is a wiki on WebObjects, implemented in WebObjects. And, it’s Open Source, so code is available etc…

I’d selected PHPWiki for a couple of reasons. First, it’s pretty solid, so I wouldn’t have to worry too much about it. More importantly, though, it allowed me to use full URLs as page names, so I could use it to create Wiki pages for any object in CAREO (or SciQ, or Learning Commons Teaching Resources, or etc…). I just did a test in WODev Wiki, and it supports this too, so it should work just fine. Not sure what the limit on characters in the page name is, but it would be trivial to change if it’s too small…

When I get a chance, I’ll look at migrating from PHPWiki to WODev Wiki, so I can integrate with the upcoming APOLLO stuff a little better…

Collaboration at a Distance

Michelle made a good point in an email. I’d overlooked the value of collaboration at a distance, because I really take it for granted now. I’ve been working with folks over the ‘net for years, but much more intensely over the past year.

The Learning Object Syndication with RSS presentation(s) (here and here) wouldn’t have been possible without iChatAV, wikis, and weblogs.

And the Pachyderm install would have cost a few orders of magnitude more without these tools (well, we really only used iChatAV/Trillian). The cost of travel between Calgary and California would have been waaaay too high, since it would have meant a few trips to get it running.

The Pachyderm Has a Pulse!

Josh and I have been poking at the Pachyderm for the last few days, and have finally convinced it to do something other than stare blankly at us.

Josh wrote some PHP script mojo to suck the Filemaker database (used to manage assets) into the SQL Server DB (used to author presentations). Works like a charm, once you know where the hidden landmines are. Like, say a field name has the wrong case. Or, say, records in one (and only one) table aren’t actually saved when the rest of the database is (so they have to be manually re-entered every time the database is opened. No hassle there…)

Long story short: Pachyderm.ucalgary.ca is up and running (empty, but up and running).

I’ve added an image of Evan, and created a Pachyderm screen and presentation, just to show there isn’t that much smoke and mirror juju going on. The presentation is available here. It’s not terribly exciting, because I have no idea how to actually use Pachyderm, but it serves as HELLO, WORLD! which is just fine by me. Warning: the presentations currently use a rather evil “browser detection” script. I say that in quotes, because it doesn’t really detect anything useful anymore, and really just gets in the way unless you’re using Windows/IE.

spaghettievan_pachyderm.jpg

Pachyderm progress

Josh and I have been slogging through the Pachyderm installation/configuration process. It’s been a whole lot of one-step-forward, two-steps-back, but sometimes things just kinda work.

The commons thread we’ve come across is basically a version of “never, EVER use Windows on a server. Or, on a desktop, if it can be avoided.”

Anyway, we’ve got it mostly working, thanks mostly to Josh’s fancy PHP scripts to import data from the Pachyderm Filemaker database into the SQL Server database. That part works like a charm.

The rest feels decidedly duct-taped. MacGyver would be proud. Except his stuff usually worked.

Windows Sucks Much Ass

Grrr.

I’ve been installing WinXP Pro for most of the day now. After switching back to the “classic” UI so my retinas stopped bleeding, I’ve been working through installing firewall, virus protection, SQL Server, IIS, etc…

Holy crap.

They must lock their engineers into a room and tell them to figure out the most annoying, non-intuitive ways of doing things (and presenting info to the user).

This started as a rant on our Mantis bugtracker, but it’s just gotten worse since I wrote this:

Still installing and updating WinXP. Holy crap that’s one bad OS. Ugleeeee. Designed by folks that should be kept in a server room somewhere…

What do I care that an update download is for KB826939, or that it is 5.300000000001 MB? How do they measure the .000000000001MB? Why on earth wouldn’t you just round that off? At a glance, it doesn’t even look like a meaningful number. I thought it was a serial number or something silly like that, but looking closer, it’s just providing waaaaay too much insignificant detail. Good freaking lord, they’re a bunch of rabid morons.

Little things, like, say, turning on the web server. I had to poke around for settings, run a network config wizard, and Add Software (which has been running for over an HOUR now – I could have compiled Apache FROM SOURCE in that time!). I’m guessing it’s going to take some mojo to actually get IIS running, after it’s been installed.

On MacOSX, it’s a quick trip to the System Preferences application, click a checkbox, and IT’S DONE. Web server enabled, with public directories for each user, and a big honkin’ Documents directory for the whole server (with CGI-EXECUTABLES and everything).

Wow. No wonder there is such a market for Windows IT folks. Someone could make a career out of installing Windows.

It takes me less than an hour to install MacOSX from scratch. With all services I need enabled and running. Securely.

UPDATE: It’s been “configuring components” for the last half hour. Getting a little nervous that installing a simple web server needs this much configuring. What exactly is getting installed, anyway? It’s just supposed to be a service that spits out text and/or binary files over HTTP on request. yeesh.