Thoughts on the mythical School Aggregator (EduGlu?)

I’ve been giving some thought to the “school aggregator” that grew out of the discussions around Northern Voice. What kinds of things will it have to be able to do? Types of interfaces? Explicit and implicit data and metadata? How to manage caching of items, and manage displaying the potentially hundreds of thousands of bits of content that will be pulled into the system over the course of a year? And how to present cohorts/classes/years within this? How to allow students to add multiple data sources, and tag it for use in whatever class context(s)? How to let students and teachers mine the aggregated data to get what they need/want? Lots of stuff to chew on here.

Brian’s students have been off to a great start in their AggRSSive project – and they have plans to make it even more kick-ass. Tyler’s described plans were pretty much spot-on to what is needed. Not complete, but a darned good start. Can’t wait to get my hands on that…

At Northern Voice, someone from a genetic engineering organization (didn’t catch the name of the guy or the agency – I was sure it was Genentech, but others were sure it wasn’t) was describing a mind-blowing RSS-based workflow that he’s using to tie research, automated lab results, individual publishing, and lots of other sources together into one interface. They have some AI ninjas crunching everything to make sure it gets to where it needs to go, and to start making connections between stuff. He mentioned that as a result of this AI-based aggregation, they were able to make a completely new discovery that linked two previously unrelated topics (proteomics and something else…) Very cool stuff. We should see if they can give a tour, and if they’d be willing to share with the rest of the class.

In the meantime, I just took a quick romp through SourceForge to see what else has been done in the area. Not a lot, unfortunately. But, I did come across a rather cool server-side aggregator that I hadn’t heard of before. sux0r appears initially to behave like others (Feed on Feeds, etc…) but – it doesn’t have categories. Well, it does, but not in the traditional sense. You create a set of tags, and then proceed to flag aggregated posts as belonging to any of these tags. After a while, the bayesian magic has enough to chew on, and it begins to automatically tag incoming posts. Latent semantic analysis to apply folksonomies?

Anyway, although the concept is cool, and will form an important part of EduGlu, the current incarnation in sux0r won’t scale to thousands of feeds in thousands of categories, over dozens of years.

OK. Enough thinking about this stuff for now. Back to work…

I’ve been giving some thought to the “school aggregator” that grew out of the discussions around Northern Voice. What kinds of things will it have to be able to do? Types of interfaces? Explicit and implicit data and metadata? How to manage caching of items, and manage displaying the potentially hundreds of thousands of bits of content that will be pulled into the system over the course of a year? And how to present cohorts/classes/years within this? How to allow students to add multiple data sources, and tag it for use in whatever class context(s)? How to let students and teachers mine the aggregated data to get what they need/want? Lots of stuff to chew on here.

Brian’s students have been off to a great start in their AggRSSive project – and they have plans to make it even more kick-ass. Tyler’s described plans were pretty much spot-on to what is needed. Not complete, but a darned good start. Can’t wait to get my hands on that…

At Northern Voice, someone from a genetic engineering organization (didn’t catch the name of the guy or the agency – I was sure it was Genentech, but others were sure it wasn’t) was describing a mind-blowing RSS-based workflow that he’s using to tie research, automated lab results, individual publishing, and lots of other sources together into one interface. They have some AI ninjas crunching everything to make sure it gets to where it needs to go, and to start making connections between stuff. He mentioned that as a result of this AI-based aggregation, they were able to make a completely new discovery that linked two previously unrelated topics (proteomics and something else…) Very cool stuff. We should see if they can give a tour, and if they’d be willing to share with the rest of the class.

In the meantime, I just took a quick romp through SourceForge to see what else has been done in the area. Not a lot, unfortunately. But, I did come across a rather cool server-side aggregator that I hadn’t heard of before. sux0r appears initially to behave like others (Feed on Feeds, etc…) but – it doesn’t have categories. Well, it does, but not in the traditional sense. You create a set of tags, and then proceed to flag aggregated posts as belonging to any of these tags. After a while, the bayesian magic has enough to chew on, and it begins to automatically tag incoming posts. Latent semantic analysis to apply folksonomies?

Anyway, although the concept is cool, and will form an important part of EduGlu, the current incarnation in sux0r won’t scale to thousands of feeds in thousands of categories, over dozens of years.

OK. Enough thinking about this stuff for now. Back to work…

45 thoughts on “Thoughts on the mythical School Aggregator (EduGlu?)”

  1. Peerkat, which Rael Dornfest wrote and discarded like, five years ago, was a great example of the kind of app you’re talking about. I did a proof of concept along these lines a couple years ago using Plone, when it looked like I might become unemployed, but stopped working on it when I found a job. Worked nicely with Plone though, using the CMFSin plugin.

    In reality, sophisticated applications of RSS have been pushed back about a decade by the collective decision to not use RDF.

  2. Bill, D’Arcy, keep me in the loop on this one.

    As you guys know, my current grad class is at http://technorati.com/tag/span505.

    I should say that what I would most like is also RSS feeds (aggregated or not) of the comments. That might also help the students comment, which they’ve been very slow to take up so far. I doubt, however, that’s possible because different software offers very different possibilities with commenting. E.g. it would work if they were all using WordPress or Haloscan, but part of the principle is not to force such homogenization on them.

    (Meanwhile, my solution will probably be just to instruct them to comment, or else…)

  3. One more valuable thing about Northern Voice

    To get a sense of how getting people together to share ideas and have fun (I have those priorities listed in the wrong order, but I’m in the office right now) can pay off with enhanced capacity, check out D’Arcy’s latest post on the as-yet nonexiste…

  4. @Bill – I’m thinking of a combination of boolean tags (tagged with “ucalgary” and “chem355” and “assignment” – or something like that) and perhaps tying into some kind of identity mapping system (this student is represented by these RSS feeds, and is enrolled in these institutions in these courses…)

    @Scott – Yeah! That’s the guy. Very cool stuff. He mentioned that they have a half-time programmer position dedicated to babysitting the AI, so I’m guessing it’s a non-trivial beast to run.

  5. So the guy you are talking about was Mark Mayo from the Genome Sciences Centre (http://www.bcgsc.ca/), part of the BC Cancer Agency. I know because I accosted him in the hallway and pleaded that he at the very least publish descriptions on what they are doing, if not the code itself, as it is totally mind-blowing. The cool thing to me was that they were using a combination of AI and recommender-type approaches; because they are a ‘closed’ system in the sense that all the users are using the same software in a common environment, they can build in things like a way to watch which links people follow, and which links their ‘neighbors’ follow, and constantly refine the qaulity of the algorithms that produce the feeds. The middle piece, the aggregator/sorter, is some heavy-duty coding that is quite domain specific by the sounds of it, but it is cool that they are tieing all the pieces togehter, all the people, the aggregator, and the reports from the scientific devices, through RSS. Systems integration… psshaw! Loosely coupled yet still heavy duty is more like it. Very inspiring.

  6. I’m doing a presentation at UVIC on Thursday and this concept has bumped a couple others that I was going to cover. The concept of class tags as demonstrated by Jon was a missing piece for me.

    The inital step has to be getting the tagging happening so you can find the stuff (with Google if neccessary). Finding that unique identifier for the group that can be relied upon over a period of time.

    It is becoming clear to me that the tools for aggregation of content between members of any online community will get better. Right now all I need is ‘good enough’. While it is a little sloppy the current combinations of Technorati, Flickr, and others may be ‘good enough’ for now when aggregated through a basic GLU tool.

    I’d like better but I can’t wait around for perfect.

  7. @Gardner – that was actually the point the genetics dude was trying to make – they use the AI as a starting point to feed stuff toward humans who do the real work with it…

    @Boris – That’s a good start, too. But we need to be able to give students and teachers a set of tools to let them actively mine the aggregation. I’m thinking of the aggregation as a big soup, with tools letting users do meaningful stuff with the soup. I’ll grab a copy of Agg2 to play with it…

  8. Maybe you want to actually keep first class content? Like, grab the existing RSS 2.0 categories and pull the in locally, so you can have a local tag cloud?

    Hey, how about our favourite platform? See Aggregator2 for Drupal.

  9. No! Not enough thinking about this stuff! This edu-aggregator idea is exactly what we’re thinking about down here. Your post gives us all sorts of useful pointers and exciting ideas. (One thought: AI works for aggregation, but not for synthesis/integration–that takes augmented human intelligence.)

    I will be following this post up, and how. Thanks, D’Arcy!

  10. @ Darcy – the concept of multiple tags together makes sense although it is harder once you get outside an environment that has formal support for tags – which is one thing I am considering. There is discussion about some ISBN for digital content that might apply here too.

    @Jon – Going to be using tag/span505 as an example on Thursday to a bunch of pre-service teachers. I am also going to be quoting you on the complexity of using wikis as source references – that was a cool thing too.

    + + new thought
    With these comment treads I immediately find difficultly in put them in one place as I would like to attach these thoughts to Brian’s from today as well as a couple others. I Could post it to my own blog but it fragments the conversation – gotta solve that too.

  11. More Post NV2006 stuff

    Were starting a good conversation over at Darcy’s blog Thoughts20on20the20mythical20School20Aggregator20EduGlu3F20at20DE28099Arcy20Norman20Dot20Net
    which is being backed up by Brian’s contribution One more valuable thing about Northern Voic…

  12. Did Boris just suggest Drupal, no way 😉

    Hang on I think I can fix it with WPMU 😉 😉

    Seriously though I don’t know if looking at something which will harness ‘the web’ is going to get you anywhere… just seems to darn difficult. Maybe.

    But aggregating a site which has all the schoolies blogging on together with integration with a funky aggregator plugin like this http://www.ozpolitics.info/blog/?p=87 or something similar based on feed2js bringing in appropriate outside content (I’m convinced this has to be manual), then, did I mention WPMU by the way…

    And then you’ve got the question of individual aggregation vs. public river of news, and then…

  13. […] I first discovered the Amazon Plog when a message from Clark Aldrich greeted me at my Amazon home page. I’m trying to sort out how the Plog is special… or if it is special. I suppose this is something like the Personal Start Page I have at Walden University, on which announcements from courses (or the bursars office or whatever) that relate to me are posted. In an educational setting this might look something like D’Arcy’s much linked to mythical aggregator, which is now ever so slightly less mythical. In keeping with the naming trend, perhaps such dynamically generated personal blogs for students would be eduplogs? […]

  14. I’m trying to marry what I’m reading here with what I’m reading from the CogDog. Then I get a menage a trois (time to drop the metaphor) when I think about “Lifebook,” a super-AI-enabled-augmenting university aggregator/interface/portal (heck, it could be a mmorpg) that gives students a compelling experience of, and reflection on, the reconceptualization of self that is at the heart of education.

    I feel Leviathan at the other end of the fishing line! Can we land it?

  15. I’ve been thinking about this sucker overnight – even set up a project on Eduforge.org – and I think it’s totally doable. I don’t think Drupal or WPMU can just be dropped in, though.

    What I’m seeing is a cross between UBC’s agRSSive, Feedburner, and something that groks cohorts and classes. Individuals are registered in the system (either self- or auto) and then enter a list of feeds that they want to represent them. Blogs. Flickr. Del.icio.us, MySpace, LiveJournal, CoComment, etc… Then, they post stuff into any of those feeds with a tag/categary/label of a “class” id – they’d get a tag cloud for them to use when logging into EduGlu – and the system takes care of the rest. It will only “save” entries flagged appropriately (not going to provide a permanent cache of everything a person does – just the stuff that’s “relevant” )

    Then, you can go to EduGlu and view a list of your “classes” and see a river of news for each, or a combined RSS feed, or run some kind of “Smart Folder” query, or whatever.

    You can view the combined RSS feed(s) anywhere you want, or come to EduGlu to see them.

    Just the start, though. If either Drupal or WPMU can do this, it’d save a lot of time 🙂

  16. Some thoughts:

    How mobile will an individual’s EduGlu be? One critique of Web 2.0 apps is that they are often centralized. I could see campus eduglus, then, which remain hooked to that DNS once a student/faculty member/staffer leaves. In contrast, imagine third-party hosting for ‘glus, or (better yet) a p2p solution.

    BCGSC: that kind of ninja-powered concept connection is very valuable. Is there a semantic search version of this out there?

    Das Lifebook: it should be explicable as an e-portfolio solution integrated with web content management tool and information literacy service.

    The specter of copyright. Has there been case law about users appropriating content through an RSS feed? Moreover, users will certainly create new media in actionable ways. Will campuses have an incentive to either lock down, restrict, or even forbid eduglus (cf the TEACH Act structure of the BlackWeb, or the popular fate of torrent clients: blocked or throttled)?

    PS: am leaving this comment as yet another test of CoComment.

  17. Bryan – I’m thinking this needs to be an institutional aggregation. There would be nothing stopping an individual from registering with other off-campus Glus as well, but the School will need a copy of whatever is used for assessment – currently they have to retain paper copies of stuff for X many years. Having a version of stuff available in the school glu repository would help there as well. The EduGlu software would be open source, so anyone would be free to run their own.

    Haven’t even thought about the copyright spectre. Oy. And what happens if Faculty use the aggregation as part of a research project? Ethics issues ensue. Ick…

  18. Much of this is over my head, but…

    1) Assessment issues aren’t such a big deal, I think. For one, students can be asked to print out a blog entry for assessment purposes. For another, in fact not all assignments are kept by universities. Typically, only final exams. There’s certainly plenty that is not retained.

    2) Next year I’m scheduled to teach a class on (essentially) recent and contemporary Latin American politics. Specifically it’s on human rights and democratization in the region. This could be a test for you guys out there. Because here, unlike my current class, there’s actually a lot floating around the web that could be of interest to the class’s concerns, and which one might want to aggregate in some way. Also it’s for undergrads so it is, I think, that much more likely that some of the students may already have their own blogs.

  19. “Individuals are registered in the system (either self- or auto) and then enter a list of feeds that they want to represent them.”

    Not bad so far, I think it’s very important that this is something which can be ‘owned’ by an institution. I’d work on, in the first instance, the system recognising their group/s and aggregating automatically that data into the area (an aggregation and repost would be best as this would mean that they get to archive it individually, a la FeedWP).

    I think this has potential as a bloglines-esque one stop student shop, integration with LDAP, BB etc. is up there almost before what it actually does 🙂

    Cos:

    “Blogs. Flickr. Del.icio.us, MySpace, LiveJournal, CoComment, etc…”

    easy peasy (do I say that too much?)

    “Then, they post stuff into any of those feeds with a tag/categary/label of a “class” id – they’d get a tag cloud for them to use when logging into EduGlu – and the system takes care of the rest.”

    This, I reckon is the hard bit… I think that you’re in danger of wheel re-invention here when you can get something like WPMU (there I go again) doing this alongside with categories related to the group/s (or alternatively some sort of structured blogging hack)… being all things could be dangerous but I don’t know how useful it’d be if it just ads on the side, I think that the one stop to lots of other thing integrated through aggregation is what you want here.

    Or should I rephrase that, ‘what I want’ 😉

  20. If WPMU (or Drupal, or anything else) can be convinced to understand the concepts of individuals, cohorts, classes, feeds and tags, then that’s awesome. I’m thinking of fleshing out the ideas without any particular implementation in mind. If you start with a hammer, everything starts to look like pumpkins. Or nails. Or thumbs. If we start by fleshing out the ideas it will help identify what bits may already be done, what needs effort, and what’s off the mark. Easy peasy.

  21. I am currently working on a paper that focuses on a user-contructed component-based learning management system (as opposed to a formal for-profit LMS). EduGlu seems to be the “elegant” solution many of us have been hoping for.
    In your humble opinion, how does the EduGlu solution compare to the PLE?
    Will EduGlu allow for a plug and play architecture, i.e., allow me to use furl rather than delicious, .wmv vs .mov, etc.?
    This discussion has me shaking with delight!

    -cds

  22. In my head, at least, EduGlu can’t be tied to any specific service(s). We can’t possibly predict what students and teachers will be using to publish stuff. So, I’m hoping to just include anything than can squirt out an RSS feed. You’d go to your “My EduGlu Sources” page or something and just start giving it your various feeds, which will be associated with your identity for association with classes and cohorts etc…

    I guess one thing that might be different between EduGlu and PLE, is that EduGlu will have no facilities for publishing content on its own. You won’t post something to EduGlu – you’ll post it to whatever service you’re using, and EduGlu will pull it into the data bucket for it to be routed to whoever cares about your and/or your topic/tag.

    Clear as mud, I know… 🙂

  23. Re Lifebook (cf. Pete Townshend’s Lifehouse): Yes, but it also needs to be able to aggregate or frame or explicitly invite re-presentation of activity on Facebook, Flickr, or whatever. In short, Lifebook should be ready and willing to erase the boundary between “school work” and “life.”

    There once was a book, pure and easy, playing so free like a breath rippling by.

    Or to put it another way, Lifebook will make visible the essential promises of work as play for mortal stakes (cf. Frost, “Two Tramps in Mudtime”). Seriously. This could happen.

  24. This is hugely appealing, in terms of supporting personalised and collaborative learning using a whole host of different applications. I see you’ve spotted my query over on Moodle.org about getting Moodle to publish more RSS feeds. The good people at Elgg have been working on rss feed aggregation, including the ability to add external rss feeds straight into a blog, which is a powerful technique.

    Whilst aggregating tagged entries from all the individuals in a course would be great, aggregating all content tagged with the course id string would need some sort of moderation or filtering to stop individuals or companies polluting the ‘river’ with spam or worse.

  25. Miles, what I’m thinking of is having users in a community manually add their feeds, rather than relying on an open system such as Technorati, which becomes polluted pretty regularly. It’s one part school-aggregator, one part small-scale-technorati, and one part custom-query-engine to let you create your own slices of views of the aggregated data…

  26. I’ve recently been looking at Gregarius (http://gregarius.net/) for potentially aggregating feeds from multiple users. Currently though it appears as if there is only one login for adding feeds, but it does have nice capacity for tagging and categories. Could probably act as a nice base for a multi-user system though.

  27. Brent, thanks for the tip about gregarius! It seems like a pretty cool app – but it doesn’t quite do what I have in mind… It doesn’t inherit tags or categories associated with items in the feeds – it looks like things need to be re-tagged and re-categorized after aggregation. Searches aren’t persistent – no way to create a “smart folder” or “saved search”. Output of the aggregation doesn’t include an RSS generator – which makes it hard to subscribe to categories etc… since you have to go to gregarius to see stuff.

    It does a lot of stuff really well, though. Feed management is pretty cool. I love the concept of having a view of all items posted by a user – but I’m not sure how that user’s identity is interpreted – simple name matching? Could get messy if there are a few Bobs or Teds being aggregated.

    I’ve installed a local copy, and fed it my Edublogs .opml feed so it’s got 100+ feeds to chew on. If nothing else, it will be food for thought…

  28. Actually, with a couple of plugins, gregarius gets pretty darned close to EduGlu right out of the gate. Just need to teach it a few concepts (People, Cohorts, Classes…) and it’s good to go…

    Autotagger plugin sounds like it will inherit tags from tags or categories on the items in a feed.
    http://plugins.gregarius.net/index.php?req=info&id=22

    RSS View sounds like it will add RSS output for any page.
    http://plugins.gregarius.net/index.php?req=info&id=7

    Now to figure out persistent searches, and a way to tie multiple feeds to an individual…

  29. Brent, thanks! I’d installed the RSS View and AutoTag plugins – RSS seems to work, but AutoTag appears to have borked the ability for Gregarius to update feeds. Doh… 🙂

    Multiple users is a feature on the roadmap for the mid-range future development plans. That’s pretty curious, but at least it’s on their radar…

  30. Since you mentioned the genetic research being done using RSS, I thought I would share that we have been doing this in the open science UsefulChem project, involving mainly the synthesis of anti-malarial compounds. RSS feeds are available to the lab notebook, higher level discussions of synthetic strategy or basic information processing of molecule data. This enables undergraduate students taking my organic chemistry classes a way to contribute to a real research project in a modular way, without needing to understand the big picture first.

    I have not found the feeds of the genetic project so I assume they are password protected somehow. If anyone finds them I would be very interested in taking a look.

    more info:
    http://drexel-coas-elearning.blogspot.com/2006/02/blogger-as-lab-notebook.html
    http://usefulchem.wikispaces.com

  31. I think the genetics project they mentioned at Northern Voice is an internal tool, likely pretty locked down. I haven’t gone searching for public info about the project – the only times I’ve heard about it have been at Northern Voice (both ’05 and ’06 were attended by someone from the project)

  32. Navigate your del.icio.us tags without leaving your site

    Thanks to one of our UBC whiz kids, Enej Bajgoric, for developing and sharing some code that lets me render not only my del.ico.us tagcloud, but to navigate it for resources and related tags — all within my site, maintaining my own look and feel. I’…

  33. I just found this post via Scott Leslie. Thanks Scott! Anyhow, I started to compose a comment here but had to leave for home so it was easier to save it as a draft on my blog and send you a pingback. I’ve just done that.

    I’m glad my small contribution to the session could stimulate some thinking. As I told Scott in the hallway, I don’t think there’s very much from our effort that would be in any way useful outside our organisation. In particular it doesn’t really address any of the issues that would arise in the EduBlogging space where the aggregation challenges are much more generic. Generic is always hard. *sigh*

    That being said, if some of my experience can be applied I’m very happy to share what I can (which is unfortunately somewhat limited due to the nature of research funding these days…).

  34. Mark, thanks for chiming in! The system you were describing was pretty specifically tailored for your group’s workflow, but your description of how it worked sparked a lot of ideas for me (and many others) about what a more generalized tool could do for education (and, I suppose, other areas as well).

    But, you’re right – generalized is hard 🙂 I’ve got somewhat limited time to devote to this as well, since it’s officially “off the books” and therefore a personal/pet project. With a 3-year-old at home, the amount of time I have to dedicate to writing code is stunningly small at the moment.

  35. Todd. thanks for the link. PlanetPlanet is just a simple aggregator, though… What I’ve got in mind is an aggregator plus query engine, letting you create custom views on the aggregated data rather than just having predefined “Rivers of News” views. It also needs to understand the concept of a Person (student or teacher) and be able to associate a Person with their Feeds and whatever educational contexts are relevant (institution, faculty, department, class, section, cohort, group, etc…)

  36. I had a hacked version of FeedOnFeeds (http://feedonfeeds.com/) which allowed me to define a person and services they used using FOAF, and aggregate their output in a number of ways (all feeds created by this person, all feeds generated by this service for people I’m subscribed to, select which feeds to go into my river of news, etc.), plus the ability to auto-detect existing FOAF files on a feed’s HTML page (the URL in the link element of an RSS channel) and the option to subscribe to the output of any other services they might have listed there.

    It was only single user though and I’ve just moved to Gregarius, so at some point I’ll probably port it to that.

  37. I really believe that this kind of aggregation opens the door for a whole new competency. (Well, at least an old competency that can be revived.)

    I’m talking about the creation of customized learning threads — based on The Long Tail concepts — where a knowledgeable person guides a learner in which components of the whole would be best to reach their goals. (Can you say “reference librarian”?) By taking knowledge of the learner’s current state, applying ID and expertise in the subject matter, a skilled guide could easily create a customized learning path mapped exactly to each learner.

    And if you do a great job for me, I talk about it and build your brand. Soon, the value you add is recognized (and rewarded with $$$) because of the time saved in achieving competency. If you’re a poor guide, you fail and go on to some other type of career — real estate, used cars, or despot.

    Is anyone aware of software tools that can implement a model like this? Or do we need to build them?

Comments are closed.