on visualizing online discussions

For my MSc thesis research, I’m working with a bunch of data collected through online discussions during a blended course. Part of the discussions took place using Blackboard’s discussion board feature, part took place on students’ blogs. One of the things I need to do is to document how the discussions played out, to try and tease out any differences between the two venues. I’ll be using the Community of Inquiry model to describe the social/teaching/cognitive components of posts, but I’ve been wanting to describe the flow of discussion as well. How do the discussions occur? Are there patterns of activity, in time or size of responses? I’ve been struggling with how to document these. In my thesis, it’s really just a glorified case study, so I’ve had to constantly force myself to stop thinking of it as controlled experimental data. What I’m doing is describing the activity within a single course, in 2 venues of online discussion.

I had a bit of an epiphany this afternoon, while working through some preliminary work to prep for CoI coding. I thought about Hans Rosling’s statistic visualizations and how he was able to incorporate several axes of data into a graph by using size, colour, shape, etc…

And then it hit me – it would be relatively straightforward to apply that approach to the data documenting an online discussion. The timestamp data is there. The info about the individual is there. Basic “demographic” data is there (number of words, types of things included – images, links, attachments, media, etc…), and if I combine those, I get something like this:

On this rough mockup visualization, time is the vertical axis, transformed into a simple “number of days” integer. The horizontal axis is “threads of discussion.” This displays the discussion in a “FAQ” discussion board used in the course. There were 9 primary threads (plus one forked thread).

Each circle represents a post. The size of the circle represents the number of words in a post or response – in this mockup, I just did a simple conversion where the number of words directly translated into the width of the circle (a post with 100 words is 1.00″, a post with 50 words is .50″, a post with 150 words is 1.50″ etc…). The colour of the circle indicates the person who posted it. White circles are the instructor. Black circles are anonymous students (who did not provide consent to participate in the research, so the content of their posts was deleted from my working archive), and other colours indicating individual students.

This is a very rough mockup. I’m hoping to refine it a bit more, to include a way to represent the CoI coding for each message – an indicator of the relative social/cognitive/teaching aspect of the post, as well as a way to indicate other interesting things about a post (how many images/links/attachments/embedded media were included? etc…)

Problems with the mockup:

  1. It’s messy when posts occur close together. Overlap makes the circles obscure each other.
  2. The literal translation of wordcount to size means larger posts overwhelm the other posts in the diagram, in a way that over-represents the difference as seen in the actual discussion (a post that is 5x the size of another post doesn’t necessarily drown out the other posts, but it is given prominent emphasis in the diagram…)
  3. Forking of threads could get confusing – how to best indicate the branch points? I tried with a dotted line, but it’s unclear which post/circle it originates from…
  4. threads that are displayed beside each other may not be directly related, but they may appear to be intertwined because of the overlap of circles (a large post in thread 6 overlaps threads 5 and 7, etc…)

I’d like to extend the mockup, after figuring out ways to get around these issues, to show all posts in all discussions in the entire course. It should be interesting to see the temporal overlap between discussions, and see some data about patterns of interaction from participants across the entire thing – does a given participant start most threads? do they respond with giant posts? do they stay in one CoI aspect, or do they cover the whole thing? etc…

I would love to see a large visualization, with vertical lanes for each thread in an entire course, across all venues of online discussion, with posts displayed as shown above, and with the CoI coding indicated. What better way to compare activity across discussions in a course?

It strikes me that this visualization is extremely simple – perhaps too simple? perhaps so obvious in hindsight that someone else has already come up with a solution? Scott Leslie sent me a link to Boardtracker, which looks extremely interesting, but it looks like it’s strictly based on time and not threads, and doesn’t appear to handle representing individual contributions. Also, it appears to be under construction…

update: I was thinking about the overly-large-circle problem, and wondered what the diagram would look like if it was laid out more like an autoradiogram, with opacity of a block indicating the “size” of a contribution, and symbols overlaid to represent data like contributor and potentially coding info…

Size of contribution (wordcount) is the opacity of each block. The coloured circle represents the contributor (white is instructor, black is anonymous, etc…) This representation makes it harder to see at a glance, but probably displays the conversation more accurately.

update 2: working in some of Tim’s suggestions via his comment, I came up with this version. It’s a little closer to Rosling’s work. Now, I need to figure out how to indicate the CoI coding for each post…

update 3: I put all of the metadata from the Blackboard discussions, and one WordPress site, into OmniGraphSketcher to see what it would look like. Some interesting things become apparent:

Blackboard posts (and responses) are circles, WordPress posts (and comments) are diamonds. At a glance, discussion board interactions appear to be briefer – fewer words – and more immediate (posts usually occur within a few days, and then stop). Blog posts appear to be longer (more words), and extend conversation over a longer period – with several days being common between post and comment. The WordPress blog posts also appear to have elicited longer responses via comments (at least in the first WordPress site I entered data for…)

Visualization tools that may be useful:

  • SNAPP – works with major LMS applications, but appears to not like our old version of Blackboard (Bb8), and doesn’t grok WordPress, so couldn’t be used to visualize my entire data set.
  • Meerkat – sounds like it might support custom data imports. I’ve signed up for an account so I can try it out.
  • AGNA
  • DiscoverText