Saturday, April 4, 2009

Words, Tags, Bones and Trees: the Value of Clouds

I've looked at clouds from both sides now,
From up and down, and still somehow,
It's cloud illusions I recall,
I really don't know clouds, at all.

Another post on data in the clouds? No, not today. Instead, a presentation about extending word clouds as trees got me thinking about the value of presenting data in the shape of clouds. Clouds and trees and ...fishbone diagrams. For those new to word clouds, it's time to hop into a new tool for visualizing text and the amazing possibilities it affords for better understanding and text mining. In that vein, I enjoyed a presentation (linked below) by Gambette and Veronis, not because they offer some very complicated home-grown software for generating a version of tag clouds via tree structures, but because they explain the value for use of analyzing text via cloud structures.

In teaching, use of simple tag/word cloud tools like Wordle or IBM's Many Eyes provide a tool for visualizing the student's own or another author's text. For visual learners, this provides valuable insight and synthesis of ideas that may otherwise prove elusive. As the presentation below describes, tag clouds provide tools for objective literature analysis, discourse analysis, text mining for meaning, or an exploration of natural language processing - which the authors describe as text desambiguation. Much more complicated than necessary, when a picture speaks 1,000 words, but first one needs access to the picture.
Wordle: TagClouds
So explore Wordle first with a favorite piece of your text (as I did with this Blog post on right; click on thumbnail to access). It's as simple as copy and paste in the Wordle window. See if it doesn't provide immediate insight into the text and a new option for exploring meaning. Then, wander over to ManyEyes and explore the many, many visual and structural options for doing the same. Both are free to use. If you have ideas for using in your teaching, post them here and I'll share in my work. For a more detailed and scholarly look at the use of cloud analysis, follow along in the Slideshare presentation: Visualising a text with a tree cloud

1 comment:

Philippe said...

Here are two tree clouds of your blog post, if you want to embed them inside: one where colors reflect frequency and another one where color reflects average position: red for words in the beginning of the blog post, blue for words in the end. Indeed, using the command line version of TreeCloud is not very easy (well, it is not that complicated: "C:\Python26\Python.exe" C:\Treecloud\Treecloud.py stoplist=C:\Treecloud\StoplistEnglish.txt minnb=2 distance=hyperlex unit=1 color=chronology C:\Treecloud\Examples\GridKnowledge.txt) but I'm preparing a graphic interface for people allergic to Linux or MS-Dos.