Visualizing trees using nested rectangles
Abstract:
Working with the attributes of large data trees with more than a few hundred entries can be unweidly and tedious. It's visual representation is often space-wasting and uninformative.
By using a simple recursive algorithm which draws coloured nested rectangles proportional to the attributes of your choosing, referred to as a Treemap, you can quickly examine the overall structure and attributes of your data.
Components:
- My obsession
- Attended a conference - Played with it, no automation - Yuck, java - Make it in perl! - Slow start - Took a week off, and wrote it - Nathan Torkington, YAPC::NA, Boom, CPAN
- The problem with my data
- Visibility - In the beginning, disks were 10 megabytes, and it was good. Now we have 300 GB drives. As my father says, garbage expands to meet the space available. - Scalability - dir /s, du -k, du the tree form, well, that's all fine and dandy, if you have around 100 files. I don't know about you, but my home directory contains 8 GB in 45,000 files.
Step 1
- Transforming my data
- dir C:\*.* /s - du -k /
Step 2
- Invoking the infant module - Imager quirk on Sun - Feeding data into the mapper
Step 3
- Understanding the output
- What the heck am I looking at?
- Examining different attributes of your data
- Node size based on: X - Colouring based on: Y
Understanding what happened
- Description of the two algorithms implemented
Visit a website to crunch your data:
http://treemaps.uc.org/map.cgi ? Upload (*) dir C:\*.* /s format ( ) du -k / format
References:
Perl Module:
http://search.cpan.org/~spditner/Treemap-0.2/
Perl Module Website:
http://orange.uc.org/read/treemaps
Treemap History:
http://www.cs.umd.edu/hcil/treemaps
Acknowledgements:
HCIL @ UMD
About the Author:
Code Listing:
- Examples of nasty data - What can be done -
Main topic: What is the central theme of your article? What is the one thing you want to tell the reader?
Significant insight into the overall structure of large data trees can be gained from a single screen-sized raster graphic using proportional nested rectangles and colour.
Your connection: How are you connected to the main topic? Are you the main developer
I have created a perl module implementing two Treemap algorithms to assist in my examination of a variety of large data sets.
Audience: Who should read your article? What background should they have? What do you expect them to already know?
Anyone who finds themselve paging through screens of tree data would find this useful. I expect them to know how to install perl modules and how to transform their own data into XML, or hash reference trees.
Motivation: What will the reader learn, and why can't they learn the same thing elsewhere?
They will learn how to transform the output of 'du -k /', or 'dir c:\*.* /s' into a visual representation that reveals thier system's directory structure, space utilization, and file type distribution down to the file level in a single image.
They will also learn how the Treemap module can easily be used to visualize other data sets such as company budgets, website activity, and news group activity.
![[ home ]](/images/ivory.png)