Skip to content Skip to navigation Skip to collection information

OpenStax_CNX

You are here: Home » Content » Research in a Connected World » Visualization Matters

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

In these lenses

  • eScience, eResearch and Computational Problem Solving

    This collection is included inLens: eScience, eResearch and Computational Problem Solving
    By: Jan E. Odegard

    Click the "eScience, eResearch and Computational Problem Solving" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.
 

Visualization Matters

Module by: Martin Turner. E-mail the authorEdited By: Alex Voss, Elizabeth Vander Meer

Key Concepts

  • Themes in the science of visualization
  • Simulation models
  • Visualization tools – graphs created using Excel and MATLAB
  • Distributed visualization
  • Metadata and Paradata for scientific visualization

Introduction

"We don’t see with our eyes. We see with our brains", Paul Bach-y-Rita.

In the last thirty years computer-based visualization has moved from an informal ad hoc tool designed to create particular results, to becoming a proper science in its own right. Universal generalisations and specifications as well as best practice guidelines are now available. Visualization methods are now being studied as an individual topic within various courses and modules; at all levels from undergraduate to postgraduate. Visualization is now the basis of numerous PhD titles and further research projects and programmes, funded across all the research councils and the infrastructure HE/FE funding agencies. This research and development has created a large toolkit for general use as well as individual methodologies for specialist user data sets, and has helped in understanding the barriers between the computer display and the human visual system. Visualization, it should be emphasised, is as much about gaining investigative insight as it is about enhancing presentations to tell a clearly specified story.

The science of visualization has been split into three themes; information visualization that studies methods for the representation of large-scale collections of often non-numerical information as well as the recommendations for use of graphical techniques to aid in the analysis of data. Scientific visualization, the second theme, was developed from previous often natural and experimental methods of displaying data, which has seen an explosion of users due to the deluge of in-silico experimental data (e.g. supercomputing and high throughput computing results) as well as real experimental capture equipment (e.g. 3D medical scanners, climate sensor data and astrophysical telescopes). Results often mimic reality, for example creating virtual wind-tunnel visualizations, but can be abstract, for example visualizing 6-dimensional tensor components using different geometric shapes (as in Figure 1). Visual analytics is the third theme. This merges both of these fields to focus on the user’s analytical reasoning, which often involves interactive visual interfaces and commonly employs various data-mining techniques as well as combining data across different databases.

This chapter introduces examples within these visualization themes, first providing an overview of simulation models and then specific examples from the creation of graphs using popular tools such as Excel and MATLAB. It then moves on to present the complexities of distributed visualization, as well as the role of adding metadata and paradata.

Figure 1: Visualization examples: information visualization example showing the content of the ½ million files on my hard disc (Sequoiaview); and two scientific visualizations, the first showing climate modelling using various animated glyphs to show flow strength (Avizo); and the second a selection of interactive superquadric glyphs selecting various forms from the six dimensions available within tensor stress components (AVS/Express).
(a) (b) (c)
Figure 1(a) (graphics1.png)Figure 1(b) (graphics2.jpg)Figure 1(c) (graphics3.png)

The Human Visual System: The User is Key.

Will Schroeder et al. in The Visualisation Toolkit (Schroder et al. 1998) stated “informally visualisation is the transformation of data or information into pictures. Visualisation engages the primal human sensory apparatus, vision, as well as the processing power of the human mind. The result is a simple and effective medium for communicating complex and/or voluminous information.” Based upon using the massive amount of brain power within the human visual system that constitutes about 1/3 of the total brain size, visualizations have been shown to be one of the best and sometimes the only way of conveying a huge amount of data in a short period of time. One of the key reasons for visualization as a specific field to study was the rapid increase in quantity of data being produced by simulations on supercomputers of physical, natural and theoretical problems. This has been termed as the data-deluge problem and frequently has been so large that graphical representations offer the only viable way to assimilate the data.

The simulation models themselves have also been increasing in complexity, involving large numbers of independent and dependent variables whose relationships need to be understood. For example, in climate modelling, we may wish to explore how temperatures, water vapour content, pressure, wind directions and velocities vary within a 3D region, over time and all at once. The process of visualization is therefore concerned with ways to represent the data as well as defining tools for interactively exploring the multidimensional and multi-variant models. One of the early active research areas was to find ways to link this visualization process with interactive control of the simulations themselves, opening up completely new possibilities for interactive exploration and understanding of complex phenomena. Over the years a number of visualization systems have emerged, which provide a framework for this kind of model exploration.

Visualization Tools: Evaluating a Graph

Plenty of literature and course notes are now available but as a simple example, a few rules are presented next on how to create an effective graph. A graph should present a reasonable amount of data, say something about the behaviour of that data and it should avoid giving a false impression of the data. In other words, the graph must communicate something. Tukey (1977) pp 128,157 said there cannot be too much emphasis on our need to see behaviour. Graphs force us to note the unexpected; nothing could be more important”.

Excel and MATLAB are two of the most popular visualization tools currently used, even though users may not consider them as such. They produce numerous 2D and 3D graphs of different sizes and dimensions but the visualization choices are rarely thought about. Figure 2 shows two views using MATLAB of a simple formulae (y = (x-1)-2 + 3(x-2)-2). Both show the same numerical sampled data but the second, by cropping the y-axis to a limited range [0:50], could be said to present a large amount of extra information highlighting an important area. This process has been termed focus and context zoom interaction.

Figure 2: The default and a cropped version of the numerical evaluation of the MATLAB plot command for the formulae (y = (x-1)-2 + 3(x-2)-2).
(a) (b)
Figure 2(a) (graphics4.jpg)Figure 2(b) (graphics5.jpg)

Users make choices about the data to be used and its visualization, and these affect both the quality and the quantity of the information presented. Good visualization requires graphical integrity and there are many standard techniques to help quantify and qualify between different versions. As a short exercise three well used simple objective tests (adapted from Tufte (2001)) are presented here as applied to the data shown in Figure 3.

  • Objective Test 1: The Lie Factor emphasises the variation in the data which can cause misleading interpretations. The variation in height of the smallest and largest bars in the graph on the left is (74 - 56) / (63 - 56) = 2.57; however, the variation in the data is 74 / 63 = 1.17. These two numbers being different indicate how visually the variations appear more extreme on the left-hand graph.
  • Objective Test 2: The Data Ink represents the non-erasable items of a graphic and often represents the non-redundant ink. The horizontal grid lines, tick marks and the frame around the graph are all erasable – and can be, within reason, as they may distract more than guide the observer.
  • Objective Test 3: The Data Density represents the ratio defined as the number of entries in the data matrix divided by the area of the data graphic. In this case by removing frames the data density of the right-hand graph slightly increases.

Figure 3: An Excel bar chart example showing the statistics of the number of dives for a Female Elephant Seal in early February 1991 (numbers and example adapted from Tufte 1997). The graph on the left is the default format.
(a) (b)
Figure 3(a) (graphics6.png)Figure 3(b) (graphics7.png)

It is possible and recommended to try these kinds of tests and many others on any images including those found in national newspapers.

Distributed Visualization: Massive Datasets

The visualization of large datasets has become a new key bottleneck in applications where validation of results and data acquisition from scientific equipment is required at an early stage. Such validation would allow correctness of methods (such as the set up of a physical experiment) to be determined prior to further spending of computational or imaging machine resources. Datasets can far exceed the capabilities of modern graphics hardware (GPUs) and so visualization systems are turning to parallel compute facilities to render them.

Figure 4 shows a use case of a current system being developed. Here multiple render processes are executed to render small sections of a decomposed dataset (right hand side). In this case the GPU output from each render process is visible; although usually these windows are not visible and only the left hand composite image is shown. However, this conveys the idea of distributed rendering with the final composited image, shown on the left, viewable by the user. This final real-time interactive image can be transmitted across the internet at fast rates (experience is about 15 frames per second across countries), to be displayed within an application window (as shown), within a portal, or within a Virtual Learning or Research Environment. There is no current national based visualization service in the UK; but various test services exist within JISC and research council funded projects, including on the National Grid Service (NGS http://www.ngs.ac.uk/) and two initiatives currently running on the UK national supercomputing service (HECToR http://www.hector.ac.uk/) are leading the way.

Figure 4: End-user volume viewer application (left) displays a composited image from raycasting volume rendering processes running in parallel on the four cluster nodes (right). AVS/Express pre-test version for the MRBV (Massive Remote Batch Visualizer) project running on a CRAY XT4.
Figure 4 (graphics10.png)

Making Choices: Metadata and Paradata

Rules can be broken with the addition of appropriate metadata, and this has been known for a long time. The addition of good metadata including all forms of annotations is very important even if it takes time and careful thought. Metadata can include all details describing the source of the data, the methods used to pre-manipulate the data and create the visualization, as well as the contact details of the author, creation date etc. Recently there have been tools developed to help record this process. These include the development of software within e-Science, creating a set of middleware linking computing resources together – adding semantic tags which define meaning to these components – and creating ontologies, which describe how human terms relate to computer terms.

A proposal is to add paradata that extends the concept of metadata to consider issues of choice and alternatives by recording the subjective decisions. For example, Figure 5 shows fourteen different visualization variations for a simple vortex fluid flow data set. Often only one or two images will be used to illustrate a specific scientific phenomenon, but it is very rarely considered in detail what decisions have been make and it is even rarer for these decisions to be written down, as to why a particular version has been chosen. The use of paradata would now allow and even force the authors to describe the reasons for their choices.

It is said that an image is worth a thousand words, but we can rephrase this to say a good visualization may need a thousand words of annotations, in both metadata and paradata, in order to properly describe it.

Figure 5: Fourteen different versions of scientific visualizations for the same data flow field (McDerby 2007).
Figure 5 (graphics11.png)

A couple of solutions to address visualization are the introduction of recordable and shareable workflows (myExperiment), and the controlled recording of researchers' choices creating a visualization provenance (VisTrails). These and similar tools are going to be more available within VREs (Virtual Research Environments) that are already considering the use of collaborative environments; including an emphasis on the web 2.0 generic principles of being able to store and annotate everything.

Conclusions: “Lying” with Visualizations

They always say you can lie with statistics, but similarly you can lie with visualizations as well. This is especially true as visualizations not only can be selective in choice of data, but as they employ the human visual system they can create visual illusions as well. Often this process is not deliberate but is accidentally misleading, caused by authors who only have space for a few visualizations and make quick, possibly uninformed, decisions.

We have presented a few very simple examples to describe how small changes can improve the presentation of information. Also we have given a warning that without defining and describing the choices made, through metadata and possibly paradata, there can be confusion. Fortunately there are now methods, just starting to be introduced, to help in the process, although more need to be actively used, tested and developed.

Acknowledgement

At the University of Manchester, Research Computing Services, starting with the Computer Graphics Unit, has for over 30 years been considering how to efficiently create and present visual stimuli and is still learning the best way to integrate and transfer information from computer source to the human user. Thanks to all those who indirectly have contributed ideas to this short article from numerous sources (including the MSc module taught at Manchester). It is recommended that readers explore the topic further as this barely covered the topic. Dr Martin J. Turner, the University of Manchester, and part of the JISC funded national vizNET support network, Martin.Turner@manchester.ac.uk.

References

Schroeder, W., Martin, K. and Lorensen, B. (1998) The Visualization Toolkit Prentice Hall 1998 2nd Edition

Tukey, J. W. (1977) Exploratory Data Analysis. Addison-Wesley, Reading, MA

Tufte, E.R. (2001) The Visual Display of Quantitative Information Graphics Press, Cheshire, Connecticut 2nd Edition

Tufte, E.R. (1997) Visual Explanations: Images and Quantities, Evidence and Narrative Graphics Press Cheshire, Connecticut

Collection Navigation

Content actions

Download:

Collection as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add:

Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks