Importance-Driven Time-Varying Data Visualization: Chaoli Wang, Hongfeng Yu, Kwan-Liu Ma University of California, Davis

Importance-Driven Time-Varying
Data Visualization
Chaoli Wang, Hongfeng Yu, Kwan-Liu Ma
University of California, Davis
Importance-Driven Volume Rendering
[Viola et al. 04]

Differences
• Medical or anatomical data sets

• Pre-segmented objects
• Importance assignment
• Focus on rendering
• Time-varying scientific data sets
• No segmentation or objects are given
• Importance measurement
• Focus on data analysis
Questions
• How to capture the important aspect of data?

• Importance – amount of change, or “unusualness”
• How to utilize the importance measure?
• Data classification
• Abnormality detection
• Time budget allocation
• Time step selection
Related Work
• Time-varying data visualization

• Spatial and temporal coherence
[Shen et al. 94, Westermann 95, Shen et al. 99]
• Compression, rendering, presentation
[Guthe et al. 02, Lum et al. 02, Woodring et al. 03]
• Transfer function specification
[Jankun-Kelly et al. 01, Akiba et al. 06]
• Time-activity curve (TAC) [Fang et al. 07]
• Local statistical complexity (LSC) [Jänicke et al. 07]
Importance Analysis
• Block-wise approach
• Importance evaluation
• Amount of information a block contains by itself
• New information w.r.t. other blocks in the time series
• Information theory
• Entropy
• Mutual information
• Conditional entropy
Information Theory
• Entropy
H ( X )    p( x ) log p( x )
xX
• Mutual information
p( x, y )
I ( X ; Y )   p( x, y ) log
xX yY p( x ) p( y )
• Conditional entropy
H ( X | Y )  H ( X )  I ( X ;Y )
p(x), p(y): Marginal probability distribution function

p(x,y): Joint probability distribution function
Relations with Venn Diagram
H(X) H(Y)
I(X;Y) H(X|Y) H(Y|X)
I(X;Y) = I(Y;X) H(X|Y) ≠ H(Y|X)

Entropy in Multidimensional Feature Space
• Feature vector f3
• Data value f2
• Gradient magnitude or other derivatives
• Domain-specific quantities
f1
• Multidimensional histogram
• Use the normalized bin count as probability p(x)
H ( X )    p ( x ) log p( x )
xX
Importance in Joint Feature-Temporal Space
• Consider two data blocks X and Y at F

• the same spatial location
• neighboring time steps F = (f1, f2, f3, …)
F
• Use joint feature-temporal histogram
• Use the normalized bin count as probability p(x,y)
• Run-length encode the histogram
p( x, y )
I ( X ;Y )   p ( x, y ) log
xX yY p( x ) p( y )
Importance Value Calculation
• Consider a time window for neighboring blocks
• Importance of a data block Xj at time step t:

AX j ,t  w H(X
i 1.. M
i j ,t | Y j ,i )
• Importance of time step t:

At  A
j 1.. N
X j ,t
Importance Curve – Earthquake Data Set
I regular
T
Importance Curve – Climate Data Set
I periodic
T
Importance Curve – Vortex Data Set
I turbulent
T
Clustering Importance Curves
• Hybrid k-means clustering [Kanungo et al. 02]

• Lloyd’s algorithm
• Local search by swapping centroids
• Avoid getting trapped in local minima
Clustering All Time Steps vs. Time Segments
599 time steps

50 segments
1200 time steps

120 segments
90 time steps
90 segments
Cluster Highlighting – Earthquake Data Set
Cluster Highlighting – Hurricane Data Set
Cluster Highlighting – Climate Data Set
Cluster Highlighting – Vortex Data Set
Cluster Highlighting – Combustion Data Set
Abnormality Detection
A: El Niño B: La Niña
Time Budget Allocation
• Allocate time budget based on importance value

At
t   

T 
i 1
Ai
• Animation time
• Non-even allocation
• Rendering time
• Assign to each time step (and each block in a time step)
• Adjust the sampling spacing accordingly
Time Step Selection
• Uniform selection
• Importance-driven selection
• Select the first time step
• Partition the rest of time steps into (K-1) segments
• In each time segment, select one time step: t  arg max H ( | t )

• Maximize the joint entropy
Precomputation and Clustering Performance
• The test data sets with their parameter settings, sizes of joint feature-temporal histograms,
and timings for histogram calculation.
• Timing for clustering all time steps of the five test data sets.
Choices of Window and Bin Sizes
• The importance curve of the vortex data set with different time window sizes (W) and
numbers of bins for feature components F = (f1, f2, f3).
Choices of # of Clusters and Block Size
3 clusters 4 clusters 5 clusters
50×50×20 20×20×20 10×10×20
• The cluster of the highest importance values under different choices of number of clusters
and block size. Top row: color adjustment only. Bottom row: color and opacity adjustment.
Artifact Along Block Boundaries
20×20×20 10×10×20
Summary
• Importance-driven data analysis and visualization

• Quantify data importance using conditional entropy
• Cluster the importance curves
• Leverage the importance in visualization
• Limitations
• Block-based classification
• Size of joint feature-temporal histogram
• Extensions
• Non-uniform data partition
• Incorporate domain knowledge
• Dimension reduction
Acknowledgements
• NSF
• CCF-0634913, CNS- 0551727, OCI-0325934, OCI-0749227, and
OCI-0749217
• DOE SciDAC Program
• DE-FC02-06ER25777, DE-FG02-08ER54956, and DE-FG02-
05ER54817
• Data sets
• Combustion: Jacqueline H. Chen, SNL
• Climate: Andrew T. Wittenberg, NOAA
• Earthquake: CMU quake group
• Hurricane: NSF, IEEE Visualization 2004 Contest

Importance-Driven Time-Varying Data Visualization: Chaoli Wang, Hongfeng Yu, Kwan-Liu Ma University of California, Davis

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Importance-Driven Time-Varying Data Visualization: Chaoli Wang, Hongfeng Yu, Kwan-Liu Ma University of California, Davis

Uploaded by

Copyright:

Available Formats

Importance-Driven Time-Varying

[Viola et al. 04]

• Medical or anatomical data sets

• How to capture the important aspect of data?

• Time-varying data visualization

p(x), p(y): Marginal probability distribution function

I(X;Y) H(X|Y) H(Y|X)

I(X;Y) = I(Y;X) H(X|Y) ≠ H(Y|X)

• Consider two data blocks X and Y at F

• Consider a time window for neighboring blocks

• Importance of a data block Xj at time step t:

• Importance of time step t:

• Hybrid k-means clustering [Kanungo et al. 02]

599 time steps

1200 time steps

• Allocate time budget based on importance value

3 clusters 4 clusters 5 clusters

50×50×20 20×20×20 10×10×20

• Importance-driven data analysis and visualization

You might also like