Paylaş

Kötüye Kullanım Bildir Kötüye Kullanım Bildir

Sonraki Blog»

Blog Oluştur

Giriş Yapın

HOME|ABOUT|PUBLICATIONS|SOFTWARE|DIGITAL HUMANITIES|CULTURAL ANALYTICS

PAGES

Style Space: How to compare image sets and follow their evolution (part 1)
1

Home IM AGEPLOT SOFTWARE FOR DIGITAL HUM ANITIES DIGITAL HUM ANITIES

text by Lev Manovich (August 4-6, 2001) projects and visualizations are done collaborations by members of Software Studies Initiative (credits appear under the images on Flickr) Batch image processing softwate: Sunsern Cheamanunkul and Jeremy Douglass. ImagePlot visualization software: Lev Manovich, Jeremy Douglass, Nadia Xiangfei Zeng. ImagePlot documentation: Tara Zepel. Statistical analysis of manga images and data: Sunsern Cheamanunkul, Bertrand Grandgeorge, Lev Manovich. Research described in this article was supported by Calit2 UCSD Division, Center for Research in Computing and the Arts (CRCA), NEH Office of Digital Humanities, and National University of Singapore.
ESPAÑOL | PORTUGUÊS Lev M anovich Director Benjamin Bratton Associate Director Jeremy Douglass Postdoctoral Scholar, Calit2

AN EXAMPLE: VAN GOGH's PARIS AND ARLES PAINTINGS Lets start with an example. We want to compare van Gogh paintings created when the artist lived in Paris (1886-1888) and in Arles (1888). We have digital images of most of the paintings done by the artist in these two places: 1999 for Paris, and 161 for Arles. (We did not include the paintings done after the ear accident which took place in the end of 1889 - although van Gogh continued to be in Arles for a few months, he was in and out of the hospital and his productivity was severely diminished). The following visualizations project each of the image set into the same coordinate space. X-axis represents the measurements of average brightness (X-axis); Y-axis represents the measurements of average saturation (Y-axis). (We use median rather than mean since it is less affected by outlier values. The measurements are done with a free open source digital image analysis application ImageJ.) Here are Paris paintings:
Todd M argolis Technical Director, CRCA William Huber Graduate Researcher, UCSD Tara Zepel Graduate Researcher, UCSD Cicero Inacio da Silva Software Studies Brazil Almila Akdag Postdoctoral Researcher, eHumanities Group (KNAW), Amsterdam Eduardo Navas Postdoctoral Scholar, Information Science and M edia Studies, University of Bergen Jean-Francois Lucas PhD candidate in Sociology, European University of Brittany, Rennes Alexander Avrorin Researcher, M IFI (M oscow) Full List of Participants

TOPICS

Software Takes Command (2008) FOLLOW Software Studies book series at M IT Press SOFTW ARE STUDIES IN THE W ORLD And here are Arles paintings: FACEBOOK Followers (253) Follow this blog Unable to connect to the Internet FOLLOW BY EMAIL Google Chrome Email address.#culturadigitalbr (1) admin (7) español (1) events (78) people (6) português (38) press (18) projects (56) publications (26) trends (23) READ Lev M anovich. We see the parts of the space of visual possibilities explored in each period.. can't display the Submit webpage because your computer isn't connected to the Internet.the more dense and the more sparse areas. You can try to diagnose the Projecting sets of paintings done in two places into the same coordinate space allows us to better see the similarities and differences between the two periods on brightness/saturation dimensions.. the outliers. the presence or absence of clusters. Arles paintings are much less spread out than Paris paintings. higher brightness). But these are not . Their cluster is higher and to the right of the cluster formed by Paris paintings (higher saturation. We also see the relative distributions of their works . etc.

Y-axis: saturation mean. the two visualizations above compare van Gogh's Paris and Arles paintings according to their average brightness and average saturation. The two clusters overlap significantly. Visual differences are translated into spatial distances. The position of each artifact is determined by its values for these properties. X-axis: brightness mean. Images which are visually similar will be close. We are not claiming that such representations can capture all aspects of a visual styke. All images which were already created.) STYLE SPACE: DEFINITION A style space is a projection of quantified properties of a set of cultural artifacts (or their parts) into a 2D place. Now. we can rephrase this definition as follows: A style space is a projection of quantified visual properties of images into a 2D plane. images which are different will be further away. each using a different combinations of visual properties. Since the rest of this discussion deals with images. Here is another example of a style space concept application.absolute differences. We compare 128 paintings by Piet Mondrian (1905-1917) and 151 paintings by Mark Rothko (1944-1957). and Y axis represents average saturation. Of course.) It allows us compare all images in a set (or sets) according to their visual values. X and Y represent the properties (or their combinations). we can create many 2D visualizations. two or three properties can't capture all the aspect of a visual style. consider a style space where min and max of each axis are set to smallest and biggest possible visual values. In the example above. Separating a "style" into distinct visual dimensions and organizing images according to their values on these dimensions allows us to see more clearly how differences between the images in a set. so they share the same X axis. see Mondrian vs Rothko: footprints and evolution in style space). A "style space" representation is a tool for exploring image sets. (It is particularly effective for large sets. others are not. (For a discussion of this example. X axis represents average brightness. In other words: while some Arles paintings are exploring a new visual territory. Traces of van Gogh earlier pre-Paris styles are also still visible: a significant number of Paris paintings and a number of Arles paintings are quite dark (left quarter of each visualizations. We can also use three visual properties to map images in a three-dimensional space. Since images have many different visual properties. The two image visualizations are placed side by side. For instance. and all possible images which can be created in the future will lie within the boundaries set by these mind and max .

all ads created by a company. To illustrate this.0 (left).values. a particular artistic school.6 (top). outside of the cluster. These pages form a pretty tight cluster. Y-axis = brightness standard deviation): Because brightness mean and brightness standard deviation variables are correlated.0 (right). all possible images will lie within a half ellipse. 255. we placed a set of specially created black and white images in a simple style space (X-axis = brightness mean. Most pages fall within a particular part of the ellipse. the pages of a comic. the ellipse is only sparsely populated. The pages make visible the ellipse shape. . 127. defined by these coordinates: 0.5. The following example maps pages from nine manga titles according to their brightness mean (X) and brightness standard deviation (Y). 126. or any other cultural image set will typically occupy only a part of this ellipse. The images of a particular artist.

textures. Informally. such measurements are often called "image features. lines. For example. Digital image processing allows us to measure images on hundreds of other visual dimensions: colors. If we consider measurements of a single visual dimension (i. shapes. etc.brightness and saturation.org/wiki/Descriptive_statistics. In computer science." if it consists from many chapters. Although an average value of all pixel's colors may seem like a strange idea.e a single visual property such as brightness mean).wikipedia. Formal descriptions are available in statistics.) We can refer to a particular part of a style space occupied by a set of images as a footprint of this set. In this text we use the world "title" but you may also find the word "series" in descriptions of our visualizations on Flickr linked here. the central tendency and the dispersion (see http://en. this feature measurement turns out to be quite meaningful: it reveals that almost all of 128 Mondrian paintings created between 1905 and 1917 fall into groups: whose dominated by brown and red (bottom) and whose dominated by blue and violet (top). we can characterize their distribution.) If we want to analyze multiple features together.(Note: A manga narrative can be referred to as both a "title" and a "series." We can map images into a space defined by any combination of these features. we can characterize a footprint using its center and shape. FEATURES The visualizations above use simple visual features . the following visualization of 128 Mondrian paintings created between 1905 and 1917 uses measures of average brightness as X. we can apply the techniques of multivariate statistics. . and average hue as Y (a median average of colors of every pixel represented on 0-255 scale).

it will have high entropy (since it is hard to predict the values of a pixel based on the values of its its neighbours). features) represent "dimensions" of style? In many cases. measurements of color saturation and hues are meaningful and can reveal interesting patterns in the evolution of the artists. the basic "low-level" properties correspond to "high-level" stylistic attributes. If an image consists mostly from flat areas .e.i. If an image has lots of details and/or textures. Here is another example of how a low-level feature captures a high-level style attribute.e.) . a singular gray tone or color without much variation or texture .it will have low entropy. This feature is entropy .. Both entropy and standard deviation are measured using pixel's brightness values. in the case of many modern abstract artists such as Mondrian and Rothko. For instance.IMAGE FEATURES AND STYLE To what extent basic properties of visual cultural artifacts (i.a measure of unpredictability. This visualization maps one million manga page according to their entropy (Y-axis) and standard deviation (X-axis).

In between these four extremes. The concept assumes that we can partition a set of cultural artifacts works into a small number of discrete categories." How does the statement that "our basic concept of 'style' maybe not appropriate then we consider large cultural data sets" we just made fits with the concept of a "style space"? A "style space" is simply a space of all possible values of particular visual features (either single features or their combinations) mapped into X and Y. we can represent any image set in such a space. (If we measure and visualize numbers and characteristics of shapes in paintings of each artist produced in their later years. Since we can measure visual properties of any images. the paintings produced by each artist in a particular period we are considering only cover a smaller area of brightness/saturation space. we may be able to. In the case of one million manga images. so it is meaningful to talk about a "style" of each period.." This suggests that our concept of “style” as it is commonly maybe not appropriate then we consider large cultural data sets. If we try to divide this space into discrete stylistic categories. But with Mondrian and Rothko image sets. the distances between neighbour pages are very tiny. Such a visualization reveals if it is meaningful to speak about a "style" shared by this image set (or its parts). The pages in the upper right have lots of detail and texture.e. we can't talk about their distinct style. In other words: the footprint of our sample of one million pages almost completely covers the complete space of possible values in entropy/standard deviation space. the large part of this footprint is very dense. they completely fill the whole range of possible values on entropy dimension (little texture/detail . The pages with the highest contrast are on the right. we find practically infinite graphical variations. while pages with the least contrast are on the left.) . or not. we find every possible stylistic variation. We can call this dense area a "core.lots of texture/detail). In addition. In the case of our one million pages set.The pages in the bottom part of the visualization are the most graphic and have the least amount of detail. the footprints will be even smaller. i. If an image set is spread out across the space. any such attempt will be arbitrary. If an image sets forms a cluster which only occupies a small part of the space.

but it has one limitation: sometimes it makes it hard to see varying density of images footprint. and then colorized in Photoshop.) DENSITY Mapping all images in a set into a space defined by some of their visual features can be very revealing. Lev Manovich. Therefore a visualization which shows images can be supplemented by a visualization which represents images as points and uses transparency. The following visualization shows same one million manga data sets mapped in the same way using points. William Huber.(For more details about our manga data set. The initial plot was created in free Mondrian software. see Douglass. Jeremy. 2011. The following graphs show the distributions of brightness mean and brightness standard deviation averages calculated per each title in our manga set. Another way to visualize density is by graphing values of images on each single dimension separately. . Understanding scanlation: how to read one million fan-translated manga pages.

we can characterize what we informally called "density" more precisely using probability density function. and standard deviation. Continue to Part 2. For example.) End of part 1.(In statistical terms. graphs of frequency distribution. each feature is a "random variable. measures of dispersion such as range. we can think of one million manga pages as a sample of a larger population of all manga. a set of all his paintings can be taken as a complete population.) (Note: when using statistics to describe measures of visual features. If we can fit a data to some well-known distribution such as normal distribution. Posted by Lev Manovich on Saturday. Against Search ." The values of a single features of all images in a given set can be descrbed using univarite statistics: measures of central tendency such as mean or median.. 2011 Topics: projects 0 comments: Post a Comment Links to this post Create a Link Newer Post Subscribe to: Post Comments (Atom) Home Older Post RECENTLY . variance. August 06. In the case of van Gogh paintings. we need to always be clear if we treat our image set as a complete population or as a sample from a larger population..

2011 how to include interactive / very high res visualizations in publications ? July 17. 2011 arriving in 4-5 years: visual data analysis July 15.edu July 20. 2011 PDF of M anovich's The Language of New M edia 2001 book manuscript available on academia. Powered by Blogger. 2011 how to include interactive / very high res visualizations in publications ? July 17. how. 2011 human memory record: what. 2011 "Information Visualization as a Research M ethod in Art History" panel at CAA 2012 (Los Angeles) July 08. 2011 how do you a call a person who is interacting with digital media ? July 18. . 2011 UCSD claims claim world data sorting record (1 TB in 106 sec) July 06. 2011 how do you a call a person who is interacting with digital media ? July 18. size? July 18. 2011 human memory record: what. 2011 How many cultural artifacts have been digitized by 2011? July 09. size? July 18. 2011 arriving in 4-5 years: visual data analysis July 15. 2011 "Information Visualization as a Research M ethod in Art History" panel at CAA 2012 (Los Angeles) July 08. 2011 UCSD claims claim world data sorting record (1 TB in 106 sec) July 06. 2011 Cultural Software (new article by Lev M anovich) July 14. 2011 Against Search July 21.July 21. CA | 92093-0037 Simple template.edu July 20. 2011 How many cultural artifacts have been digitized by 2011? July 09. how. 2011 Cultural Software (new article by Lev M anovich) July 14. 2011 Software Studies Initiative | UC San Diego | 9500 Gilman Drive M C 0037 | La Jolla. 2011 PDF of M anovich's The Language of New M edia 2001 book manuscript available on academia.

we get a more symmetrical shape. European University of Brittany. van Gogh produced approximately 900 paintings. The following visualization plots images of 776 paintings (%86 of the total number) which were created between 18881 and 1890. up to the "ear" episode).e. Amsterdam Eduardo Navas Postdoctoral Scholar. and lighter later paintings in the center and on the right. Calit2 Todd M argolis Technical Director. The clusters are not symmetrical: one side is dense. UCSD Cicero Inacio da Silva Software Studies Brazil Almila Akdag Postdoctoral Researcher. eHumanities Group (KNAW). another is more spread out. Rennes Alexander Avrorin Researcher. M IFI (M oscow) Full List of Participants The distribution has two clusters: earlier dark paintings on the left. X-axis = brightness median. Information Science and M edia Studies. what is the shape of their distribution? According to the estimates. Part 1 is here. CRCA William Huber Graduate Researcher. UCSD Tara Zepel Graduate Researcher. text: Lev Manovich PATTERNS IN STYLE SPACE ESPAÑOL | PORTUGUÊS if we visualize all van Gogh paintings according to their brightness and saturation values.Paylaş Kötüye Kullanım Bildir Kötüye Kullanım Bildir Sonraki Blog» Blog Oluştur Giriş Yapın HOME|ABOUT|PUBLICATIONS|SOFTWARE|DIGITAL HUMANITIES|CULTURAL ANALYTICS PAGES Style space: How to compare Image sets and follow their evolution (part 2) 0 Home IM AGEPLOT SOFTWARE FOR DIGITAL HUM ANITIES DIGITAL HUM ANITIES This is part 2 of a three-part article. If we only plot the paintings done in Arles in 1887 (i. TOPICS . Lev M anovich Director Benjamin Bratton Associate Director Jeremy Douglass Postdoctoral Scholar. Y-axis = saturation median. University of Bergen Jean-Francois Lucas PhD candidate in Sociology.

vertical position indicates if modifications are in the uppper part of the logo.. The visualization uses these features to situate all logos in 2D space according to their difference from the original logo which would have appear at X = 0. it is too early to make any generalizations. but mathematical tests show that they are actually not. With smaller data sets we analyzed. Some of them look like normal distributions. You can try to diagnose the .625 manga pages. we often see some asymmetry.074. Software Takes Command (2008) FOLLOW Software Studies book series at M IT Press SOFTW ARE STUDIES IN THE W ORLD Many social and natural processes follow a familiar bell-curve (normal distribution). It would not be suprizing to find that the distributions of features of very large image sets do follow the familiar Bell curve pattern: a dense cluster containing most of the data. A few really large data sets we analyzed and visualized in our lab have different distributions depending on what features we consider. FACEBOOK Followers (253) Follow this blog Unable to connect to the Internet FOLLOW BY EMAIL Google Chrome Email address.. gradually falling off to the side.#culturadigitalbr (1) admin (7) español (1) events (78) people (6) português (38) press (18) projects (56) publications (26) trends (23) READ Lev M anovich. Consider this visualization of 587 Google logos (1998-2007). For an example. Horizontal distance from 0 on X-axis indicates the degree of visual difference. see this graph showing distributions of values of eight visual features for 1. can't display the Submit webpage because your computer isn't connected to the Internet. or the bottom part. Each logo version was analyzed to extract a number of visual features. and a large very sparse area. What are the shapes of distributions of large cultural data sets? Because digital humanities only recently started to work with big data sets.

we need to map our images differently.7. X-axis = brightness mean. Mix and max of X and Y in the visualization should be set to their lowest and highest absolute possible values. Using min and max values of the measured properties of all images in out sets combined as the boundaries of the visualization will allow us to use the visualization area most efficiently. black text inside each square indicates X and Y coordinates of a point in the center of a square. modifying it more dramatically. For example. Y-axis = brightness standard deviation.i. However.e. we can project them into a 2D space defined by these visual properties as we did with Piet Mondrian's and Mark Rothko's paintings in part 1. These "anti-logos. Min = 0. The function of the Google logo changed: from identifying the company to surprising Google users by how much designers can depart from the original logos. As Google became one of the most recognized brands in the world. Max = 255. Min = 0. Max = 126. VISUALIZING AN IMAGE SET IN RELATION TO A SPACE OF ALL POSSIBLE IMAGES If we want to visually compare two or more image sets to each other in relation to two visual properties. if we want to understand the footprint of each image set in relation to the absolute mix and max . . if we measure brightness on 0-255 scale. breaking the symmetry of the previously established bell-shaped pattern of graphic variability.At first it may appear that the distribution of the Google logos follows the familiar Bell curve. started to appear after 2007. we have added small white squares in the corners. the designers started taking more chances with the logo. and max should be set to 255. However a closer look reveals that the "cloud" of logos extends to the left more. in our visualization they occupy the right most part. mix should be set to 0. To make visualizations easier to see." so to speak. The following visualizations of Mondrian and Rothko paintings uses this idea. lowest and highest possible values of visual features of all possible images .

074. (The visualization uses transparency. so the pages rendered first remain visible. Here is an example of manga pages visualization without transparency).viz for more details about this project.com. This allows us to see the footprint of the these parts in relation to the larger footprint of all images. (See Manga.VISUALIZING PARTS OF AN IMAGE SET IMAGE SETS IN RELATION TO THE WHOLE SET A related idea is to render parts of an image set over the background showing the complete set. In the next example we compare pages of two manga titles from our complete set of 883 titles comprising 1. the drawback is that the contrast of every page is diminished. We visualize pages of nine most popular titles on onemanga.790 pages. Y-aixs = brightness standard deviation: . X-axis = brightness mean. lets render a larger number of titles to get the idea about the shape of manga distribution.) First.

August 11.) Posted by Lev Manovich on Thursday.as opposed to just two features used in all visualization examples in this article. All other pages are rendered as grey points.. 2011 how do you a call a person who is interacting with digital media ? July 18. we will use 400 features . The pages of each title are rendered as color points. 2011 PDF of M anovich's The Language of New M edia 2001 book manuscript available on academia. 2011 human memory record: what.Now lets look at just two titles.edu July 20. to map this space more comprehensively. a few pages of the titles overlap. but the rest form two distinct clusters. 2011 Topics: publications 0 comments: Post a Comment Links to this post Create a Link Home Subscribe to: Post Comments (Atom) Older Post RECENTLY . supernatural. Pink points: title: Ga on-Bi artist: Ju Deo intended audience: Shounen (teenage boys) genre tags (from onemanga. size? . artist: Ouchi Natsumi. Blue points: title: Aozora Pop.com): action. As can be seen.. intended audience: shoujo (teenage girls) (This work is a part of the larger project to find if Japanese manga aimed at different audiences has different footprints in the style space. Against Search July 21. how.

2011 PDF of M anovich's The Language of New M edia 2001 book manuscript available on academia. size? July 18. 2011 how to include interactive / very high res visualizations in publications ? July 17. 2011 How many cultural artifacts have been digitized by 2011? July 09. 2011 UCSD claims claim world data sorting record (1 TB in 106 sec) July 06. 2011 UCSD claims claim world data sorting record (1 TB in 106 sec) July 06. Powered by Blogger. 2011 arriving in 4-5 years: visual data analysis July 15. 2011 human memory record: what.edu July 20. 2011 "Information Visualization as a Research M ethod in Art History" panel at CAA 2012 (Los Angeles) July 08. . CA | 92093-0037 Simple template. 2011 How many cultural artifacts have been digitized by 2011? July 09. 2011 arriving in 4-5 years: visual data analysis July 15. 2011 Cultural Software (new article by Lev M anovich) July 14. 2011 how do you a call a person who is interacting with digital media ? July 18. 2011 Against Search July 21. 2011 how to include interactive / very high res visualizations in publications ? July 17. 2011 Software Studies Initiative | UC San Diego | 9500 Gilman Drive M C 0037 | La Jolla. 2011 "Information Visualization as a Research M ethod in Art History" panel at CAA 2012 (Los Angeles) July 08. 2011 Cultural Software (new article by Lev M anovich) July 14. how.July 18.