Technical Assignment

Get Real! Peter Melis 3641872 Master NMDC WG1: Ann-Sophie Lehmann

The Issuecrawler and colink analysis
As Lev Manovich writes in his article “Data Visualisation as New Abstraction and Anti-Sublime” examples of data visualization can already be found in the eighteenth century. Nowadays with the aid of computer technology we can visualize more and more data and in more ways than we can imagine (4). This is why data visualization is a hot topic right now. Manovich makes a clear distinction between visualization and mapping. He uses “the term visualization for the situations when quantified data which by itself is not visual (…).Visualization (…) can be thought of as a particular subset of mapping in which a data set is mapped into an image” (3-4). The subject I want to talk about is the Issue Crawler from The IssueCrawler is web network location and visualization software. It consists of crawlers, analysis engines and visualisation modules. It is server-side software that crawls specified sites and captures the outlinks from the specified site ( One of the methods used to visualize a network is colink analysis. In the case of the Issue Crawler, colink analyses means that the crawler is given a number of URL’s to start with (seed URL’s). The Issue Crawler crawls these URL’s and retains the pages that receive at least two links from these seeds. Together with the visualization modules you can get a mapping of for example the websites surrounding a specific subject or problem. Below you can find an cropped example of a visualization done by myself via the Issue Crawler to give you an idea.

6. There are two types of colinks: colinks that are based on inlinks and colinks that are based on outlinks (Qiu et. is via the pictures below: Co-inlink Website B Website A Website C Colinked Co-outlink Website B Colinked Website A Website A In the left example B and C are colinked because A links to both. Correlation analysis to convert the raw frequencies into correlationcoefficients. al. The first example of a decision to make when performing colink analysis is if you let your search or crawl look at just the page where the links link to or to the whole site where the linked page is on. A link can be substantive (“they have their own real meanings and incentives such as agreement and recommendation”. 3. Compilation of the raw co-citation frequency matrix. They stress that it is important to be critical about the quality of links. Retrieval of co-citation frequency information for the core set. Furthermore researchers have used keyword searches in combination with colink analysis to avoid websites and companies in the results that actually aren’t involved in the same issues. They only took in account the websites in the results that mentioned a specific keyword on their homepages to make sure all the companies were involved in the same issue (Vaughan and You 440-441). al. Multivariate analysis of the correlation matrix using principle componentsanalysis cluster analysis or multidimensional scaling techniques. 4. Lang et. al. Vaughan and You write that for research in business relations the links found on a homepage represent the relations better than all the links on the entire website (436). The Issue Crawler makes use of the co-inlink principle. 327). 2. 5. Where ACA is concerned with which author is well cited in what field of research. My last example of a customization of the colink analysis method is the method developed by Qiu et. believe that it is important not to just take In account the links on the homepages of your seed URL’s. al. This means choosing for example between just looking at the links on a homepage instead of the links on the entire website. but also the links on the rest of the whole website if you want to avoid restricted data problems found in analyzing smaller networks (161-162). In the right example A is colinked because both B and C link to A. They state that most colink analysis is done by taking all found links in to account.) or non-substantive (those . Al methods are based on a basic sequence of steps (see McCain in Qiu et. The quickest way to explain this. WCA results generally results in web pages or sites that are well linked from other pages or sites (Zuccala 1488). This example of different ways to look at colinking is just the beginning.The concept of web colink analysis (WCA) comes from the concept of author cocitation analysis (ACA). Different methods are being used and developed. 329): 1. Interpretation of the resulting “map” and validation. Selection of the core set of items for the study.

This is the “dark side” of the operation of mapping and of computer media in general—its built-in existential angst. or if the Issue Crawler doesn’t give correct representations I don’t know yet. “An exploratory study on substantive co-link analysis. I often wonder. al.” Scientometrics 76-2 (2008): 327 – 341. B. Other sources www.that do not) (328). J. to construct an infinite number of different interfaces to a media object. Minneapolis: University of Minnesota Press. you will get more in-depth results giving a better representation of the relations between websites.” Small Tech.” Scientometrics 83 (2010): 157-166. et al. 2008. Zuccala. P et. “Site co-link analysis applied to small networks. The Culture of Digital Tools Eds. and so on. By allowing us to map anything onto anything else. Vaughan. If this is because I did it wrong.” Scientometrics 77-3 (2008): 433-444. L. A. to follow infinite trajectories through the object. You.govcom. Manovich. Hawk. It is Manovich who points us in this direction when he talks about data visualization and art: Since computers allow us to easily map any data set into another set. Their case study shows that when you only look at substantive links. Literature (Issue Crawler) . “Content assisted web co-link analysis for competitive intelligence. “Author Cocitation Analysis is to intellectual structure as Web Colink Analysis is to …?” Journal of the American Society for Information Science and Technology 57-11 (2006): 1487-1502. “Data Visualisation as New Abstraction and Anti-Sublime. This is why I think it is very interesting to research in combination with the fact that I wasn’t happy with the results from my crawl. et. L. al. why did the artist choose this or that form of visualization or mapping when endless other choices were also possible? Even the very best works that use mapping suffer from this fundamental problem. As you can see there is a lot of considerations to take into account when performing colink analysis. Qiu. 3-9. computer media simultaneously make all these choices appear arbitrary—unless the artist uses special strategies to motivate her or his choice (7). and J.

