This action might not be possible to undo. Are you sure you want to continue?
Bank of America
Bank of America
A BSTRACT Large ﬁnancial institutions such as Bank of America handle hundreds of thousands of wire transactions per day. Although most transactions are legitimate, these institutions have legal and ﬁnancial obligations in discovering those that are suspicious. With the methods of fraudulent activities ever changing, searching on predeﬁned patterns is often insufﬁcient in detecting previously undiscovered methods. In this paper, we present a set of coordinated visualizations based on identifying speciﬁc keywords within the wire transactions. The different views used in our system depict relationships among keywords and accounts over time. Furthermore, we introduce a search-by-example technique which extracts accounts that show similar transaction patterns. In collaboration with the Anti-Money Laundering division at Bank of America, we demonstrate that using our tool, investigators are able to detect accounts and transactions that exhibit suspicious behaviors. Keywords: Fraud detection, ﬁnancial data visualization, categorial and time-varying data Index Terms: I.3.3 [Computer Graphics]: Generation—Line and Curve Generation 1 I NTRODUCTION Picture/Image
Large American banks handle hundreds of thousands to millions of wire transfers per day. While most of these transactions are perfectly legal, a small amount is performed as part of criminal endeavors such as money laundering. The enormous amount of generated activity and the unconstrained nature of the data makes it very difﬁcult to ﬁnd these few instances among all the legitimate ones. At the same time, strict regulations require banks to spend considerable effort to ﬁnd and report these activities, or face signiﬁcant ﬁnes or even being shut down. The problems faced by risk managers and fraud analysts are exacerbated by the fact that an increasing number of transactions are purely digital and often involve a web of ﬁnancial institutions around the world. Thus a bank’s wire transfers may come from and go to individuals or businesses who are not the bank’s customers. Often the bank is just a middle man for transactions that originate in different countries. In these circumstances, banks may know little
about the individuals or businesses involved other than what is in the transaction record. Yet they must still exercise due diligence in discovering and reporting suspicious activity. For a large ﬁnancial institution, this means monitoring hundreds of thousands of transactions per day, then investigating possibly suspicious ones in depth at considerable expense (and risk, if the monitoring is not effective). The problem is overwhelming and growing worse. Hierarchical interactive visual analysis with multiple linked views can effectively attack this problem because it is geared toward the visualization and interactive exploration of massive datasets, integrating multiple methods from various disciplines such as information visualization, human computer interaction, and statistics. In this paper, we present WireVis, a multiview approach that assists analysts in exploring large numbers of categorical, timevarying data containing wire transactions. Our method is highly interactive, and combines a keyword network view, a heatmap, a search-by-example tool, and a new visualization called Strings and Beads. These four views together fully depict the relationships among accounts, time, and keywords within the transactions, and present the user with a global overview of the data, providing the ability to aggregate and organize groups of transactions for better investigation and analysis and the ability to drill-down into and compare individual records. Although the examples and results in this paper concentrate on wire transaction data, the approach is general and applicable to any type of ﬁnancial transaction data. This method should be effective for any keyword-based data, semistructured or not, with varying but substantial levels of activity over time. This work presents substantial qualitative advances over current practice in investigating ﬁnancial transactions, which involves blind queries followed by painstaking analysis of spreadsheets. • It provides an overview that scales to hundreds of thousands to millions of transactions over any desired length of time. • It provides tightly integrated views that look at patterns of activity over time and over keywords for clusters, sub-clusters, and so on. • It replaces blind queries with contextual exploration, clustering, reclustering, and in-place analysis. • It introduces powerful search-by-example techniques. 2 M ONITORING W IRE T RANSACTIONS
† e-mail:firstname.lastname@example.org ‡ e-mail:email@example.com § e-mail:firstname.lastname@example.org ¶ e-mail:email@example.com
∗∗ e-mail:firstname.lastname@example.org †† e-mail:email@example.com ‡‡ e-mail:firstname.lastname@example.org
In collaboration with Bank of America, we have tackled the problem of monitoring wire transactions. As we describe the nature of the data and the current practice in monitoring wire transactions, we will shed light on the requirements of this problem and how interactive visual analysis can bring about a drastic improvement in ﬁnancial investigation. Normalizing data in the various ﬁelds of a wire transaction is difﬁcult as these ﬁelds are frequently open to interpretation at the
whereas wire data bear on tens or hundreds of corporate-deﬁned keywords. 20] supporting pattern identiﬁcation. when the amount of money exceeds a certain threshold. rare anomalous transactions. Recent works such as Snap-Together visualization have studied the feasibility of such systems . customers engaged in fraudulent activities could follow common temporal patterns of legal activity or try to break away from any ﬁxed pattern. in the current practice. structural and temporal patterns could be exhibited on correlation matrices. Fraudulent activity could also be distributed over a variable number of senders and receivers. Other authors provided guidelines  for using multiple-views systems for information visualization. and is by nature time-varying. To ﬁll this gap. Currently. when the destination corresponds to a high risk country or organization or when the transaction relates to a high risk activity. Time-series data capture measurable quantities that change over time. such as DiskTrees and TimeTubes . ﬁnding the appropriate order on a correlation matrix is a task-dependent question that will be receiving increasing interest [18. To hide from scrutiny. geopolitical. Several works have focused on the display of this type of data. However. several works in the information visualization ﬁeld have been proposed such as TableLens  and data sheets  and are now part of commercial information visualization suites.. Moreover. ﬁnancial analysts are faced with hundreds of thousands of transactions bearing predeﬁned keywords over periods of time. In the current work. Roughly speaking. Spreadsheets support various operations on rows and columns and give a detailed account of the data. the bank’s own records. Recently more efforts  have been put into querying such data directly through user interaction. and provided periodic views [12. • A potential risk inherent to the transaction. In fact. The need to follow temporal patterns in transactional data suggests the use of time-series data visualizations. ﬁnancial analysts probe wire transactions based on various considerations: • Ofﬁcial rules governing which transactions must be reported. analysts use spreadsheets to look at large data tables of transactions. Additional information channels could then contribute to the investigation. looking for certain keywords that may be indicative of high risk. Adding to the complexity of the problem. this work can only handle a small set of topics or keywords simultaneously. whereas too many false positives could harm their relationships with their clients or irritate ofﬁcial agencies who would be wasting their time and resources on paranoid reports. Systems offering multiple coordinated views have arisen as a suitable solution for complex multi-faceted datasets. It could also be sent to the receiver’s account directly or handled through a third party. In genomics.. not to mention having to pay large ﬁnes. As a matter of fact. with additional information as to what to do with the money. such as stock values. this metaphor has been used for the visualization of massive gene arrays  and provides a compact overview of the data as well as a drill-down capability for detailed information. However. they are not effective at providing a clear overview of trends and correlations. since most transactions contain a limited number of keywords. the data includes onetime transactions in great numbers as well as repeated transactions. For instance. the data are also semi-structured and certain records are unstructured. 3 R ELATED W ORK Currently. and the transactions that contain one or more keywords are displayed using a spreadsheet for investigation with transactions raising multiple red ﬂags to be scrutinized more thoroughly. In the latter. transactional datasets such as wire transfers could be more challenging to these kinds of tools due to the sparsity of the data points. additional comments. at present they do not have the capability to investigate all the patterns and activities they probably should.g. It has also been used to visualize social network data  as well as co-activity graphs in the arena of software visualization . etc. climate data. and various actors are in play and where methods to hide illegal activity are constantly evolving as older methods are discovered and stymied. provided a taxonomy  and model  of multiple-views systems. publicly available databases. e. The payee could be the real beneﬁciary (e. such distributed frauds are mostly beyond the reach of ﬁnancial investigators. In summary.g. Monitoring keyword-tagged transactions over time also suggests the use of corpora exploration and visualization tools such as ThemeRiver . Although the orderability of matrices can enhance user performance with regards to certain tasks . This set of data is categorical due to the classiﬁcation using keywords. point of origination.Figure 1: The ﬂow of money and associated information in a wire transaction. e. Hence. a person or business) or his account holder. or. ThemeRiver is geared towards the display of overall trends and temporal changes in topics. we set out building a multiple coordinated views system in view of capturing as many aspects of transactional datasets as desired. for instance. Based on intelligence reports and previous analyses. in other words. and studied the beneﬁts and tradeoffs of such systems from a user perspective . investigators create a large list of keywords that best ﬁt the international state of affairs. and instructions may be appended.g. with or without additional comments or instructions (Figure 1). .. At present. 10]. The users need to ﬁnd trends and identify patterns in these datasets. While a set of ﬁlters can easily be set up to catch transactions matching a limited set of rules and automate a report generation process. and search engines could provide evidence for or against further action. a wire transaction could be best seen as a semi-structured data record with numerous optional free text ﬁelds. heatmap visualizations or correlation matrices have been used successfully in various application domains  and are now shipped in several commercial tools such as Spotﬁre Decision Site . However. They are governed by an ever changing context where. a wire transaction corresponds to a certain amount of money sent by a payer to a payee via a chain of intermediaries. analysts query the transactional data over a time period. The transferred amount could come from the sender’s account directly or via a third party. Similarly. rather than outliers and isolated behaviors. risky or suspicious transactions are more evasive and uncertain. Effective analysis tools need to take into account this wide variety of scenarios and allow the user to see the transactions and patterns that match them. home-grown expertise. the data are therefore sparse when viewed as a relationship matrix between transactions and keywords. as most accounts would ﬁre a transaction only once in a while. economic and strategic motivations. Other information such as the address of the sender and receiver. Furthermore. An increase in false negatives would cast serious doubts on the ﬁnancial institutions who would appear as purposely harboring fraudulent activities. However. All transactions are ﬁltered through this list of keywords.
thus. the high level overview that is imperative. we maintained close communication with these groups and routinely showed them our progress and received feedback. To manage the enormous amount of information. while simultaneously enhancing the shortcomings. the heatmap view shows relationships between accounts and keywords (see section 2). 4. search by example (top right). we interviewed and communicated with members of the Risk Management. This data is then inspected by hand. the search-by-example tool helps discover accounts of similar activities.1 User-centric Design The design of WireVis was based on an analysis of the current work of fraud analysts with their existing tools. No single view could fulﬁll all the requirements and show all the necessary data. week. and the heatmap visualization tools show the accounts instead of each individual transaction. Even so. the number of accounts still range in the tens of thousands or more. During the design phase of the project. The complexity of the clustering algorithm is . The transaction data are therefore ﬁrst grouped according to the sending and receiving accounts. and strings and beads (lower left). V ISUAL A NALYTICS TOOLS FOR M ONITORING W IRE T RANSACTIONS WireVis uses four tightly coordinated views of transaction activity. throughout the development phase. and then to drill down to the level of individual transactions when needed for the investigation. All views rely on high interactivity along with the ability to see global trends and capabilities to drill-down into speciﬁc transaction records. The keyword network view is used to represent the relationships between keywords. and lastly. we need to visualize the activities of the corresponding accounts in order to detect suspicious behaviors. Furthermore.Figure 2: A view of the entire system showing the heatmap (top left). This provides the scalability needed. with additional tools like search engines to ﬁnd out if businesses are legitimate. we cannot rely on the data being ﬁltered down. Strings and Beads depicts the transactions over time. so a system of coordinated views was designed that would allow the user to see different data. It must be possible to see aggregated views of all transactions of a day. For WireVis. and useful levels of abstraction. Filtering. but must be able to show and work with all the data at once. and WireWatch (analysis) divisions of Bank of America on their current practices as well as their needs for monitoring fraudulent wire transfers. The current work method of ﬁltering data using predeﬁned keywords and other criteria must be kept to increase acceptance of the system. or month in an overview.2 Data Aggregation Since one cannot usually detect suspicious activity from single wire transactions. we wanted to keep as many of the working aspects of the existing system. etc. 4 Overview and Detail. Compliance. 4. At the same time. Despite the large amounts of data. Coordinated Multiple Views. we hierarchically cluster the accounts. We therefore deﬁned the following list of requirements for the system: Interactivity. analysts ﬁrst ﬁlter the data by geographic region using a set of speciﬁc keywords and other criteria (like amounts). keyword graph (lower right). while being able to understand the connections between the views easily. WireVis must be highly interactive and respond to user input immediately.
it is necessary that the heatmap be scalable in this dimension. However. accounts hitting all keywords (full rows in the grid) usually correspond to ﬁnancial institutions rather than individual accounts and would also be deemed irrelevant to investigative work. The purpose of showing the histogram is to combat the effect of using a cumulative sum in the Heatmap. as the former occur in transactions involving the latter. Search-byexample is a powerful tool for exploratory analysis. that are similar to a reference account. Depending on the nature of the measurement displayed in the grid (e. We have found that this strategy works well in practice. Figure 4: The keyword network view shows the relationship between keywords. If a transaction contains two keywords that should not be related in the context of a wire transfer. the user can quickly identify if the accounts in the same cluster contribute evenly to a speciﬁc keyword. When a keyword is highlighted. Therefore.4 Overview of Keyword-to-Account Relationships Depicting relationships between keywords is important for identifying questionable transactions. . We use a heatmap to display statistical measurements relating keywords (see section 2) and bank accounts. Then the user can use a strategy of selecting reasonably sized subsets for reclustering. and group the accounts based on their distances to the average point of all accounts. while the less frequent ones appear on the outskirts of the circle. i. To show the relationships between keywords.g. This method has the complexity of O(3n) and can cluster tens of thousands of accounts in seconds. Each horizontal (green) line in the histogram represents an account in the cluster. Then.Figure 3: Heatmap with close-up view on a cell histogram displaying the number of keywords for each account. The user can also hover on the heatmap and overlay the value associated to each cell. In Figure 4. Instead.. so interactive reclustering has proved highly useful in exploratory visualization since it will provide much better clusters for further exploratory analysis . the clusters are often not optimal for the analyst’s current purpose.. Paris and France). The appearance of keywords in the same transaction forms the basis of the underlying relationship matrix in which the distances between the keywords are calculated based on the number of times that they appear together in transactions. hitting the same accounts. Most existing clustering techniques such as k-means O(kn) or single-link clustering O(n2 ) require minutes to hours to compute as n becomes large. since it permits the user to quickly identify and search for behaviors of interest without having to specify those behaviors in detail. such coupling of keywords can be accounted for easily (e. Since analysts must try to grasp hundreds of thousands of transactions involving as many accounts. crucial because the analysts often need to perform reclustering as patterns in the transactions are discovered. the heatmap view makes it possible to visually compare patterns of behavior across different accounts. the user can drill-down any set of clusters or keywords through direct interaction by selecting the desired subgrid or by expanding a cluster of his choice.e.2. if this approach was not hierarchical. we use a simple network graph as shown in Figure 4. dynamic overviews of the data while relying on the high interactivity and multiple views to assist in exploring the data space. The wire analysts have approved this as a crude but effective way to explore the transaction space. Also. analysts can detect keywords displaying a similar activity. We use a simple scheme where the saturation of the cell is proportional to the number of times a keyword appears for that set of transactions. The effectiveness is signiﬁcantly enhanced because the method provides fast. thus organizing all the data for a selected time period before launching interactive exploration. Our procedure is to apply the hierarchical binning method as a preprocessing step. Moreover.g. 4. a high risk country heavily involved in money laundering). We treat each account as a point in kdimensional space (where k is the number of keywords). since it is intuitively related to the heatmap concept. sequential or diverging). At the intersection of a given row and column. Our tools perform clustering as described in Section 4. the coupling of two remotely related keywords will trigger further investigation. we color-code a value such as the number of hits for that keyword/column with regards to that account/row in the time-period encompassed in the data. lines are drawn from that keyword to all relating keywords. keywords closer to the center of the keyword network view are the most frequently appearing keywords. thus maintaining interactivity. A keyword is said to be related to another if both of them appear in the same transaction. In many cases.g. In other situations. He can also see common keywords (full columns in the grid) which are likely to be ﬁltered out in an investigative process. The most frequent keywords appear in the middle of the view. Then. providing a high level abstraction of the data as a ﬁrst overview. respective to keywords. By looking at the histogram.. The xaxis shows the number of “hits” of the keyword for that account. or if there are abnormal distributions within the cluster. which is unacceptable in our case. we use a simple “binning” technique to ﬁnd groupings of accounts based on frequency of keywords that occur in the transactions of the accounts.3 Keyword Network View 4. Our tools are enhanced with a user-conﬁgurable search-by-example capability (Section 5) that helps the analysts ﬁnd accounts. When a user highlights a speciﬁc keyword. Likewise. various color schemes  can be applied to the visualization. lines are drawn between the highlighted keyword to all relating keywords. Our heatmap uses a grid whose columns are the keywords of interest and whose rows are clusters of bank accounts being scrutinized. it should be quickly identiﬁed and further inspected by an investigator. whereas keywords on the outskirts of the circle appear less frequently. its value would be limited. An alternative could be to provide the user with a more exact but more time-consuming reclustering approach to be used at any point in the exploration. the user can spot at a glance the accounts that are more frequently related to a given keyword or set of keywords (e.
5 The Strings and Beads Visualization wire transaction. where value can be the amounts of the transactions. e. In addition to this event-driven scenario.Figure 5: Double-clicking on a speciﬁc bead brings up the related wire transactions in a separate window. or go back to observe a pattern more closely. Due to the ﬂuctuation of the data. we strive to quickly provide the user with more detail at key points to enhance both exploration and enlightened decision-making. 4. 5 S EARCH BY E XAMPLE Because of the complex structure of the data and the observed patterns. keywords. slow down.g. displaying the real distribution of low-level values that add up to the aggregate value as in Figure 3. it is also crucial to look at patterns of activity over periods of time (typically months). when a terrorist attack took place somewhere in the Middle East. we smooth out the strings as splines with the option to change the number of control points. In order to support visualization of wire activity over time. message passing between the views becomes trivial. the analysts can now interact with accounts. Together. at an almost subliminal level. the analyst can hover over the dates in the Strings and Beads view. the strings and the beads show the overall trends of the activities as well as the individual transactions. the frequency of activities. as shown in Figure 6. selecting a cell shows all the beads that contain such keyword. etc. we create the Strings and Beads view in which the strings refer to the accounts or cluster of accounts over time. The user can interactively drill-down into individual transactions. Since transactions do not take place over weekends or holidays and often vary drastically in amounts or frequency. At any point. representing the strings as line segments creates jagged lines. highlighting all the beads of a particular day. etc. and time ranges via zooming if necessary. when an analyst hovers the mouse over the keyword names in the heatmap view. For example. as shown in the video. both the heatmap and strings and beads view react to zooming (e. time. and the beads depict the details of a handful of transactions for that day. we choose to represent strings as splines instead of disjointed line segments. the analysts were eventually able to identify the culprit and report the incident. Finally. In case of aggregate data. the analysts can see a global trend of the account activities over time by simple highlighting. The strings shows the overall activities for selected accounts or clusters for an entire year. an intelligence agency had reasons to believe that the attack was supported and funded by individuals in the US. The heatmap then reacts by highlighting all the cells that contain the transactions of these beads and displaying the number of occurrences of the keyword for that day (See Figure 7b). and the beads refer to speciﬁc transactions on a given day. the heatmap view. and time ranges in terms of animated patterns. Figure 5 shows that the Strings and Beads view is quite effective in giving an overview on top of showing speciﬁc detail. By looking for wire activities over a narrow range of time prior to the attack and on selected keywords. To facilitate fast interactions with the Strings and Beads view. the analysts can quickly zoom in to the time period in question by brushing a range of time. By doing so. the cells in the heatmap view will only contain values over a certain time range when a user zooms into a time period in the Strings and Beads view). With the four views coordinated together so that an action performed in one view affects all other windows. the user can also overlay a glyph. the user is able to see the real data points (shown as bars) on top of the smoothed String. Since the underlying data structure is the same. as these can bring up unusual behavior. all the beads in Strings and Beads are highlighted to show when the transactions with such keyword occur (See Figure 7a). the analysts can double-click on a bead to bring up the original wire information in a separate window as shown in Figure 5. keywords. It requested bank wire analysts to search for wire transfers between the US and the location of the attack. account values.. the user can display the original transaction values for detailed analysis.. To illustrate. and see how the selections correlate in all dimensions.6 Coordination Between Views The ability to look for suspicious activities over a period of time is crucial to the analysts. To further examine the details of a speciﬁc The resulting clusters from binning provide a foundation for the coordination between the four views. Instead. etc. events. This highlighting technique allows the analyst to search for suspicious keywords and see when these keywords occur over time. making it difﬁcult to distinguish between different strings. This is signiﬁcantly more powerful than using the views separately. Similarly. The x-axis of the view shows the progression of time. and the y-axis shows the “value” of the transaction. This allows the analyst to focus on speciﬁc dates and observe which accounts are transacting over what keywords over that period of time. Our coupled interface allows the user to quickly ﬁnd these temporal patterns and activities. but has the ability to further investigate speciﬁc incidences. The user can then pause. and the Strings and Beads views: selecting a string highlights a row in the heatmap. Figure 6: Turning on the option showing the original data. 4.g. All these quick actions permit rapid exploration over many accounts. a histogram. it is difﬁcult to deﬁne transaction patterns that one is looking . all cells in the heatmap are highlighted along with the number of occurrences of the keyword in each account cluster. At the same time. which gives a good overview of trends over time. Such tight integrations occur throughout the keyword network view.
An important feature of WireVis is therefore to search by example (see Figure 8). and results are shown immediately. it is usually necessary to ﬁnd not just transactions from and to the involved accounts. 6. we asked James Price. Once an interesting pattern is found. keywords and personal information have all been stripped. There is a transaction near the end of the year of approximately three million dollars. (b) Hovering in the time axis in Strings and Beads highlights the beads of that particular day. The user can interactively see keyword occurrences over time. This is due to the fact that the WireWatch group is located in California and has speciﬁc securityrelated hardware and software restrictions that make the installation of our tool impossible in the scope of the ﬁrst phase of the project. In order to receive feedback from these key collaborators. This transaction is peculiar because all other transactions involving this keyword have transaction amounts in the In order to assess the usefulness of our tools. which are shown in other views for further investigation. but also accounts that show similar activities. the most frequently occurring keyword appears furthest left in the view. to ﬁlter out the keywords that occur with expected frequencies.1 Seeing Normal Behavior The clustering provides an obvious separation between large corporations or ﬁnancial institutions from small businesses and individuals. From this view. The ﬁrst row of the heatmap contains accounts that transact in large amounts (as can be seen in the Strings and Beads view) in high frequency over a large range of keywords. Senior Vice President in Bank of America’s Global Anti-Money Laundering organization. As the user moves the slider. for example. By rubber-banding in the heatmap view and zooming into the column of keyword 58. the number of similar accounts grows or shrinks.(a) Brushing Keywords (b) Brushing Time Figure 7: (a) Highlighting keywords in the heatmap view shows the corresponding beads in the Strings and Beads view. During the hovering. we see in the Strings and Beads view that keyword 58 occurs only in the second half of the year as can be seen in Figure 9a (which is a detail of Figure 7a). we change the y-axis to show the amount of the transactions instead of the number (Figure 9b). as well as the related cells in the heatmap. . unsuspicious behavior and detecting activity that may indicate fraud. and can sometimes be ﬁltered out because they might not contain sufﬁcient indications of suspicious behaviors. In a separate view. Bars represent the number of hits for all the deﬁned keywords. we identify a keyword that shows abnormal temporal patterns by hovering over the list of keywords in the heatmap view. We categorize his observations into two groups: seeing normal. account numbers. we notice that not many transactions involve this word. There is no separate search button. then recluster all accounts using the remaining keywords to provide a more focused overview of the activities. The keywords in the heatmap view are sorted based on their frequency in all transactions over all accounts. giving the user a feeling for the space to explore. 6 C ASE S TUDIES 6. to observe video of interaction with our system during a teleconferencing session and provide his interpretation of the visualizations. our expert evaluators at WireWatch were not able to do so. For privacy and proprietary reasons. we employed a sanitized dataset containing transactions sampled over twelve months. A slider is used to deﬁne the maximum difference for identifying a cluster as similar to the prototype. the search is performed whenever the user changes the criteria or the threshold.2 Detecting Suspicious Activity First. the user is shown the currently selected cluster or account to use as the prototype for a new search. The user can then select particular results. While certain members of the Risk Management and Compliance groups have used the system. The user can quickly browse through time and see keyword and account activity. one strategy could be. Thus. This peculiar time-based behavior prompts further investigation into the transactions involving that keyword. These typically represent large institutions and can often be ﬁltered out from consideration. The last row of the heatmap contain individuals that only have exactly one transaction over the course of the year (which can be veriﬁed by drilling-down into the sub-clusters). which the user can select as relevant criteria by clicking them. Switching to the Strings and Beads view. for a priori.
Important relations could be uncovered and even new keywords found. other clustering techniques can be used. Not only is this transaction of a very large amount. The user can select a prototype cluster/account as well as keywords of interest. the combination of all facets of this transaction lead to further investigation. and is one of our top priority items for the next phase of our project. they ﬁnd the use of visual metaphors to identify questionable behaviors very interesting and promising. c) The receiving account’s information in the search-by-example view shows that this is likely a bank or large company. the WireVis approach described here can be applied to any transactional data or. the WireWatch group has asked to be able to install and use the system on live data for further validation. We expect to develop WireVis along this line. and all those are in the second half of the year. Double-clicking on the bead representing that transaction reveals details about the transaction (Figure 9c). During our discussions with analysts. Finally. Although we provided four views that show relationships between accounts. Clicking on the receiver’s name brings up the Search By Example window where we see that the receiving account most likely belongs to a bank or a large institution because of the number of transactions and the range of keywords it is involved in. Some examples of the views include a matrix or graph view showing the relationship between accounts and a geographical view showing the location of the senders and receivers. such as that used in INSPIRE . It is only an example of how a fast clustering algorithm can enhance the ana- lysts’ ability to interact with the data. Depending on the need of the analyst. WireVis can be made signiﬁcantly more powerful by adding an unstructured text analysis capability. The program shows all similar clusters according to a user-controlled similarity threshold. we enable the experts to take a very analytical but still much less constrained approach than using other tools. By providing exploratory tools for the very speciﬁc data of wire transfers. Considering that the only tools the analysts currently have are lists of text. and therefore is not of interest to the investigator. While WireVis does not provide many of the other methods currently used in fraud detection. as indicated in the beginning. they are not the only four possible views that could be used. However. on the other hand. keywords. but it also involves a keyword showing an abnormal temporal pattern. its tools are complementary and useful. The fact that this particular transaction is of an amount much larger than others makes it stand out as an outlier. to . Any statistical indicators and metrics can be plugged in depending on the need of the analysts or the latest intelligence information. we see that it only appears a few times. this large institutions is an intermediary handler of the transaction. Although there is no single attribute of this transaction that would warrant an investigation into the origination account. This would permit relating any words in the transaction ﬁelds to the keywords. Figure 8: Search by Example. clicking on the sender shows that that this account has had only one transaction over the past year (Figure 9d). More than likely.a) When brushing over keyword 58. We understand that making the integration of WireVis into their daily practice as seamless as possible is very important. and time. WireVis is designed to be naturally extensible. our clustering technique using binning is not intended to be the only solution to grouping accounts. However. indeed. has only used this one keyword. We will explore these possibilities in the future. we have identiﬁed other views that can be helpful. 7 D ISCUSSION AND F UTURE W ORK b) Changing the Y-axis to show the amount of the transactions and double-clicking on the highest bead shows the transaction details. Figure 9: Case study using sanitized real-world data. d) The sender. range of tens to hundreds of thousands of dollars. Fraud analysts found the combination of suspicious patterns sufﬁcient to launch a full investigation. and with a very high sum. Since seamlessly supporting the work of the analysts is such a key aspect of this effort. For example.
Interactively exploring hierarchical clustering results. North and B. March. Pitkow.  C.  N. page 51. ACM. ACM Press. H. In IEEE Transactions on Visualization and Computer Graphics. In SoftVis ’05: Proceedings of the 2005 ACM symposium on Software visualization. Dynamic query tools for time series data sets: timebox widgets for interactive exploration. volume 12. A. Information Visualization. time.  R. M. Rao and S. T. A. pages 76–85. S. A User Interface for Coordinating Visualizations Based on Relational Schemata: Snap-Together Visualization.  C. check cashing. pages 677–684. Matrixexplorer: a dual-representation system to explore social networks. Constructing and reconstructing the rea orderable matrix. pages 318–322. 1994. Thomas. University of Maryland. 8 C ONCLUSION We demonstrate that using interactive visualization techniques coupled with hierarchical analyses in searching for suspicious ﬁnancial transactions signiﬁcantly enhances the analysts’ ability to see global trends as well as quickly narrow down to individual activities. 2003. Nowell. Visualizing the non-visual: spatial analysis and interaction with information from text documents. G. 2000. 4(2):114–135. ACKNOWLEDGEMENTS The authors would like to thank James Price and his team of investigators in the Global Anti-Money Laundering organization at Bank of America. In AVI ’00: Proceedings of the working conference on Advanced visual interfaces. Visual discovery and analysis.-D. Cambazard. Siirtola and E. a U. Baldonado.com/. In AVI ’00: Proceedings of the working conference on Advanced visual interfaces. 2000. Mackinlay. Dumais. Matrix zoom: A visual interface to semiexternal graphs. Journal of Computational and Graphical Statistics. We will be expanding WireVis by coupling to analysis tools that search for money service businesses (i. 2004. under the auspices of the SouthEast Regional Visualization and Analytics Center. The table lens: merging graphical and symbolic representations in an interactive focus + context visualization for tabular information. and V. J. http://www. 2000. R EFERENCES  J. Environmental Systems Research. 2005. Pottier. Card. P. Department of Energy Ofﬁce of Science laboratory. North. van Wijk and E. and P.  D.  Spotﬁre. 1996.e. pages 128–135. Ghoniem. editors. Adelson. Information Visualization. J. A coordination model for exploratory multi-view visualization.  H. R.  C. Crow. Kuchinsky. IEEE Transactions on Visualization and Computer Graphics. Keim. money orders. D.-D. In CHI ’98: Proceedings of the SIGCHI conference on Human factors in computing systems. Whitney. J. 2006. CHI. J. 35(7):80–86. Jussien. Computer. Roberts. W. 2005. Themeriver: Visualizing thematic changes in large document collections. 2004. Shneiderman. Boukhelifa. currency exchange. Fekete. IEEE Computer Society. Snap-together visualization: can users construct and operate coordinated visualizations? International Journal of Human-Computer Studies. 2002. and P. 53(5):715–739.  M. Castagliola. Fekete.  N. ACM Press/Addison-Wesley Publishing Co. In INFOVIS: Proceedings of the IEEE Symposium on Information Visualization. J. IEEE. Abello and F. businesses that deal in money transmission. 2005.  J. Shneiderman. This allows the analysts to see a complete relationship between accounts. pages 4–9. pages 183–190. Information Visualization.  J. J. a U. pages 110–119. keywords. NVAC is operated by the Paciﬁc Northwest National Laboratory (PNNL). Decision site for functional genomics. With this combination. Seo and B. 2002.  M. E. P. 2000. Rodgers. 2005. van Ham. 4(1):32–48.). 2004. Havre. Pennock. and S. Brewer.  C. K. Shneiderman. PhD thesis. and A.S. 6(1):44–58. A. van Selow. . Guidelines for using multiple views in information visualization. Pixel-oriented visualization techniques for exploring very large databases. K. and J. Ghoniem.. the analysts can highlight elements in one view and see that element depicted in a different way in the other. Card. 3(1):1–18. In INFOVIS ’95: Proceedings of the 1995 IEEE Symposium on Information Visualization. 1998. IEEE Computer Society.  E. Fekete. Lantrip. IEEE Computer Society.  S. pages 400–407. S. and N. Woodruff. We create our visualization tools based on the principles of high interactivity and coordinated multiviews. This work was performed with support from Bank of America and the National Visualization and Analytics Center (NVACTM ).. Olson. Visualizing the evolution of web ecologies. A. etc. Schur. On the readability of graphs using node-link and matrix-based representations: a controlled experiment and statistical analysis. In INFOVIS ’99: Proceedings of the 1999 IEEE Symposium on Information Visualization.any keyword-based data over time. In Proceedings of the International Conference on Coordinated and Multiple Views in Exploratory Visualization (CMV 2003). Gossweiler. IEEE Transactions on Visualization and Computer Graphics. pages 27–36. IEEE Computer Society.S. Chi. Shneiderman. M¨ kinen.  J.  S. Henry and J.spotﬁre. C. Department of Homeland Security Program.  H. and patterns of activity. North and B. K. R. Designing Better Maps: A Guide for Gis Users. Peeking in solver strategies using explanations visualization of dynamic graphs for constraint programming. J. and L.  M. J. Cluster and calendar based visualization of time series data. Pirolli. H. Hochheiser and B. 1995. 2000. ACM Press. ACM Press. Snap-together visualization: a user interface for coordinating visualizations via relational schemata. Eick. Q. 1999.-D. Wise. 8(1):9–20. Hetzler. In B.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.