Link analysis: What is it and why does it matter?

Businesses today are relying on more and more data for their operations and decision making. The data they depend on is also becoming increasingly complex, with all kinds of dependencies within it.

In this context, more organizations are turning to new analytics techniques like graph analytics or link analysis. 

Link analysis helps connect the dots within your data to gain previously unseen insights. No matter the industry, understanding hidden connections and relationships can help drive better business decisions.

This article explores what link analysis is and what the advantages are of using it. We’ll also look at a real-world example of how link analysis can be applied in the context of fraud detection.

What is link analysis?

Link analysis is an analytics technique used to identify, evaluate and understand the connections within data. The data to perform link analysis is stored in a graph database and then is displayed as a graph visualization, also called a network visualization . 

Individual data points in a graph data model are represented by nodes. These entities are connected with edges - also called relationships - that represent the links between any nodes.

Both nodes and edges can have properties that store key information. For a node, this might be the name of a person or a business. For an edge, it could be the amount of a transaction. 

a simple graph visualization showing how different data points are connected

By representing graph data as a network, it becomes much easier to look for information within that data, and spot important trends, patterns, or anomalies.

What are the advantages of using link analysis?

Traditional databases, as opposed to graph databases, store data in rows and columns. It’s easy to search for or verify simple information.

But when you’re working with large datasets or multiple data sources, the relationships within the data can be highly complex. And the more complex the data, the more difficult and time consuming it becomes to run analysis in a traditional database. Traditional analytics are also not designed to analyze the relationships within the data.

image showing a relational database vs a graph database

With graph data and link analysis, on the other hand, data from multiple sources is represented all at once. The connections between entities are treated as first-class citizens, so you can quickly understand what data is connected, and what the nature of those connections is.

These characteristics of link analysis make it an especially advantageous solution for organizations that need to understand the relationships within their data. It comes with several key benefits.

Fast discovery of insights

The human brain can process visual information many times faster than written or numerical information. By displaying your data as a visual network, link analysis can enable you to find exactly the information you need in a matter of seconds.

Scalable data analysis

Link analysis can display and analyze huge amounts of data from multiple sources. Even when the quantity of data is very large, querying data stored in a graph is quick, enabling you to scale any project.

Accessible and intuitive analytics

Displaying data as a network presents an easy and intuitive way for even non-technical users to understand and explore complex connections.

Link analysis use cases

Link analysis provides a powerful tool for identifying patterns and connections that might be difficult - or even impossible - to detect using other methods. By analyzing the relationships between entities, analysts can gain insights into complex networks, surfacing hidden insights or anomalies, or identifying areas for further investigation.

Here are some common use cases for link analysis.

Law enforcement

In a law enforcement context, agencies can use link analysis to identify connections between individual entities or organizations involved in criminal activities: drug trafficking, organized crime, etc.

Using link analysis, investigators can identify individuals who are connected to multiple crimes, for example, or surface connections between criminal suspects and their accomplices.

Cybersecurity

Cybersecurity analysts need to be able to quickly identify security threats and risk to effectively protect their organization.

Link analysis can help them identify patterns of behavior that may indicate suspicious activity. It can be used for example to identify connections between malicious domains or IP addresses or to spot patterns of behavior that indicate phishing or another type of cyber attack.

Intelligence analysis

Intelligence agencies commonly use link analysis to identify connections between individuals or organizations involved in criminal or terrorist activities.

Investigators may use link analysis to flag individuals who are connected to known participants in a terrorist organization or to spot connections between criminals and their associates.

Let’s take a deeper dive into the use case of fraud detection to see in detail how link analysis can be applied.

Fraud detection with link analysis

Let’s look at one use case where link analysis performs especially well to understand how this type of analytics can be applied. 

Fraud is an increasingly complex problem for financial institutions, insurers, and other businesses to manage. Fraudsters often operate in networks that resemble professional organizations. Oftentimes they work across borders. Fraudsters are also experts in evading prevention and detection systems, quickly evolving their criminal techniques to get around new prevention tools. 

The data analysts use to detect possible cases of fraud is therefore full of complexity, and comes from multiple sources. Link analysis enables you to visualize and explore all that data in one place. Patterns indicative of fraud become much more apparent.

Take the example of fraudulent car accident claims filed with an insurance company. Scaling the fake car accidents requires multiple policyholders, multiple cars and multiple car passengers.

Link analysis can help identify if within an insurance company’s client database several people are interconnected across cars, individuals, repair shops or claims. The larger the network, the more likely that fraud is going on. 

link analysis diagram of fake car accident fraud

You can also see individuals who may be at a high risk for fraud at a glance. Say for example that your bank is onboarding a new customer. Everything about her seems normal. But link analysis shows you that she’s connected to a known fraudster by an IP address. Traditional detection systems may have missed this red flag.

Fraud detection is just one example of link analysis applications. This analytics technique can also be applied with impressive results to supply chain , IT, intelligence, cyber security, and many other use cases. Learn more about link analysis and graph analytics use cases .

Link analysis FAQ

About linkurious and linkurious enterprise.

Linkurious is a software company providing technical and non technical users alike with the next generation of detection and investigation solutions powered by graph technology. Simply powerful and powerfully simple, Linkurious Enterprise helps more than 3000 data-driven analysts or investigators globally in Global 2000 companies, governmental agencies, and non-profit organizations to swiftly and accurately find insights otherwise hidden in complex connected data so they can make more informed decisions, faster.

A banner reading "Watch the Linkurious Enterprise product tour" with a call to action to watch now

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List

Logo of plosone

Link Prediction in Criminal Networks: A Tool for Criminal Intelligence Analysis

Giulia berlusconi.

1 Università Cattolica del Sacro Cuore and Transcrime, Milano, Italy

Francesco Calderoni

Nicola parolini.

2 MOX, Department of Mathematics, Politecnico di Milano, Milano, Italy

Marco Verani

Carlo piccardi.

3 Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milano, Italy

Conceived and designed the experiments: GB FC NP MV CP. Performed the experiments: GB FC NP MV CP. Analyzed the data: GB FC NP MV CP. Contributed reagents/materials/analysis tools: GB FC NP MV CP. Wrote the paper: GB FC NP MV CP.

Associated Data

All network data are available at Figshare ( https://dx.doi.org/10.6084/m9.figshare.3156067 ).

The problem of link prediction has recently received increasing attention from scholars in network science. In social network analysis, one of its aims is to recover missing links, namely connections among actors which are likely to exist but have not been reported because data are incomplete or subject to various types of uncertainty. In the field of criminal investigations, problems of incomplete information are encountered almost by definition, given the obvious anti-detection strategies set up by criminals and the limited investigative resources. In this paper, we work on a specific dataset obtained from a real investigation, and we propose a strategy to identify missing links in a criminal network on the basis of the topological analysis of the links classified as marginal, i.e. removed during the investigation procedure. The main assumption is that missing links should have opposite features with respect to marginal ones. Measures of node similarity turn out to provide the best characterization in this sense. The inspection of the judicial source documents confirms that the predicted links, in most instances, do relate actors with large likelihood of co-participation in illicit activities.

Introduction

Criminal intelligence analysis aims at supporting investigations, e.g. by producing link charts to identify and target key actors. Law enforcement agencies increasingly use Social Network Analysis (SNA) for criminal intelligence, analyzing the relations among individuals based on information on activities, events, and places derived from various investigative activities [ 1 – 3 ]. SNA provides added value compared to more traditional approaches like link analysis, by enabling in-depth assessment of the internal structure of criminal groups and by providing strategic and tactical advantages. For instance, SNA can inform law enforcement officers in the identification of aliases during large investigations and in the collection of evidence for prosecution [ 2 ]. Furthermore, the network analysis of criminal groups under investigation may help identify effective strategies to achieve network destabilization or disruption [ 3 , 4 ].

Given the sensitiveness and implications of criminal proceedings, criminal intelligence and investigations strive for achieving the most accurate representation of each case. Information gathering and selection are crucial steps, due to the implications of both type I (false positive) and type II (false negative) errors. A number of controls and procedural safeguards are in place to prevent false positives (i.e. wrong accusations). Investigators, prosecutors, and courts routinely deal with irrelevant information by discarding it throughout the proceedings and keeping only material useful to build a case [ 5 ]. Contrarily, the inherently covert nature of criminal activities makes investigations more vulnerable to false negatives (i.e. missing information), with very limited solutions available to the law enforcement agencies due to time and resource constraints.

Missing information is also the main challenge for SNA of criminal networks. Law enforcement data from wiretap or other investigative sources are inevitably incomplete. Criminals often use communication and protection methods to decrease the effectiveness of law enforcement action [ 6 ]. Investigators rely on data-gathering methods, e.g. observations, archives, informants, witnesses, that results in incomplete information and thus a partial vision of the network under investigation [ 5 , 7 – 9 ]. The lack of data generates problems of uncertain information, potentially jeopardizing the effectiveness of the investigations [ 3 ]. In the analysis of criminal networks, missing data can refer to missing nodes and/or missing links [ 7 ].

Missing nodes often depend on the scope and focus of the investigations. In turn, these may affect the specification of network boundaries, i.e. the definition of rules of inclusion of actors and their relations in the network [ 10 – 12 ]. Law enforcement agencies may overlook some important actors, especially if they take precautions against detection [ 5 ]. Research has shown that some skilled criminals assume a strategic position in criminal networks by balancing security and active involvement. Whereas intensive interaction with others normally increases the criminals’ performance, it also affects their visibility and consequently the vulnerability to law enforcement targeting. Some key players (e.g. the boss in a mafia) will avoid direct involvement in the illicit activities to reduce the risk of identification and arrest [ 13 – 15 ]. Nevertheless, the literature points out that even the most skilled criminals may hardly avoid detection in long lasting and intensive investigations, particularly if they have an important role in a criminal group [ 15 ].

Missing links instead refer to the lack of information on the relations between two known criminals. The police may miss meetings, conversations, and plans about criminal activities [ 5 , 16 ]. For instance, criminals may use different telephone lines, according to the nature of the conversation and the interlocutor, and investigators may be able to identify only some of them. The frequent change of mobile phones and SIM cards and the use of particular lines to communicate with high-ranking affiliates may also prevent law enforcement agencies from identifying all conversations among suspects [ 17 ]. This results in incomplete information which may hinder or mislead investigations. Scholars and practitioners in criminology and criminal justice have often acknowledged the problem of missing links [ 5 , 7 , 16 , 18 ]. Yet, studies on the their identification in criminal networks are still rare [ 19 – 21 ]. This is surprising, not only given the significant growth of works on missing links in other fields with the development of a number of different strategies [ 22 – 26 ], but also given that criminal investigations face the problem of missing links almost by definition, due to the scarcity of investigative resources and the anti-detection strategies by criminals [ 20 ].

This paper proposes an innovative strategy to identify possible missing links in a criminal network. It draws from the literature on link prediction and applies it on a unique dataset based on a real investigation. Differently from previous studies, the main assumption is that missing links may have characteristics contrary to those of marginal links discarded during the investigation. Indeed, while some links are ordinarily removed from a criminal network due to their marginality, other links with opposite characteristics may be missing due to lack of information. The analysis thus infers missing links a contrario from the characteristics of marginal links actually removed throughout the proceedings. The possible missing links so detected are highly probable social ties whose existence should be investigated by law enforcement agencies. Their identification during ongoing investigations may support law enforcement agencies in the allocation of scarce investigative resources, especially in the case of large criminal networks, and therefore improve the law enforcement action.

The Oversize dataset

The analysis relies on a unique dataset from operation Oversize, an Italian criminal case against a mafia group. The investigation lasted from 2000 to 2006, and targeted more than 50 suspects involved in international drug trafficking, homicides, and robberies. The trial started in 2007 and lasted until 2009, when the judgment was passed, and the main suspects were convicted with penalties from 5 to 22 years of imprisonment. Most suspects were affiliated to the ‘Ndrangheta, a mafia from Calabria (a southern Italian region) with ramifications in other regions and abroad [ 27 , 28 ].

Contrarily to most empirical studies on criminal networks, which rely on data derived from a single source of information, Oversize’s peculiarity lies in the availability of three networks from three judicial documents corresponding to three different stages of the criminal proceedings [ 16 ]: the wiretap records (WR), the arrest warrant (AW), and the judgment (JU). The wiretap records include all wiretap conversations transcribed by the police and considered relevant at first glance. The arrest warrant contains a selection of the transcripts and other relevant information from informants and other investigative activities (e.g. physical surveillance). The judgment summarizes the trial and includes information from several sources of evidence, including wiretapping and audio surveillance. It is worth mentioning that the documents related to the arrest warrant and judgment are public [ 29 , 30 ], whereas wiretap records are not publicly available because they report private conversations involving people other than suspects (access was obtained by the authors through a special permission). Nonetheless, the three networks, derived from a thorough, exhaustive analysis of the textual judicial documents [ 16 ], can be made public because no personal or sensitive information is reported (see Data Availability Statement).

Most studies on criminal networks focus on one or a small number of case studies, and rely on a single source of information [ 4 , 7 , 16 , 18 , 28 , 31 ], because access to data is difficult to obtain, particularly in the case of wiretap records. The main limitation of a case study approach concerns the external validity of the findings, i.e. the extent to which the results can be generalized beyond the case studies [ 32 ]. The analysis of the Oversize dataset focuses on a single criminal network thus sharing similar limitations on external validity with previous studies. The peculiarity of the dataset (i.e. the availability of three networks) prevents replication on other cases. Yet, it simultaneously constitutes the strength and innovation of the current study because it enables observation of the discarded marginal links and the prediction of possible missing links.

The individuals involved in illicit activities constitute the nodes of the networks, the links indicate a relation between any two actors. We restrict the analysis to the undirected case, i.e. we neglect the directionality of links. The three networks are formally defined by N i = ( V , E i ), i = WR , AW , JU , where V is the set of nodes (the same for all networks, with | V | = 182 nodes) and E i is the set of links of network i . We denote by ( x , y ), with x , y ∈ V , any pair of nodes of network, be they connected by a link, i.e. ( x , y ) ∈ E i , or not. The number c xy of telephone calls recorded between individuals (nodes) ( x , y ) is available for all ( x , y ) ∈ E i . Table 1 summarizes the main statistics of the three networks. We recall that the degree k x of a node x is the number of links incident to x , i.e. the number of neighbor nodes. A node is isolated if k x = 0. The density of the network is the ratio between the number of existing links | E i | and their maximum possible number | V |(| V | − 1)/2.

The networks are simultaneously displayed in Fig 1 . The Oversize networks show some features typical of illicit networks. Many criminal organizations analyzed in the literature exhibit the presence of a core of few highly-connected nodes and a large number of peripheral actors [ 4 , 7 , 28 , 33 – 35 ]. Fig 1 highlights (through node coloring) the result of the k -shell core-periphery analysis [ 36 ]: nodes are partitioned into “concentric” layers (or shells), starting from the periphery and arriving to the core of the network. Each node is assigned to a shell: the 1-shell contains the most peripheral nodes, the 2-shell those which are in the layer immediately more internal, and so on. More in detail, the algorithm for k -shell decomposition can be summarized as follows [ 36 ]: put in the 1-shell (and remove) all nodes with degree k x = 1, and then all nodes having k x ≤ 1 after removal of the former; put in the 2-shell (and remove) all nodes with k x = 2, and then all nodes having k x ≤ 2 after removal of the former; etc. The procedure stops when all nodes have been classified in a k -shell. In the N WR network, 4 shells are identified: they include, from the periphery to the core, 123, 33, 19, and 7 nodes, respectively, thus confirming the presence of a core of few actors and a large number of peripheral individuals.

An external file that holds a picture, illustration, etc.
Object name is pone.0154244.g001.jpg

The links removed in passing from N WR (Wiretap Records) to N AW (Arrest Warrant) ( above ), or from N WR to N JU (Judgement) ( below ), are highlighted in red. Nodes are colored according to their coreness, based on the k -shell analysis of the N WR network: 1 = white (most peripheral), 2 = yellow, 3 = orange, 4 = brown (most central).

Network reduction and marginal links

In passing from N WR to N AW , 58 of the 247 links of N WR are removed (thus E WR ⊃ E AW ) creating 36 isolated nodes. Similarly, in passing from N WR to N JU , 134 links are removed ( E WR ⊃ E JU ) creating 93 isolated nodes. However, the links of N JU are not a subset of those of N AW , i.e. the two reductions are not in cascade. This is normal, as subsequent phases of the criminal proceedings may generate new information, e.g. from witnesses or additional investigative activities.

Fig 1 highlights the links removed in the network reduction processes (i.e. from N WR to N AW , and from N WR to N JU ). The removed links are in most cases associated to a small number of telephone calls ( Fig 2 ). In the original network N WR , the number of calls c xy ranges from 1 to 52, with average value 〈 c xy 〉 = 3.95. On the other hand, the sets of removed links have 〈 c xy 〉 WR → AW = 1.59 (ranging from 1 to 6) and 〈 c xy 〉 WR → JU = 2.53 (from 1 to 20). None of the links with highest number of calls is removed. To substantiate this observation, we repeatedly select at random (for 10 5 repetitions) 58 or 134 links from N WR , namely the same number of links removed, respectively, from N WR to N AW and from N WR to N JU . It turns out (see the right panels in Fig 2 ) that the average number of calls of the links actually removed is extremely small, such that the probability of randomly selecting a smaller value is almost zero in both cases. It can safely be claimed that the link removal process tends to be biased by the intensity of the contacts between individuals, as the links with lower intensity are more likely to be removed.

An external file that holds a picture, illustration, etc.
Object name is pone.0154244.g002.jpg

Left panels : in green, the histogram of the number of calls of the links of N WR . In yellow, the number of calls of the links removed in passing from N WR to N AW ( above , 58 removals), and from N WR to N JU ( below , 134 removals). Right panels : the distribution of the average number of calls of a random sample of 58 links ( above ) or 134 links ( below ) of N WR , compared with the average number of calls (red vertical line) of the links actually removed from N WR to N AW ( above ) and from N WR to N JU ( below ).

The removed links often connect two individuals who had occasional contacts during the two-year investigation. In some cases, they concern pairs of actors who had telephone conversations in a few occasions and for very specific purposes (e.g. the purchase of small quantities of drugs). For instance, n63 (we refer to individuals by means of their anonymized label) was involved in a small number of telephone conversations with different retailers to arrange the purchase of small quantities of drugs in different occasions during the investigation. However, since he was not involved in the trafficking activities, nor in other serious crimes, the links between him and the other alleged criminals were discarded by the police when passing from N WR to N AW . In other occasions, the links removed from the network involved at least one individual who had not been identified by the police. Indeed, 11 out of the 58 links discarded from N WR to N AW involved individuals who participated in (minor) illicit activities and are reported in the judicial documents with the initials “V.M.” (male voice) of “V.F.” (female voice), or with the name or nickname mentioned in the telephone conversations. The same applies to 32 links out of the 134 removed when passing from N WR to N JU .

Other removed links with low intensity concern conversations about issues unrelated to the main illicit activities conducted by the members of the criminal group. Two examples are indicative of this type of links. In one occasion, n40 and n39 discuss about the debts that a third person has towards n39; in another occasion, n49 informs n26 of the arrest of another member of the group. In both cases, the links are formed as a consequence of an occasional communication between two individuals. Such communications may be useful to have a complete picture of the criminal network, but the links did not represent stable communication channels or relations among network members, nor they added any relevant information to the investigation process and they were discarded by the police.

Although it is certainly true that many removals involve peripheral nodes (especially in the N WR to N JU case), the visual inspection of Fig 1 reveals that many removals concern links which are instead connected, on one or both sides, to nodes with medium/large coreness. It cannot be claimed, therefore, that the network reduction is a process trivially involving the network periphery only. On the other end, we already pointed out that the intensity of the contacts (number of telephone calls) seems to be associated with the classification of marginal links by the police (see Fig 2 ). However, this quantity cannot be used for link prediction, since it cannot be associated in a straightforward way to a potential (non detected) link whose likelihood we want to quantify. As a matter of fact, there is no obvious way to associate a “predicted weight” to a predicted (thus non observed) link.

In the following, we will assess two topological indicators, namely link betweenness and node similarity, in their ability of characterizing the links which are marginal and thus, a contrario , in predicting the links which have not been detected but are likely to exist (missing links). These two quantities can indeed be used for this exercise, since their value can be naturally associated to a non existing (predicted) link, contrarily to the link weight (i.e. the number of calls). For this analysis, we will work on the unweighed (binary) network, i.e. we will neglect the information on the number of calls, both because we want to assess the predicting capabilities of the pure topological information (e.g. who is in contact with whom), and because the actual benefit of using weights in link prediction is known to be questionable [ 37 ].

Link betweenness

Our first hypothesis is that removed links are characterized by low betweenness. This means that they are redundant in the sense that they connect individuals who are already connected in some way in the network and do not significantly improve the flow of information. Networks are generally composed of subgroups (or communities) connected by one or a few links that bridge between them. “Structural holes”[ 38 ] are non-redundant contacts that lie in a brokerage position between otherwise disconnected components and thus facilitate the exchange of information and ideas. Links connecting different communities have high link betweenness (a generalization of Freeman’s node betweenness [ 39 , 40 ]), since they are crucial to connect different parts of a network. Conversely, within-community links are to some extent redundant and their removal is likely to have little impact on the network. Our first hypothesis is thus tested through the computation of the link betweenness for both removed and non-removed links. We recall that, given a link ( x , y ) ∈ E i connecting nodes x and y , the link betweenness b xy is the number of shortest paths passing through ( x , y ), among those connecting all node pairs ( s , t ) of the network. More precisely:

where B st is the number of (equivalently) shortest paths connecting ( s , t ), and B s t x y is the number of such paths passing through ( x , y ). Betweenness thus emphasizes those links that favor the exchange of information among network members. The first hypothesis thus assumes that marginal links may have low betweenness and this may explain why they were discarded throughout the proceedings.

Node similarity

Our second, alternative hypothesis is based on the literature on link prediction. Several studies have applied different link prediction methods to a number of networks. They show that nodes are more likely to be connected when they are similar and share a number of features [ 22 , 23 ]. According to the second hypothesis, thus, marginal links connect structurally dissimilar nodes, i.e. individuals who occasionally collaborate but are dissimilar in terms of interests, background, and involvement in criminal activities. Therefore, these connections are not crucial for the criminal conducts. The literature proposes several analytical strategies for link prediction, with new methods constantly added, mostly based on measures of node similarity [ 24 – 26 ]. Given the small size of the Oversize networks, such strategies are a viable option, since the exhaustive calculation of similarities for all node pairs is computationally feasible. The hypothesis is that marginal links have low similarity scores and this would explain their removal.

Node similarity approaches attribute a score s xy to all node pairs ( x , y ) and, consequently, induce a ranking of all node pairs. Notice that, if ( x , y ) ∈ E i (the set of links), s xy can be interpreted as a score attributed to the link. Thus, node similarity actually yields a ranking of all the links E i . Among the many possible similarity scores, the simplest one amounts at counting the number of Common Neighbors (CN) of nodes ( x , y ):

where Γ( z ) denotes the set of neighbors of node z . The rationale is that ( x , y ) must have common features, interests, etc., if they have many common acquaintances. Thus it is likely that they are directly connected, or that they will in the near future. Empirical evidences of this assumption have been found in many instances [ 41 , 42 ].

The CN similarity score can be refined in many ways, e.g. by weighting—not simply counting—the number of common neighbors. One of these ways leads to the definition of the Resource Allocation (RA) similarity score:

where k z = |Γ( z )| is the degree of node z . Here, the role of the common neighbor z in connecting ( x , y ) is diluted if z has many connections, since it will have less resources allocated to bridge ( x , y ).

CN and RA are widely used to quantify node similarity. Extensive tests on the capability of a broad set of indicators (including the two above) in solving the link prediction problem, found that CN obtains a very good performance despite its extreme simplicity, whereas RA ranks as one of the best indicators on a large set of benchmark tests [ 24 ].

Fig 3 shows the relationship between the number of calls c xy , the betweenness b xy , and the similarity score s xy , both for the whole network N WR and for the marginal links (here we only consider the reduction N WR to N AW for brevity). The figure reveals that all the removed links collocate among those with low similarity score, whereas we find removed links spread throughout the entire betweenness range. On the basis of this preliminary observation, we now consider the two hypotheses above discussed.

An external file that holds a picture, illustration, etc.
Object name is pone.0154244.g003.jpg

Each blue cross corresponds to a link ( x , y ) ∈ E WR . Red circles highlight the links removed in passing from N WR to N AW . The horizontal axis is truncated to improve readability: only 4 links over 247 have c xy > 20, none of which is removed.

To check the first hypothesis (i.e. the removed links have low betweenness), we compute the betweenness of all links of the network N WR and we compare their statistics to those of the links which are removed in passing to N AW or, respectively, N JU . The results are summarized in Fig 4 . The average betweenness of the links of N WR is 〈 b xy 〉 WR = 249.4, and those of the removed links are not largely dissimilar, namely 〈 b xy 〉 WR → AW = 300.7 and 〈 b xy 〉 WR → JU = 238.0, respectively. Incidentally, some of the removed links have betweenness value of the order of the highest values found in the network (left panels in Fig 4 ). Furthermore, if we repeatedly select at random (for 10 5 repetitions) 58 or, respectively, 134 links to remove (these are the number of links removed from N WR to N AW and, respectively, from N WR to N JU ), we discover that the average betweenness of the links actually removed is by no means anomalously small—in the N WR to N AW case it is even larger than average (right panels in Fig 4 ). This leads to the rejection of our first hypothesis.

An external file that holds a picture, illustration, etc.
Object name is pone.0154244.g004.jpg

Left panels : in green, the histogram of the betweenness of the links of N WR . In yellow, the betweenness of the links removed in passing from N WR to N AW ( above , 58 removals), and from N WR to N JU ( below , 134 removals). Right panels : the distribution of the average betweenness of a random sample of 58 links ( above ) or 134 links ( below ) of N WR , compared with the average betweenness (red vertical line) of the links actually removed from N WR to N AW ( above ) and from N WR to N JU ( below ).

We now move to our second hypothesis (i.e. the removed links connect structurally dissimilar nodes) and adopt a strategy common in the research on missing links, i.e. node similarity. We compute the similarity score s xy (i.e. the similarity of the node pair ( x , y )) of all the links of the network N WR , and we compare their statistics to those of the links which are removed in passing to N AW or, respectively, N JU . The results are summarized in Fig 5 for the CN similarity score ( Eq (2) ). The average score of the links of N WR is 〈 s xy 〉 WR = 0.789, whereas those of the removed links are much smaller, namely 〈 s xy 〉 WR → AW = 0.397 and 〈 s xy 〉 WR → JU = 0.455, respectively. None of the removed links has a score of the order of the highest values found in N WR (left panels in Fig 5 ).

An external file that holds a picture, illustration, etc.
Object name is pone.0154244.g005.jpg

Left panels : in green, the histogram of the score of the links of N WR (247 links). In yellow, the score of the links removed in passing from N WR to N AW ( above , 58 removals), and from N WR to N JU ( below , 134 removals). Right panels : the distribution of the average score of a random sample of 58 links ( above ) or 134 links ( below ) of N WR , compared with the average score (red vertical line) of the links actually removed from N WR to N AW ( above ) and from N WR to N JU ( below ).

To give statistical significance to the above observation, we repeatedly select at random (for 10 5 repetitions) the same number of links removed from N WR to N AW and, respectively, from N WR to N JU (58 or 134 links). The average score of the links actually removed is extremely small, such that the probability of randomly selecting a smaller average score is p < 0.01 in both cases (right panels in Fig 5 ). This means that the link removal process, if assessed in terms of similarity score s xy , appears to be strongly biased towards the links with least score. In this respect, the number of calls c xy and the score s xy associated to links seem to play a similar role in driving the removal process. However, as already pointed out, the former cannot be used for link prediction purposes.

The above results are confirmed if we instead adopt the RA similarity score ( Eq (3) ). Here the average score of the links of N WR is 〈 s xy 〉 WR = 0.124, whereas those of the removed links are 〈 s xy 〉 WR → AW = 0.046 and 〈 s xy 〉 WR → JU = 0.067. Again, the probability of randomly selecting a smaller average score is p < 0.01 in both cases. Therefore, the hypothesis that removed links connect individuals who are structurally dissimilar (i.e. individuals who occasionally collaborate but are different in terms of tasks and involvement in criminal activities) can be accepted. Node similarity scores can thus be adopted to identify missing links within the Oversize network.

Prediction of missing links

Our goal is now to identify the possible missing links in the Oversize network by inferring them a contrario , on the basis of the characteristics of the marginal links (i.e. links removed along the criminal proceedings) identified through the testing of the two hypotheses above. As a matter of fact, given that the link removal process proved to be strongly biased towards the smallest similarity scores, it is reasonable to presume that unobserved links (i.e. pairs of actors) with large similarity scores might be connected by missing links. In other words, if a small similarity between two actors—although connected—reveals the marginality of their link, a large similarity should be indicative of a connection, even when the link was not identified by law enforcement agencies. The procedure of attributing large likelihood of existence to links connecting highly similar nodes is at the basis of network reconstruction in all those fields where the knowledge of the complex set of interactions among agents is admittedly largely incomplete, such as for instance in social [ 12 ] or biological networks [ 43 ].

Let us first consider the CN score, defined by Eq (2) . If we compute the similarity s xy of the 247 links of the network N WR , we find that they range from 0 to 7, with average value 〈 s xy 〉 = 0.789. On the other hand, if we compute s xy for all ( x , y ) ∉ E WR , i.e. for all node pairs not directly connected, we find values ranging from 0 to 5, but a much smaller average 〈 s xy 〉 = 0.123. Indeed, if we exhaustively consider all the combinations of a pair ( x , y ) ∈ E WR with another ( x , y ) ∉ E WR , we find that the latter has a higher s xy than the former in 19.7% of the cases only.

Since s xy is significantly higher for pairs ( x , y ) directly connected, it is reasonable to presume that those pairs ( x , y ) ∉ E WR with extremely large s xy be actually connected by a missing link, i.e. a link existing but not experimentally observed. More precisely, if we set a threshold value S (typically large), we can compute the fraction α (typically small) of existing links ( x , y ) ∈ E WR with s xy ≥ S . If we now take a pair ( x ′, y ′) ∉ E WR such that s x ′ y ′ ≥ S , then the probability that s x ′ y ′ ≥ s xy is larger than 1 − α (i.e. typically large) for whatever ( x , y ) ∈ E WR , namely the predicted link ( x ′, y ′) collocates among the node pairs with higher similarity.

Fig 6 reports the relationship between the similarity threshold S , the number of predicted links N pred , and the link “reliability” 1 − α . In the following we focus our discussion on S = 3, a value which corresponds to 1 − α ≈ 0.90 and to a number of 17 predicted links (among the | V |(| V | − 1)/2 − | E WR | = 16224 pairs non directly connected). It is a reasonable trade off between a too tight ( S = 4, with N pred = 3) and a too loose threshold ( S = 2, with N pred = 100), as the number of predicted links is of the order of roughly one tenth of the existing links. The predicted links are highlighted in Fig 7 . Notice that they mostly connect nodes with large centrality (i.e. k -shell coreness), and thus they could represent important, yet overlooked, relationships among key individuals.

An external file that holds a picture, illustration, etc.
Object name is pone.0154244.g006.jpg

The plot visualizes the relationship between the number of predicted links N pred , the link reliability 1 − α , and the similarity threshold S . The inset replicates the part of the plot with the highest reliability values.

An external file that holds a picture, illustration, etc.
Object name is pone.0154244.g007.jpg

The Oversize network N WR of the Wiretap Records (nodes and links in grey), with the 17 predicted links with largest CN similarity score s xy (in blue). Nodes are colored according to their coreness based on the k -shell analysis (1 = white (most peripheral), 2 = yellow, 3 = orange, 4 = brown (most central)). The two parts of the network most relevant for link prediction are magnified in the bottom.

In the light of that, we carried out a new campaign of analysis of the judicial documents to discover clues of the possible connections among the relevant individuals: the results are discussed below and summarized in Table 2 . It should be emphasized that the absence of the predicted links from the original network N WR essentially means that those connections have not corresponded to a recorded telephone call in the period of investigation (see the Discussion section for an overview of possible motivations). This does not exclude, however, the existence of a social connection of whatever nature, which is crucial to be discovered in order to have the most possible complete picture of the criminal network.

For all predicted links, with the only exception of (n5, n39), the analysis of the judicial documents finds evidence of the likelihood of a social tie.

Node similarity predicts a link between n49 and n27, two of the main traffickers within the criminal network; n49 is the son of the boss and, with his father in jail, he was in charge of the trafficking activities, the management of the criminal group, and the investment of the proceeds of crime in both legal and illegal activities; n27 was heavily involved in the drug trafficking activities; in particular, he was charged with being responsible of the purchase and retail of large quantities of cocaine. Considering their role within the criminal group, it is highly probable that the two knew each other personally and had contacts. Similar considerations apply to the missing link identified between n49 and n48, who was in charge of the wholesale distribution of the drug in the province of Lecco, in the north of Italy. The judicial documents suggest that they collaborated with the mediation of other members of the criminal organization. However, both n49 and n48 lived in the same area and had key roles in the drug distribution chain, increasing the likelihood of a link between the two, as identified by the node similarity scores.

Node similarity also predicts a link between n50 and n160. The former is n49’s brother, also involved in drug trafficking activities. The latter is a fugitive who acted as a broker in the wholesale of drug. His being on the run was favored by n49, who provided constant support to n160. Considering the strong link between n49 and n160, and the close relationship between n49 and n50, it is likely that n50 and n160 also knew each other personally. Another predicted link is the one between n118, who is the wife of n45, and n36. Indeed, n118 is one of the few women suspected of being involved in the illicit activities of the criminal group. She was aware of her husband’s involvement in drug trafficking and her telephone calls discussing drug debts were intercepted by the police. The husband of n118 used to buy cocaine from n36 on behalf of other members of the criminal organization. The two men’s frequent contacts and n118’s involvement in illicit activities indicate that n118 and n36 may have known each other. The likelihood of a link between n13 and n43, also predicted by node similarity measures, is confirmed by a telephone call intercepted by Italian law enforcement agencies during the investigation. No conversations were recorded between the two alleged criminals; however, in June 2004 n13 informed another member of the organization of n43’s arrest, indicating that n13 and n43 knew each other.

Other links predicted by the CN similarity score include those forming a closed triad among n40, n53, and n147. The three suspects were involved in the drug retail in the province of Lecco and they used to buy the drug from the same wholesalers. As for n49 and n48, sharing drug distribution channels and operating in the same area justifies high node similarity scores. Nodes n40, n53, and n147 all share a missing link with n48’s boss n19, a drug trafficker involved in the wholesale of cocaine in the province of Lecco. A direct link between n19 and the three retailers was never confirmed by the police; however, the four suspects had trade relationships through n19’s subordinates, including n48, and they may have known each other personally. Two links were also predicted between n19’s assistant n24, and n48 and n147, respectively. The need to balance security and efficiency may have resulted in a division of labor between n19 and n24, with the former dealing cocaine with n48 and—indirectly—n147, and the latter having contacts with other wholesalers and retailers in the Lecco province. The strong relationship between n19 and n24, however, makes the predicted links very likely to have existed in the criminal organization. Another closed triad is formed by predicted links among n28, n26, and n140: as a matter of fact, n28 is n27’s younger brother; his activities included blending and hiding cocaine before its sale. The drug was then distributed by n27 with the help of n26, n140 and other wholesalers. Although no conversations or meetings were recorder among n28, n26, and n140, it is thus likely that they knew each other or had contacts in the past.

Overall, the thorough analysis of the judicial documents allowed us to validate, with a reasonable degree of reliability, 16 out of 17 of the links predicted by the CN similarity scores.

We now move to investigating the predicting capabilities of the RA similarity score, defined by Eq (3) . The relationship between the similarity threshold S , the number of predicted links N pred , and the link “reliability” 1 − α is not only qualitatively, but also quantitatively very similar to that displayed in Fig 6 for the CN score (we omit the figure for the sake of conciseness). In particular, to facilitate a direct comparison with the CN results, we select again a threshold value (in this case S = 0.45) such that 17 links are predicted with a reliability 1 − α ≈ 0.90. It turns out that the links predicted by RA have only a partial overlap with those predicted by CN, since only 5 links out of 17 are designated by both methods. The attempt of validating the 12 new links through the analysis of the judicial documents, however, was not conclusive: no strong evidences were found for them, contrarily to what above described for the CN score.

It seems therefore that the RA similarity score, in this specific case, has a weaker predicting capability than the CN score. With the aim of interpreting this fact, we focus on the 12 links predicted by RA but not by CN: notice that, having selected S = 3 for CN, they necessarily correspond to node pairs having exactly 1 or 2 common neighbors. Non connected pairs, i.e. ( x , y ) ∉ E WR , have a maximum RA score of about s xy = 0.625. In view of Eq (3) , to get a top-ranking RA score it is sufficient to have a common neighbor which is exclusive to the node pair (i.e. a degree 2 node) since this guarantees s xy ≥ 0.5 (only 12 node pairs out of 16224 meet this inequality). This represents a peculiar form of connection, especially if we compare it with the typical scenario of CN top-ranking pairs, which are instead connected by 4 or 5 common neighbors. Fig 8 displays two representative cases of predicted links which are in the top ranking positions for CN and RA, respectively, but are not predicted by the other method. The local network structure appears to be strongly different: in the CN case, the predicted link is immersed in a dense community, contrarily to the RA case. Indeed, if we compute the average clustering coefficient of the nodes connecting the predicted links which are not in common between the two methods, we find c avg = 0.431 for CN and c avg = 0.090 for RA, a clear indication of a different local topology. On the other hand, the local topology around the link predicted by RA suggests that n149 is likely to have the peculiar role of brokering two important subnetworks (notice the large number of neighbors of n9 and n43). If it is so, it is not suprising that no direct connection should exist, as the intermediation is exerted precisely by n149.

An external file that holds a picture, illustration, etc.
Object name is pone.0154244.g008.jpg

The left panel portrays the portion of the N WR network around the link (n19, n147), predicted by the CN score (incidentally, (n13, n43) is also a predicted link). The right panel portrays the portion of network around the link (n9, n43) predicted by the RA score.

To further explore which link prediction methods are appropriate in this specific case, we broaden the scope of the analysis by testing two additional methods, namely the Katz index similarity (e.g., [ 24 ]) and the Structural Perturbation Method (SPM) [ 25 ]. Both of them are global, i.e., the likelihood of a predicted link depends on the entire network. This is not the case for the CN and RA methods, which are based on a similarity score s xy whose value only depends on the local structure of the network around ( x , y ).

Given an undirected, unweighed network with adjacency matrix A , the Katz index defines the similarity of nodes ( x , y ) by

where 0 < β < 1/ λ max ( A ) to ensure convergence. By recalling that ( A k ) xy is the number of paths of length k connecting ( x , y ), and noting that A xy = 0 if the link ( x , y ) does not exist (which is the case when we quantify the likelihood of ( x , y ) for prediction), we interpret Eq (4) as a generalization of the CN score, since it considers the paths of all lengths connecting ( x , y ) instead of those of length 2 only, which are those passing through the common neighbors.

For the network N WR we have λ max ( A ) = 7.07 and thus 0 < β < 0.141. To facilitate the comparison with the results above discussed, we select again the top-17 predicted links according to index Eq (4) . It turns out that the 17 predicted links are the same as those of CN in the range 0 < β < 0.060, while for β = 0.100 the links predicted in common by Katz and CN reduce to 13 (but only 4 in common by Katz and RA). Interestingly, the 4 new links predicted by Katz (they are (n9, n39), (n13, n40), (n24, n40), (n43, n143)) are, in the CN ranking, in the set immediately below the top-17. Most notably, we were able to find in the judicial documents clear evidence of the likelihood of these social ties (we omit the details for brevity). Overall, we can safely claim that the results of the global link prediction method based on Katz similarity are consistent with those of the CN approach and, as such, they depart significantly from those obtained by the RA method.

The SPM considers the set of predicted links as a perturbation of the nominal network (coded by the adjacency matrix A ) which, however, preserves its structural features (see [ 25 ] for details). To quantify the sensitivity to perturbations, a small portion of links are randomly selected and removed, so that we can write A = A R + Δ A with the (symmetric) matrix Δ A containing the removed links. Then A R is decomposed according to its eigenbasis:

where | V | is the number of nodes and λ k and v k are the eigenvalue of A R and the corresponding orthogonal and normalized eigenvector, respectively. The perturbed matrix is obtained as

which can be interpreted as an approximation of A in a linear expansion based on A R . In practice, A ˜ will be obtained as the average of many instances of Eq (6) , each one computed for a different random removal Δ A . Finally, the predicted links ( x , y ) are those with largest A ˜ x y among the node pairs non connected in the original network, i.e., those with A xy = 0.

If we apply the SPM to the adjacency matrix A of the network N WR , we find a set of top-17 predicted links which overlaps with that of the CN method by 11 to 14 links, according to parametrization (number of random removals and fraction of removed links). The links in common with RA, instead, are never more than 4. In all instances, the new links predicted by SPM turn out to be, in the CN ranking, in the set immediately below the top-17. As for the Katz index described above, the results of the SPM prediction are largely consistent with those of the CN approach and, on the contrary, depart significantly from those obtained by the RA method.

To summarize the results of the link prediction analysis, we have found three different methods (one local, CN, and two global, Katz index and SPM) whose results are largely overlapping. Most notably, these results find significant validation in the judicial documentation, since they correspond to social ties not included in wiretap records but nonetheless very likely to exist. On the other hand, the fourth method, RA, does not seem an appropriate tool for link prediction in this specific case: its results are divergent with respect to the other methods and, moreover, its predicted links cannot be validated through the available documents. Of course, the most general question on which other methods, among the many available [ 22 – 26 , 44 ], are appropriate in this specific context remains open. However, our analysis indicates that a few methods able to provide reliable predictions do exist. Among them, CN should certainly be appreciated for its conceptual simplicity and easy computability.

The rejection of the first hypothesis, according to which marginal (i.e. discarded) links are those with low betweenness, has some interesting implications. From a network analysis standpoint, it is a fact that the criminal justice system discarded as marginal a number of links with high betweenness. This may appear surprising, as these links connected not only peripheral nodes but also nodes with medium-high coreness. Thus, they may appear to bridge the “structural holes” within the criminal group [ 38 ]. In fact, a careful analysis reveals that links with high betweenness include a few occasional contacts or communications unrelated to the illicit activities. Despite their apparent bridging function, from a criminal intelligence standpoint these links are marginal. Overall, we must conclude that link betweenness proved to be unable to discriminate between marginal and important links in the criminal network.

The second hypothesis, based on node similarity, performed definitely better in the identification of marginal links. The link removal process independently conducted by the criminal justice system focused on links with low similarity, whereas in all instances it considered as relevant those links with high similarity. This demonstrates that node similarity matters beyond the merely topological analysis, as we have evidence that it is also naturally embedded in the activities of the law enforcement agencies.

The specific nature of the criminal case and the design of the study prevent an exhaustive and conclusive verification of the predicted links. In this study, instead, it is possible to verify the prediction through independent analysis of the judicial sources. The information of the case shows that social ties corresponding to the predicted links are, in almost all instances, very likely to have existed in the criminal organization, although undetected by investigators and thus not annotated in the Oversize networks. Reasons for overlooking predicted links include suspects’ use of communication and protection methods, investigators’ limited time and resources, and reliance on imperfect data-gathering methods (e.g. covert observations, informants, witnesses)[ 5 , 7 – 9 ]. It is also worth noting that strong empirical evidence from wiretaps or other investigative sources must be available to include a link between any two suspects in the judicial documents. Investigators may have suspected some of the predicted links without being able to demonstrate their actual existence. At the same time, since criminals face a trade-off between efficiency and security, they may have deployed several security strategies against law enforcement surveillance, thus impeding the detection of their interactions [ 13 , 15 ].

Conclusions

Previous studies suggest that various fields of law enforcement may benefit from SNA: identification of suitable targets for network destabilization and prediction of the impact of their removal; detection of aliases through the analysis of actors with similar patterns of connections; and identification of potential defectors according to their position in the network [ 2 , 45 ]. In this paper, we show how SNA may support criminal intelligence analysis and ongoing investigations by identifying missing links among suspects.

This study demonstrated that node similarity, already applied in different fields for link prediction, can identify possible missing links also in criminal networks, when information is noisy or incomplete almost by definition. The criminal justice system deploys a number of guarantees against false positives such as incorrect accusations and interactions unrelated to criminal conducts. Conversely, effective strategies to prevent false negatives, such as missing information, are scarce. Due to constrained data collection resources, law enforcement agencies may indeed miss some actors and links, with negative consequences on intelligence and investigation activities. This applies to drug trafficking networks, such as the Oversize network, as well as to other types of covert networks including street gangs and terrorist groups. These criminal organizations can all be conceived as networks of relations among co-offenders based on kinship or criminal collaboration. Since the social network approach to crime focuses on the relationships among co-offenders rather than, e.g. their illicit activities [ 46 ], SNA can be used to analyze any type of criminal networks, from small and flexible groups of collaborating criminals to more structured organizations.

Node similarity measures helped identify the characteristics of the links independently removed throughout the criminal proceedings: the removal process was strongly biased towards the links with least node similarity score. This provided support to the hypothesis that links discarded by the investigators throughout the criminal proceedings connect individuals that are structurally dissimilar, i.e. they link individuals who occasionally collaborate but are dissimilar in terms of tasks and involvement in criminal activities. Therefore, the removed links are not crucial for the criminal conducts. Consequently, node similarity enabled prediction of links that are likely to exist, but that were undetected by the police. Missing links were inferred a contrario from the characteristics of removed links, on the assumption that pairs of unconnected actors with large node similarity scores were likely connected by missing links, but for several reasons went unnoticed by law enforcement agencies. Content analysis of the judicial sources independently corroborates the likelihood of predicted links. Moreover, the comparative analysis of different similarity scores reveals that not all of them have the same predictive capability: we argue that the reason lies in the different topological properties they highlight.

In conclusion, the results show that node similarity measures can inform ongoing criminal investigations. On one hand, the independent link reduction conducted by the law enforcement agencies confirms node similarity as an important property of relevant links. On the other hand, link prediction may point out where to direct the scarce investigative resources for more effective investigations or even uncover relevant patterns overlooked by law enforcement authorities, especially in the case of investigations targeting large networks or criminal organizations with sophisticated communication and protection methods. Besides their practical implications, the results extend the prediction of missing links to a field largely neglected so far.

Funding Statement

The Polisocial Award program is the only funding source and it supports the authors NP and MV. No other fund was available. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Data Availability

link analysis case study

  • The Benefits
  • Case Studies
  • Our Partners

Book your demo

CloudNine has acquired ESI Analyst and are looking forward to working with you.

Click here to request a demo of CloudNine’s ESI Analyst

link analysis case study

Using Link Analysis To Define Data Relationships In Investigations

link analysis case study

Clarifying The Who Behind The What And When

Today’s digital investigations are being powered by  link analysis . Link analysis is an analytical process whereby data points, often referred to as “nodes”, are used to identify relationships and connections between disparate data sources. The power behind link analysis and its rapid adoption in today’s era of big data is that it enables data visualization, data clustering, charting, timelining and more through data aggregation. When it comes to the copious amounts of data that can be acquired throughout the course of a legal investigation, the process is invaluable to identify patterns and context between a vast number of seemingly disconnected data sources.

Subruta Paul, in his 2013 article entitled, “ On Some Aspects of Link Analysis and Informal Network in Social Network Platform ” explains different linking types, which include explicit links and aggregate links. Explicit links are those that are created between nodes which correspond to a specific defined entity. One example, as provided by Paul, is a phone call. When a phone call is placed, there is a defined link between the originating phone number and the destination phone number. When all of these phone calls between two specific phone numbers are combined, it results in an aggregate link, representing all of the placed calls.

Leveraging explicit and aggregate link analysis is invaluable to digital investigators seeking to establish contextual relationships and behavior patterns.

Leveraging explicit and aggregate link analysis is invaluable to digital investigators seeking to establish contextual relationships and behavior patterns. Explicit linking, beyond that of just the phone numbers themselves, can be taken a step further helping to define the behavior of the individual being investigated, providing further context. Let’s explore this concept.

If we gathered data from someone’s smartphone in the course of an investigation, we can examine the call log and extrapolate all calls placed and received by that specific device. The phone number of the device can then be explicitly linked to an individual via the  IMSI  (International Mobile Subscriber Identity), defining the relationship between the user and the phone. We can then aggregate the data as well as the underlying metadata, including things like call duration and the date of each of the calls. Given this additional explicit link, we can now identify the phone numbers that this individual contacted the most or engaged with for the longest period of time.

This example extends itself to a plethora of other potential explicit links. The individual’s phone may have geolocation artifacts, text messages, app data, transactions and even connected device data (often referred to as the Internet of Things or “IoT”) recorded in its device history. Performing link analysis on these additional nodes by linking the phone number, the device’s  IMEI  (International Mobile Equipment Identity), or the user’s IMSI can result in a cornucopia of links to examine.

Our example barely scratches the surface of how using link analysis to identify explicit links and aggregation of data for analysis can aid an investigation. This type of analysis can establish context, and even possibly intent. This is where the power of tools like  ESI Analyst  can help refine an investigation by demonstrating a series of events in a timeline, showing their relationships to a given individual or set of actors. The power of link analysis is a proven and effective tool that enables robust data visualization and, most importantly, a clear and comprehensive understanding of the data being analyzed. If you would like to learn more, please reach out and  arrange a demonstration  today.  

© 2018 - 2021 ~ TIDAL CHANGE TECHNOLOGIES, INC. ~ ALL RIGHTS RESERVED

PRIVACY POLICY SITE INFORMATION

link analysis case study

The Trouble With Text Message Screenshots As Evidence

Recent articles:, deduplicating decentralized communications, aceds summer refresher-modern mess of mobile discovery, managing production requirements for esi protocols, to ask or not to ask that is the e-discovery question, “ discover the next wave of ediscovery innovation ”, contact the cloudnine team today:.

DataWalk

Link Analysis For Intelligence Investigations: A Step-By-Step Example

Link analysis software such as datawalk enables you to identify and analyze the relationships ("connections") between data objects. in this article we'll walk through a detailed example in the datawalk application..

link analysis

Figure 1. The Universe Viewer visualizes all of an organization’s data and the interconnections.

In this hypothetical scenario, a data set was manually created (shown with green-arrow) using open-source content describing the Jalisco New Generation Cartel (Cártel de Jalisco Nueva Generación, CJNG). CJNG is a criminal group based in Jalisco, Mexico and headed by Nemesio Oseguera Cervantes ("El Mencho"), one of Mexico's most-wanted drug lords. The primary content of this set was assembled from DOJ poster boards of the key members, leadership, and familial relationships.

Link Analysis For Intelligence Investigations

Entity Search

In our example, an analyst wants to search for a specific person related to the CJNG investigation. Using the basic “SEARCH” option from the primary DataWalk menus, a list of available fields is presented, and can be adjusted and configured to meet any specific content. The user is looking for information on ABIGAEL GONZALEZ VALENCIA. Not knowing the exact spelling of his name, the analyst applies a Soundex-transformation to a common spelling of Abigail by simply typing in SX(Abigail) to identify variations such as Abigael, Abigail, Abigale, Abigayle, Abegail, etc. More advanced matching for nicknames such as Gabby, Gail, Abbie, and Gayle could be applied with advanced features [shown later].

Figure 2. Starting the investigation with Soundex search on name of interest.

Figure 2. Starting the investigation with Soundex search on name of interest.

DataWalk supports a range of different search-options to help identify variations in the data. The (i) icon at the top of the menu (upper-left) provides a balloon-help cheat-sheet for different functions. These functions include wildcards (*), SX (Soundex), TYPO (letter errors), STEM, AND/OR, and REGEXP (to handle special configurations). Additional functions can be added per user-needs (e.g., Metaphone).

The search is conducted on different fields from different loaded sets to identify potential matches. DataWalk identifies results from four (4) sets including CJNG-People, Arrests, FBARs, and MSBs. Each is categorized by the source along with a sample set of fields to help the user identify which records/entities best match their select. In this case the name in the CJNG set is selected and the user brings the results into a Link Chart for link analysis.

Link Analysis For Intelligence Investigations (1)

  • Link Analysis

Using visually appealing icons, glyphs, colors, and related components, DataWalk link charts offer a viewpoint into the data, emphasizing key content, important information, and critical connections. In this example, the blue-star (image below, upper-left) represents the entity is in a leadership position within the organization and is set according to a value defined in the underlying data. Additionally, the red-circle (upper-right) signifies a status of the entity. In this case the letter “I” indicates he is currently “incarcerated/imprisoned” and others show (A) arrest, (F) fugitive, and (D) deceased. These markers are easy to customize to meet the needs of any investigation.

Figure 3. Sample visualization of an object in a DataWalk link analysis.

Figure 3. Sample visualization of an object in a DataWalk link analysis.

The next step is to see how this entity relates to other members of the CJNG organization. The analyst continues the link analysis by selecting the entity and invoking the “Add Linked Objects” menu on the right-side of the link chart interface. From here there are several options available including 1st and 2nd degree connections and a list of any connected sets. For this example, the CJNG-People set is selected and "Add Objects" is initiated to show the next-level of connections.

Figure 4. Adding objects to a DataWalk link chart.

Figure 4. Adding objects to a DataWalk link chart.

Based on the set-connections defined in the Universe Viewer (UV) the system follows all selected connections and brings back any new entities. The results shown below quickly depict the relationships from/to Abigael to other cartel members and family, and being able to add, visualize, and analyze this data reflects a fundamental value of link analysis. Every connection also displays the type/name of the relationship for these ten (10) new entities. The analyst quickly sees that NEMESIO OSEGUERA CERVANTES is another leader in the CJNG cartel and is currently a wanted (F) fugitive. Additionally, another entity (JENNIFER BEANEY CAMACHO CÁZARES), with an image, shows as his wife. Any desired level of detail can be shown in the labels, comments, or the diagram arranged using different placement techniques.

Figure 5. DataWalk link analysis showing connections of an entity.

Figure 5. DataWalk link analysis showing connections of an entity.

All the entities are selected and Add Linked Objects is reapplied to show the next level of connections. The results are easily arranged using different placement methods to minimize link cross-over.

Figure 6. Extending the link analysis

Figure 6. Extending the link analysis

At this point the analyst realizes there is some missing information from the diagram. Specifically, the link between the wife/mother and her children. The analyst decides to create a new link between JENNIFER and NOEMI using the “add connection” feature available in DataWalk top-level menu. Being able to manually create such links is a basic feature of a link analysis system.

Figure 7. Initiating the creation of a manual link in DataWalk.

Figure 7. Initiating the creation of a manual link in DataWalk.

Link Analysis For Intelligence Investigations

Add Entity/Connection

Once this mode is activated, the analyst simply clicks on one of the entities and holds down the mouse (and the link follows the cursor) and selects the second entity and releases the mouse to establish the link. In this case, the “direction” of the link is not important, but in other cases, the order of connection defines the “flow” of the relationship.

Figure 8. Creating a manual connection in a link analysis.

Figure 8. Creating a manual connection in a link analysis.

Once the mouse is released, a pop-up menu requires the analyst to select what type of link to create. In our CJNG model there is only a single type of connection called “related” used to define the role and connect all cartel members. In other models, there can be multiple types of connections based on different needs and requirements, and it is a simple process to add additional link-types to the model.

Figure 9. Selecting connection type for a manual connection in DataWalk

Figure 9. Selecting connection type for a manual connection in DataWalk

In this specific model, the “related” connection allows the analyst to enter additional information and details regarding the linkage. As entities can have different types, roles, and relationships over a period of time, it is important to capture all of the details to ensure the proper fidelity is maintained for the analytics. In Figure 10, the number of attributes is fairly basic and is easily extended to add/change them to meet evolving needs.

Figure 10. Specifying a relationship with a manual connection in DataWalk.

Figure 10. Specifying a relationship with a manual connection in DataWalk.

The analyst enters the type of relationship (mother/child). Often these values are defined as an enumerated-type (e.g., a predefined list) chosen from a pick-list. Different types of components (e.g., date selectors, selection-boxes, spinners, etc.) are used to simplify data input. Once completed, the analyst “saves” the results and Figure 11 shows the new connection between the selected entities.

Figure 11. The new manual connection visualized in DataWalk.

Figure 11. The new manual connection visualized in DataWalk.

In this environment, a special configuration provides a “supervisor” with a notification that new information is added to the system. The system automatically detects this change and then signals an alert to the designated personnel regarding this situation. In the upper-right part of the display, an icon (red bell) visually displays an active alert with a count of the total number of outstanding (unread) alerts. Optionally, the supervisor can receive an email notification (or other notification via an external ticketing system) of this alert. 

Figure 12. Example alert notification in DataWalk.

Figure 12. Example alert notification in DataWalk.

The supervisor can review the alert by logging into the system and invoking the “Workspace” dashboard to see which alert triggered. Using the same ringing red-bell (animated) the alert is identified and the supervisor clicks on the “New Objects” tab showing the one (1) new entry available for review. In this case, the system requires the supervisor to “Approve” the data change before any other users can see this information. Note: the analyst that originally creates the data can always see it in their own sandbox, but others are excluded until it is approved. In this example, the supervisor has 3 options; approve, deny, or request more information. The select-values are configurable to meet various agency or investigative needs.

Figure 13. Supervisor approval of a new object in DataWalk.

Figure 13. Supervisor approval of a new object in DataWalk.

Note: this same process is applied to the creation of new entities (e.g., cartel members). Once the supervisor approves the new data, all other authorized users will see the data next time they query the system.

Expand The Network (Walk Data)

At this point, the analyst continues the link analysis by expanding out the cartel network showing additional levels and relationships among its membership. The highlighted entities in the following diagram show those added entities.

Figure 14. New entries being added to a DataWalk link analysis

Figure 14. New entries being added to a DataWalk link analysis.

Using the various placement techniques available within DataWalk, the analyst can define the best format to meet their analytical needs. The screenshot below shows the “hierarchy-top” to position each of the three (3) leaders at the top of the diagram and allow their connections to flow downward. This helps the analyst understand the different roles and significance of members in the cartel. Although this example is limited in size, there can be many levels represented.

Figure 15. Link analysis with hierarchical layout.

Figure 15. Link analysis with hierarchical layout.

At this time, the CJNG set is exhausted, as there are no additional data available to expand the network. However, in the Universe Viewer (UV), the CJNG set is connected to the “People” set which is comprised of the names of people derived from many different sets (investigations, corrections, financials, watchlists, arrests, registrations, etc.). The analyst selects the People set to see any new connections and uncovers there are two (2) matches from Zachary Manning (cousin) and Denise Cook (friend) both stemming from LILIANA ROSA CAMBA located in the lower-right of the cartel network diagram.

Figure 16. Connections to data from the “People” data set

Figure 16. Connections to data from the “People” data set

The People set has connections to a wide range of other sets and contains much more robust content. The analyst does a drill-down on both Zachary and Denise to see more specific details about their backgrounds. Then using the “Add Linked Objects” panel, the analyst chooses all of the available sets to see any additional connections.  Note: the values for names, social security numbers, addresses, phones, and other personal details used in these examples are "synthetic" and are not intended to reflect any real-world person.

Figure 17. Sample data available for an object in a link analysis.

Figure 17. Sample data available for an object in a link analysis.

The system accesses each selected set to pull out any connections for either Zachary or Denise, as shown in Figure 18 below:

Figure 18. Further extending the link analysis.

Figure 18. Further extending the link analysis.

At this point there are two viable options to pursue to determine additional connections, behaviors, or related activities.

  • Zachary previously arrested for a bank fraud, has a social media profile, a valid social security number, involved in various BSA activity, owns a BMW 325i, and lives in Compton, CA.
  • Denise has records in the Relativity set (an authoritative document management platform), BSA records, a registered phone, a valid SSN, and an address located in La Mesa, CA.

In the expanded view of Zachary, the BSA set shows both SAR (Suspicious Activity Report) and CTR (Currency Transaction Report) transactions and presents them geographically on a map showing their activity relative to their home address. We see a heavy concentration of SARs at two specific banks and wider usage of banks for CTR deposits.

Figure 19. Connecting financial transactions with people in a link analysis.

Figure 19. Connecting financial transactions with people in a link analysis.

Figure 20. Identifying transaction locations on a map.

Figure 20. Identifying transaction locations on a map.

The number of people shown in the network diagram related to this address indicates some type of “safehouse” usage and invoking the street-view option by right-clicking on the home address provides an automatic link to Google Street View to validate the address. As seen in the screen-capture, this property also has a larger number of vehicles present. Drilling-down further (expanding the network) on the other people shows they all have additional BSA transactions. The analyst classifies this group as “money mules” and will investigate further.

Figure 21. Google Street View generated via a link analysis

Figure 21. Google Street View generated via a link analysis

Switching back to Denise and expanding her BSA shows a similar number of transactions for SAR and CTRS. All the SARs are under $10,000 indicating some type of structuring behavior.

Figure 22. Link analysis connecting Denise Cook with reported transactions.

Figure 22. Link analysis connecting Denise Cook with reported transactions.

Showing the timelines for both SAR (green sphere) and CTR (orange sphere) transactions indicates there was a mix of both types of filings in the March-August timeframe. The analyst knows that people change their behaviors when their actions are being recorded. Beginning in July, Denise started to structure her cash deposits under the $10k limit (around $8k) to avoid the CTR filing forms. From that point, the bank started to exclusively file only SARs to document this suspicious behavior.

Figure 23. Time series analysis visualizes frequency of transactions.

Figure 23. Time series analysis visualizes frequency of transactions.

When her activity is presented on a geospatial map, we can see that her transactions are clearly conducted at locations along the US/Mexico border (US side) at various/different banks and institutions. Most SARs are reported from one specific location while the CTRs are reported by a number of different banks. Clearly there is some type of explicit intent for Denise to travel almost 30 miles to make her cash deposits. The analyst will further review this information.

Figure 24. Visualizing locations of financial transactions.

Figure 24. Visualizing locations of financial transactions.

Figure 25. Visualizing locations of financial transactions.

Figure 25. Visualizing locations of financial transactions.

When performing a Google Street View of her address, the results show a high-end estate nearing the end of its construction. The value of this property is $1.1M.

Figure 26. Extending link analysis using Google Street View

Figure 26. Extending link analysis using Google Street View

Figure 27. Image from Zillow.com showing value of home

Figure 27. Image from Zillow.com showing value of home

The analyst returns to the link chart to determine if the other connections will return any additional entities. When choosing the Add Link Objects, there is an option to Show Object Counts that provides the total number of entities that will be returned if the query is run. In this case the Intercepts set shows there are 707 records available for the phone. To avoid cluttering the display with this new data, the analyst right-clicks on the phone number entity and copies it into a new Link Chart display.

Figure 28. Identifying available phone intercepts data in DataWalk

Figure 28. Identifying available phone intercepts data in DataWalk

In this new display, the analyst expands the network using the Intercepts set resulting in a large concentration of connections. This type of data is not often used for “network” analysis but is much better suited for geographical (lat/long) and temporal (date/time) analyses.

Figure 29. Example of a link analysis with a high density of objects.

Figure 29. Example of a link analysis with a high density of objects.

These intercepts are the location records tied to Denise’s mobile phone and when displayed using the heatmap option, it shows a heavy concentration of activity in Brooklyn, NY. Each sphere represents a location reference. The darker colored spheres indicate higher concentration of activity (e.g., stay over, lingering, stopped).

Figure 30. Heatmap showing geographical concentration of objects.

Figure 30. Heatmap showing geographical concentration of objects.

When placed on a time chart, the analyst sees the activity occurred over a 5-day period: March 13-18. Each spike in the timeline shows the relative activity for that period and zooming-in provides better resolution (hours/mins).

Figure 31. Time series analysis showing activity for a select 5-day period.

Figure 31. Time series analysis showing activity for a select 5-day period.

As the analyst manipulates the timeline and focuses in on the first spike, it becomes clear that Denise flew to New York and landed at John F. Kennedy (JFK) airport around 4:30, taking about an hour to get her bags and hail a taxi (or rideshare). It appears she went directly from the airport to the Upper West Side in Manhattan via the Bronx (via the Major Deegan Expressway) where she stayed for approximately 1 hour.

Figure 32. Extending the link analysis with further geospatial analysis

Figure 32. Extending the link analysis with further geospatial analysis

Over the next several days, she concentrated her movements around Brooklyn, spent time in Staten Island, traveled out to Islip and Medford on Long Island, and also Queens. The analyst can infer the specific locations visited by Denise to cross reference them with other data sets to determine if they are significant or have any additional intelligence value.

Returning to the original link chart, the analyst further expands the phone number and finds a match against the Federal Firearms License (FFL) set. It connects with Fine Jewelry Inc. with locations in Southern California including San Diego, Solana Beach, Oceanside, Rancho Bernardo, and La Mesa (the location where Denise lives). Based on the connection between the Jeweler and the Cartel, there may be some type of high-end luxury buying, money laundering, or a front business, or some combination thereof. The analyst will continue to do research and find additional content to determine the nature of these relationships.

Figure 33. Identifying link analysis connections indicating potential money laundering

Figure 33. Identifying link analysis connections indicating potential money laundering

Fuzzy Matching (Aliases)

As a last step, the analyst checks to see if there are any potential “alias” entities matching Denise. For this configuration, the system is set up to identify entities that match on several conditions including: same gender, same race, same ethnicity, same year-of-birth, similar last names (Soundex), and live within 25 miles of each other (via zip code). In this set, there are three (3) matches generated: Didi Cooke, Densie Cooks, and Deniece Cooks. Any type of condition can be defined for “fuzzy” matching and can vary from set-to-set.

Figure 34. Checking for aliases via link analysis.

Figure 34. Checking for aliases via link analysis.

The final diagram shows all the entities and their connections. On the bottom of the display, there are a series of thumbnails showing the 19 steps used to go from the original entity to the final results. When the chart is saved (or restored) all of these steps are available for review. Thus, if the analyst is asked about the process used to find these results, it can easily be played-back using the history. And other analysts that access this chart will also see this history.

Final Report

Finally, if the analyst needs to send a report (aka targeting package or dossier) to another person without access to the system, they simply create a PDF report with all of the relevant details. These reports are configured to client specifications to include headers/footers, watermarks, disclosure statements, and even agency logos. The analyst generates a report from the Folder associated with Denise.

Figure 35. Saved steps in a link analysis enable storytelling of the investigation.

Figure 36. Sample investigation report.

DataWalk is a trademark of DataWalk S.A. Microsoft and Excel are trademarks of Microsoft Corporation. Hadoop is a registered trademark of the Apache Software Foundation.

This article is derived from an article previously published by the author on LinkedIn.

DataWalk logo white

  • Cryptocurrency
  • Financial Crime 360
  • Intelligence Analysis
  • Investigations
  • Law Enforcement
  • National Intelligence
  • Pandemic Response
  • AI & ML
  • Data Repository
  • Knowledge Graph
  • Graph Analytics
  • Certification
  • Partner Directory
  • Partner Portal
  • Partner Program
  • Contract Vehicles
  • Intelligence Agencies
  • Other Use Cases
  • Material en Espanol
  • Matériel en Francais
  • Grant Resources

Quick Links

  • IBM i2 Alternative
  • OSINT Guide
  • Palantir Alternative
  • No translations available for this page

U.S. flag

An official website of the United States government, Department of Justice.

Here's how you know

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock A locked padlock ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Cross-Site Analysis and Case Study of STOP Program Grantee Perspectives on Violence Prevention and Mental Health Training Program Implementation

This paper explores STOP Program grantee perspectives on violence prevention and mental health training program implementation.

In this study, researchers examined factors that influence the implementation of violence prevention and mental health training programs in schools, with a particular focus on implementation readiness and school mental health capacity. Several promising results were found that may contribute to ongoing efforts to improve school safety. In response to the Students, Teachers, and Officers Preventing (STOP) School Violence Act of 2018 (H.R. 4909), 128 grantees across the U.S. were awarded funding through the Bureau of Justice Assistance (BJA) in 2018 and 2019 to improve school safety by implementing programs in the Violence Prevention and Mental Health Training category. The major goals of this study were to 1) understand the challenges and facilitators of implementing violence prevention and mental health training programs through a broad cross-site analysis, 2) assess contextual factors influencing implementation, as well as regional and population variances through targeted, comprehensive case studies, and 3) provide evidence to inform program implementation in violence prevention and mental health programs in schools to improve program outcomes and sustainability. Understanding the environment of implementation, grantees’ capacity to carry out planned activities, and the perspectives of implementation team members are critical components to learning what factors support and inhibit implementation and ultimately, the extent to which programming will be replicable and scalable as federal funding continues to support mental health and violence prevention initiatives. The study was conducted at two levels: 1) a cross-site analysis of grantees who have been awarded funding in the Violence Prevention and Mental Health Training category over the two award years (2018 and 2019), and 2) a case study analysis of six grantee sites.

Additional Details

Related topics, similar publications.

  • Predictors of Non-offending Caregiver Support in Cases of Child Sexual Abuse
  • Linking parental incarceration and family dynamics associated with intergenerational transmission: A life-course perspective
  • Adolescent Loneliness during the COVID-19 Pandemic: The Role of Pre-Pandemic Risk Factors
  • Integrations
  • Customer stories
  • Our newsletter
  • Corporate information
  • Opportunities
  • Cambridge Intelligence Life
  • Request a trial

Data visualization for fraud detection tools

The fraud detection visualization challenge.

Fraud management has changed massively in recent years, with the advance of digital technologies and AI creating new opportunities and techniques for fraudsters to commit crime faster, and with more agility.

To detect and investigate it effectively, you need to see connections – between people, accounts, transactions, and dates – and understand complex sequences of events.

That means analyzing a lot of data.

Download the white paper

The visualization-AI intelligence cycle

A diagram showing the visualization-AI intelligence cycle

A successful fraud investigation follows the visualization-AI intelligence cycle, combining the different strengths of visual analytics, AI and human reasoning.

Detection: AI software uses machine learning and pattern recognition to make recommendations and raise alerts.

Investigation: Interactive graph visualization presents insights in a way that’s easy for human investigators to navigate, analyze, and gain actionable intelligence.

Prevention: Investigators use what they’ve learned to inform the next set of queries and rules they feed into the system.

As patterns of fraud are detected, analysts can use the new insight to update and improve their automated systems.

Detecting fraud

Fraud detection is an increasingly automated process, as analysts are often looking for familiar patterns of activity. They automatically score each case or transaction, and assign it to a category – often using machine learning to process events.

The increase in scams means a higher volume of alerts fall into the ‘unsure’ category. No matter how advanced automated fraud detection is today, a flagged transaction needs fast analytical expertise from a human investigator. Visual graph and timeline analysis makes that possible.

Here’s a visual graph analysis chart showing a vehicle insurance claim that’s been flagged for review. Nodes represent claims, vehicles, people, and addresses. An automatic hierarchy layout makes it easy to spot dependencies.

An unusual connection stands out right away: the witness, Everett Page, shares an address with Walter Stewart, who has a previous claim relating to the same vehicle involved in this incident.

This is enough for the analyst to flag this claim for deeper investigation.

using link analysis to investigate known fraud using a fraud detection tool built with KeyLines

Timeline visualization adds a time dimension, making it easy to understand the sequence in which events unfolded.

This dataset contains a record of credit card transactions. We can easily pick out those which are disputed (in red) and identify the patterns around them.

Using timeline visualization to investigate known fraud using a fraud detection tool built with KronoGraph

Fraud management white paper

See how our link analysis and timeline visualization tools help investigators detect, investigate and prevent fraud.

Investigating fraud

More complex cases, for example those that might involve coordinated fraud rings and organized crime, require more complex human involvement. Here, link analysis and timeline visualizations are investigation tools. They present larger volumes of data for investigators to navigate and turn into actionable intelligence.

Understanding patterns of fraud relies on the analyst’s domain knowledge and investigative skills, which are both enhanced with visual analysis.

This fictional but typical dataset includes links between nodes representing policies, policyholder details, insurance claims, vehicle damage, doctors, witnesses, and mechanics. When you visualize a lot of cases at once, it’s easy to pick out ordinary claims – they’re the Y-shaped structures dotted around the chart – from the more complex, potentially fraudulent claims.

using link analysis to detect fraud

Timeline visualization makes it easy to see how the relationships between traders developed through time.

In this insider trading example, we see Shany Keebler buying shares just before a big profit announcement. Combine that with communications data and unusual patterns start to emerge.

Timeline analysis for insider trading investigation

Preventing fraud

The third stage of the visualization-AI intelligence cycle is prevention – where data science teams use new information to train their models. It’s also an opportunity to close loopholes or vulnerabilities in the system.

Here, link analysis provides an overview of investigation outputs and operational data from multiple silos.

Armed with a single intuitive view, data scientists can uncover patterns and trends, and recommend model and process changes to prevent future scams.

using link analysis to detect fraud

Customize your fraud detection tool

Organizations worldwide trust our link analysis and timeline visualization technologies to join the dots in their fraud detection and investigation processes. Here’s why they choose us.

comprehensive data visualization for your fraud detection tool

See the full picture

Interact with data from across the organization in a single chart. Our products’ flexible approach means you can overcome data silos and gain insight into fraud information from multiple sources, giving you a clearer, more complete picture of events.

scalable data visualization for your fraud detection tool

Visualization that scales

Our toolkits support fraud analysis at scale. Whether that means you’ve got a large and distributed team, or huge volumes of data to analyze, we’ve designed our products to scale-up to any size organization and operation.

fast data visualization for your fraud detection tool

Get answers faster

Discover more intuitive ways to understand your fraud data. Timeline and link analysis tools reveal fraud insight more effectively than other automated or manual processes, leading to faster and better decisions and fewer missed incidences of fraud.

customized data visualization for your fraud detection tool

A custom fraud detection tool

Visualization tools, custom designed for your fraud analysts and the data they need to understand, empower anyone to find insight in complex data. The result: insightful and straightforward tools that people want to use.

Want to try it for yourself?

Graph visualization for JavaScript developers

A screen showing a KeyLines graph visualization featuring a network of email communications between employees

Code how you like and build link analysis apps that work with any stack.

Start a KeyLines trial

Graph visualization for React developers

A screen showing a ReGraph graph visualization featuring a network of email communications between employees

Use a simple data-driven API to build customized graph visualization apps in React.

Start a ReGraph trial

Timeline visualization for JavaScript or React developers

A screen showing a timeline visualization featuring in investigation into suspected fraud by individuals against US stores

Design interactive timelines to explore patterns and unfolding events.

Start a KronoGraph trial

Posts from our blog about fraud detection tools

link analysis case study

Credit card fraud visualization: AI detection strategies that work

link analysis case study

Data visualization, AI and fraud detection

link analysis case study

Data visualization and AI for healthcare fraud detection

Read more about fraud management

  • Our customers
  • Our partners
  • Connected Insights
  • Paid internships
  • How we work
  • Meet the team
  • Company news
  • Evaluation FAQs
  • Procurement FAQs

Register for news & updates

Registered in England and Wales with Company Number 07625370 | VAT Number 113 1740 61 6-8 Hills Road, Cambridge, CB2 1JP. All material © Cambridge Intelligence 2024. Read our Privacy Policy .

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts

Latest science news, discoveries and analysis

link analysis case study

China's Moon atlas is the most detailed ever made

link analysis case study

‘Shut up and calculate’: how Einstein lost the battle to explain quantum reality

link analysis case study

Rat neurons repair mouse brains — and restore sense of smell

link analysis case study

Mini-colon and brain 'organoids' shed light on cancer and other diseases

Scientists urged to collect royalties from the ‘magic money tree’, first glowing animals lit up the oceans half a billion years ago, plastic pollution: three numbers that support a crackdown, the maldives is racing to create new land. why are so many people concerned, ecologists: don’t lose touch with the joy of fieldwork chris mantegna.

link analysis case study

Should the Maldives be creating new land?

link analysis case study

Lethal AI weapons are here: how can we control them?

link analysis case study

Algorithm ranks peer reviewers by reputation — but critics warn of bias

link analysis case study

How gliding marsupials got their ‘wings’

Bird flu in us cows: is the milk supply safe, nato is boosting ai and climate research as scientific diplomacy remains on ice, hello puffins, goodbye belugas: changing arctic fjord hints at our climate future, nih pay raise for postdocs and phd students could have us ripple effect.

link analysis case study

Retractions are part of science, but misconduct isn’t — lessons from a superconductivity lab

link analysis case study

Any plan to make smoking obsolete is the right step

link analysis case study

Citizenship privilege harms science

European ruling linking climate change to human rights could be a game changer — here’s how charlotte e. blattner, will ai accelerate or delay the race to net-zero emissions, current issue.

Issue Cover

Surprise hybrid origins of a butterfly species

Stripped-envelope supernova light curves argue for central engine activity, optical clocks at sea, research analysis.

link analysis case study

Ancient DNA traces family lines and political shifts in the Avar empire

link analysis case study

A chemical method for selective labelling of the key amino acid tryptophan

link analysis case study

Robust optical clocks promise stable timing in a portable package

link analysis case study

Targeting RNA opens therapeutic avenues for Timothy syndrome

Bioengineered ‘mini-colons’ shed light on cancer progression, galaxy found napping in the primordial universe, tumours form without genetic mutations, marsupial genomes reveal how a skin membrane for gliding evolved.

link analysis case study

Breaking ice, and helicopter drops: winning photos of working scientists

link analysis case study

Shrouded in secrecy: how science is harmed by the bullying and harassment rumour mill

How ground glass might save crops from drought on a caribbean island, londoners see what a scientist looks like up close in 50 photographs, books & culture.

link analysis case study

How volcanoes shaped our planet — and why we need to be ready for the next big eruption

link analysis case study

Dogwhistles, drilling and the roots of Western civilization: Books in brief

link analysis case study

Cosmic rentals

Las borinqueñas remembers the forgotten puerto rican women who tested the first pill, dad always mows on summer saturday mornings, nature podcast.

Nature Podcast

Latest videos

Nature briefing.

An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday.

link analysis case study

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

The Impact of Digitalization on Production Management Practices: A Multiple Case Study

  • Conference paper
  • First Online: 26 April 2024
  • Cite this conference paper

link analysis case study

  • Ruggero Colombari 6 ,
  • Jasmina Berbegal Mirabent 7 &
  • Paolo Neirotti 8  

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 206))

Included in the following conference series:

  • International Conference on Industrial Engineering and Industrial Management (ICIEIM) – Congreso de Ingeniería de Organización

With the diffusion of Industry 4.0, manufacturing firms can decentralize their operational decisions and enable real-time data-driven decision-making. Using a socio-technical approach and the manufacturing shop-floor as a unit of analysis, this article studies the changes induced by digitalization on operational decision-making, organizational structures, and individual competencies. A cross-country multiple case study conducted in the automotive sector suggests three main areas on which firms have to focus: decentralized data-driven decision-making, front-line managers’ upskilling, and production workers’ involvement. The successful implementation of digitalization and the actual decentralization of decision-making depend on individual factors related to the competencies of front-line managers, who acquire a central role in this skill-biased technological and organizational change.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Agostini, L., Filippini, R.: Organizational and managerial challenges in the path toward Industry 4.0. Eur. J. Innov. Manag. 22 (3), 406–421 (2019)

Article   Google Scholar  

Almada-Lobo, F.: The Industry 4.0 revolution and the future of Manufacturing Execution Systems (MES). J. Innov. Manag. 3 (4), 16–21 (2015)

Appio, F.P., Frattini, F., Petruzzelli, A.M., Neirotti, P.: Digital transformation and innovation management: a synthesis of existing research and an agenda for future studies. J. Prod. Innov. Manag. 38 (1), 4–20 (2021)

Bostrom, R.P., Heinen, J.S.: MIS problems and failures: a socio-technical perspective Part I: the causes. MIS Quart. 1 (3), 17 (1977)

Cagliano, R., Canterino, F., Longoni, A., Bartezzaghi, E.: The interplay between smart manufacturing technologies and work organization. Int. J. Oper. Prod. Manag. 39 (6/7/8), 913–934 (2019)

Colombari, R., Neirotti, P.: Closing the middle-skills gap widened by digitalization: how technical universities can contribute through Challenge-Based Learning. Stud. Higher Educ. 1–16 (2021)

Google Scholar  

Colombari, R., et al.: The interplay between data-driven decision-making and digitalization: a firm-level survey of the Italian and U.S. automotive industries. Int. J. of Prod. Economics 255 , 108718 (2023)

Creswell, J.W., Poth, C.N.: Qualitative Inquiry & Research Design: Choosing Among Five Approaches, 4th edn. Sage, Los Angeles (2018)

Culot, G., Nassimbeni, G., Orzes, G., Sartor, M.: Behind the definition of Industry 4.0: analysis and open questions. Int. J. Prod. Econ. 226 , 107617 (2020)

Gioia, D.A., Corley, K.G., Hamilton, A.L.: Seeking qualitative rigor in inductive research. Organ. Res. Methods 16 (1), 15–31 (2013)

Gupta, V., Acharya, P., Patwardhan, M.: A strategic and operational approach to assess the lean performance in radial tyre manufacturing in India. Int. J. Product. Perform. Manag. 62 (6), 634–651 (2013)

King, N.: Using templates in the thematic analysis of text. In: Cassels, C., Symon, G. (eds.) Essential Guide to Qualitative Methods in Organizational Research, pp. 256–270. Sage, London (2004)

Chapter   Google Scholar  

Lavalle, S., Lesser, E., Shockley, R., Hopkins, M., Kruschwitz, N.: Big data, analytics and the path from insights to value. MIT Sloan Manag. Rev. 52 (5), 21–32 (2011)

Mavrikios, D., Papakostas, N., Mourtzis, D., Chryssolouris, G.: On industrial learning and training for the factories of the future: a conceptual, cognitive and technology framework. J. Intell. Manuf. 24 (3), 473–485 (2013)

Parker, S.K., Grote, G.: Automation, algorithms, and beyond: why work design matters more than ever in a digital world. Appl. Psychol. 71 (4), 1171–1204 (2020)

Piva, M., Santarelli, E., Vivarelli, M.: The skill bias effect of technological and organisational change: evidence and policy implications. Res. Policy 34 (2), 141–157 (2005)

Sung, T.K.: Industry 4.0: a Korea perspective. Technol. Forecast. Soc. Chang. 132 , 40–45 (2018)

Trist, E.L., Bamforth, K.W.: Some social and psychological consequences of the longwall method of coal-getting. Hum. Rel. 4 (1), 3–38 (1951)

Veile, J.W., Kiel, D., Müller, J.M., Voigt, K.I.: Lessons learned from Industry 4.0 implementation in the German manufacturing industry. J. Manuf. Technol. Manag. 31 (5), 977–997 (2020)

Wilkesmann, M., Wilkesmann, U.: Industry 4.0–organizing routines or innovations? VINE J. Inf. Knowl. Manag. Syst. 48 (2), 238–254 (2018)

Yin, R.K.: Case Study Research: Design and Methods. Sage Publications, Thousand Oaks, Calif (2003)

Download references

Author information

Authors and affiliations.

Dept. d’Economia i Organització d’Empreses. Facultat de Ciències Econòmiques i Socials, Universitat Internacional de Catalunya. C/ Immaculada, 22, 08017, Barcelona, Spain

Ruggero Colombari

Dept. d’Organització d’Empreses. Escola Politècnica Superior d’Enginyeria, Universitat Politècnica de Catalunya, Av Victor Balaguer, s/n, 08800, Vilanova i la Geltrú, Spain

Jasmina Berbegal Mirabent

Dipartimento di Ingegneria Gestionale e della Produzione (DIGEP), Politecnico di Torino , Corso Duca degli Abruzzi, 24, 10129, Torino, Italia

Paolo Neirotti

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ruggero Colombari .

Editor information

Editors and affiliations.

ETSEIB, Universitat Politècnica de Catalunya, Barcelona, Spain

Joaquín Bautista-Valhondo

Manuel Mateo-Doll

Rafael Pastor-Moreno

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Cite this paper.

Colombari, R., Berbegal Mirabent, J., Neirotti, P. (2024). The Impact of Digitalization on Production Management Practices: A Multiple Case Study. In: Bautista-Valhondo, J., Mateo-Doll, M., Lusa, A., Pastor-Moreno, R. (eds) Proceedings of the 17th International Conference on Industrial Engineering and Industrial Management (ICIEIM) – XXVII Congreso de Ingeniería de Organización (CIO2023). CIO 2023. Lecture Notes on Data Engineering and Communications Technologies, vol 206. Springer, Cham. https://doi.org/10.1007/978-3-031-57996-7_44

Download citation

DOI : https://doi.org/10.1007/978-3-031-57996-7_44

Published : 26 April 2024

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-57995-0

Online ISBN : 978-3-031-57996-7

eBook Packages : Engineering Engineering (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Advertisement

Will a Mountain of Evidence Be Enough to Convict Trump?

Monday will see opening statements in the People of the State of New York v. Donald J. Trump. The state’s case seems strong, but a conviction is far from assured.

  • Share full article

Donald Trump sits at the defense table between two lawyers.

By Ben Protess and Jonah E. Bromwich

  • April 21, 2024

In the official record, the case is known as the People of the State of New York v. Donald J. Trump, and, for now, the people have the stronger hand: They have insider witnesses, a favorable jury pool and a lurid set of facts about a presidential candidate, a payoff and a porn star.

On Monday, the prosecutors will formally introduce the case to 12 all-important jurors, embarking on the first prosecution of an American president. The trial, which could brand Mr. Trump a felon as he mounts another White House run, will reverberate throughout the nation and test the durability of the justice system that Mr. Trump is attacking in a way that no other defendant would be allowed to do.

Though the district attorney, Alvin L. Bragg, has assembled a mountain of evidence, a conviction is hardly assured. Over the next six weeks, Mr. Trump’s lawyers will seize on three apparent weak points: a key witness’s credibility, a president’s culpability and the case’s legal complexity.

link analysis case study

The Donald Trump Indictment, Annotated

The indictment unveiled in April 2023 centers on a hush-money deal with a porn star, but a related document alleges a broader scheme to protect Donald J. Trump’s 2016 campaign.

Prosecutors will seek to maneuver around those vulnerabilities, dazzling the jury with a tale that mixes politics and sex, as they confront a shrewd defendant with a decades-long track record of skirting legal consequences. They will also seek to bolster the credibility of that key witness, Michael D. Cohen, a former fixer to Mr. Trump who previously pleaded guilty to federal crimes for paying the porn star, Stormy Daniels.

Daniel J. Horwitz, a veteran defense lawyer who previously worked in the Manhattan district attorney’s office prosecuting white-collar cases, said prosecutors can be expected to corroborate Mr. Cohen’s story wherever possible.

“The prosecution has layers upon layers of evidence to back up what Michael Cohen says,” Mr. Horwitz said.

Both sides will lay out their cases in opening statements on Monday, offering dueling interpretations of the evidence some six years after the payoff to Ms. Daniels entered the public consciousness and briefly imperiled Mr. Trump’s presidency.

But in previewing the case for prospective jurors last week, Manhattan prosecutors emphasized neither the payoff that secured Ms. Daniels’s silence, nor the sex scandal that was buried in the process. One prosecutor, Joshua Steinglass, instead distilled the trial’s stakes to a fundamental question: “This case is about the rule of law and whether or not Donald Trump broke it.”

Mr. Steinglass’s boss, Mr. Bragg, has offered a loftier interpretation, casting Mr. Trump’s actions as election interference. Although Mr. Trump’s lawyers might claim he was merely trying to hide embarrassing stories from his family, Mr. Bragg says Mr. Trump orchestrated a scheme to conceal simmering sex scandals from voters as they headed to the polls in 2016. All told, his allies struck three hush-money deals, paying off people who had stories to tell — stories that could have derailed Mr. Trump’s candidacy.

Mr. Bragg’s prosecutors will seek to turn that 2016 campaign strategy against him: The tactics that helped propel Mr. Trump to victory will be admitted as evidence and reconsidered far beyond the courtroom. Aides and friends who lied on Mr. Trump’s behalf will take the witness stand to testify against him.

They include: David Pecker, the tabloid publisher who bought and buried damaging stories about Mr. Trump; Hope Hicks, a spokeswoman who tried to spin reporters; and Mr. Cohen, the fixer who paid Ms. Daniels. Mr. Pecker, who ran the company that owned The National Enquirer, is set to go first, and is expected to recount for the jury several conversations with Mr. Trump about the hush money, according to a person familiar with the plan.

link analysis case study

Who Are Key Players in the Trump Manhattan Criminal Trial?

The first criminal trial of former President Donald J. Trump is underway. Take a closer look at central figures related to the case.

Mr. Trump faces 34 felony counts, and up to four years behind bars, but more than just his freedom is at stake. If convicted, he might lose the right to vote, including to cast a ballot for himself. If he were to win back the White House, he would be the first convicted criminal to serve as commander in chief. And the question of how he might serve a prison sentence, should it come to that if he does not receive probation, could throw the country into turmoil.

America has grown accustomed to seeing Mr. Trump smash through its customs and is now witnessing a phenomenon that is a first in the 248 years of its history. Presidents have been impeached, driven from office and rejected at the polls. Mr. Trump is about to be the first to have his fate decided not just by voters, but by 12 citizens in a jury box.

And they all hail from Manhattan, the borough that made Mr. Trump famous, and where he is now deeply unpopular. A favorable jury pool, legal experts say, has given Mr. Bragg a leg up at the trial.

Yet the jury, which was made final on Friday and includes six alternates, is no rubber stamp: It includes at least two people who have expressed some affection for the former president, and it takes only one skeptical member to force a mistrial, an outcome that Mr. Trump would celebrate as a win.

The stakes are high for Mr. Bragg as well. He is betting his career and his legacy on a prosecution he inherited, rejected and then transformed.

When he took office in 2022, he declined to bring a financial fraud case against Mr. Trump that his predecessor had prepared, prompting an uproar when two prosecutors resigned in protest.

But Mr. Bragg continued to investigate and soon revisited the hush-money deal — an episode that had become known internally as “the zombie case,” because it kept coming back to life. Little more than a year after taking office, Mr. Bragg indicted the former president.

Three other indictments followed in three other cities, but with those cases mired in delay, Mr. Bragg’s trial may now be the only one that Mr. Trump will face before Election Day.

The Manhattan case comprises the three hush-money deals: with Ms. Daniels, with a former Playboy model and with a onetime doorman who told a tale of Mr. Trump fathering a child out of wedlock.

Mr. Pecker and his tabloid bought the silence of the doorman, whose story turned out to be false. They also bought the rights to the story told by the model, Karen McDougal, and then never wrote it, a practice known as “catch and kill.”

Then there was Ms. Daniels, who was interested in selling her story of a sexual encounter with Mr. Trump. Mr. Pecker drew the line there: Her price was too high.

Instead, he and a top editor alerted Mr. Cohen, who soon paid Ms. Daniels $130,000 not to tell her story about a sexual encounter with Mr. Trump a decade earlier.

Mr. Cohen has said he acted at Mr. Trump’s direction, but the former president is not charged over the payment itself. Instead, he stands accused of covering up the transaction by disguising reimbursements to Mr. Cohen.

In internal records, Mr. Trump’s company marked those payments as legal expenses, citing a retainer agreement. Yet no such expenses existed, prosecutors say, and the retainer agreement was fictional.

Mr. Trump is accused of engineering — or, at least, approving — the coverup. His company, prosecutors argue, produced 34 false records that underpin the counts against him: 11 checks, 11 monthly invoices Mr. Cohen submitted and 12 entries in the general ledger for Mr. Trump’s trust.

Mr. Trump signed several of the checks in the White House, as prosecutors will surely point out at the trial.

But directly linking Mr. Trump to the plot to falsify those records is another matter altogether.

His lawyers will be likely to argue that he was oblivious, and that Mr. Cohen handled the specifics. Mr. Cohen hashed out the reimbursement plan with Mr. Trump’s chief financial officer, Allen H. Weisselberg, who is serving jail time for perjury and will not testify, records show.

The lack of a firsthand witness to confirm Mr. Cohen’s account is a potential flaw in the case, but it may not be fatal. Prosecutors plan to introduce a document containing Mr. Weisselberg’s handwritten notes about the reimbursements — a key piece of evidence demonstrating that Mr. Cohen did not act alone.

And under the law, the prosecutors need not prove that Mr. Trump personally falsified the records. Already during the trial’s first week, Mr. Steinglass laid the groundwork with a simple analogy: He asked prospective jurors whether they could accept that, if a husband hired a hit man to murder his wife, the husband was just as guilty as the man who pulled the trigger.

“Can you all follow the same kind of logic in this case?” Mr. Steinglass asked the prospective jurors. Many said they could.

Mr. Cohen is expected to offer the closest thing this case has to a smoking gun: He is likely to say that, in early 2017, he and Mr. Trump discussed the repayment scheme in the Oval Office.

If Mr. Trump testifies in his own defense, that could pit Mr. Cohen’s word against Mr. Trump’s — a he-said, he-said story, with two questionable narrators.

Whether or not Mr. Trump takes the stand, the trial could become a referendum on Mr. Cohen’s credibility, with the verdict possibly hinging on a convincing performance.

In 2018, Mr. Cohen pleaded guilty to a variety of federal crimes, admitting to participating in the hush-money deals with Ms. Daniels and Ms. McDougal and lying to Congress about plans for a Trump business deal in Russia. Mr. Trump’s lawyers will seek to emphasize Mr. Cohen’s checkered past at every turn.

And, on cross-examination, Mr. Trump’s lawyers are likely to portray Mr. Cohen as a serial liar with a grudge against his former boss.

Susan Necheles, one of Mr. Trump’s lawyers, began that campaign during jury selection. She referenced Mr. Cohen’s 2022 book “Revenge,” questioning the credibility of “someone who says that they want revenge against President Trump.”

Yet the prosecution is expected to note that Mr. Cohen told many of his lies for Mr. Trump. And prosecutors will offer evidence corroborating the broad strokes of Mr. Cohen’s story, which could persuade jurors when they are weighing his testimony about the crucial Oval Office meeting.

Mr. Trump’s White House executive assistant, Madeleine Westerhout, who has been identified as a potential witness, could confirm that Mr. Cohen did indeed meet with Mr. Trump, even if she cannot confirm what they discussed. Mr. Pecker can support at least some of Mr. Cohen’s testimony about Mr. Trump’s involvement in the hush-money deals. And a recording Mr. Cohen made of a call he had with Mr. Trump will capture the former president discussing the deal with Ms. McDougal.

“The prosecution’s argument is that you can trust Michael Cohen beyond a reasonable doubt as to their isolated conversation,” said Mr. Horwitz, the former prosecutor. He called the approach “Prosecuting 101.”

William K. Rashbaum , Maggie Haberman , Jonathan Swan and Michael Rothfeld contributed reporting.

Ben Protess is an investigative reporter at The Times, writing about public corruption. He has been covering the various criminal investigations into former President Trump and his allies. More about Ben Protess

Jonah E. Bromwich covers criminal justice in New York, with a focus on the Manhattan district attorney’s office and state criminal courts in Manhattan. More about Jonah E. Bromwich

Our Coverage of the Trump Hush-Money Trial

News and Analysis

Prosecutors accused Donald Trump of violating a gag order four additional times , saying that he continues to defy the judge’s directions  not to attack witnesses , prosecutors and jurors in his hush-money trial.

Trump’s criminal trial in Manhattan is off to an ominous start for the former president, and it might not get any easier  in the days ahead. Here’s why.

The National Enquirer  was more than a friendly media outlet  for Trump’s presidential campaign in 2016. It was a powerful, national political weapon that was thrust into the service of a single candidate , in violation of campaign finance law.

More on Trump’s Legal Troubles

Key Inquiries: Trump faces several investigations  at both the state and the federal levels, into matters related to his business and political careers.

Case Tracker:  Keep track of the developments in the criminal cases  involving the former president.

What if Trump Is Convicted?: Could he go to prison ? And will any of the proceedings hinder Trump’s presidential campaign? Here is what we know , and what we don’t know .

Trump on Trial Newsletter: Sign up here  to get the latest news and analysis  on the cases in New York, Florida, Georgia and Washington, D.C.

Environment   |   News releases   |   Research   |   Science

April 17, 2024

Ice age climate analysis reduces worst-case warming expected from rising CO2

four woolly mammoths on frozen ground

This artist’s rendition shows woolly mammoths in northern Spain. These animals lived in Europe and North America during the last glacial period, around 21,000 years ago. A new study used updated climate maps from that period, when atmospheric carbon dioxide was lower, to better predict future warming under rising CO2. Mauricio Anton

As carbon dioxide accumulates in the atmosphere, the Earth will get hotter. But exactly how much warming will result from a certain increase in CO2 is under study. The relationship between CO2 and warming, known as climate sensitivity, determines what future we should expect as CO2 levels continue to climb.

New research led by the University of Washington analyzes the most recent ice age, when a large swath of North America was covered in ice, to better understand the relationship between CO2 and global temperature. It finds that while most future warming estimates remain unchanged, the absolute worst-case scenario is unlikely.

The open-access study was published April 17 in Science Advances.

“The main contribution from our study is narrowing the estimate of climate sensitivity, improving our ability to make future warming projections,” said lead author Vince Cooper , a UW doctoral student in atmospheric sciences. “By looking at how much colder Earth was in the ancient past with lower levels of greenhouse gases, we can estimate how much warmer the current climate will get with higher levels of greenhouse gases.”

The new paper doesn’t change the best-case warming scenario from doubling CO2 — about 2 degrees Celsius average temperature increase worldwide — or the most likely estimate, which is about 3 degrees Celsius. But it reduces the worst-case scenario for doubling of CO2 by a full degree, from 5 degrees Celsius to 4 degrees Celsius. (For reference, CO2 is currently at 425 ppm, or about 1.5 times preindustrial levels, and unless emissions drop is headed toward double preindustrial levels before the end of this century.)

As our planet heads toward a doubling of CO2, the authors caution that the recent decades are not a good predictor of the future under global warming. Shorter-term climate cycles and atmospheric pollution’s effects are just some reasons that recent trends can’t reliably predict the rest of this century.

“The spatial pattern of global warming in the most recent 40 years doesn’t look like the long-term pattern we expect in the future — the recent past is a bad analog for future global warming,” said senior author Kyle Armour , a UW associate professor of atmospheric sciences and of oceanography.

Instead, the new study focused on a period 21,000 years ago, known as the Last Glacial Maximum, when Earth was on average 6 degrees Celsius cooler than today. Ice core records show that atmospheric CO2 then was less than half of today’s levels, at about 190 parts per million.

“The paleoclimate record includes long periods that were on average much warmer or colder than the current climate, and we know that there were big climate forcings from ice sheets and greenhouse gases during those periods,” Cooper said. “If we know roughly what the past temperature changes were and what caused them, then we know what to expect in the future.”

Researchers including co-author Gregory Hakim , a UW professor of atmospheric sciences, have created new statistical modeling techniques that allow paleoclimate records to be assimilated into computer models of Earth’s climate, similar to today’s weather forecasting models. The result is more realistic temperature maps from previous millennia.

For the new study the authors combined prehistoric climate records — including ocean sediments, ice cores, and preserved pollen — with computer models of Earth’s climate to simulate the weather of the Last Glacial Maximum. When much of North America was covered with ice, the ice sheet didn’t just cool the planet by reflecting summer sunlight off the continents, as previous studies had considered.

world maps colored blue and red

The left panel shows the sea surface temperature map during the most recent ice age, 21,000 years ago, compared to modern preindustrial temperatures. This new, more detailed analysis shows that the strong cooling over the northern oceans, caused by the North American ice sheet, contributed substantially to total global cooling. The right panel shows that the warming of the ocean’s surface expected under future doubling of atmospheric CO2 displays a different pattern of temperature change, with a lower expectation for globally averaged warming than previous worst-case estimates. Cooper et al./Science Advances

By altering wind patterns and ocean currents, the ice sheet also caused the northern Pacific and Atlantic oceans to become especially cold and cloudy. Analysis in the new study shows that these cloud changes over the oceans compounded the glacier’s global cooling effects by reflecting even more sunlight.

In short, the study shows that CO2 played a smaller role in setting ice age temperatures than previously estimated. The flipside is that the most dire predictions for warming from rising CO2 are less likely over coming decades.

“This paper allows us to produce more confident predictions because it really brings down the upper end of future warming, and says that the most extreme scenario is less likely,” Armour said. “It doesn’t really change the lower end, or the average estimate, which remain consistent with all the other lines of evidence.”

The research was funded by the National Science Foundation, the Department of Defense’s National Defense Science and Engineering Graduate Fellowship, the Alfred P. Sloan Foundation, the National Oceanic and Atmospheric Administration and the European Union’s Horizon 2020 program. Other co-authors are Jessica Tierney at the University of Arizona; Matthew Osman at the University of Cambridge in the U.K.; Cristian Proistosescu and Philip Chmielowiec at the University of Illinois Urbana-Champaign; Yue Dong at the University of Colorado; Natalie Burls at George Mason University; Timothy Andrews at the U.K. Met Office Hadley Centre; Daniel Amrhein and Jiang Zhu at the NSF National Center for Atmospheric Research in Boulder; Wenhao Dong at the University Corporation for Atmospheric Research in Boulder and Geophysical Fluid Dynamics Laboratory; and Yi Ming at Boston College.

For more information, contact Cooper at [email protected] or Armour at [email protected] .

News releases

Read more news releases

Search UW News

Artificial intelligence, flooding and landslides, latest news releases.

A hand holds a smartphone with the TikTok app open.

UW Today Newsletter

UW Today Daily

UW Today Week in Review

For UW employees

Be boundless, connect with us:.

© 2024 University of Washington | Seattle, WA

IMAGES

  1. explain case study design

    link analysis case study

  2. Intelligence Investigations Link Analysis Example

    link analysis case study

  3. Intelligence Investigations Link Analysis Example

    link analysis case study

  4. How To Do Case Study Analysis?

    link analysis case study

  5. Infographic

    link analysis case study

  6. Intelligence Investigations Link Analysis Example

    link analysis case study

VIDEO

  1. Fault Analysis

  2. Mini Course Sensitivity Analysis case study and Triangulation intro

  3. B Ed Practical Examination Record Analysis Case Study & Psychology Record in Tamil

  4. Exploratory Data Analysis: Real-life Churn Analysis Case Study

  5. Applied Statistics & Analytics Projects 2019

  6. Stress analysis of a chain link

COMMENTS

  1. PDF Link Analysis

    Link Analysis One of the biggest changes in our lives in the decade following the turn of the century was the availability of efficient and accurate Web search, through search engines such as Google. While Google was not the first search engine, it was the first able to defeat the spammers who had made search almost useless.

  2. Link analysis for fraud detection: a step-by-step example

    Step 1: Load a claim. This claim folder involves two vehicles and three claimants, associated with three separate addresses. Our first step is to load the disputed claim in a link analysis chart, using the sequential layout to simplify the view. In this example, we have two people (Stephen Porter and Julia Rodriguez) claiming for damage to ...

  3. Interpreting social science link analysis research: A theoretical

    Link analysis in various forms is now an established technique in many different subjects, reflecting the perceived importance of links and of the Web. A critical but very difficult issue is how to interpret the results of social science link analyses. It is argued that the dynamic nature of the Web, its lack of quality control, and the online ...

  4. Journal of Investigative Psychology and Offender Profiling

    The case study is outlined, followed by the step-by-step process of conducting linkage analysis. Notably, in this case study, the analyst was not allowed to testify as to whether they believed the cases were linked but rather allowed to suggest how similar the offences were, and that a similar signature existed in each of them. Keppel (1995)

  5. The Application of Link Analysis to Police Intelligence

    Abstract. Link analysis procedures were developed and evaluated to aid law-enforcement agencies integrate collected information and develop hypotheses leading to the prevention and control of organized crime. The procedures were designed to portray the relationships among suspected criminals, to determine the structure of criminal organizations ...

  6. Link analysis: a guide

    Link analysis is an analytics technique used to identify, evaluate and understand the connections within data. The data to perform link analysis is stored in a graph database and then is displayed as a graph visualization, also called a network visualization.. Individual data points in a graph data model are represented by nodes. These entities are connected with edges - also called ...

  7. Link Prediction in Criminal Networks: A Tool for Criminal Intelligence

    The main limitation of a case study approach concerns the external validity of the findings, i.e. the extent to which the results can be generalized beyond the case studies . The analysis of the Oversize dataset focuses on a single criminal network thus sharing similar limitations on external validity with previous studies.

  8. PDF Stable Algorithms for Link Analysis

    into ways of designing stable link analysis methods. This in turn motivates two new algorithms, whose performance we study em-pirically using citation data and web hyperlink data. 1. INTRODUCTION From its origins in bibliometric analysis [11], the analysis of cross-referencingpatterns—"link analysis"—has come to play an

  9. Using Link Analysis To Define Data Relationships In Investigations

    Today's digital investigations are being powered by link analysis. Link analysis is an analytical process whereby data points, often referred to as "nodes", are used to identify relationships and connections between disparate data sources. The power behind link analysis and its rapid adoption in today's era of big data is that it ...

  10. Link Analysis

    Abstract. Link analysis has been recognized as an effective technique in data science to explore the relationships of objects. The objects can be social events, people, organization and even business transactions. This chapter reports the practical models of link analysis in various data-driven application areas.

  11. Link analysis

    In network theory, link analysis is a data-analysis technique used to evaluate relationships (Tap link [clarification needed]) between nodes.Relationships may be identified among various types of nodes (100k [clarification needed]), including organizations, people and transactions.Link analysis has been used for investigation of criminal activity (fraud, counterterrorism, and intelligence ...

  12. Link Analysis

    The Text Link Analysis node extracts concepts and also identifies relationships between concepts based on known patterns within the text. Pattern extraction can be used to discover relationships between your concepts, as well as any opinions or qualifiers attached to these concepts. ... Case Study. The goal of this tutorial is to demonstrate ...

  13. Link Analysis

    Link Analysis. Link analysis is a process of finding connections between different entities, such as connecting customers to other customers or customer to products. From: Using Information to Develop a Culture of Customer Centricity, 2013. Related terms: User Requirement; Social Network; Association Rules; Operating Systems; Case Study ...

  14. Understanding Link Analysis and Using it in Investigations

    The Bottom Line. Link Analysis can be an invaluable tool for investigators by enabling users to draw conclusions more precisely through the visual analysis of connections. It helps with analytical tasks where target-centric link analysis is key. (Life-style Analysis, Analysis of Friends of Friends, Etc.).

  15. PDF Centrality Measures and Link Analysis

    Case study Centrality measures Case study: Stability of centrality measures in weighted graphs Centrality, link analysis and web search A primer on Markov chains PageRank as a random walk PageRank algorithm leveraging Markov chain structure Network Science AnalyticsCentrality Measures and Link Analysis12

  16. Tutorial 3: Mini Link Analysis Research Project Case Study

    Tutorial 3: Mini Link Analysis Research Project Case Study Overview. This tutorial introduction goes through the stages of a very small pretend link analysis research project, from the initial crawling to analyzing the link data. This project is designed to give you an easy way to learn how to use SocSciBot 4 and SocSciBot Tools for a standard ...

  17. Case Study Methodology of Qualitative Research: Key Attributes and

    A case study is one of the most commonly used methodologies of social research. This article attempts to look into the various dimensions of a case study research strategy, the different epistemological strands which determine the particular case study type and approach adopted in the field, discusses the factors which can enhance the effectiveness of a case study research, and the debate ...

  18. Intelligence Investigations Link Analysis Example

    Figure 29. Example of a link analysis with a high density of objects. These intercepts are the location records tied to Denise's mobile phone and when displayed using the heatmap option, it shows a heavy concentration of activity in Brooklyn, NY. Each sphere represents a location reference.

  19. Generalizing link prediction: Collaboration at the University of

    For the case study in this article we take K training = K test = 1, thereby only ignoring all isolates. LP on another basis than time. The canonical case of LP is time-based: G and G Test represent the same network at different points in time. One can, however, also imagine cases where one wants to predict unknown relations between entities on ...

  20. Cross-Site Analysis and Case Study of STOP Program Grantee Perspectives

    The major goals of this study were to 1) understand the challenges and facilitators of implementing violence prevention and mental health training programs through a broad cross-site analysis, 2) assess contextual factors influencing implementation, as well as regional and population variances through targeted, comprehensive case studies, and 3 ...

  21. Fraud detection tool & data visualization essentials

    Fraud detection is an increasingly automated process, as analysts are often looking for familiar patterns of activity. They automatically score each case or transaction, and assign it to a category - often using machine learning to process events. The increase in scams means a higher volume of alerts fall into the 'unsure' category.

  22. The Missing Link of Job Analysis: A Case Study

    There are various steps involved in conducting job analysis; they can be broadly defined as follows: Collecting background and available information about the roles. Identifying the representative roles to be included in the study of job analysis. Conducting job analysis and gathering relevant data.

  23. Case Study Method: A Step-by-Step Guide for Business Researchers

    Case study protocol is a formal document capturing the entire set of procedures involved in the collection of empirical material . It extends direction to researchers for gathering evidences, empirical material analysis, and case study reporting . This section includes a step-by-step guide that is used for the execution of the actual study.

  24. Link Building Case Study: How We Built Backlinks With a 'Stats' Page

    To do this, we first went to the Backlinks report for the page, toggled the "one link per domain" filter, then searched for each statistic in the link anchors and surrounding texts. For the "93%" statistic, over 700 websites were linking to the page. That's the first box checked.

  25. Reading alphabetic and nonalphabetic writing systems: A case study of

    This case study investigates the reading processes of two bilingual teachers who speak English as a second language and use different first languages—Mandarin Chinese and Korean. The two participants read researcher-selected digital texts in English and in their respective first language, retold the texts, and answered comprehension questions ...

  26. Latest science news, discoveries and analysis

    Find breaking science news and analysis from the world's leading research journal.

  27. Sustainability

    With the implementation of China's rural revitalization strategy, the sustainable preservation of traditional dwellings has become a research priority. Moreover, with the aging population in the countryside increasing, the limited mobility of the elderly may result in them receiving daily corneal illuminance too low for a healthy circadian stimulus. This work aims to explore the relationship ...

  28. The Impact of Digitalization on Production Management ...

    Therefore, the chosen methodology is multiple-case study with embedded unit of analysis (Yin 2003). The empirical setting chosen for this study is the automotive sector, a manufacturing industry that relies on team-based work practices and has widely implemented digitization and data-integration technologies.

  29. Will a Mountain of Evidence Be Enough to Convict Trump?

    Monday will see opening statements in the People of the State of New York v. Donald J. Trump. The state's case seems strong, but a conviction is far from assured. By Ben Protess and Jonah E ...

  30. Ice age climate analysis reduces worst-case warming expected from

    The open-access study was published April 17 in Science Advances. "The main contribution from our study is narrowing the estimate of climate sensitivity, improving our ability to make future warming projections," said lead author Vince Cooper, a UW doctoral student in atmospheric sciences. "By looking at how much colder Earth was in the ...