Table of Contents
Network Analysis
- Section 2 of the network mapping paper
Other sections:
It has become fashionable to talk of networks of organisations, people, computers, transport and so on. In organisations there is talk of being more “networky” and getting away from the older more hierarchical ways of doing things. Conferences are organised around “networking” both formal and informal.
Yet, the more that you listen to this network talk the more you realise that people mean very different things by the term “network”. The purpose of this paper is to explore what network thinking means and how networks can be mapped and analysed.
Why is this important and useful? The structure of a network will affect how influence and information is distributed. Certain members will be potentially more influential because of their position in the network. Mapping the network can give guidance on the easiest ways to distribute information, the links that should be there to improve the network, how to avoid bottlenecking and so on. Such network maps are used by commercial and government organisations to plot situations as divers as:
- Structures of trust, advice and communication within an organisation or group of organisations
- Planning the development of network
- Improving the functioning of project teams
- Mapping communities of interest or expertise
- Identifying centres of expertise
- Indicating key organisations and links to encourage community cohesion
What is a network?
The first thing to be said is that a network is not a list. The term implies a set of connections between its members. These connections may consist of flows of information, power, money and so on, but the implication is that an influence of some sort is passing from one to the other.
Networks can be dense or sparse - meaning that the number of connections is great or small. The total number of possible connections in any a group of members of size n is given by the formula:
n x (n - 1) / 2
Thus, a network of 10 members has a total of 45 possible connections. The density of a network can be measured by comparing the number of actual links with number of possible links and expressing this as a percentage. However this measure should be used with care. The number of possible links increases dramatically with the number of nodes. In any real world network there will be a natural limit on the number of connections that any node may be able handle. Thus large networks will show much lower densities than small so of different size can’t be compared used such a density measure. A better gauge of density is the average number of connections per node as this can be applied across all scales.
For all the members to be connected into a network structure the number of links must be at least n - 1. Research shows that the best connected organisations are arranged so that any member can connect to another within three steps.
Measuring Centrality
A key concept in the analysis of networks is centrality which holds that nodes in a network will have influence because of their position. There are several types of centrality. The following examples show the “kite” diagram developed by Professor David Krackhardt of Carnegie Mellon University to illustrate some of the basic properties of networks. Ten people make up the network and they are related in ways shown by the linking lines. Darker shading indicates how members score under various network measures.
Figure 1 - Degree Centrality is a measure of how many connections members have. In the diagram below, Diane has six connections. Fernando and Garth have 4. Carol, Andre, Beverly, Ed and Heather have 3. Ike has 2 and Jane has 1. Diane has the top degree centrality score.
Figure 2 - Closeness Centrality refers to the way that influence is spread through the network. In the diagram below, Fernando and Garth share the top score. Both Heather and Diane are next most influential followed by Andre and Beverly, then Carol and Ed with Ike and then Jane who is the least influential. It indicates their potential for influence because of their position in the network.
Figure 3 - Betweenness Centrality refers to the way that some nodes will control the access to parts of the network. Thus, in the diagram below, Heather is the only access to Ike and Jane. Such nodes are “gatekeepers” and can either restrict or facilitate the way that influence spreads to a cluster of nodes.
Figure 4 - Clusters of nodes can be identified in a network. A cluster is identified where the nodes connect more to each other than they do to the rest of the network. Thus, Ike, Jane and Heather form a cluster, to which Heather is the gatekeeper.
So, understanding the way that various nodes control the spread of information and influence in a network can be very useful in deciding where to exert pressure for change, how best to introduce ideas or information and how the network structure might be improved.
Snowballing
Many of the examples of social network analysis show vast maps with several thousand nodes and huge numbers of connections. Often, these will have been prepared using data mining techniques - the automatic recording of connections through email records or online social networks. These are compiled into matrices and the data set fed into the SNA software. The results are often hard to interpret.
When this approach is applied to the study of communities who use a mix of different connections with others - online, offline conversations, meetings, publications, letters and so on - the collection of data can be quite difficult. Commonly it will be based on interviews, surveys, workshops and other methods that will take time and staff commitment to organise and facilitate. So if you are studying a community with 3,000 members you will need to access these individually and in managed groups. This can be colossal task and recent UK examples have gained the method a reputation for being difficult to apply in real communities.
There is another approach that uses the characteristics of local networks to assist in the initial gathering of information. We must ask ourselves the question: “what do we want to use the analysis for?”. If it's to thoroughly understand the intricacies of local social networks, then we probably have to follow the route described above. If, on the other hand, we want to identify which individuals and organisations are potentially most central to the workings of community networks, there is a simpler method.
The SNA practitioner will usually be asked by a particular agency or group to carry out an analysis. This client will have their own set of contacts in the community as a starting point. Through interviews, questionnaires etc, this group will be asked to cite those they have most contact with in terms of the focus of the study - working relationships, information spread, political influence and so on.
The map and centrality analysis produced by this will certainly have been skewed in favour of the client's own network of contacts. Unsurprisingly, the client will turn out to be the most central node in the network. However the map will also show the contacts cited by those interviewed and connections between interviewees. A number of these will be linked with nodes other than the client list and score highly in terms of centrality. These become the targets for a second round of interviews or questionnaires concentrating on the nodes that were cited by contacts but not interviewed in the first wave. A large community may need several rounds of this process. Eventually the high centrality scores will emerge clearly without having to engage the whole community in the survey. This approach is sometimes known as “snowballing” in SNA literature.
The diagram illustrated above shows a network revealed in a two stage survey:
- The client (black node) gives the investigator a list of contacts (orange nodes) who are then interviewed or provided with a simple questionnaire asking who they are most in contact with relative to the subject of study.
- The results of this first survey are plotted and shown by the blue nodes and by links to other orange nodes.
- A centrality analysis is carried out to determine the key blue nodes who are then selected for survey revealing connections to red nodes
This process removes the skewing of the survey by the initial client contact selection and quickly focusses on the most central nodes without having to survey the whole community. Effectively it is using a networking approach to carry out the network survey. Complex and extensive networks may need several iterations of this method.
Software
The study of networks has increased in popularity over the last few years. Until recently most of the examples of network analysis came from the US and Australia, but this is changing. A recent study of community networks in New Cross by the Royal Society of Arts gained a lot of interest as did a paper for the same organisation by Paul Ormerod stressing the understanding of network effects in economics. In the field of Public Health, the publication of “Connected” by Christakis and Fowler has raised interest in the network effects in alcoholism, smoking, sexually transmitted diseases and obesity.
The software used to study such phenomena and measure the centrality of network nodes has generally been derived from academic models in the US. UCINET, PAJEK, GEPHI and AGNA are popular in the academic community. Commercially, Valdis Krebs’ INFLOW and Karen Stephenson’s NETFORM are protected and can only be used under licence. Generally these programs are PC based and read their data from spreadsheet input. However, we find that the act of drawing makes them come alive in a way that spreadsheet input does not.
Most of the examples that you will find at the end of this document are created in yEd, a Java based programme that is used to create diagrams by drawing. For the last year, we have been using a web based application called Kumu which allows the storage of information directly in drawn nodes and links and lets you use that information to:
- Structure the diagram - size of nodes, thickness of links etc
- Carry out complex searches to highlight and cluster nodes
Of course the drawing input of data is appropriate to a certain size of network. Up to around 20 nodes, you can pretty well sketch the diagram on a bit of paper and gauge the centrality of nodes by hand - although in a really connected network this can become more difficult. The number of possible links in a network is n(n-1)/2, so 20 nodes have a possible 190 links.
The number of possible links rises steeply as the number of nodes increases as shown below:
- 20 - 190
- 50 - 1,225
- 100 - 4,950
- 200 - 19,900
- 1000 - 499,500
- 3000 - 4,498,500
Up to around 200 nodes, drawn input is OK. beyond that it becomes more difficult.
However this 200 node upper limit suits most organisational situations. It may not work for the mapping of sizeable communities (the recent mapping of the New Cross community in London contained around 3,000 nodes). Most of the work that we have done with local organisations yields less than 200 nodes.
Network mapping questionnaire
The sheet above shows a simple questionnaire for eliciting information from an individual about:
- What organisations / individuals they work most with or refer to for advice
- The characteristics of the cited organisations (including their own)
- The degree to which organisations share Skills and Resources
This allows us to build a network map, carry out a Social Network Analysis to identify the most central individuals / organisations and to compare assets held with their position in the network.
Recently, we have been using online services to collect network information and to link that with audits of assets held by the various individuals and organisations. Using the right web based system can allow material to be collected in the field by tablet or smart phone. The information is returned online to a central database that we can use to construct a network map.