Link Analysis_1_Basic Elements

1. Edge Attributes

1.1 Methods of category

1.1.1 Basic three categories in terms of number of layers as edges or direction of edges:

import networkx as nx

G = nx.DiGraph() # 1.directed

G = nx.Graph() # 2.undirected

G = nx.MultiGraph() # 3.between two nodes many layers of relationships

1.1.2 Logical categories in terms of cluster characteristics, i.e., Bipartite:

from networkx.algorithms import bipartite

B = nx.Graph() # create an empty network first step, no subsets of nodes

B.add_nodes_from(['H', 'I', 'J', 'K', 'L'], bipartite = 0) # label 1 group

B.add_nodes_from([7, 8, 9, 10], bipartite = 1) # label 2

# add a list of edges at one time

B.add_edges_from([('H', 7), ('I', 7), ('J', 9),('K', 8), ('K', 10), ('L', 10)])
# Chect if bipartite or not
bipartite.is_bipartite(B)

Bipartite graph cannot contain a cycle of an odd number of nodes.

1.2 Edge can contain detailed features:

G.add_edge('A', 'B', weight = 6, relation = 'family', sign = '+')

G.remove_edge('A', 'B') # remove edge

1.3 Access edges:

G.edges() # list of all edges

G.edges(data = True) # list of all with attributes

G.edges(data = 'relation') # list with certain attribute

2. Node Attributes

2.1 Node be named as character.

G.add_node('A', name = 'Sophie')

G.add_node('B', name = 'Cumberbatch')

G.add_node('C', name = 'Miko') # pet dog

2.2 Access nodes:

G.node['A']['name']

3. Network Connectivity

3.1 Triadic Closure: Tendency for people who have shared connections to become connects, i.e., to cluster.

3.1.1 Local Clustering Coefficient

# local clustering only for multigraph type

G = nx.Graph()

G.add_edges_from([('A', 'K'),

                 ('A', 'B'),

                 ('A', 'C'),

                 ('B', 'C'),

                 ('B', 'K'),

                 ('C', 'E'),

                 ('C', 'F'),

                 ('D', 'E'),

                 ('E', 'F'),

                 ('E', 'H'),

                 ('F', 'G'),

                 ('I', 'J')])

nx.clustering(G, 'A')

0.6666666666666666

Solve: 2 / [2 × 3 ÷ 2] # actual pairs / (C32)

3.1.2 Global Clustering Coefficient

# Method 1: Take average of all local clustering coefficients.

nx.average_clustering(G)

0.28787878787878785

# Method 2: Percent of open triads that are triangles in the network

# Triange: 3 nodes connected by 3 edges

# open triads: 3 nodes connected by 2 edges

# Transitivity = (3 * number of closed triads) / number of open triads

nx.transitivity(G)

0.4090909090909091

Method 2 put a larger weight on high degree nodes.

3.2 Distances

3.2.1 Singe Pair Pattern:

Find path and length of the shortest path between two nodes.

nx.shortest_path(G, 'A', 'H')

['A', 'C', 'E', 'H']

nx.shortest_path_length(G, 'A', 'H')

3.2.2 One Node to Every Others Pattern:

Breadth-first Search: discover nodes in layers step by step.

T = nx.bfs_tree(G, 'A')

T.edges() # to get the tree

OutEdgeView([('A', 'K'), ('A', 'B'), ('A', 'C'), ('C', 'E'), ('C', 'F'), ('E', 'D'), ('E', 'H'), ('F', 'G')])

nx.shortest_path_length(G, 'A') # get dictionary of distances from A to others

{'A': 0, 'K': 1, 'B': 1, 'C': 1, 'E': 2, 'F': 2, 'D': 3, 'H': 3, 'G': 3}

3.2.3 Measures of Distance Patterns

# Average of all

nx.average_shortest_path_length(G)

# Maximum distance

nx.diameter(G)

Eccentricity of a node is the largest distance between A and all others.

Radius is the minimum eccentricity.

Periphery is the set of nodes that have eccentricity equal to the diameter.

Center is the set of nodes with eccentricity equal to radius.

nx.eccentricity(G)
nx.radius(G)
nx.periphery(G)
nx.center(G)

3.2.4 Application

import numpy as np

import pandas as pd

%matplotlib notebook

# Instantiate the graph

G = nx.karate_club_graph()

nx.draw_networkx(G)

4. Connectivity

4.1 Connectivity in Undirected Graphs

# find number of communities (connected componets)

nx.number_connected_componets(G)

# give list of them

sorted(nx.connected_components(G))

# find the community to which 'M' belongs

nx.node_connected_components(G, 'M')

4.2 Connectivity in Directed Graphs

# find strongly connected component (directed path to every other nodes &

# no other node has directed path to this subset)

sorted(nx_strongly_connected_components(G))

5. Network Robustness

5.1 Definition: the ability for network to maintain general structural properties (connectivity) when faced with attacks (removal of edges or nodes).

# smallest number of nodes needed to disconnect

nx.node_connectivity(G_un)

# which nodes

nx.minimum_code_cut(G_un)

# smallest number of edges needed to disconnect

nx.edge_connectivity(G_un)

# which edges

nx.minimum_edge_cut(G_un)

5.2 Node Connectivity

# ways to deliver msg from 'G' to 'L'

sorted(nx.all_simple_paths(G, 'G', 'L'))

# want to block this path, how many nodes neeed to remove

nx.node_connectivity(G, 'G', 'L')

# which nodes

nx.minimum_node_cut(G, 'G', 'L')

5.3 Edge Connectivity

# how many

nx.edge_connectivity(G, 'G', 'L')

# show in details

nx.minimum_edge_cut(G, 'G', 'L')

6. Centrality

6.1 Degree Centrality

6.1.1 Undirected Network

G = nx.karate_club_graph()

G = nx.convert_node_labels_to_integers(G, first_label = 1)

degCent = nx.degree_centrality(G)

degCent[34]

0.5151515151515151

6.1.2 Directed Network

indegCent = nx.in_degree_centrality(G)

indegCent = nx.out_degree_centrality(G)

6.2 Closeness Centrality

6.2.1 Calculation: Shorter distance away from all other nodes.

closeCent = nx.closeness_centrality(G)

closeCent[34]

0.55

sum(nx.shortest_path_length(G, 34).values())

# Essence is equivalent to process below

(len(G.nodes()) - 1)/61

0.5409836065573771

6.2.2 Disconnceted Nodes Measurement

Method One

# choose non-normalizing, closeness centrality would be one

nx.closeness_centrality(G, normalized = False)

Method Two

# choose normalising,i.e. divide by (total nodes - 1)

nx.closeness_centrality(G, normalized = True)

0.071

6.3 Betweenness Centrality (computationally expensive)

Essence: Find nodes which shows up in many shortest paths between two nodes.

6.3.1 Method One: Use all 34 nodes in karate club

btwnCent = nx.betweenness_centrality(G,normalized = True, endpoints = False)

import operator

sorted(btwnCent.items(), key = operator.itemgetter(1), reverse = True)[0:5]

[(1, 0.43763528138528146),

 (34, 0.30407497594997596),

 (33, 0.145247113997114),

 (3, 0.14365680615680618),

 (32, 0.13827561327561325)]

6.3.2 Method Two: Use 10 nodes as approximation

btwnCent_approx = nx.betweenness_centrality(G,normalized = True, endpoints = False, k = 10)

sorted(btwnCent_approx.items(), key = operator.itemgetter(1), reverse = True)[0:5]

[(1, 0.3674031986531986),

 (34, 0.3048388648388649),

 (32, 0.17290028258778256),

 (3, 0.13572044853294854),

 (33, 0.130249518999519)]

6.3.3 Method Three: Specify subsets

btwnCent_subset = nx.betweenness_centrality_subset(G,

                                                  [34, 33, 21, 30, 16, 27, 15, 23, 10],

                                                  [1, 4, 13, 11, 6, 12, 17, 7],

                                                  normalized = True)

sorted(btwnCent_subset.items(), key = operator.itemgetter(1), reverse = True)[0:5]

[(1, 0.04899515993265994),

 (34, 0.028807419432419434),

 (3, 0.018368205868205867),

 (33, 0.01664712602212602),

 (9, 0.014519450456950456)]

6.3.4 Method Four: Edges

btwnCent_edge = nx.edge_betweenness_centrality(G, normalized = True)

sorted(btwnCent_edge.items(), key = operator.itemgetter(1), reverse = True)[0:5]

# node 1 is the instructor of club

[((1, 32), 0.1272599949070537),

 ((1, 7), 0.07813428401663695),

 ((1, 6), 0.07813428401663694),

 ((1, 3), 0.0777876807288572),

 ((1, 9), 0.07423959482783014)]

btwnCent_edge_subset = nx.edge_betweenness_centrality_subset(G,

                                                            [34, 33, 21, 30, 16, 27, 15, 23, 10],

                                                              [1, 4, 13, 11, 6, 12, 17, 7],

                                                             normalized = True)

sorted(btwnCent_edge_subset.items(), key = operator.itemgetter(1), reverse = True)[0:5]

[((1, 9), 0.01366536513595337),

 ((1, 32), 0.01366536513595337),

 ((14, 34), 0.012207509266332794),

 ((1, 3), 0.01211343123107829),

 ((1, 6), 0.012032085561497326)]

Link Analysis_1_Basic Elements的更多相关文章

[.net 面向对象程序设计进阶] (11) 序列化(Serialization)(三) 通过接口 IXmlSerializable 实现XML序列化及通用XML类
[.net 面向对象程序设计进阶] (11) 序列化(Serialization)(三) 通过接口 IXmlSerializable 实现XML序列化及通用XML类本节导读:本节主要介绍通过序列 ...
[.net 面向对象程序设计进阶] (7) Lamda表达式(三) 表达式树高级应用
[.net 面向对象程序设计进阶] (7) Lamda表达式(三) 表达式树高级应用本节导读:讨论了表达式树的定义和解析之后,我们知道了表达式树就是并非可执行代码,而是将表达式对象化后的数据结构.是 ...
Skip list--reference wiki
In computer science, a skip list is a data structure that allows fast search within an ordered seque ...
基于jsoup的Java服务端http(s)代理程序-代理服务器Demo
亲爱的开发者朋友们,知道百度网址翻译么?他们为何能够翻译源网页呢,iframe可是不能跨域操作的哦,那么可以用代理实现.直接上代码: 本Demo基于MVC写的,灰常简单,copy过去,简单改改就可以用 ...
Netty源码分析第8章(高性能工具类FastThreadLocal和Recycler)---->第6节: 异线程回收对象
Netty源码分析第八章: 高性能工具类FastThreadLocal和Recycler 第六节: 异线程回收对象异线程回收对象, 就是创建对象和回收对象不在同一条线程的情况下, 对象回收的逻辑我 ...
fullpage.js 具体使用方法
1.fullpage.js 下载地址 https://github.com/alvarotrigo/fullPage.js 2.fullPage.js 是一个基于 jQuery 的插件,它能够很方便 ...
guestfs-python 手册
Help on module guestfs: NAME guestfs - Python bindings for libguestfs FILE /usr/lib64/python2.7/site ...
Java爬取网易云音乐民谣并导入Excel分析
前言考虑到这里有很多人没有接触过Java网络爬虫,所以我会从很基础的Jsoup分析HttpClient获取的网页讲起.了解这些东西可以直接看后面的"正式进入案例",跳过前面这些基 ...
由Reference展开的学习
在阅读Thinking in Java的Containers in depth一章中的Holding references时,提到了一个工具包java.lang.ref,说这是个为Java垃圾回收提供 ...

随机推荐

翻页插件 jquery
//css <style> * { padding:; margin:; list-style: none; } .wrapper { width: 100%; cursor: point ...
吴裕雄--天生自然Numpy库学习笔记：NumPy 数组属性
NumPy 数组的维数称为秩(rank),秩就是轴的数量,即数组的维度,一维数组的秩为 1,二维数组的秩为 2,以此类推. 在 NumPy中,每一个线性的数组称为是一个轴(axis),也就是维度(di ...
CentOS 7控制台屏幕分辨率问题
我们在服务器上,很少会安装图形化界面,一般都使用字符界面的控制台.CentOS 下,控制台分辨率缺省情况下,变得很高,导致在显示器上花屏或者只能显示局部. 这是由于使用了frame buffer,好处 ...
XC1263 签到题（哇，写得我怀疑人生啊！！！@！@）
1263: 签到题时间限制: 1 Sec 内存限制: 128 MB提交: 174 解决: 17 标签提交统计讨论版题目描述大家刚过完寒假,肯定还没有进入状态,特意出了一道签到题给各位dala ...
使用类进行面向对象编程 Class 实例化和 ES5实例化对比，继承
ES5 写法 function Book(title, pages, isbn) { this.title = title; this.pages = pages; this.isbn = isbn; ...
Vue项目的准备
1.下载nodejs 检查是否安装成功 2.使用gitee作为线上仓库 3.使用脚手架工具--命令行工具能在8080里显示出以下画面即为成功
3分钟让你的Eclipse拥有自动代码提示功能
第一步:Window->Preferences->Java 第二步:Java->Editor->Content Assist->Auto Activation->将 ...
多项式输出 (0)<P2009_1>
多项式输出 (poly.pas/c/cpp) [问题描述] 一元n次多项式可用如下的表达式表示: 其中,称为i次项,ai称为i次项的系数.给出一个一元多项式各项的次数和系数,请按照如下规定的格式要求输 ...
ios 物流时间轴,自动匹配电话号码,可点击拨打
http://www.code4app.com/thread-27587-1-1.html 资讯时间轴(折叠/展开) http://www.code4app.com/thread-32358-1-1. ...
看Web视频整理标签笔记
原来观看web视频,初学html的时候发现记忆不太深刻,所以自己整理了一些笔记,加深记忆且方便忘记时查看.html的规范(遵循)1.一个html文件开始标签和结束标签<html></ ...

Link Analysis_1_Basic Elements

Link Analysis_1_Basic Elements的更多相关文章

随机推荐

热门专题