COMP2521: Assignment
COMP2521: Assignment 2
Social Network Analysis
A notice on the class web page will be posted after each major revision. Please check the class notice
board and this assignment page frequently (for Change Log). The specification may change.
FAQ:
You should check Ass2 FAQ, it may offer answers to your queries!
Change log:
No entries as yet!
Objectives
to implement graph based data analysis functions (ADTs) to mine a given social network.
to give you further practice with C and data structures (Graph ADT)
Admin
Marks 20 marks (scaled to 14 marks towards total course mark)
Individual
Assignment
This assignment is an individual assignment.
Due 08:00pm Friday 22 November 2019
Late
Penalty
2 marks per day off the ceiling.
Last day to submit this assignment is 8pm Monday 25 November 2019, of course
with late penalty.
Submit TBA
Aim
In this assignment, your task is to implement graph based data analysis functions (ADTs) to mine a given
social network. For example, detect say "influenciers", "followers", "communities", etc. in a given social
代做COMP25216作业、代写Network Analysis留学生作业
network. You should start by reading the Wikipedia entries on these topics. Later I will also discuss these
topics in the lecture.
Social network analysis
Centrality
The main focus of this assignment is to calculate measures that could identify say "influenciers",
"followers", etc., and also discover possible "communities" in a given social network.
Dos and Don'ts !
Please note that,
For this assignmet you can use source code that is available as part of the course material (lectures,
exercises, tutes and labs). However, you must properly acknowledge it in your solution.
All the required code for each part must be in the respective *.c file.
You may implement additional helper functions in your files, please declare them as "static"
functions.
After implementing Dijkstra.h, you can use this ADT for other tasks in the assignment. However,
please note that for our testing, we will use/supply our implementation of Dijkstra.h. So your
programs MUST NOT use any implementation related information that is not available in the
respective header files (*.h files). In other words, you can only use information available in the
corresponding *.h files.
Your program must not have any "main" function in any of the submitted files.
Do not submit any other files. For example, you do not need to submit your modified test files or
*.h files.
If you have not implemented any part, must still submit an empty file with the corresponding file
name.
.
Provided Files
We are providing implementations of Graph.h and PQ.h . You can use them to implement all three parts.
However, your programs MUST NOT use any implementation related information that is not available in
the respective header files (*.h files). In other words, you can only use information available in the
corresponding *.h files.
Also note:
all edge weights will be greater than zero.
we will not be testing reflexive and/or self-loop edges.
we will not be testing the case where the same edge is inserted twice.
Download files:
Ass2_files.zip
Ass2_Testing.zip
Part-1: Dijkstra's algorithm
In order to discover say "influencers", we need to repeatedly find shortest paths between all pairs of
nodes. In this section, you need to implement Dijkstra's algorithm to discover shortest paths from a given
source to all other nodes in the graph. The function offers one important additional feature, the function
keeps track of multiple predecessors for a node on shortest paths from the source, if they exist. In the
following example, while discovering shortest paths from source node '0', we discovered that there are
two possible shortests paths from node '0' to node '1' (0->1 OR 0->2->1), so node '1' has two possible
predecessors (node '0' or node '2') on possible shortest paths, as shown below.
We will discuss this point in detail in a lecture. The basic idea is, the array of lists ("pred") keeps one
linked list per node, and stores multiple predecessors (if they exist) for that node on shortest paths from a
given source. In other words, for a given source, each linked list in "pred" offers possible predecessors for
the corresponding node.
Node 0
Distance
0 : X
1 : 2
2 : 1
Preds
0 : NULL
1 : [0]->[2]->NULL
2 : [0]->NULL
Node 1
Distance
0 : 2
1 : X
2 : 3
Preds
0 : [1]->NULL
1 : NULL
2 : [0]->NULL
Node 2
Distance
0 : 3
1 : 1
2 : X
Preds
0 : [1]->NULL
1 : [2]->NULL
2 : NULL
The function returns 'ShortestPaths' structure with the required information (i.e. 'distance' array,
'predecessor' arrays, source and no_of_nodes in the graph)
Your task: In this section, you need to implement the following file:
Dijkstra.c that implements all the functions defined in Dijkstra.h.
Part-2: Centrality Measures for Social Network Analysis
Centrality measures play very important role in analysing a social network. For example, nodes with
higher "betweenness" measure often correspond to "influencers" in the given social network. In this part
you will implement two well known centrality measures for a given directed weighted graph.
Descriptions of some of the following items are from Wikipedia at Centrality, adapted for this assignment.
Closeness Centrality
Closeness centrality (or closeness) of a node is calculated as the sum of the length of the shortest paths
between the node () and all other nodes () in the graph. Generally closeness is defined as below,
where is the shortest distance between vertices and .
However, considering most likely we will have isolated nodes, for this assignment you need to use
Wasserman and Faust formula to calculate closeness of a node in a directed graph as described below:
where is the shortest-path distance in a directed graph from vertex to , is the number of nodes that can
reach, and denote the number of nodes in the graph.
For further explanations, please read the following document, it may answer many of your questions!
Explanations for Part-2
Based on the above, the more central a node is, the closer it is to all other nodes. For for information, see
Wikipedia entry on Closeness centrality.
Betweenness Centrality
The betweenness centrality of a node is given by the expression:
where is the total number of shortest paths from node to node and is the number of those paths that pass
through .
For this assignment, use the following approach to calculate normalised betweenness centrality. It is
easier! and also avoids zero as denominator (for n>2).
where, represents the number of nodes in the graph.
For further explanations, please read the following document, it may answer many of your questions!
Explanations for Part-2
Your task: In this section, you need to implement the following file:
CentralityMeasures.c that implements all the functions defined in CentralityMeasures.h.
For more information, see Wikipedia entry on Betweenness centrality
Part-3: Discovering Community
In this part you need to implement the Hierarchical Agglomerative Clustering (HAC) algorithm to
discover communities in a given graph. In particular, you need to implement Lance-Williams algorithm,
as described below. In the lecture we will discuss how this algorithm works, and what you need to do to
implement it. You may find the following document/video useful for this part:
Hierarchical Clustering (Wikipedia), for this assignment we are interested in only "agglomerative"
approach.
Brief overview of algorithms for hierarchical clustering, including Lance-Williams approach (pdf
file).
Three videos by Victor Lavrenko, watch in sequence!
Agglomerative Clustering: how it works
Hierarchical Clustering 3: single-link vs. complete-link
Hierarchical Clustering 4: the Lance-Williams algorithm
Distance measure: For this assignment, we calculate distance between a pair of vertices as follow: Let
represents maximum edge weight of all available weighted edges between a pair of vertices and .
Distance between vertices and is defined as . If and are not connected, is infinite.
For example, if there is one directed link between and with weight , the distance between them is . If
there are two links, between and w, we take maximum of the two weights and the distance between them
is . Please note that, one can also consider alternative approaches, like take average, min, etc. However,
we need to pick one approach for this assignment and we will use the above distance measure.
You need to use the following (adapted) Lance-Williams HAC Algorithm to derive a dendrogram:
Calculate distances between each pair of vertices as described above.
Create clusters for every vertex , say .
Let represents the distance between cluster and , initially it represents distance between vertex and
.
For k = 1 to N-1
Find two closest clusters, say and . If there are multiple alternatives, you can select any one
of the pairs of closest clusters.
Remove clusters and from the collection of clusters and add a new cluster (with all vertices
in and ) to the collection of clusters.
Update dendrogram.
Update distances, say , between the newly added cluster and the rest of the clusters () in the
collection using Lance-Williams formula using the selected method ('Single linkage' or
'Complete linkage' - see below).
End For
Return dendrogram
Lance-Williams formula:
where , , , and define the agglomerative criterion.
For the Single link method, these values are: , , , and . Using these values, the formula for Single link
method is:
We can simplify the above and re-write the formula for Single link method as below
For the Complete link method, the values are: , , , and . Using these values, the formula for Complete link
method is:
We can simplify the above and re-write the formula for Complete link method as below
Please see the following simple example, it may answer many of your questions!
Part-3 Simple Example (MS Excel file)
Your task: In this section, you need to implement the following file:
LanceWilliamsHAC.c that implements all the functions defined in LanceWilliamsHAC.h.
Assessment Criteria
Part-1: Dijkstra's algorithm (20% marks)
Part-2:
Closeness Centrality (22% marks),
Betweenness Centrality (23% marks)
Part-3: Discovering Community (15% marks)
Style, Comments and Complexity: 20%
Testing
Please note that testing an API implementation is very important and crucial part of designing and
implementing an API. We offer the following testing interfaces (for all the APIs you need to implement)
for you to get started, however note that they only test basic cases. Importantly,
you need to add more advanced test cases and properly test your API implementations,
the auto-marking program will use more advanced test cases that are not included in the test cases
provided to you.
Instructions on how to test your API implementations are available on the following page:
Testing your API Implementations
Submission
You need to submit the following five files:
Dijkstra.c
CentralityMeasures.c
LanceWilliamsHAC.c
Submission instructions on how to submit the above five files will be available later.
Plagiarism
This is an individual assignment. Each student will have to develop their own solution without help from
other people. You are not permitted to exchange code or pseudocode. If you have questions about the
assignment, ask your tutor. All work submitted for assessment must be entirely your own work. We regard
unacknowledged copying of material, in whole or part, as an extremely serious offence. For further
information, read the Course Outline.
因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:99515681@qq.com
微信:codehelp
COMP2521: Assignment的更多相关文章
- Atitit GRASP(General Responsibility Assignment Software Patterns),中文名称为“通用职责分配软件模式”
Atitit GRASP(General Responsibility Assignment Software Patterns),中文名称为"通用职责分配软件模式" 1. GRA ...
- user initialization list vs constructor assignment
[本文连接] http://www.cnblogs.com/hellogiser/p/user_initialization_list.html [分析] 初始化列表和构造函数内的赋值语句有何区别? ...
- Swift 提示:Initialization of variable was never used consider replacing with assignment to _ or removing it
Swift 提示:Initialization of variable was never used consider replacing with assignment to _ or removi ...
- 代写assignment
集英服务社,强于形,慧于心 集英服务社,是一家致力于优质学业设计的服务机构,为大家提供优质原创的学业解决方案.多年来,为海内外学子提供了多份原创优质的学业设计解决方案. 集英服务社,代写essay/a ...
- [Top-Down Approach] Assignment 1: WebServer [Python]
Today I complete Socket Programming Assignment 1 Web Server Here is the code: #!/usr/bin/python2.7 # ...
- default constructor,copy constructor,copy assignment
C++ Code 12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849 ...
- Programming Assignment 5: Kd-Trees
用2d-tree数据结构实现在2维矩形区域内的高效的range search 和 nearest neighbor search.2d-tree有许多的应用,在天体分类.计算机动画.神经网络加速.数据 ...
- Programming Assignment 4: 8 Puzzle
The Problem. 求解8数码问题.用最少的移动次数能使8数码还原. Best-first search.使用A*算法来解决,我们定义一个Seach Node,它是当前搜索局面的一种状态,记录了 ...
- Programming Assignment 2: Randomized Queues and Deques
实现一个泛型的双端队列和随机化队列,用数组和链表的方式实现基本数据结构,主要介绍了泛型和迭代器. Dequeue. 实现一个双端队列,它是栈和队列的升级版,支持首尾两端的插入和删除.Deque的API ...
随机推荐
- 使用navicat连接只开放内网ip连接的数据库
无法通过Navicat来连接MySQL,比较常见的两种问题? 服务器上自己安装的MySQL数据库,且未开通外网登录账号 直接购买服务商的MySQL数据库不创建公网访问,只有内网访问 背景: 公司数 ...
- 基于V7的新版RL-USB和RL-FlashFS的NAND完整解决方案,实现更简单,用户仅需初始化FMC
说明: 1.新版方案更加好用,不管用户使用的那家NAND,用户要做的仅仅是初始化FMC,其它读写API,擦写均衡,坏块管理,ECC校验和掉电保护都不用操心了. 2.新版RL-USB相比老版本功能强劲了 ...
- java8-计算时间差的方法
一.简述 在Java8中,我们可以使用以下类来计算日期时间差异: 1.Period 2.Duration 3.ChronoUnit 二.Period类 主要是Period类方法getYears(),g ...
- phpredis 报错 “Function Redis::setTimeout() is deprecated” 解决方法
项目在本地开发过程中抛出异常: Function Redis::setTimeout() is deprecated 找到出错代码: <?php use Illuminate\Support\F ...
- Java面试,如何在短时间内做突击
面试前很有必要针对性的多刷题,大部分童鞋实战能力强,理论不行,面试前不做准备很吃亏.这里整理了很多常考面试题,希望对你有帮助. 面试技术文 Java岗 面试考点精讲(基础篇01期) Java岗 面 ...
- python爬虫公众号所有信息,并批量下载公众号视频
前言 本文的文字及图片来源于网络,仅供学习.交流使用,不具有任何商业用途,版权归原作者所有,如有问题请及时联系我们以作处理. 作者: 数据分析实战 PS:如有需要Python学习资料的小伙伴可以加点击 ...
- python爬虫-京东商品爬取
京东商品爬取 仅供学习 一.使用selenium from selenium import webdriver from selenium.webdriver.common.keys import K ...
- SSH框架之Struts2第二篇
1.2 知识点 1.2.1 Struts2的Servlet的API的访问 1.2.1.1 方式一 : 通过ActionContext实现 页面: <h1>Servlet的API的访问方式一 ...
- 你看不懂的spring原理是因为不知道这几个概念
背景 问题从一杯咖啡开始. 今天我去楼下咖啡机买了一杯「粉黛拿铁」.制作过程中显示: 我取了做好的粉黛拿铁,喝了一口,果然就是一杯热巧克力.咦咦咦,说好的拿铁呢?虽然我对「零点吧」的咖啡评价很高,觉得 ...
- springboot项目创建,及运行
1. File --> new --> spring Initializr(选择jdk,和默认的url)-->next-->通过dubbo调用的服务可以直接下一步,也可以选择w ...