Random Graph Models of works:社交网络讲义的随机图模型
- 格式:ppt
- 大小:955.00 KB
- 文档页数:21
sbm模型松弛度量方法-回复sbm模型是社交网络分析中常用的一种模型,用于将网络中的节点分成若干个社区。
而松弛度量方法则是一种通过衡量社区结构之间的相似度来评估分区质量的方法。
本文将会详细介绍sbm模型的原理和应用,并以松弛度量方法作为评估指标,一步一步回答该主题。
第一部分:sbm模型的原理和应用1.1 sbm模型的原理sbm模型全称为Stochastic Block Model,是一种随机图模型,在社交网络中被广泛应用于社区发现。
它基于一个假设,即网络中的节点可以被划分为若干个社区,并且同一社区的节点具有相似的连接概率,不同社区之间的连接概率较低。
1.2 sbm模型的应用sbm模型在社交网络分析中有许多应用,例如:- 社区发现:通过将网络中的节点划分为不同的社区,sbm模型可以帮助我们了解网络中的群组结构和关联关系,为后续的行为预测、目标推荐等任务提供基础。
- 信息传播:sbm模型可以帮助我们预测信息在网络中的传播路径和速度,进而优化信息推送策略和社交广告投放。
- 社交影响力分析:通过sbm模型,我们可以分析社交网络中不同节点的影响力大小,了解网络中的意见领袖和关键节点,从而更好地设计营销策略和舆情监测。
第二部分:松弛度量方法的介绍2.1 松弛度量方法的定义松弛度量方法是一种通过计算不同社区结构之间的相似度来评估分区质量的方法。
它基于一种假设,即对于一个理想的社区划分,同一社区的节点应该更有可能彼此相连,不同社区的节点联系较少。
2.2 松弛度量方法的计算松弛度量方法通常基于社区结构之间的连接概率计算,常用的指标包括:- 模块度(Modularity):衡量实际社区划分与随机模型之间的差异程度,数值越大表示社区结构越好。
- 规模比(Size Ratio):衡量实际社区划分中不同社区规模大小的平衡程度。
- 新颖性(Novelty):衡量实际社区划分中新出现的社区数量,用于评估算法的发现能力。
第三部分:sbm模型与松弛度量方法的应用示例为了更好地理解sbm模型和松弛度量方法的应用,我们以一个社交网络为例进行说明。
Computational Modeling of Spatio-temporal SocialNetworks: A Time-Aggregated Graph ApproachS HASHI S HEKHAR AND D EV O LIVERDepartment of Computer Science & EngineeringUniversity of Minnesota, MinneapolisEmail: shekhar@; Web: /~shekharIntroductionSocial computing is transforming on-line spaces with popular applications such as social-networking (e.g., Facebook), collaborative authoring (e.g., Wikipedia), social bargain-hunting (e.g., Groupon), etc. Spatio-temporal constraints are becoming a critical issue in social computing with the emergence of location-based social-networking, Volunteered Geographic Information (Goo 07, Elw 08), Participative Planning (Elw 08, Fis 01), etc. Location-based social networks (e.g., and the “Places Check-in” feature on Facebook) facilitate socialization with nearby friends at restaurants, bars, museums, and concerts. Volunteered Geographic Information (e.g., Wikimapia, OpenStreetMap, Google MyMaps) allows Internet users to participate in generation of geographic information. Traditional computational models for social networks are based on graphs [Fre 06, Was 94, Nrc 03, Cro 09], where nodes represent individual actors (e.g., persons, organizations) and edges represent relationship ties (e.g., communication, financial aid, contracts) between actors. Such graph models are used to assess centrality and the influence of actors (e.g., measures such as degree, reach, “between-ness,” bridge), as well as community structure (e.g., measures such as cohesion, clustering, etc.). Statistical properties such as skewed degree distribution are modeled by random graphs [New 02, Nrc 03], where each node-pair has a connecting edge with independent probability p, which may depend on factors such as geographic distance [Won 05].However, traditional graph and random graph models are limited in addressing spatio-temporal questions such as change (e.g., how is trust or leadership changing over time? who are the emerging leaders in a group? what are the recurring changes in a group?), trends (e.g., what are the long-term and short-term trends in network size or structure? what are the exceptions to the long-term trend?), duration (e.g., how long is the tenure of a leader in a group? how long does it take to elevate the level of trust such as a relationship changing from visitor to friend?), migration, mobility and travel (e.g., interplay between travel behavior and size/structure of social networks [Tim 06]). This position paper explores time-aggregated graph models to support computational tools to address such questions.Background“A social network is a social structure made up of individuals (or organizations) called “nodes,” which are tied (connected) by one or more specific types of interdependency, such as friendship, kinship, common interest, financial exchange, dislike, sexual relationships, or relationships of beliefs, knowledge or prestige” [Wik 10]. Network formation theories bringup the principle of homophily [McP 01] (i.e. birds of a feather flock together) and differentiate between two types, namely, baseline and in-breeding. Baseline homophily [Fel 81] refers to the limited social pool available to individuals for tie formation due to foci of activities, demographics, etc. In-breeding homophily models additional constraints due to gender, religion, social class, education, personality, etc.Spatio-temporal constraints (e.g., geographic space, travel, schedules and diurnal cycles) play a major role in determining baseline homophily due to reasons like opportunity and minimization of cost and effort [Deb 69]. Survey research on student housing communities [Ath 73, Mok 04] and Torontonian personal communities [Wel 88] have provided evidence on the role of geographic spaces. For example, [Wel 88] noted that about 42% of “frequent contact” ties lived within a mile of a typical person. Computational simulation based on agent-based models [Cro 09], game-theory and cost-benefit analysis [Joh 00] as well as spatially embedded random graphs and distance-decay based edge probabilities [Won 05] have reproduced many properties of social networks including small-world properties (e.g., graph diameter), short average geographic distances, community structures, low tie density, etc.Traditionally, social network research has had a relatively small amount of data from infrequent longitudinal surveys and computer simulations. However, the recent popularity of social computing on the Internet is providing large and frequently-sampled spatio-temporal social interaction datasets. These may facilitate better understanding of social network formation and the role that spatio-temporal constraints play, in the face of opportunities to interact with distant actors via Internet based social networking applications. This development also highlights a central role for computation and computational models, not only to scale up to the large and growing data volumes, but also to address new spatio-temporal social questions related to change, trends, duration, mobility, and travel.Time-Aggregated GraphsGiven a spatio-temporal social network dataset and analysis questions, a computational model provides a representation to not only specify the dataset and questions, but also design data-structures and algorithms for addressing the questions. The model should be able to scale as large datasets are becoming available from Internet based social computing services with a large number of actors and a large number of time-points.There are several challenges associated with modeling such a network. The model should be able to accommodate changes and compute results consistent with the existing conditions both accurately and simply. Furthermore, for quickly answering frequent queries such as tracking recurring changes in a group, fast algorithms are required for computing the query results. Sufficient support for the design of correct and efficient algorithms should therefore be provided by the model. The need for computational efficiency conflicts with the requirement for expressive power of the model and balancing these two conflicting goals is challenging.Figure 1: A Time-series of Snapshots for a trust Network for time instants 1 through 10Dynamic networks are often modeled using a time-series of snapshots [Tan 07, Lah 08] of actors and their relationships. For example, Figure 1 shows a time-series of snapshots of a trust network for time instants, t = 1 to t = 10. Three levels of trust are exhibited in Figure 1, where an absent edge indicates that there is no trust relationship, an edge with V indicates a visitor trust relationship and an edge with F indicates friendship, which is a stronger relationship than visitor. For example, (N1, N3) have no trust relationship at t = 1, 2, and 4, a visitor relationship at t = 3, 5, 6, and 7 and a friend relationship at t = 8, 9, and 10. Due to duplication of information about nodes and edges across snapshots, the scalability of snapshot time-series model is limited in answering longitudinal questions such as how long it takes a relationship to evolve from visitor to friend. Storage and thus computational cost increases linearly with the number of time-points.An alternative is the time aggregated graph (TAG) model [Geo 06], which has provided scalable algorithms for temporal questions in transportation networks [Geo 08]. Figure 2 shows a time aggregated graph that is based on the trust network of Figure 1. Each edge has a time series (enclosed in square brackets). For example, the trust relationship for the edge (N1, N3) for all instants within the time interval under consideration are aggregated into a time series [-,-,V,-,V,V,V,F,F,F]; the entry ’-’ indicates that the edge is absent at the time instants t = 1, t = 2 and t = 4.Figure 2: A Time Aggregated Graph for a Trust Network for time instants 1 through 10With the time aggregated graph model, the longitudinal behavior is captured for each edge (or node). Time aggregated graphs do not replicate information about nodes and edges across time-points. This reduces storage cost as well as computational costs. It also consolidates time-series information to make it easier to answer cross snapshot questions such as how long it takes for a relationship to evolve from visitor to friend. For example, in Figure 3, it takes 2 units for edge (N1, N2), 3 units for edge (N2, N4), 4 units for edge (N1,N3) and 1 unit for edge (N3, N4) with an average of (2 + 3 + 4 + 1) / 4 = 2 units, for a relationship to evolve from visitor to friend. Answering this duration question in snapshot model takes more effort due to the need to repeatedly go from one snapshot to the other.DiscussionTime-Aggregated Graphs has the potential to be a general representation of temporal evolution of relationships, whose snapshots were traditionally modeled as graphs. Thus, it may help address questions about individual relationships over longer time-frames by providing a representation for relevant datasets. For example, consider social network datasets representing friend-of relationships among people. Currently, graph representations are used to model a static (e.g., time-snapshot) view of social relationships to explore questions like centrality (e.g., leaders), and cohesiveness of communities. In contrast, TAG may support a direct representation of a “friend-of” relationship over long time periods to address questions related to changes in and evolution of centrality (e.g., emerging leaders) and group cohesiveness (e.g., increasing, diminishing). We welcome collaboration towards identifying datasets and use-cases to evaluate the potential of TAG to address spatio-temporal questions about social networks.References:[And 08] C. F. d’Andrea, Wikipedia as social interaction rooms and collaborative-writing processes, WikiSym, ACM 978-1-60558-128, 2008.[Ath 73] R. Athanasiou, G. A. Yoshioka, The spatial characteristics of friendship formation, Environ.Behav, 5, 1973, 43–66.[Cro 09] Crossman, J., Bechtel, R., Parunak, H., and Brueckner, S., Integrating dynamic social networks and spatio-temporal models for risk assessment, wargaming and planning. In The Network Science Workshop. 2009, West Point, NY.[Deb 69] G. Debreu, Neighboring economic agents, La Decision, 171: 85–90, 1969.[Elw 08] S. Elwood, Volunteered Geographic Information: Future Research Directions Motivated by Critical, Participatory, and Feminist GIS. GeoJournal 72(3 & 4): 173–183, 2008.[Fel 81] S. L. Feld, The focused organization of social ties, Am. J. Soc., 86, 1981, 1015–1035.[Fis 01] F. Fisher, Building Bridges through Participatory Planning, UN-Habitat, isbn 92-1-131623-5, 2001. /decision/BuildingBridges.pdf.[Fre 06] L. Freeman, The Development of Social Network Analysis. Vancouver: Empirical Pres, 2006. [Geo 08] B. George, S. Shekhar, and S. Kim. Spatio-temporal network databases and routing algorithms. University of Minnesota. CSE Technical Report No. 08-039, 2008. (A summary of results in Proc. Symposium on Spatial and Temporal Databases, Springer LNCS 4605, 2007).[Geo 06] B. George, S. Shekhar, Time-Aggregated Graphs for Modeling Spatio-temporal Networks, Journal of Data Semantics, 11, 2006 (Springer LNCS 5383, 2008). (Special issue on best papers from ER 2006 Workshop on Conceptual Modeling for GIS).[Goo 07] M. F. Goodchild, Citizens as sensors: The world of volunteered geography. GeoJournal, 69(4): 211–221, 2007.[Joh 00] C. Johnson and R. P. Giles, Spatial Social Networks, Review of Economic Design, 5, 273–299, Springer Verlag, 2000.[Lah 08] M. Lahiri, T. Y. Berger-Wolf, Mining Periodic Behavior in Dynamic Social Networks, Proc.ICDM, IEEE, 2008.[McP 01] M. McPherson, L. Smith-Lovin, J. M. Cook, Birds of a feather: Homophily in social networks, Annu. Rev. Sociol., 27, 2001, 415–444.[Mok 04] D. Mok, B. Wellman, R. Basu, Does distance matter for relationships? SUNBELT International Social Network Conference, Portoroz, Slovenia, 2004.[New 02] M. E. Newman, D. J. Watts, S. Strogatz, Random graph models of social networks, Proceedings of the National Academy of Science, 99, 2566–2572, Feb. 19th, 2002.[Nrc 03] National Research Council, Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers, isbn 0-309-08952-2, National Academies Press, 2003.[Tan 07] C. Tantipathananandh, T. Berger-Wolf, D. Kempe, A framework for community identification in dynamic social networks, Proc. SIGKDD, ACM, 2007.[Tap 06] D. Tapscott, A. Willians. Wikiconomics: How mass collaboration changes everything. New York: Portifolio, 2006.[Tim 06] O. Timo, Mapping Social Networks in Time and Space, Arbeitsberichte Verkehr und Raumplanung, Working Paper 341, IVT, ETH Zurich, 2006.[Was 94] S. Wasserman, and K. Faust, Social Network Analysis: Methods and Applications, Cambridge University Press, Cambridge, 1994.[Wel 88] B. Wellman, P. Carrington, A. Hall, Networks as personal communities, in B. Wellman, S. D.Berkowitz (Eds.), Social Structures: A Network Approach, Cambridge University Press, Cambridge, 1988.[Wik 10] Social Network, Wikipedia, /wiki/Social_network, retrieved Nov. 29, 2010.[Won 06] L. H. Wong, P. Pattison, G. Robbins, A spatial model for social networks, Physica A, 360, 2006, 99–120, Elsevier.。
nx.random_geometric_graph的用法一、概述nx.random_geometric_graph 是 NetworkX 的一种生成随机图的函数,主要基于平面上点的几何距离来连接节点,也称为基于几何距离的随机图生成器。
其生成的随机图常被用于测试网络算法和进行实验。
二、函数定义```python nx.random_geometric_graph(n, radius, dim=2, pos=None, p=2, periodic=False, seed=None)```参数:- n :图中节点的数量 - radius :表示点之间的最大距离。
- dim :表示空间的大小,例如dim=2表示空间为二维平面。
- pos :表示节点的位置,默认随机生成。
- p :好像是一个距离指数,决定边的形成情况,当p → ∞ 时,该随机图趋近于完全图,反之则趋近于星形图。
- periodic :表示是否使用周期性边界。
- seed :为随机数生成器设置一个种子。
返回值:生成一个随机图。
三、基本用法1.用法一:简单使用只需要指定节点的数量、半径以及维数即可得到这个随机图。
```python import networkx as nxn = 50 radius = 0.2 G =nx.random_geometric_graph(n, radius) ```2.用法二:搭配pos参数使用pos 表示节点的位置,可以指定节点在二维平面上分布的位置。
可以利用 numpy 库生成一些二维坐标作为参数传递,这种情况下我们不需要再指定维数和半径。
下面是一个例子:```python import numpy as np import matplotlib.pyplot as pltpos = np.random.rand(n, 2) # 在二维平面中随机生成n个坐标 G = nx.random_geometric_graph(n,pos=pos) nx.draw(G, pos, node_size=20,node_color='blue') plt.show() ```我们可以发现,节点在正方形的边界内随机分布,没有直接相连的节点。
第二章一.名词解释1.全局耦合网络:任意两个点之间都有边直接相连,完全连接。
2.最近邻耦合网络:每一个节点只和它周围的邻居节点相连。
3.星形耦合网络:有一个中心点,其余N-1个点都只与这个中心点连接4.均匀网络:当k >> <k>时,度为k的节点几乎不存在。
因此这类网络也成为均匀网络或指数网络5.无标(尺)度网络:由于这类网络的节点连接度没有明显的特征长度6.随机网络:节点度的分布将遵循钟形曲线分布。
按照这种分布,大多数节点拥有的连接的数目都相差不多7.鲁棒性:如果移走少量节点后,网络中的绝大部分节点仍是连通的,那么称该网络的连通性对节点故障具有鲁棒性或者稳健性。
8.脆弱性:蓄意去除少量度最高的节点就可破坏无标度网络的连通性9.设计网络:随机网络中节点总数N是预先给定的,所以它们是静态的、固定的、平衡的网络,也有称为设计网络10.演化网络:若网络模型的节点总数不是预先给定的,而是逐步增减的,则它们是动态的、增长的、非平衡的网络,或者称为演化网络(evolving network)11.马太效应:新的节点更倾向于与那些具有较高连接度的“大”节点相连接,这种现象也称为“富者更富(rich get richer)”或“马太效应(Matthew effect)”。
12.分形几何:普通几何研究的对象一般都具有整数的维数,比如,零维的点、一维的线、二维的面、三维的立体、乃至四维的时空。
分形几何(fractal geometry)是研究具有不一定是整数的维,而存在一个分数维数的空间。
13.适应度:在许多实际网络中,节点的度及其增长速度并非只与该节点的年龄有关,有时是与节点的内在性质有关的,Bianconi和Barabasi把这一性质称为节点的适应度(fitness)14.模块:模块(model)是指一组物理上或功能上连接在一起的、共同完成一个相对独立功能的节点。
15.模体:具有高聚类性的网络在局部可能包含各种由高度连接的节点组构成的子图(subgraph),如三角形,正方形和五角形,其中一些子图所占的比例明显高于同一网络的完全随机化形式中这些子图所占的比例,这些子图就称为模体。
一文读懂社会网络分析(SNA)理论、指标与应用开新坑!社交网络分析(又称复杂网络、社会网络,Social Network Analysis)是诞生于数学图论、计算机科学、物理学的交叉碰撞中的一门有趣的学科。
缘起:我研究SNA已经有近2年的时光,一路坎坷走来有很多收获、踩过一些坑,也在线上给很多学生讲过SNA的入门知识,最近感觉有必要将心得和基础框架分享出来,抛砖引玉,让各位对SNA感兴趣的同学们一起学习进步。
我的能力有限,如果有不足之处大家一起交流,由于我的专业的影响,本文的SNA知识可能会带有情报学色彩。
面向人群:优先人文社科类的无代码学习,Python、R的SNA 包好用是好用,但是对我们这这些社科的同学来说门槛太高,枯燥的代码首先就会让我们丧失学习兴趣。
特征:类综述文章,主要目的是以通俗的语言和精炼的框架带领各位快速对SNA领域建立起一个全面的认知,每个个关键概念会附上链接供感兴趣的同学深入学习。
开胃菜:SNA经典著作分享《网络科学引论》纽曼 (访问密码 : v9d9g3)2 概述篇:什么是网络?我们从哪些角度研究它?1) 认识网络SNA中所说的网络是由节点(node,图论中称顶点vertex)和边(edge)构成,如下图。
每个节点代表一个实体,可以是人、动物、关键词、神经元;连接各节点的边代表一个关系,如朋友关系、敌对关系、合作关系、互斥关系等。
最小的网络是由两个节点与一条边构成的二元组。
Les Miserables人际关系网络2) 构建网络就是建模马克思说过,“人的本质在其现实性上,它是一切社会关系的总和。
” 事实上,当我们想快速了解一个领域,无论该领域是由人、知识、神经元乃至其他实体集合构成,利用SNA的方法将实体及其相互关系进行抽象和网络构建,我们就完成了对某一领域的“建模”,这个模型就是网络图,拿科学网络计量学家陈超美的观点来说,借助网络图,“一图胜千言,一览无余”。
3) 社会网络类型此处展示常见且常用的网络类型名词,想要具体了解可以点击链接仔细查看!•网络中节点的来源集合异同o一模网络 one-modeo二模网络 two-mode•视角:•边权重o加权网络 weight networko无权网络 unweight networko符号网络 Signed network•关系是否有方向o有向网络 Directed networko无向网络 Undirected network4) 网络分析的5大中心问题SNA可以帮助我们快速了解该网络中的分布格局和竞争态势,“孰强孰弱,孰亲孰远,孰新孰老,孰胜孰衰”,这16字箴言是我学习SNA总结的精华所在,初中级甚至高级的社会网络分析学习几乎完全就是围绕着这四个方面开展,后面将要讲到的理论与方法皆为此服务,希望同学们可以重点关注。
摘要近年来,网络信息科技迅速发展,使Facebook、Twitter 以及新浪微博等在线社交网络平台兴起,从而使人们更易获取各种信息。
如何准确揭示社交网络中信息传播的规律,以及如何有效地影响和控制信息的传播,已成为复杂网络科学领域研究的热点,所以复杂网络的研究引起了广大科研工作者的兴趣。
复杂网络已涉及到数理学科、网络拓扑理论、系统科学以及传播动力学等众多不同领域,基于此,从网络拓扑结构和传播动力学两个角度,考虑到无标度网络能更加形象地反映现实世界中的社交网络,本文基于无标度网络构建社交信息传播的相关模型,并对建立的模型进行了详细地研究和分析。
本文工作及创新点主要包括如下几个方面:1. 考虑到社交网络的异质性以及信息的发布和转发机制对社交信息传播的影响,基于无标度网络提出了一类新型的SIFT(Susceptible-Issuer-Forwarder-Stifler) 社交信息传播模型。
通过平均场理论,计算出了SIFT模型的基本再生数R和系统的平衡点(包括信息消失平衡点和信息流行平衡点),然后详细地研究了信息消失平衡点的全局渐近稳定性、信息传播的持久性以及信息流行平衡点的全局吸引性。
2. 考虑到人们的心理因素(如犹豫,遗忘等)变化对谣言传播的影响,基于无标度网络建立了一类新的SHPRS(Susceptible-Hesitating-Propagating-Resisted- Susceptible)谣言传播动力学模型,并对其进行了详细地研究。
通过平均场理论,计算得到了所建立模型的基本再生数R和平衡点,然后证明了无谣平衡点的全局渐近稳定性和谣言流行平衡点的全局吸引性。
研究表明人们心理的变化确实对谣言传播有影响,然后进一步地给出了一些控制谣言传播的有效方法。
3. 考虑到网络异质性、广告吸引度以及时间延迟等因素的影响,基于无标度网络建立了一类新的具有时滞的INSRI(Ignorant-Insider-Spreader-Resister-Ignorant)广告信息传播动力学模型,并详细地研究了广告吸引度与时延对于广告传播的影R和平衡点,然后对系统稳定性进行了详响。
一文读懂社会网络分析(SNA)理论、指标与应用开新坑!社交网络分析(又称复杂网络、社会网络,Social Network Analysis)是诞生于数学图论、计算机科学、物理学的交叉碰撞中的一门有趣的学科。
缘起:我研究SNA已经有近2年的时光,一路坎坷走来有很多收获、踩过一些坑,也在线上给很多学生讲过SNA的入门知识,最近感觉有必要将心得和基础框架分享出来,抛砖引玉,让各位对SNA感兴趣的同学们一起学习进步。
我的能力有限,如果有不足之处大家一起交流,由于我的专业的影响,本文的SNA知识可能会带有情报学色彩。
面向人群:优先人文社科类的无代码学习,Python、R的SNA 包好用是好用,但是对我们这这些社科的同学来说门槛太高,枯燥的代码首先就会让我们丧失学习兴趣。
特征:类综述文章,主要目的是以通俗的语言和精炼的框架带领各位快速对SNA领域建立起一个全面的认知,每个个关键概念会附上链接供感兴趣的同学深入学习。
开胃菜:SNA经典著作分享《网络科学引论》纽曼 (访问密码 : v9d9g3)2概述:什么是网络?我们从哪些角度研究它?1) 认识网络SNA中所说的网络是由节点(node,图论中称顶点vertex)和边(edge)构成,如下图。
每个节点代表一个实体,可以是人、动物、关键词、神经元;连接各节点的边代表一个关系,如朋友关系、敌对关系、合作关系、互斥关系等。
最小的网络是由两个节点与一条边构成的二元组。
Les Miserables人际关系网络2) 构建网络就是建模马克思说过,“人的本质在其现实性上,它是一切社会关系的总和。
” 事实上,当我们想快速了解一个领域,无论该领域是由人、知识、神经元乃至其他实体集合构成,利用SNA的方法将实体及其相互关系进行抽象和网络构建,我们就完成了对某一领域的“建模”,这个模型就是网络图,拿科学网络计量学家陈超美的观点来说,借助网络图,“一图胜千言,一览无余”。
3) 社会网络类型这里展示了常见和常用的网络类型名词。
随机游走在图论和统计学中的应用随机游走是指从图或网络的一个节点开始,在不断移动中随机选择下一个节点或边,并不断更新当前节点的出现频率。
在图论和统计学中,随机游走有广泛的应用,可以用来解决一些重要的问题,比如在社交网络中找到社区结构、评估网页的价值、推荐系统中的推荐等等。
这篇文章将介绍一些随机游走在图论和统计学中的应用。
随机游走在图论中的应用1. PageRank算法PageRank算法是谷歌搜索引擎的基础,它会根据网页的链接结构来评估每个网页的重要性。
这个算法通过对网页的链接关系进行随机游走来确定网页的rank值,其中随机游走者从当前页面的链接中以等概率随机选择下一条链接,如果跳转到的页面没有链接,则以等概率随机选择其他页面。
2. 社区检测社交网络的社区结构是人类社会中的一种普遍现象,因此研究社交网络的社区结构一直是社交网络分析领域中的热门话题之一。
随机游走可以通过在网络中随机游走,统计节点的访问频率来确定节点的重要性和归属社区,进而发现网络中的社区结构。
3. 随机游走降维随机游走可以减少大型网络中的节点和边,将网络降维,同时保留网络中的重要信息。
随机游走降维可以用于网络可视化、网络分类、网络聚类等。
随机游走在统计学中的应用1. 马尔可夫链马尔可夫链是一个随机过程,每个状态的概率分布都与上一个状态有关。
随机游走可以产生一个马尔可夫链,其中节点就是状态,转移矩阵是节点之间的概率分布。
马尔可夫链的应用包括图像识别、自然语言处理、金融统计学等。
2. 蒙特卡罗模拟蒙特卡罗模拟是一种使用随机游走来模拟不确定性和复杂性的统计学方法。
这个方法可以在没有明确解析解的情况下模拟和估计模型参数的分布和期望值。
蒙特卡罗模拟的应用包括金融风险评估,气候变化预测,粒子物理学等。
总结随机游走在图论和统计学领域中都有着广泛的应用,它可以被用来解决各种重要的问题。
在图论中,随机游走可以被用来确定网页的重要性、检测社区结构、进行降维等;而在统计学中,随机游走可以被用来产生马尔可夫链、进行蒙特卡罗模拟等。
复杂网络的基本模型及其应用随着信息技术的飞速发展,我们生活中的各个领域都已经形成了庞大的网络系统。
而这些网络系统不仅在数量上迅速增长,同时也在复杂度上逐渐提高。
这就为我们研究网络系统带来了新的挑战,同时也为我们提供了丰富的研究机会。
复杂网络正是这样的一门热门研究领域,本文将介绍复杂网络的基本模型以及它们的应用。
一、复杂网络的基本模型1. 随机网络模型随机网络是复杂网络研究的基础模型,也是最简单的网络模型之一。
在随机网络中,节点和连接是随机连接的,也就是说,连接的生成没有规律或者是基于概率分布。
随着网络规模的增大,随机网络的度分布逐渐趋向于高斯分布。
而高斯分布的一个重要特征就是其均值和方差都非常重要,并且许多实际系统的度分布都具有高斯分布特征。
随机网络的主要局限性是其缺乏社区结构,也就是说,在随机网络中,不存在形态或功能的相似节点的聚簇现象。
2. 小世界模型小世界模型是在维持较高的局部聚集程度的前提下具有较短平均距离的网络模型。
与随机网络模型不同的是,小世界模型中,节点的连接是随机化的,但是节点之间距离却非常接近。
小世界模型的典型特征就是“六度分隔理论”,也就是在小世界网络中,从任何一个节点出发,找到其他节点的平均距离都不会超过6个。
小世界模型是现实世界网络的典型模型,例如社交网络和蛋白质相互作用网络等。
它的局限性主要在于缺乏完整的社区结构,也就是节点之间的聚集程度仍然不够高。
3. 无标度网络模型无标度网络是目前复杂网络研究中最流行的网络模型之一。
在这个模型中,网络的度分布不是均匀的,而是具有“幂律分布”特征。
也就是说,只有极少数节点拥有极高的度数,而大多数节点的度数都很低。
这种模型通常被用来描述物理网络和大规模互联网。
无标度网络模型与其他两个基础模型的最大不同之处就在于其在网络中加入了“富者愈富”这一原则,即在网络中度数较高的节点往往更容易与其它节点建立新的连接。
这种现象导致了网络的非线性增长,以及一些非常重要的复杂网络现象,例如小世界现象、无标度现象等。
59近年来,社会网络成为计算机领域的热门研究话题。作为一个研究分支,社会网络在社会学中的发展已超过了半个世纪,形成了一套比较有效的概念体系和研究方法,对当前计算机领域内的社会网络研究应该有可借鉴之处。本文简要介绍社会学中的社会网络研究(social network analysis,SNA),希望对计算机专业研究人员有所启发。
社会网络研究介绍传统的定量社会科学把个人的一些“标签”式的属性,如性别、收入、社会地位、阶级等,作为基本的分析单位,得到诸如性别比、人口统计、平均收入等指标,并研究其相互关系。以研究社会中的不平等现象为例,其标准的过程是:根据收入、职业等指标,对个人的社会地位进行量化,进而对量化结果进行统计分析,计算诸如均值、方差等参数,并试图建立其与性别、地域、受教育程度等因素的函数关系,再试图通过经济、文化、历史、社会心理来理解这些关系(现象)的成因。不过,这种方法忽视了个人之间的社会交往对这些属性的影响。如统计平均收入,其实假定了个体的独立性。而正如俗语所言,“人以群分”,个人收入与朋友收入往往呈现正相关,并且个人往往会有意识地利用社会关系,来改善自己的社会地位。因此,属性化的分析多是一种“后观”式的描述,无法为解释社会现象提供系统的方法。社会网络研究则是把关系放在中心的地位。在
这套理论中,个人被抽象为节点,个人之间的社会关系作为节点之间的边,共同形成一个网络。社会学家希望网络的结构属性可以为社会现象提供系统性的解释。相关研究内容包括:个人的权力和声望 通过在网络中定义节
点的度数,介数(betweenness)和接近度(close-ness)等概念,可以分别揭示个人在社会中声望某个方面的状况。如节点度数代表与一个人有关系的人数的多少;介数反映个人在网络中是否占据中间地位,隐含着沟通不同群体的能力;接近度则反映一个人与其他所有人的平均距离。在社会学意义下,这些概念蕴含着个人的权力或社会声望,反映一个人的社会资本。而社会网络中节点度数的分布则反映社会的分层情况[1]。社会中的横向结构(社会群体) 社会中
Package‘ergm.rank’June1,2022Version4.1.0Date2022-06-01Title Fit,Simulate and Diagnose Exponential-Family Models for Rank-Order Relational Data Depends R(>=4.0),ergm(>=4.2.2),network(>=1.15)Imports mon(>=4.2.0),utilsLinkingTo ergmSuggests covr,knitr,rmarkdownDescription A set of extensions for the'ergm'package to fit weighted net-works whose edge weights are ranks.See Krivit-sky and Butts(2017)<doi:10.1177/0081175017692623>and Krivitsky,Hunter,Mor-ris,and Klumb(2021)<arXiv:2106.04997>.License GPL-3+file LICENSEURL https://BugReports https:///statnet/ergm.rank/issuesVignetteBuilder rmarkdown,knitrRoxygenNote7.2.0Roxygen list(markdown=TRUE)Encoding UTF-8R topics documented:ergm.rank-package (2)AlterSwap-ergmProposal (3)CompleteOrder-ergmReference (3)newcomb (4)12ergm.rank-packagerank.deference-ergmTerm (5)rank.edgecov-ergmTerm (5)rank.inconsistency-ergmTerm (6)rank.nodeicov-ergmTerm (6)rank.nonconformity-ergmTerm (7)Index8ergm.rank-package Fit,Simulate and Diagnose Exponential-Family Models for Rank-Order Relational DataDescriptionergm.rank is a set of extensions to package ergm to fit and simulate from exponential-familyrandom graph models for networks whose edge weights are ranks.For a list of functions typehelp(package='ergm')and help(package='ergm.rank')DetailsMainly,it implements the CompleteOrder reference measure for valued ERGMs(documentedhere),and provides some rank-order change statistics(documented here).For a complete list of the functions,use library(help="ergm")and library(help="ergm.rank")or read the rest of the manual.When publishing results obtained using this package,please cite the original authors as describedin citation(package="ergm.rank").All programs derived from this package must cite it.This package contains functions specific to using ergm to model networks whose dyad values areranks.Examples include preferences,valued ties reduced to ranks,etc..These terms have a specialized interpretation,and are therefore generally prefixed by"rank.",though they should take any valued data.For detailed information on how to download and install the software,go to the Statnet projectwebsite:https://.A tutorial,support newsgroup,references and links to furtherresources are provided there.Author(s)Pavel N.Krivitsky<*****************>ReferencesKrivitsky PN(2012).Exponential-Family Random Graph Models for Valued Networks.ElectronicJournal of Statistics,2012,6,1100-1128.c("\Sexpr[results=rd,stage=build]tools:::Rd_expr_doi(\"#1\")","doi:10.1214/12-EJS696")\ifelse{text}{doi:10.1214/12-EJS696<https:///10.1214/12-EJS696>}{\ifelse{latex}{\href{ EJS696}{doi:10.1214\out{\slash{}}12\out{\-}EJS696}}{\href{https:///10.1214/12-EJS696}{doi:10.1214/12-EJS696}}}Krivitsky PN and Butts CT(2017).Exponential-Family Random Graph Models for Rank-OrderRelational Data.Sociological Methodology,2017,47,68-112.doi:10.1177/0081175017692623AlterSwap-ergmProposal3 See Alsoergm-terms,ergm-referencesAlterSwap-ergmProposalA proposal that swaps values of two alters incident on an egoDescriptionThis proposal randomly selects two dyads(i,j)and(i,j′)with a common sender and proposes to swap their values if distinct.DetailsThis proposal is not referenced in the lookup table.CompleteOrder-ergmReferenceA uniform distribution over the possible complete orderings of the al-ters by each egoDescriptionA uniform distribution over the possible complete orderings of the alters by each egoUsage#CompleteOrderSee AlsoergmReference for index of reference distributions currently visible to the package.4newcomb newcomb Newcomb’s Fraternity NetworksDescriptionThese14networks record weekly sociometric preference rankings from17men attending the Uni-versity of Michigan in the fall of1956;Data were collected longitudinally over15weeks,although data from week9are missing.FormatA list of15networks.DetailsThe men were recruited to live in off-campus(fraternity)housing,rented for them as part of the Michigan Group Study Project supervised by Theodore Newcomb from1953to1956.All were incoming transfer students with no prior acquaintance of one another.The data set,derived from one in the unreleased netdata package,contains a network list newcomb with14networks.Each network is complete and contains two edge attributes:list("rank")the preference of the i th man for the j th man from1through16,with1being the highest preference.list("descrank")the same,but1indicates lowest preference.Licenses and CitationIf the source of the data set does not specified otherwise,this data set is protected by the Creative Commons License https:///licenses/by-nc-nd/2.5/.When publishing results obtained using this data set the original authors should be cited.In addition this should be cited as:Vladimir Batagelj and Andrej Mrvar(2006):Pajek datasetshttp://vlado.fmf.uni-lj.si/pub/networks/data/Sourcehttp://vlado.fmf.uni-lj.si/pub/networks/data/ucinet/ucidata.htm#newfrat ReferencesSee the link above.Newcomb T.(1961).The acquaintance process.New York:Holt,Reinhard and Winston.Nordlie P.(1958).A longitudinal study of interpersonal attraction in a natural group setting.Un-published doctoral dissertation,University of Michigan.White H.,Boorman S.and Breiger R.(1977).Social structure from multiple networks,I.Block-models of roles and positions.American Journal of Sociology,81,730-780.rank.deference-ergmTerm5 Examples#Note:This takes a long time.data(newcomb)#Fit a model for the transition between initial(time0)ranking and#ranking after one week(time1).Note that MCMC interval has been#decreased to save time.newcomb.1.2.fit<-ergm(newcomb[[2]]~rank.inconsistency(newcomb[[1]],"descrank")+rank.deference+rank.nonconformity("all")+rank.nonconformity("localAND"),response="descrank",reference=~CompleteOrder,control=control.ergm(MCMC.interval=10))#Check MCMC diagnostics(barely adequate).mcmc.diagnostics(newcomb.1.2.fit)summary(newcomb.1.2.fit)rank.deference-ergmTermDeference(aversion)DescriptionMeasures the amount of"deference"in the network:configurations where an ego i ranks an alter j over another alter k,but j,in turn,ranks k over i.A lower-than-chance value of this statistic and/ora negative coefficient implies a form of mutuality in the network.Usage#valued:rank.deferencerank.edgecov-ergmTerm Dyadic covariatesDescriptionModels the effect of a dyadic covariate on the propensity of an ego i to rank alter j highly. Usage#valued:rank.edgecov(x,attrname)6rank.nodeicov-ergmTermArgumentsx,attrname either a square matrix of covariates,one for each possible edge in the network, the name of a network attribute of covariates,or a network;if the latter,optionalargument attrname provides the name of the quantitative edge attribute to usefor covariate values(in this case,missing edges in x are assigned a covariatevalue of zero).rank.inconsistency-ergmTerm(Weighted)InconsistencyDescriptionMeasures the amount of disagreement between rankings of the focus network and a fixed covariate network x,by couting the number of pairwise comparisons for which the two networks disagree.Usage#valued:rank.inconsistency(x,attrname,weights,wtname,wtcenter) Argumentsx,attrname x can be a network with an edge attribute attrname containing the ranks ora matrix of appropriate dimension containing the ranks.If x is not given,itdefaults to the LHS network,and if attrname is not given,it defaults to theresponse edge attribute.weights optional parameter to weigh the counts.Can be either a3D n×n×n-array whose(i,j,k)th element gives the weight for the comparison by i of j and k ora function taking three arguments,i,j,and k,and returning the weight of thiscomparison.wtname,wtcenterIf wtcenter=TRUE,the calculated weights will be centered around their mean.wtname can be used to label this term.rank.nodeicov-ergmTermAttractiveness/Popularity covariatesDescriptionModels the effect of one or more nodal covariates on the propensity of an actor to be ranked highly by the others.rank.nonconformity-ergmTerm7Usage#valued:rank.nodeicov(attr)Argumentsattr quantitative attribute(see Specifying Vertex attributes and Levels(?nodal_attributes) for details.)rank.nonconformity-ergmTermNonconformityDescriptionMeasures the amount of"nonconformity"in the network:configurations where an ego i ranks analter j over another alter k,but ego l ranks k over j.Usage#valued:rank.nonconformity(to,par)Argumentsto which controls to whom an ego may conform:•"all"(the default):Nonconformity to all egos is counted.A lower-than-chance value of this statistic and/or a negative coefficient implies a degreeof consensus in the network.•"localAND":Nonconformity of i to ego l regarding the relative rankingof j and k is only counted if i ranks l over both j and k.A lower-than-chance value of this statistic and/or a negative coefficient implies a form ofhierarchical transitivity in the network.This is the recommended form oflocal nonconformity(over"local1"and"local2").•"local1":Nonconformity of i to ego l regarding the relative ranking of jand k is only counted if i ranks l over j.•"local2":Nonconformity of i to ego l regarding the relative ranking of jand k is only counted if i ranks l over k.•"thresholds":Nonconformity of i to ego l regarding the relative rankingof j and k is only counted if i ranks l above par,where par can be a vectorwith multiple thresholds.•"geometric":Nonconformity of i to ego l regarding the relative ranking ofj and k is weighted by par taken to the power of the rank of l by i,wherepar is a scalar.par additional parameters for some types of nonconformity.Index∗datasetsnewcomb,4∗directedrank.deference-ergmTerm,5rank.edgecov-ergmTerm,5rank.inconsistency-ergmTerm,6rank.nodeicov-ergmTerm,6rank.nonconformity-ergmTerm,7∗ordinalAlterSwap-ergmProposal,3CompleteOrder-ergmReference,3rank.deference-ergmTerm,5rank.edgecov-ergmTerm,5rank.inconsistency-ergmTerm,6rank.nodeicov-ergmTerm,6rank.nonconformity-ergmTerm,7∗packageergm.rank-package,2∗quantitative dyadic attributerank.edgecov-ergmTerm,5∗quantitative nodal attributerank.nodeicov-ergmTerm,6∗quantitative triadic attributerank.inconsistency-ergmTerm,6∗triad-relatedrank.deference-ergmTerm,5rank.nonconformity-ergmTerm,7∗valuedAlterSwap-ergmProposal,3CompleteOrder-ergmReference,3rank.deference-ergmTerm,5rank.edgecov-ergmTerm,5rank.inconsistency-ergmTerm,6rank.nodeicov-ergmTerm,6rank.nonconformity-ergmTerm,7 AlterSwap-ergmProposal,3 CompleteOrder-ergmReference,3documented here,2ergm,2ergm.rank,2ergm.rank-package,2ergmReference,3pleteOrder(CompleteOrder-ergmReference),3InitWtErgmProposal.AlterSwap(AlterSwap-ergmProposal),3 InitWtErgmTerm.rank.deference(rank.deference-ergmTerm),5 InitWtErgmTerm.rank.edgecov(rank.edgecov-ergmTerm),5 InitWtErgmTerm.rank.inconsistency(rank.inconsistency-ergmTerm),6InitWtErgmTerm.rank.nodeicov(rank.nodeicov-ergmTerm),6 InitWtErgmTerm.rank.nonconformity(rank.nonconformity-ergmTerm),7network,6newcomb,4rank.deference-ergmTerm,5rank.edgecov-ergmTerm,5rank.inconsistency-ergmTerm,6rank.nodeicov-ergmTerm,6rank.nonconformity-ergmTerm,78。