summaryrefslogtreecommitdiff
path: root/reference/introduction.ipynb
blob: 37ee5ea1f5dd249c7495028e6fc296cfc87b079b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "b5a8eb6b",
   "metadata": {},
   "source": [
    "## Introduction\n",
    "\n",
    "The structure of NetworkX can be seen by the organization of its source code.\n",
    "The package provides classes for graph objects, generators to create standard\n",
    "graphs, IO routines for reading in existing datasets, algorithms to analyze\n",
    "the resulting networks and some basic drawing tools.\n",
    "\n",
    "Most of the NetworkX API is provided by functions which take a graph object\n",
    "as an argument.  Methods of the graph object are limited to basic manipulation\n",
    "and reporting.  This provides modularity of code and documentation.\n",
    "It also makes it easier for newcomers to learn about the package in stages.\n",
    "The source code for each module is meant to be easy to read and reading\n",
    "this Python code is actually a good way to learn more about network algorithms,\n",
    "but we have put a lot of effort into making the documentation sufficient and friendly.\n",
    "If you have suggestions or questions please contact us by joining the\n",
    "[NetworkX Google group](http://groups.google.com/group/networkx-discuss).\n",
    "\n",
    "Classes are named using `CamelCase` (capital letters at the start of each word).\n",
    "functions, methods and variable names are `lower_case_underscore` (lowercase with\n",
    "an underscore representing a space between words).\n",
    "\n",
    "### NetworkX Basics\n",
    "\n",
    "After starting Python, import the networkx module with (the recommended way)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5695930c",
   "metadata": {},
   "outputs": [],
   "source": [
    "import networkx as nx"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "692c2305",
   "metadata": {},
   "source": [
    "To save repetition, in the documentation we assume that\n",
    "NetworkX has been imported this way.\n",
    "\n",
    "If importing networkx fails, it means that Python cannot find the installed\n",
    "module. Check your installation and your `PYTHONPATH`.\n",
    "\n",
    "The following basic graph types are provided as Python classes:\n",
    "\n",
    "`Graph`\n",
    "\n",
    ":   This class implements an undirected graph. It ignores\n",
    "    multiple edges between two nodes.  It does allow self-loop\n",
    "    edges between a node and itself.\n",
    "\n",
    "`DiGraph`\n",
    "\n",
    ":   Directed graphs, that is, graphs with directed edges.\n",
    "    Provides operations common to directed graphs,\n",
    "    (a subclass of Graph).\n",
    "\n",
    "`MultiGraph`\n",
    "\n",
    ":   A flexible graph class that allows multiple undirected edges between\n",
    "    pairs of nodes.  The additional flexibility leads to some degradation\n",
    "    in performance, though usually not significant.\n",
    "\n",
    "`MultiDiGraph`\n",
    "\n",
    ":   A directed version of a MultiGraph.\n",
    "\n",
    "Empty graph-like objects are created with"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cb802a24",
   "metadata": {},
   "outputs": [],
   "source": [
    "G = nx.Graph()\n",
    "G = nx.DiGraph()\n",
    "G = nx.MultiGraph()\n",
    "G = nx.MultiDiGraph()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bf48d000",
   "metadata": {},
   "source": [
    "All graph classes allow any [hashable](https://docs.python.org/3/glossary.html#term-hashable) object as a node.\n",
    "Hashable objects include strings, tuples, integers, and more.\n",
    "Arbitrary edge attributes such as weights and labels\n",
    "can be associated with an edge.\n",
    "\n",
    "The graph internal data structures are based on an\n",
    "adjacency list representation and implemented using\n",
    "Python dictionary datastructures.\n",
    "The graph adjacency structure is\n",
    "implemented as a Python dictionary of\n",
    "dictionaries; the outer dictionary is keyed by nodes to values that are\n",
    "themselves dictionaries keyed by neighboring node to the\n",
    "edge attributes associated with that edge.  This “dict-of-dicts” structure\n",
    "allows fast addition, deletion, and lookup of nodes and neighbors in\n",
    "large graphs.  The underlying datastructure is accessed directly\n",
    "by methods (the programming interface “API”) in the class definitions.\n",
    "All functions, on the other hand, manipulate graph-like objects\n",
    "solely via those API methods and not by acting directly on the datastructure.\n",
    "This design allows for possible replacement of the ‘dicts-of-dicts’-based\n",
    "datastructure with an alternative datastructure that implements the\n",
    "same methods.\n",
    "\n",
    "# Graphs\n",
    "\n",
    "The first choice to be made when using NetworkX is what type of graph\n",
    "object to use.  A graph (network) is a collection of nodes together\n",
    "with a collection of edges that are pairs of nodes.  Attributes are\n",
    "often associated with nodes and/or edges.  NetworkX graph objects come in\n",
    "different flavors depending on two main properties of the network:\n",
    "\n",
    "> * Directed: Are the edges **directed**?  Does the order of the edge\n",
    ">   pairs $(u, v)$ matter?  A directed graph is specified by the “Di”\n",
    ">   prefix in the class name, e.g. `DiGraph()`.  We make this distinction\n",
    ">   because many classical graph properties are defined differently for\n",
    ">   directed graphs.\n",
    "\n",
    "> * Multi-edges: Are multiple edges allowed between each pair of nodes?\n",
    ">   As you might imagine, multiple edges requires a different data\n",
    ">   structure, though clever users could design edge data attributes to\n",
    ">   support this functionality.  We provide a standard data structure\n",
    ">   and interface for this type of graph using the prefix “Multi”,\n",
    ">   e.g., `MultiGraph()`.\n",
    "\n",
    "The basic graph classes are named:\n",
    "Graph,\n",
    "DiGraph,\n",
    "MultiGraph, and\n",
    "MultiDiGraph\n",
    "\n",
    "## Nodes and Edges\n",
    "\n",
    "The next choice you have to make when specifying a graph is what kinds\n",
    "of nodes and edges to use.\n",
    "\n",
    "If the topology of the network is all you\n",
    "care about then using integers or strings as the nodes makes sense and\n",
    "you need not worry about edge data.  If you have a data structure\n",
    "already in place to describe nodes you can simply use that structure\n",
    "as your nodes provided it is [hashable](https://docs.python.org/3/glossary.html#term-hashable).  If it is not hashable you can\n",
    "use a unique identifier to represent the node and assign the data\n",
    "as a node attribute.\n",
    "\n",
    "Edges often have data associated with them.  Arbitrary data\n",
    "can be associated with edges as an edge attribute.\n",
    "If the data is numeric and the intent is to represent\n",
    "a *weighted* graph then use the ‘weight’ keyword for the attribute.\n",
    "Some of the graph algorithms, such as\n",
    "Dijkstra’s shortest path algorithm, use this attribute\n",
    "name by default to get the weight for each edge.\n",
    "\n",
    "Attributes can be assigned to an edge by using keyword/value\n",
    "pairs when adding edges.  You can use any keyword\n",
    "to name your attribute and can then query the edge\n",
    "data using that attribute keyword.\n",
    "\n",
    "Once you’ve decided how to encode the nodes and edges, and whether you have\n",
    "an undirected/directed graph with or without multiedges you are ready to build\n",
    "your network.\n",
    "\n",
    "# Graph Creation\n",
    "\n",
    "NetworkX graph objects can be created in one of three ways:\n",
    "\n",
    "* Graph generators—standard algorithms to create network topologies.\n",
    "\n",
    "* Importing data from pre-existing (usually file) sources.\n",
    "\n",
    "* Adding edges and nodes explicitly.\n",
    "\n",
    "Explicit addition and removal of nodes/edges is the easiest to describe.\n",
    "Each graph object supplies methods to manipulate the graph.  For example,"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b57bd24b",
   "metadata": {},
   "outputs": [],
   "source": [
    "import networkx as nx\n",
    "G = nx.Graph()\n",
    "G.add_edge(1, 2)  # default edge data=1\n",
    "G.add_edge(2, 3, weight=0.9)  # specify edge data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "af944dba",
   "metadata": {},
   "source": [
    "Edge attributes can be anything:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e2400ba2",
   "metadata": {},
   "outputs": [],
   "source": [
    "import math\n",
    "G.add_edge('y', 'x', function=math.cos)\n",
    "G.add_node(math.cos)  # any hashable can be a node"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "30ce4b31",
   "metadata": {},
   "source": [
    "You can add many edges at one time:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6f3061ba",
   "metadata": {},
   "outputs": [],
   "source": [
    "elist = [(1, 2), (2, 3), (1, 4), (4, 2)]\n",
    "G.add_edges_from(elist)\n",
    "elist = [('a', 'b', 5.0), ('b', 'c', 3.0), ('a', 'c', 1.0), ('c', 'd', 7.3)]\n",
    "G.add_weighted_edges_from(elist)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cf4c1a29",
   "metadata": {},
   "source": [
    "See the Tutorial for more examples.\n",
    "\n",
    "Some basic graph operations such as union and intersection\n",
    "are described in the operators module documentation.\n",
    "\n",
    "Graph generators such as `binomial_graph()`\n",
    "and `erdos_renyi_graph()` are\n",
    "provided in the graph generators subpackage.\n",
    "\n",
    "For importing network data from formats such as GML, GraphML, edge list text files\n",
    "see the reading and writing graphs subpackage.\n",
    "\n",
    "# Graph Reporting\n",
    "\n",
    "Class views provide basic reporting of nodes, neighbors, edges and degree.\n",
    "These views provide iteration over the properties as well as membership\n",
    "queries and data attribute lookup. The views refer to the graph data structure\n",
    "so changes to the graph are reflected in the views. This is analogous to\n",
    "dictionary views in Python 3. If you want to change the graph while iterating\n",
    "you will need to use e.g. `for e in list(G.edges):`. The views provide\n",
    "set-like operations, e.g. union and intersection, as well as dict-like\n",
    "lookup and iteration of the data attributes using `G.edges[u, v]['color']`\n",
    "and `for e, datadict in G.edges.items():`. Methods `G.edges.items()` and\n",
    "`G.edges.values()` are familiar from python dicts. In addition `G.edges.data()`\n",
    "provides specific attribute iteration e.g. `for e, e_color in G.edges.data('color'):`.\n",
    "\n",
    "The basic graph relationship of an edge can be obtained in two ways.\n",
    "One can look for neighbors of a node or one can look for edges.\n",
    "We jokingly refer to people who focus on nodes/neighbors as node-centric\n",
    "and people who focus on edges as edge-centric.  The designers of NetworkX\n",
    "tend to be node-centric and view edges as a relationship between nodes.\n",
    "You can see this by our choice of lookup notation like `G[u]` providing neighbors\n",
    "(adjacency) while edge lookup is `G.edges[u, v]`.\n",
    "Most data structures for sparse graphs are essentially adjacency lists and so\n",
    "fit this perspective. In the end, of course, it doesn’t really matter which way\n",
    "you examine the graph. `G.edges` removes duplicate representations of undirected\n",
    "edges while neighbor reporting across all nodes will naturally report both directions.\n",
    "\n",
    "Any properties that are more complicated than edges, neighbors and degree are\n",
    "provided by functions.  For example `nx.triangles(G, n)` gives the number of triangles\n",
    "which include node n as a vertex.  These functions are grouped in the code and\n",
    "documentation under the term algorithms.\n",
    "\n",
    "# Algorithms\n",
    "\n",
    "A number of graph algorithms are provided with NetworkX.\n",
    "These include shortest path, and breadth first search\n",
    "(see traversal),\n",
    "clustering and isomorphism algorithms and others.  There are\n",
    "many that we have not developed yet too.  If you implement a\n",
    "graph algorithm that might be useful for others please let\n",
    "us know through the\n",
    "[NetworkX Google group](http://groups.google.com/group/networkx-discuss)\n",
    "or the Github [Developer Zone](https://github.com/networkx/networkx).\n",
    "\n",
    "As an example here is code to use Dijkstra’s algorithm to\n",
    "find the shortest weighted path:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "33060076",
   "metadata": {},
   "outputs": [],
   "source": [
    "G = nx.Graph()\n",
    "e = [('a', 'b', 0.3), ('b', 'c', 0.9), ('a', 'c', 0.5), ('c', 'd', 1.2)]\n",
    "G.add_weighted_edges_from(e)\n",
    "print(nx.dijkstra_path(G, 'a', 'd'))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "43a66e19",
   "metadata": {},
   "source": [
    "# Drawing\n",
    "\n",
    "While NetworkX is not designed as a network drawing tool, we provide\n",
    "a simple interface to drawing packages and some simple layout algorithms.\n",
    "We interface to the excellent Graphviz layout tools like dot and neato\n",
    "with the (suggested) pygraphviz package or the pydot interface.\n",
    "Drawing can be done using external programs or the Matplotlib Python\n",
    "package.  Interactive GUI interfaces are possible, though not provided.\n",
    "The drawing tools are provided in the module drawing.\n",
    "\n",
    "The basic drawing functions essentially place the nodes on a scatterplot\n",
    "using the positions you provide via a dictionary or the positions are\n",
    "computed with a layout function. The edges are lines between those dots."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "50d5b2a3",
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "G = nx.cubical_graph()\n",
    "subax1 = plt.subplot(121)\n",
    "nx.draw(G)   # default spring_layout\n",
    "subax2 = plt.subplot(122)\n",
    "nx.draw(G, pos=nx.circular_layout(G), node_color='r', edge_color='b')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3a564e32",
   "metadata": {},
   "source": [
    "See the examples for more ideas.\n",
    "\n",
    "# Data Structure\n",
    "\n",
    "NetworkX uses a “dictionary of dictionaries of dictionaries” as the\n",
    "basic network data structure.  This allows fast lookup with reasonable\n",
    "storage for large sparse networks.  The keys are nodes so `G[u]` returns\n",
    "an adjacency dictionary keyed by neighbor to the edge attribute\n",
    "dictionary. A view of the adjacency data structure is provided\n",
    "by the dict-like object `G.adj` as e.g. `for node, nbrsdict in G.adj.items():`.\n",
    "The expression `G[u][v]` returns the edge attribute dictionary itself.\n",
    "A dictionary of lists would have also been possible, but not allow\n",
    "fast edge detection nor convenient storage of edge data.\n",
    "\n",
    "Advantages of dict-of-dicts-of-dicts data structure:\n",
    "\n",
    "> * Find edges and remove edges with two dictionary look-ups.\n",
    "\n",
    "> * Prefer to “lists” because of fast lookup with sparse storage.\n",
    "\n",
    "> * Prefer to “sets” since data can be attached to edge.\n",
    "\n",
    "> * `G[u][v]` returns the edge attribute dictionary.\n",
    "\n",
    "> * `n in G` tests if node `n` is in graph `G`.\n",
    "\n",
    "> * `for n in G:` iterates through the graph.\n",
    "\n",
    "> * `for nbr in G[n]:` iterates through neighbors.\n",
    "\n",
    "As an example, here is a representation of an undirected graph with the\n",
    "edges $(A, B)$ and $(B, C)$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8867e559",
   "metadata": {},
   "outputs": [],
   "source": [
    "G = nx.Graph()\n",
    "G.add_edge('A', 'B')\n",
    "G.add_edge('B', 'C')\n",
    "print(G.adj)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "475498e4",
   "metadata": {},
   "source": [
    "The data structure gets morphed slightly for each base graph class.\n",
    "For DiGraph two dict-of-dicts-of-dicts structures are provided, one\n",
    "for successors (`G.succ`) and one for predecessors (`G.pred`).\n",
    "For MultiGraph/MultiDiGraph we use a dict-of-dicts-of-dicts-of-dicts \n",
    "where the third dictionary is keyed by an edge key identifier to the fourth\n",
    "dictionary which contains the edge attributes for that edge between\n",
    "the two nodes.\n",
    "\n",
    "Graphs provide two interfaces to the edge data attributes: adjacency\n",
    "and edges. So `G[u][v]['width']` is the same as `G.edges[u, v]['width']`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "843dfcd2",
   "metadata": {},
   "outputs": [],
   "source": [
    "G = nx.Graph()\n",
    "G.add_edge(1, 2, color='red', weight=0.84, size=300)\n",
    "print(G[1][2]['size'])\n",
    "print(G.edges[1, 2]['color'])"
   ]
  }
 ],
 "metadata": {},
 "nbformat": 4,
 "nbformat_minor": 5
}