Load nodes with attributes and edges from a DataFrame in NetworkX

I use Python for graphing: NetworkX. So far I have used Gefi. There are standard steps (but not the only possible ones):

  • Download node information from a table / table; one of the columns should be an ID, and the rest should be metadata about nodes (nodes are people, so gender, groups ... are usually used for coloring). How:

    id;NormalizedName;Gender per1;Jesús;male per2;Abraham;male per3;Isaac;male per4;Jacob;male per5;Judá;male per6;Tamar;female ... 
  • Then load the edges from the table / table using the same names for the nodes as in the column identifier of the node table, usually with four columns (Target, Source, Weight and Type):

     Target;Source;Weight;Type per1;per2;3;Undirected per3;per4;2;Undirected ... 

These are the two data frames that I have and what I want to load into Python. When reading about NetworkX, it seems like you cannot load two tables (one for nodes, one for edges) in the same graph, and I'm not sure what would be the best way:

  • Should I only create a graph with information about nodes from a DataFrame, and then add (add) edges from another DataFrame? If so, and since nx.from_pandas_dataframe () expects information about the edges, I think I should not use it to create nodes ... Should I just pass the information in lists?

  • Should I only create a graph with edge information from a DataFrame, and then add information from another DataFrame as attributes to each node? Is there a better way to do this than iterating over a DataFrame and nodes?

+10
source share
3 answers

Create a weighted graph from the border table using nx.from_pandas_dataframe :

 import networkx as nx import pandas as pd edges = pd.DataFrame({'source' : [0, 1], 'target' : [1, 2], 'weight' : [100, 50]}) nodes = pd.DataFrame({'node' : [0, 1, 2], 'name' : ['Foo', 'Bar', 'Baz'], 'gender' : ['M', 'F', 'M']}) G = nx.from_pandas_dataframe(edges, 'source', 'target', 'weight') 

Then add the node attributes from the dictionaries using set_node_attributes :

 nx.set_node_attributes(G, 'name', pd.Series(nodes.name, index=nodes.node).to_dict()) nx.set_node_attributes(G, 'gender', pd.Series(nodes.gender, index=nodes.node).to_dict()) 

Or iterate over the graph to add node attributes:

 for i in sorted(G.nodes()): G.node[i]['name'] = nodes.name[i] G.node[i]['gender'] = nodes.gender[i] 

Update:

As of nx 2.0 order of the arguments nx.set_node_attributes has changed : (G, values, name=None)

Using the example above:

 nx.set_node_attributes(G, pd.Series(nodes.gender, index=nodes.node).to_dict(), 'gender') 
+13
source

A little note:

from_pandas_dataframe does not work in nx 2, referring to this

 G = nx.from_pandas_dataframe(edges, 'source', 'target', 'weight') 

I think in nx 2.0 this happens like this:

 G = nx.from_pandas_edgelist(edges, source = "Source", target = "Target") 
0
source

Here is basically the same answer, but updated with some details filled out. We will start with basically the same setup, but there will be no indexes for the nodes, only the names for the @LancelotHolmes address of the comment and we will make it more general:

 import networkx as nx import pandas as pd linkData = pd.DataFrame({'source' : ['Amy', 'Bob'], 'target' : ['Bob', 'Cindy'], 'weight' : [100, 50]}) nodeData = pd.DataFrame({'name' : ['Amy', 'Bob', 'Cindy'], 'type' : ['Foo', 'Bar', 'Baz'], 'gender' : ['M', 'F', 'M']}) G = nx.from_pandas_edgelist(linkData, 'source', 'target', True, nx.DiGraph()) 

Here, the True parameter tells NetworkX to save all properties in linkData as link properties. In this case, I made it a DiGraph type, but if you do not need it, you can make it another type in the obvious way.

Now, since you need to map nodeData by the name of the nodes generated from linkData, you need to set the dataframe index for nodeData as the name property before you make it a dictionary so NetworkX 2.x can load it. as node attributes.

 nx.set_node_attributes(G, nodeData.set_index('name').to_dict('index')) 

This loads the entire nodeData data frame into a dictionary in which the key is the name, and the rest of the properties are key: value pairs in this key (i.e., the Normal properties of the node, where the node index is its name).

0
source

Source: https://habr.com/ru/post/1015407/


All Articles