Creating a Family Tree with Neo4J

Question

Creating a Family Tree with Neo4J

I have a dataset for a family tree in Neo4J, and I'm trying to build a Cypher request that creates a JSON dataset similar to the following:

{Name: "Bob", parents: [ {Name: "Roger", parents: [ Name: "Robert", Name: "Jessica" ]}, {Name: "Susan", parents: [ Name: "George", Name: "Susan" ]} ]}

My graph has a PARENT relationship between MEMBER nodes (i.e. MATCH (p.Member) - [: PARENT] → (c.Member)). I found the has_many Nesting in cypher and neo4j cypher a nested collection that ultimately groups all the parents together for the main child node I'm looking for.

Adding some clarity based on feedback:

Each member has a unique identifier. Unions are currently all connected with the PARENT relationship. Everything is indexed so that performance is not affected. When I run the query to just return the node graph, I get the expected results. I am trying to return output that I can use for visualization using D3. Ideally, this will be done using a Cypher request, as I use the API to access neo4j from the built-in interface.

Adding an example query:

 MATCH (p:Person)-[:PARENT*1..5]->(c:Person) WHERE c.FirstName = 'Bob' RETURN p.FirstName, c.FirstName

This query returns a list of each parent for five generations, but instead of showing a hierarchy, it lists “Bob” as a child of each link. Is there a Cypher query that will show every relationship in the data at least? I can format it as I need from there ...

+5

json neo4j

OpenDataAlex Jan 7 '15 at 20:01

source share

4 answers

I would suggest creating a method to smooth your data into an array. If the objects do not have UUIDs, you probably want to provide them with identifiers as you smooth, and then enter the parent key for each record.

Then you can run it as a set of cypher requests (either multiple requests to the REST API request, or using the batch REST API), or alternatively dump the data in CSV and use the cypher LOAD CSV to load objects.

An example of a cypher command with parameters:

 CREATE (:Member {uuid: {uuid}, name: {name}}

And then run the list again with the parent and child identifiers:

 MATCH (m1:Member {uuid: {uuid1}}), (m2:Member {uuid: {uuid2}}) CREATE m1<-[:PARENT]-m2

Make sure the index has an identifier for members!

+2

Brian underwood Jan 7 '15 at 23:45

source share

Genealogy data can conform to the GEDCOM standard and include two types of nodes: Person and Union. Person node has its identifier and usual demographic facts. Union nodes have union_id and union facts. At GEDCOM, the family is the third element that brings the two together. But in Neo4j I found it necessary to also include union_id in the Person nodes. I used 5 relationships: father, mother, husband, wife and child. A family is two parents with an internal vector and each child with an external vector. The image illustrates this. This is very convenient for visualizing compounds and creating hypotheses. For example, consider the attached photo of my ancestor Edward G. Campbell, a product of the 1917 union, where three marriages married three warrior sisters from union 8944 and two Gaiter's married sisters from union 2945. In addition, in the upper left corner, like Mahal Campbell her brother-sister John Greer Armstrong married her. Next to the Mahal is Elizabeth Campbell, who is married to another Campbell, but probably directly related to them. In the same way, you can hypothesize that Rachel Jacobs is in the upper right corner and how it can relate to other Jacobs. I use volumetric inserts that can fill ~ 30,000 human nodes and ~ 100,000 relationships in just a minute. I have a small .NET function that returns JSon from dataview; This general solution works with any dataview, so it is scalable. Now I am working on adding other data, such as locations (lat / long), documentation (in particular, what connects people, for example, census), etc.

+2

David A Stumpf Jan 08 '15 at 6:59

source share

The only way I found to get the data I'm looking for is to actually return the ratio information, for example:

 MATCH ft = (person {firstName: 'Bob'})<-[:PARENT]-(p:Person) RETURN EXTRACT(n in nodes(ft) | {firstName: n.firstName}) as parentage ORDER BY length(ft);

which will return a dataset that I can then convert:

 ["Bob", "Roger"] ["Bob", "Susan"] ["Bob", "Roger", "Robert"] ["Bob", "Susan", "George"] ["Bob", "Roger", "Jessica"] ["Bob", "Susan", "Susan"]

0

OpenDataAlex Jan 9 '15 at 18:12

source share

Michael hunger · Accepted Answer · 2015-01-09T18:12:29+0000

You can also watch Rick van Bruggens Blog on his family details :

Regarding your request

You have already created the path template here: (p:Person)-[:PARENT*1..5]->(c:Person) you can assign it to the tree variable and then work with this variable, for example. returning a tree or nodes(tree) or rels(tree) or working with this collection in other ways:

 MATCH tree = (p:Person)-[:PARENT*1..5]->(c:Person) WHERE c.FirstName = 'Bob' RETURN nodes(tree), rels(tree), tree, length(tree), [n in nodes(tree) | n.FirstName] as names

See also the cypher help card: http://neo4j.com/docs/stable/cypher-refcard and online training http://neo4j.com/online-training to learn more about Cypher.

Do not forget

 create index on :Person(FirstName);

Creating a Family Tree with Neo4J

More articles: