Chapter 4 Retrieve data from the API
The API can be called either from the connection objet, or with the call_neo4j()
function.
The call_neo4j()
function takes several arguments :
query
: the cypher querycon
: the connexion objecttype
: “rows” or “graph”: wether to return the results as a list of results in tibble, or as a graph object (with$nodes
and$relationships
)output
: the output format (R or json)include_stats
: whether or not to include the stats about the callmeta
: wether or not to include the meta arguments of the nodes when calling with “rows”
4.1 “rows” format
The user chooses wether or not to return a list of tibbles when calling the API. You get as many objects as specified in the RETURN cypher statement.
library(magrittr)
'MATCH (tom {name: "Tom Hanks"}) RETURN tom;' %>%
call_neo4j(con)
## $tom
## # A tibble: 1 x 2
## born name
## <int> <chr>
## 1 1956 Tom Hanks
##
## attr(,"class")
## [1] "neo" "list"
By default, results are returned as an R list of tibbles. We think this is the more “truthful” way to implement the outputs regarding Neo4J calls.
For example, when you want to return two nodes types, you’ll get two results, in the form of two tibbles (what we’ve seen just before) - the result is a two elements list with each element being labelled the way it has been specified in the Cypher query.
Results can also be returned in JSON:
'MATCH (cloudAtlas {title: "Cloud Atlas"}) RETURN cloudAtlas;' %>%
call_neo4j(con, output = "json")
## [
## [
## {
## "row": [
## {
## "tagline": ["Everything is connected"],
## "title": ["Cloud Atlas"],
## "released": [2012]
## }
## ],
## "meta": [
## {
## "id": [105],
## "type": ["node"],
## "deleted": [false]
## }
## ]
## }
## ]
## ]
Useful for example for writing to a file:
tmp <- tempfile(fileext = ".json")
'MATCH (people:Person) RETURN people.name LIMIT 1' %>%
call_neo4j(con, output = "json") %>%
write(tmp)
jsonlite::read_json(tmp)
## [[1]]
## [[1]][[1]]
## [[1]][[1]]$row
## [[1]][[1]]$row[[1]]
## [[1]][[1]]$row[[1]][[1]]
## [1] "Keanu Reeves"
##
##
##
## [[1]][[1]]$meta
## [[1]][[1]]$meta[[1]]
## named list()
If you turn the type
argument to "graph"
, you’ll get a graph result:
'MATCH (tom:Person {name: "Tom Hanks"})-[act:ACTED_IN]->(tomHanksMovies) RETURN act,tom,tomHanksMovies' %>%
call_neo4j(con, type = "graph")
## $nodes
## # A tibble: 13 x 3
## id label properties
## <chr> <list> <list>
## 1 144 <chr [1]> <list [3]>
## 2 71 <chr [1]> <list [2]>
## 3 67 <chr [1]> <list [3]>
## 4 162 <chr [1]> <list [3]>
## 5 78 <chr [1]> <list [3]>
## 6 85 <chr [1]> <list [3]>
## 7 111 <chr [1]> <list [3]>
## 8 105 <chr [1]> <list [3]>
## 9 150 <chr [1]> <list [3]>
## 10 130 <chr [1]> <list [3]>
## 11 73 <chr [1]> <list [3]>
## 12 161 <chr [1]> <list [3]>
## 13 159 <chr [1]> <list [3]>
##
## $relationships
## # A tibble: 12 x 5
## id type startNode endNode properties
## <chr> <chr> <chr> <chr> <list>
## 1 202 ACTED_IN 71 144 <list [1]>
## 2 84 ACTED_IN 71 67 <list [1]>
## 3 234 ACTED_IN 71 162 <list [1]>
## 4 98 ACTED_IN 71 78 <list [1]>
## 5 110 ACTED_IN 71 85 <list [1]>
## 6 146 ACTED_IN 71 111 <list [1]>
## 7 137 ACTED_IN 71 105 <list [1]>
## 8 213 ACTED_IN 71 150 <list [1]>
## 9 182 ACTED_IN 71 130 <list [1]>
## 10 91 ACTED_IN 71 73 <list [1]>
## 11 232 ACTED_IN 71 161 <list [1]>
## 12 228 ACTED_IN 71 159 <list [1]>
##
## attr(,"class")
## [1] "neo" "list"
The result is returned as one node or relationship by row.
Due to the specific data format of Neo4J, there can be more than one label and property by node and relationship. That’s why the results are returned, by design, as a list-dataframe.
4.2 Parsing results
We have designed several functions to unnest the output :
unnest_nodes()
, that can unnest a node dataframe :
res <- 'MATCH (tom:Person {name:"Tom Hanks"})-[a:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors) RETURN m AS acted,coActors.name' %>%
call_neo4j(con, type = "graph")
unnest_nodes(res$nodes)
## # A tibble: 11 x 5
## id value tagline title released
## <chr> <chr> <chr> <chr> <int>
## 1 144 Movie Houston, we have a problem. Apollo 13 1995
## 2 67 Movie At odds in life... in love on-line. Youve Got M… 1998
## 3 162 Movie Once in a lifetime you get a chance t… A League of… 1992
## 4 78 Movie A story of love, lava and burning des… Joe Versus … 1990
## 5 85 Movie In every life there comes a time when… That Thing … 1996
## 6 111 Movie Break The Codes The Da Vinc… 2006
## 7 105 Movie Everything is connected Cloud Atlas 2012
## 8 150 Movie At the edge of the world, his journey… Cast Away 2000
## 9 130 Movie Walk a mile youll never forget. The Green M… 1999
## 10 73 Movie What if someone you never met, someon… Sleepless i… 1993
## 11 159 Movie A stiff drink. A little mascara. A lo… Charlie Wil… 2007
Note that this function will return NA
for the properties that aren’t in a node. For example here, we have no ‘formed’ information for the record nodes .
On the long run, and this is not {neo4r} specific but Neo4J related, a good practice is to have a “name” propertie on each node, so this column will be full here.
Also, it is possible to unnest either the properties or the labels :
res %>%
extract_nodes() %>%
unnest_nodes(what = "properties")
## # A tibble: 11 x 5
## id label tagline title released
## <chr> <list> <chr> <chr> <int>
## 1 144 <chr [… Houston, we have a problem. Apollo 13 1995
## 2 67 <chr [… At odds in life... in love on-line. Youve Got M… 1998
## 3 162 <chr [… Once in a lifetime you get a chance… A League of… 1992
## 4 78 <chr [… A story of love, lava and burning d… Joe Versus … 1990
## 5 85 <chr [… In every life there comes a time wh… That Thing … 1996
## 6 111 <chr [… Break The Codes The Da Vinc… 2006
## 7 105 <chr [… Everything is connected Cloud Atlas 2012
## 8 150 <chr [… At the edge of the world, his journ… Cast Away 2000
## 9 130 <chr [… Walk a mile youll never forget. The Green M… 1999
## 10 73 <chr [… What if someone you never met, some… Sleepless i… 1993
## 11 159 <chr [… A stiff drink. A little mascara. A … Charlie Wil… 2007
res %>%
extract_nodes() %>%
unnest_nodes(what = "label")
## # A tibble: 11 x 3
## id properties value
## <chr> <list> <chr>
## 1 144 <list [3]> Movie
## 2 67 <list [3]> Movie
## 3 162 <list [3]> Movie
## 4 78 <list [3]> Movie
## 5 85 <list [3]> Movie
## 6 111 <list [3]> Movie
## 7 105 <list [3]> Movie
## 8 150 <list [3]> Movie
## 9 130 <list [3]> Movie
## 10 73 <list [3]> Movie
## 11 159 <list [3]> Movie
unnest_relationships()
There is only one nested column in the relationship table, thus the function is quite straightforward :
'MATCH (people:Person)-[relatedTo]-(:Movie {title: "Cloud Atlas"}) RETURN people.name, Type(relatedTo), relatedTo' %>%
call_neo4j(con, type = "graph") %>%
extract_relationships() %>%
unnest_relationships()
## # A tibble: 23 x 8
## id type startNode endNode roles value summary rating
## <chr> <chr> <chr> <chr> <list> <lgl> <chr> <int>
## 1 137 ACTED_IN 71 105 <chr [1]> NA <NA> NA
## 2 137 ACTED_IN 71 105 <chr [1]> NA <NA> NA
## 3 137 ACTED_IN 71 105 <chr [1]> NA <NA> NA
## 4 137 ACTED_IN 71 105 <chr [1]> NA <NA> NA
## 5 140 ACTED_IN 107 105 <chr [1]> NA <NA> NA
## 6 140 ACTED_IN 107 105 <chr [1]> NA <NA> NA
## 7 140 ACTED_IN 107 105 <chr [1]> NA <NA> NA
## 8 144 WROTE 109 105 <NULL> NA <NA> NA
## 9 141 DIRECTED 108 105 <NULL> NA <NA> NA
## 10 143 DIRECTED 6 105 <NULL> NA <NA> NA
## # … with 13 more rows
unnest_graph()
This function takes a graph results, and does unnest_nodes
and unnest_relationships
.
'MATCH (people:Person)-[relatedTo]-(:Movie {title: "Cloud Atlas"}) RETURN people.name, Type(relatedTo), relatedTo' %>%
call_neo4j(con, type = "graph") %>%
unnest_graph()
## $nodes
## # A tibble: 11 x 7
## id value born name tagline title released
## <chr> <chr> <int> <chr> <chr> <chr> <int>
## 1 71 Person 1956 Tom Hanks <NA> <NA> NA
## 2 105 Movie NA <NA> Everything is conn… Cloud At… 2012
## 3 107 Person 1949 Jim Broadbent <NA> <NA> NA
## 4 109 Person 1969 David Mitchell <NA> <NA> NA
## 5 108 Person 1965 Tom Tykwer <NA> <NA> NA
## 6 6 Person 1965 Lana Wachowski <NA> <NA> NA
## 7 110 Person 1961 Stefan Arndt <NA> <NA> NA
## 8 169 Person NA Jessica Thomp… <NA> <NA> NA
## 9 106 Person 1966 Halle Berry <NA> <NA> NA
## 10 4 Person 1960 Hugo Weaving <NA> <NA> NA
## 11 5 Person 1967 Lilly Wachows… <NA> <NA> NA
##
## $relationships
## # A tibble: 23 x 8
## id type startNode endNode roles value summary rating
## <chr> <chr> <chr> <chr> <list> <lgl> <chr> <int>
## 1 137 ACTED_IN 71 105 <chr [1]> NA <NA> NA
## 2 137 ACTED_IN 71 105 <chr [1]> NA <NA> NA
## 3 137 ACTED_IN 71 105 <chr [1]> NA <NA> NA
## 4 137 ACTED_IN 71 105 <chr [1]> NA <NA> NA
## 5 140 ACTED_IN 107 105 <chr [1]> NA <NA> NA
## 6 140 ACTED_IN 107 105 <chr [1]> NA <NA> NA
## 7 140 ACTED_IN 107 105 <chr [1]> NA <NA> NA
## 8 144 WROTE 109 105 <NULL> NA <NA> NA
## 9 141 DIRECTED 108 105 <NULL> NA <NA> NA
## 10 143 DIRECTED 6 105 <NULL> NA <NA> NA
## # … with 13 more rows
##
## attr(,"class")
## [1] "neo" "list"
4.2.1 Extraction
There are two convenient functions to extract nodes and relationships:
'MATCH (bacon:Person {name:"Kevin Bacon"})-[*1..4]-(hollywood) RETURN DISTINCT hollywood' %>%
call_neo4j(con, type = "graph") %>%
extract_nodes()
## # A tibble: 135 x 3
## id label properties
## <chr> <list> <list>
## 1 72 <chr [1]> <list [2]>
## 2 68 <chr [1]> <list [2]>
## 3 54 <chr [1]> <list [2]>
## 4 34 <chr [1]> <list [2]>
## 5 70 <chr [1]> <list [2]>
## 6 69 <chr [1]> <list [2]>
## 7 67 <chr [1]> <list [3]>
## 8 163 <chr [1]> <list [2]>
## 9 166 <chr [1]> <list [2]>
## 10 77 <chr [1]> <list [2]>
## # … with 125 more rows
'MATCH p=shortestPath(
(bacon:Person {name:"Kevin Bacon"})-[*]-(meg:Person {name:"Meg Ryan"})
)
RETURN p' %>%
call_neo4j(con, type = "graph") %>%
extract_relationships()
## # A tibble: 4 x 5
## id type startNode endNode properties
## <chr> <chr> <chr> <chr> <list>
## 1 202 ACTED_IN 71 144 <list [1]>
## 2 203 ACTED_IN 19 144 <list [1]>
## 3 91 ACTED_IN 71 73 <list [1]>
## 4 92 ACTED_IN 34 73 <list [1]>