Chapter 4 Retrieve data from the API

The API can be called either from the connection objet, or with the call_neo4j() function.

The call_neo4j() function takes several arguments :

  • query : the cypher query
  • con : the connexion object
  • type : “rows” or “graph”: wether to return the results as a list of results in tibble, or as a graph object (with $nodes and $relationships)
  • output : the output format (R or json)
  • include_stats : whether or not to include the stats about the call
  • meta : wether or not to include the meta arguments of the nodes when calling with “rows”

4.1 “rows” format

The user chooses wether or not to return a list of tibbles when calling the API. You get as many objects as specified in the RETURN cypher statement.

library(magrittr)

'MATCH (tom {name: "Tom Hanks"}) RETURN tom;' %>%
  call_neo4j(con)
## $tom
## # A tibble: 1 x 2
##    born name     
##   <int> <chr>    
## 1  1956 Tom Hanks
## 
## attr(,"class")
## [1] "neo"  "list"

By default, results are returned as an R list of tibbles. We think this is the more “truthful” way to implement the outputs regarding Neo4J calls.

For example, when you want to return two nodes types, you’ll get two results, in the form of two tibbles (what we’ve seen just before) - the result is a two elements list with each element being labelled the way it has been specified in the Cypher query.

Results can also be returned in JSON:

'MATCH (cloudAtlas {title: "Cloud Atlas"}) RETURN cloudAtlas;' %>%
  call_neo4j(con, output = "json")
## [
##   [
##     {
##       "row": [
##         {
##           "tagline": ["Everything is connected"],
##           "title": ["Cloud Atlas"],
##           "released": [2012]
##         }
##       ],
##       "meta": [
##         {
##           "id": [105],
##           "type": ["node"],
##           "deleted": [false]
##         }
##       ]
##     }
##   ]
## ]

Useful for example for writing to a file:

tmp <- tempfile(fileext = ".json")
'MATCH (people:Person) RETURN people.name LIMIT 1' %>%
  call_neo4j(con, output = "json") %>%
  write(tmp)
jsonlite::read_json(tmp)
## [[1]]
## [[1]][[1]]
## [[1]][[1]]$row
## [[1]][[1]]$row[[1]]
## [[1]][[1]]$row[[1]][[1]]
## [1] "Keanu Reeves"
## 
## 
## 
## [[1]][[1]]$meta
## [[1]][[1]]$meta[[1]]
## named list()

If you turn the type argument to "graph", you’ll get a graph result:

'MATCH (tom:Person {name: "Tom Hanks"})-[act:ACTED_IN]->(tomHanksMovies) RETURN act,tom,tomHanksMovies' %>%
  call_neo4j(con, type = "graph")
## $nodes
## # A tibble: 13 x 3
##    id    label     properties
##    <chr> <list>    <list>    
##  1 144   <chr [1]> <list [3]>
##  2 71    <chr [1]> <list [2]>
##  3 67    <chr [1]> <list [3]>
##  4 162   <chr [1]> <list [3]>
##  5 78    <chr [1]> <list [3]>
##  6 85    <chr [1]> <list [3]>
##  7 111   <chr [1]> <list [3]>
##  8 105   <chr [1]> <list [3]>
##  9 150   <chr [1]> <list [3]>
## 10 130   <chr [1]> <list [3]>
## 11 73    <chr [1]> <list [3]>
## 12 161   <chr [1]> <list [3]>
## 13 159   <chr [1]> <list [3]>
## 
## $relationships
## # A tibble: 12 x 5
##    id    type     startNode endNode properties
##    <chr> <chr>    <chr>     <chr>   <list>    
##  1 202   ACTED_IN 71        144     <list [1]>
##  2 84    ACTED_IN 71        67      <list [1]>
##  3 234   ACTED_IN 71        162     <list [1]>
##  4 98    ACTED_IN 71        78      <list [1]>
##  5 110   ACTED_IN 71        85      <list [1]>
##  6 146   ACTED_IN 71        111     <list [1]>
##  7 137   ACTED_IN 71        105     <list [1]>
##  8 213   ACTED_IN 71        150     <list [1]>
##  9 182   ACTED_IN 71        130     <list [1]>
## 10 91    ACTED_IN 71        73      <list [1]>
## 11 232   ACTED_IN 71        161     <list [1]>
## 12 228   ACTED_IN 71        159     <list [1]>
## 
## attr(,"class")
## [1] "neo"  "list"

The result is returned as one node or relationship by row.

Due to the specific data format of Neo4J, there can be more than one label and property by node and relationship. That’s why the results are returned, by design, as a list-dataframe.

4.2 Parsing results

We have designed several functions to unnest the output :

  • unnest_nodes(), that can unnest a node dataframe :
res <- 'MATCH (tom:Person {name:"Tom Hanks"})-[a:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors) RETURN m AS acted,coActors.name' %>%
  call_neo4j(con, type = "graph")
unnest_nodes(res$nodes)
## # A tibble: 11 x 5
##    id    value tagline                                title        released
##    <chr> <chr> <chr>                                  <chr>           <int>
##  1 144   Movie Houston, we have a problem.            Apollo 13        1995
##  2 67    Movie At odds in life... in love on-line.    Youve Got M…     1998
##  3 162   Movie Once in a lifetime you get a chance t… A League of…     1992
##  4 78    Movie A story of love, lava and burning des… Joe Versus …     1990
##  5 85    Movie In every life there comes a time when… That Thing …     1996
##  6 111   Movie Break The Codes                        The Da Vinc…     2006
##  7 105   Movie Everything is connected                Cloud Atlas      2012
##  8 150   Movie At the edge of the world, his journey… Cast Away        2000
##  9 130   Movie Walk a mile youll never forget.        The Green M…     1999
## 10 73    Movie What if someone you never met, someon… Sleepless i…     1993
## 11 159   Movie A stiff drink. A little mascara. A lo… Charlie Wil…     2007

Note that this function will return NA for the properties that aren’t in a node. For example here, we have no ‘formed’ information for the record nodes .

On the long run, and this is not {neo4r} specific but Neo4J related, a good practice is to have a “name” propertie on each node, so this column will be full here.

Also, it is possible to unnest either the properties or the labels :

res %>%
  extract_nodes() %>%
  unnest_nodes(what = "properties")
## # A tibble: 11 x 5
##    id    label   tagline                              title        released
##    <chr> <list>  <chr>                                <chr>           <int>
##  1 144   <chr [… Houston, we have a problem.          Apollo 13        1995
##  2 67    <chr [… At odds in life... in love on-line.  Youve Got M…     1998
##  3 162   <chr [… Once in a lifetime you get a chance… A League of…     1992
##  4 78    <chr [… A story of love, lava and burning d… Joe Versus …     1990
##  5 85    <chr [… In every life there comes a time wh… That Thing …     1996
##  6 111   <chr [… Break The Codes                      The Da Vinc…     2006
##  7 105   <chr [… Everything is connected              Cloud Atlas      2012
##  8 150   <chr [… At the edge of the world, his journ… Cast Away        2000
##  9 130   <chr [… Walk a mile youll never forget.      The Green M…     1999
## 10 73    <chr [… What if someone you never met, some… Sleepless i…     1993
## 11 159   <chr [… A stiff drink. A little mascara. A … Charlie Wil…     2007
res %>%
  extract_nodes() %>%
  unnest_nodes(what = "label")
## # A tibble: 11 x 3
##    id    properties value
##    <chr> <list>     <chr>
##  1 144   <list [3]> Movie
##  2 67    <list [3]> Movie
##  3 162   <list [3]> Movie
##  4 78    <list [3]> Movie
##  5 85    <list [3]> Movie
##  6 111   <list [3]> Movie
##  7 105   <list [3]> Movie
##  8 150   <list [3]> Movie
##  9 130   <list [3]> Movie
## 10 73    <list [3]> Movie
## 11 159   <list [3]> Movie
  • unnest_relationships()

There is only one nested column in the relationship table, thus the function is quite straightforward :

'MATCH (people:Person)-[relatedTo]-(:Movie {title: "Cloud Atlas"}) RETURN people.name, Type(relatedTo), relatedTo' %>%
  call_neo4j(con, type = "graph") %>%
  extract_relationships() %>%
  unnest_relationships()
## # A tibble: 23 x 8
##    id    type     startNode endNode roles     value summary rating
##    <chr> <chr>    <chr>     <chr>   <list>    <lgl> <chr>    <int>
##  1 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
##  2 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
##  3 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
##  4 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
##  5 140   ACTED_IN 107       105     <chr [1]> NA    <NA>        NA
##  6 140   ACTED_IN 107       105     <chr [1]> NA    <NA>        NA
##  7 140   ACTED_IN 107       105     <chr [1]> NA    <NA>        NA
##  8 144   WROTE    109       105     <NULL>    NA    <NA>        NA
##  9 141   DIRECTED 108       105     <NULL>    NA    <NA>        NA
## 10 143   DIRECTED 6         105     <NULL>    NA    <NA>        NA
## # … with 13 more rows
  • unnest_graph()

This function takes a graph results, and does unnest_nodes and unnest_relationships.

'MATCH (people:Person)-[relatedTo]-(:Movie {title: "Cloud Atlas"}) RETURN people.name, Type(relatedTo), relatedTo' %>%
  call_neo4j(con, type = "graph") %>%
  unnest_graph()
## $nodes
## # A tibble: 11 x 7
##    id    value   born name           tagline             title     released
##    <chr> <chr>  <int> <chr>          <chr>               <chr>        <int>
##  1 71    Person  1956 Tom Hanks      <NA>                <NA>            NA
##  2 105   Movie     NA <NA>           Everything is conn… Cloud At…     2012
##  3 107   Person  1949 Jim Broadbent  <NA>                <NA>            NA
##  4 109   Person  1969 David Mitchell <NA>                <NA>            NA
##  5 108   Person  1965 Tom Tykwer     <NA>                <NA>            NA
##  6 6     Person  1965 Lana Wachowski <NA>                <NA>            NA
##  7 110   Person  1961 Stefan Arndt   <NA>                <NA>            NA
##  8 169   Person    NA Jessica Thomp… <NA>                <NA>            NA
##  9 106   Person  1966 Halle Berry    <NA>                <NA>            NA
## 10 4     Person  1960 Hugo Weaving   <NA>                <NA>            NA
## 11 5     Person  1967 Lilly Wachows… <NA>                <NA>            NA
## 
## $relationships
## # A tibble: 23 x 8
##    id    type     startNode endNode roles     value summary rating
##    <chr> <chr>    <chr>     <chr>   <list>    <lgl> <chr>    <int>
##  1 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
##  2 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
##  3 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
##  4 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
##  5 140   ACTED_IN 107       105     <chr [1]> NA    <NA>        NA
##  6 140   ACTED_IN 107       105     <chr [1]> NA    <NA>        NA
##  7 140   ACTED_IN 107       105     <chr [1]> NA    <NA>        NA
##  8 144   WROTE    109       105     <NULL>    NA    <NA>        NA
##  9 141   DIRECTED 108       105     <NULL>    NA    <NA>        NA
## 10 143   DIRECTED 6         105     <NULL>    NA    <NA>        NA
## # … with 13 more rows
## 
## attr(,"class")
## [1] "neo"  "list"

4.2.1 Extraction

There are two convenient functions to extract nodes and relationships:

'MATCH (bacon:Person {name:"Kevin Bacon"})-[*1..4]-(hollywood) RETURN DISTINCT hollywood' %>%
  call_neo4j(con, type = "graph") %>% 
  extract_nodes()
## # A tibble: 135 x 3
##    id    label     properties
##    <chr> <list>    <list>    
##  1 72    <chr [1]> <list [2]>
##  2 68    <chr [1]> <list [2]>
##  3 54    <chr [1]> <list [2]>
##  4 34    <chr [1]> <list [2]>
##  5 70    <chr [1]> <list [2]>
##  6 69    <chr [1]> <list [2]>
##  7 67    <chr [1]> <list [3]>
##  8 163   <chr [1]> <list [2]>
##  9 166   <chr [1]> <list [2]>
## 10 77    <chr [1]> <list [2]>
## # … with 125 more rows
'MATCH p=shortestPath(
  (bacon:Person {name:"Kevin Bacon"})-[*]-(meg:Person {name:"Meg Ryan"})
)
RETURN p' %>%
  call_neo4j(con, type = "graph") %>% 
  extract_relationships()
## # A tibble: 4 x 5
##   id    type     startNode endNode properties
##   <chr> <chr>    <chr>     <chr>   <list>    
## 1 202   ACTED_IN 71        144     <list [1]>
## 2 203   ACTED_IN 19        144     <list [1]>
## 3 91    ACTED_IN 71        73      <list [1]>
## 4 92    ACTED_IN 34        73      <list [1]>