Monthly Archives: December 2012

Some notes about the new features of neo4j-rest-client

This is the first time I write in technical terms and about an own development. But first, a bit of history. Back to December 2009, I met Neo4j database. Neo4j was one of the first graph databases with serious uses in the real world. I was amazed about how easy was creating nodes and edges (they call them relationships in an even more intuitive nomenclature). But, like everything else in the graph world, was written in Java with no other language binding, except a very basic Python binding, neo4j.py. A couple of months later, they released the REST standalone server and then, because having neo4.py working was really hard for pure Python coders, I decided to write a client side library. That’s how neo4-rest-client was born. It was a really basic tool, but started to grow and grow, and later the next year the first packaged version was release in the Python Package Index. Since then, everything has been improved, as much as the Neo4j REST API as the Python community around. The Neo4j guys finally deprecated neo4.py and released a new python-embedded client, also based on then Java runtime, at the same time that other alternatives just appeared into the scene: bublflow, neo4django, or the newest py2neo, for example. However, neo4j-rest-client was always as low level as possible: it didn’t manage cache, or lazy loads, or delayed requests, to name just a few. But when the Cypher plugin, the preferred method to query the graph in Neo4j, become part of the core, I decided to implement some cool features based on it.

The first thing was to have better iterables for objects, as well as laziness in loads and requests. I implemented a way to query the graph database using Cypher, but taking advantage from the current neo4j-rest-client objects like Node or Relationship. So, every time you make a query, you can get the objects as returned by the server, what it is called the RAW response. Using the `constants.RAW` option in the `returns` parameter of the method `query` from `GraphDatabase` objects.


from neo4jrestclient.client import GraphDatabase
from neo4jrestclient.constants import RAW

gdb = GraphDatabase("http://localhost:7474/db/data/")

q = "START n=node(*) RETURN ID(n), n.name"
params = {}
gdb.query(q, params=params, returns=RAW)

Or you can use `params` to pass safely parameters to your query.


q = "START n=node({nodes}) RETURN ID(n), n.name"
params = {"nodes": [1, 2, 3, 4, 5]}

gdb.query(q, params=params, returns=RAW)

Independently of the way you define your query, the last line can omit the `returns` if the value is `RAW`, but the power of this parameter is the possibility of passing casting functions in order to format the results.

from neo4jrestclient.client import Node

q = "START n=node({nodes}) RETURN ID(n), n.name!, n"
returns = (int, unicode, Node)
gdb.query(q, returns=returns)

Or you can even create your own casting function, what is really useful when using nullable properties, referenced in Cypher as `?` and `!`.


from neo4jrestclient.client import Node

def my_custom_casting(val):
    try:
        return unicode(val)
    except:  # Never ever leave an except like this
        return val

q = "START n=node({nodes}) RETURN ID(n), n.name!, n"
returns = (int, my_custom_casting, Node)
gdb.query(q, returns=returns)

Now I can assure that if the name of a node is not present,  a proper RAW value will be returned. But what happens if the number of columns don’t math the number of elements passed to be used as casting functions? Nothing, remaining elements will be returned as RAW, as usual. Nice graceful degradation 😀

On the other hand, and using the new queries feature, I implemented  some filtering helpers than could eventually replace the Lucene query method used so far. The star here is the `Q` object.


from neo4jrestclient.query import Q

The syntax, borrowed from Django and inspired by lucene-querybuilder, is the next one:


Q(property_name, lookup=value_to_match, [nullable])

The `nullable` option can take a `True` (by default), `False` or a `None`, and set the behaviour of Cypher when an element doesn’t have the queried property. In real examples, it will look like:


lookup = Q("name", istartswith="william")
williams = gdb.nodes.filter(lookup)

The complete list of lookup options is in the documentation. And lookups can be as complicated as you want.


lookups = (
    Q("name", exact="James")
    & (Q("surname", startswith="Smith")
       | ~Q("surname", endswith="e"))
)
nodes = gdb.nodes.filter(lookup)

The `filter`  method, added to nodes and relationships, can take an extra argument `start`, in order to set the `START` instead of using all the nodes or relationships (`node(*)`). The `start` parameter can be a mixed list of integers and Node objects,  a mixed list of integers and Relationship objects, or an Index object.


n1 = gdb.nodes.create()
start = [1, 2, 3, n1]
lookup = Q("name", istartswith="william")
nodes = gdb.nodes.filter(lookup, start=start)

index = gdb.nodes.indexes.create(name="williams")
index["name"]["w"] = n1
nodes = gdb.nodes.filter(lookup, start=index)
nodes = gdb.nodes.filter(lookup, start=index["name"])

Or using just the index:


nodes = index.filter(lookup, key="name", value="w")

Also, all filtering functions support lazy loading when slicing, so you can safely do slices in huge graph databases, because internally is using `skip` and `limit` Cypher options before doing the query.

Finally, just mention about the ordering method that allows you to order ascending (default) or descending, just concatenating calls.


from neo4jrestclient.constants import DESC

nodes = gdb.nodes.filter(lookup)[:100]
nodes.order_by("name", DESC).order_by("age")

And that’s all. Let’s see what the future has prepared for Neo4j and the Python neo4j-rest-client!

Leave a Comment

Filed under Topics