NHacker Next
login
▲Datalog in miniKanrendeosjr.github.io
82 points by deosjr 11 hours ago | 8 comments
Loading comments...
deosjr 11 hours ago [-]
Seems like interest in Datalog is high this week, so I thought I'd share a write-up of a minimal Datalog implementation I did a while ago.

Runs in the browser using Hoot (https://spritely.institute/hoot/) which compiles Guile Scheme to WebAssembly.

upghost 4 hours ago [-]
Datalog is a syntactic subset of Prolog[1], which this is... not.

I think the most misunderstood thing about Prolog (and Datalog, the functor-free subset of pure Prolog) is that the syntax is really, really important.

It's like, the whole gimmick of the language. It is designed to efficiently and elegantly query and transform itself. If you lose the syntax you lose all of intermediate and advanced Prolog (and Datalog).

[1]: https://en.m.wikipedia.org/wiki/Datalog

j-pb 1 hours ago [-]
Most database literature simply uses Datalog to mean the query language fragment of conjunctive queries + recursion/fixpoint-iteration and potentially stratified negation.

Yes it started out as a Prolog subset, but the definition as the fragments it supports has become much more prevalent, mainly to contrast it to non-recursive fragments with arbitrary negation (e.g. SQL).

This usage dates back to database literature of the 80s by Ullman et. al.

kragen 3 hours ago [-]
Semantics are more important than syntax. Prolog's flexible syntax is a nice-to-have rather than essential when you're in Lisp. And Datalog is purely first-order, so the advanced Prolog you're talking about doesn't exist in it.

However, syntax does matter, and this is not acceptable

    (dl-find 
     (fresh-vars 1 
      (lambda (?id) 
       (dl-findo dl
        ((,?id reachable ,?id)))))))
as a way to ask

    reachable(Id, Id).
I think you could, however, write a bit more Scheme and be able to ask

    (?id reachable ?id)
which would be acceptable.

However, the ordering betrays a deeper semantic difference with orthodox Datalog, which is about distinct N-ary relations, like a relational database, not binary relations. This implementation seems to be specific to binary relations, so it's not really Datalog for reasons that go beyond mere syntax.

On the other hand, this (from the initial goal) would be perfectly fine:

    (dl-rule! dl (reachable ,?x ,?y) :- 
                     (edge ,?x ,?z) (reachable ,?z ,?y))
The orthodox Datalog syntax is:

    reachable(X, Y) :- edge(X, Z), reachable(Z, Y).
jitl 2 hours ago [-]
Shouldn’t lisp macros make it easy to present such a nice syntax? Perhaps the author could easily implement that bit, if not the wide rows. Or is that the point you’re making?

There is a dl-rule here: https://github.com/deosjr/deosjr.github.io/blob/15b5f7e02153...

kragen 2 hours ago [-]
I don't think you need Lisp macros for it; you could use just a regular Lisp function. I don't think the standard R5RS macros are powerful enough to grovel over the query expression to make a list of the free variables, but then, standard Scheme also doesn't have records. I think Guile has a procedural macro system that you could use, but I don't think it would be a good idea.

Yes, I think the semantic divergence is more fundamental. Triple stores and graph databases and binary relations are awesome, but they aren't what Datalog is.

fithisux 9 hours ago [-]
What scheme is this?
deosjr 9 hours ago [-]
Guile Scheme. See https://github.com/deosjr/deosjr.github.io/blob/master/dynam... for more.