Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the nested, updating, and chained subqueries CIP #100

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

petraselmer
Copy link

@petraselmer petraselmer commented Jun 22, 2016

This is part of the redesign of Cypher for adding support for working with multiple graphs that targets Cypher 10.

Rendered:

https://github.com/petraselmer/openCypher/blob/CIP-nested-subqueries/cip/1.accepted/CIP2016-06-22-nested-updating-and-chained-subqueries.adoc

@aseemk
Copy link

aseemk commented Jun 23, 2016

I'm not informed enough to offer meaningful feedback here, but as a user of Neo4j, this looks great! Thank you for putting this together. =)

@aseemk
Copy link

aseemk commented Jun 23, 2016

The only thing that made me go "hmm, interesting" was that subqueries' variables (variable bindings) persist into the outer query. How does that work if you have multiple independent subqueries that happen to (coincidentally) share a variable name? (E.g. something generic like user or tweet).

It would feel a bit more intuitive to me if only the RETURN variables persisted. Then, you could naturally rename those with WITH before running another subquery, etc.

But again, I'm not informed enough to know if this feedback is reasonable or flawed.

@cleishm
Copy link
Contributor

cleishm commented Jun 23, 2016

@aseemk Based on the examples, I'm guessing that it is only the RETURN values that are made available to the outer query. It's just that the language isn't very clear when it says "...Any new variable bindings produced by evaluating the subquery will augment the variable bindings of the initial record...".

@boggle
Copy link
Contributor

boggle commented Jun 24, 2016

Even more so, the intent is to not allow any shadowing, i.e. you cannot rebind variable from the outer query in the RETURN of the nested subquery. This is in line with CALL ... YIELD... and UNWIND ... AS ... semantics which also do not allow to do that. I think it really helps reduce confusion and hopefully nudges the users towards picking more meaningful (distinguishing) names.

@zazi
Copy link

zazi commented Jun 27, 2016

+1
(but I guess in SPARQL I can do the same (i.e. I doubt a bit the SPARQL comparison in your proposal ;) ))

@thobe
Copy link
Contributor

thobe commented Jul 12, 2016

I like the simplicity of this proposal, but I wonder how well it scales to other things we would/might want to do with/as subqueries in the future.

I was actually hoping we would be able to replace union with subqueries, rather than just allowing post-union processing by using union in a subquery, because I find union-queries hard to read with the constituent parts not being strongly delineated.

I've also held some hope that we would be able to use sub-queries as parameters to procedures, and there might be other use cases we would want to use sub-queries for that I can't think of right now.

@petraselmer
Copy link
Author

I have added the 'NOT READY FOR REVIEW' label only because we still need to consider path bindings for subqueries. We'll add this in a fortnight or so, when @boggle and I are back from leave.

In the meantime, much content regarding the new DO syntax has been added, so comments welcome.

@petraselmer
Copy link
Author

Hi @zazi

Thanks for your comment. I was wondering if you could provide an example (or link) in which correlated subqueries are supported in SPARQL? According to the W3C link in the CIP, it appears to be the case that 'vanilla' SPARQL cannot support correlated subqueries, but it may very well be the case that a vendor-specific implementation does. If this is the case, I would love to know, and, of course, this shall be added to the CIP.

@zazi
Copy link

zazi commented Jul 24, 2016

@petraselmer I'm not 100% an expert in this field. Nevertheless, (from my understanding) I'm supporting the answer to this question http://answers.semanticweb.com/questions/24508/does-sparql-11-support-correlated-subqueries ;)


Owing to the bottom-up nature of SPARQL query evaluation, the subqueries are evaluated logically first, and the results are projected up to the outer query.

Only variables projected out of the subquery will be visible, or in scope, to the outer query.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would tend to say that this is not a disadvantage, but rather then a (good) feature.

@zazi
Copy link

zazi commented Jul 24, 2016

For readability issues, I would recommend to add all queries in plain, natural language sentences as well.

MERGE (c:Child {id: x})
MERGE (r)-[:PARENT]->(c)
}
----
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to explain the difference between the preceding example and what it would be without the DO and just having the MERGE calls following UNWIND. I assume it's simply keeping the same cardinality (i.e. the DO itself has a cardinality of 1) and not introducing any new identifiers, but it might be helpful to be explicit about that and why it's helpful.

@zazi
Copy link

zazi commented Jul 26, 2016

Furthermore, some example data that illustrate the queries might be helpful (i.e. a small data set and the result sets of the different queries).

@Mats-SX
Copy link
Member

Mats-SX commented Jul 28, 2016

@zazi

Furthermore, some example data that illustrate the queries might be helpful (i.e. a small data set and the result sets of the different queries).

This would be solved by the addition of TCK scenarios for this feature (which is planned). Do you agree?

@boggle
Copy link
Contributor

boggle commented Sep 26, 2016

@thobe passing subqueries as arguments to procedures is out of scope for this CIP.

@thobe How do you envision replacing UNION with subqueries? I'm not seeing that. Besides having UNION is very common for people with a SQL background and opens up a natural trajectory for adding other set operations (INTERSECT SET DIFFERENCE)

@Tin-Nguyen
Copy link

+1, I'm waiting for this

@ric81
Copy link

ric81 commented Dec 13, 2016

In which Neo4j release is/will this change be included ?

@Mats-SX
Copy link
Member

Mats-SX commented Dec 14, 2016

@ric81 That is currently not known. Do note that this repository manages the specification and standardisation of Cypher as a language, which is separate from Neo4j the graph database (although this decoupling is fairly new and not yet fully mature).


The inner mandatory query is any inner match query.

Moreover, any inner match query from which the trailing final `RETURN` clause has been omitted may also be used as an inner mandatory query.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs examples to demonstrate the use of these. As well as the difference between having a RETURN clause and leaving it out.

In my understanding MANDATORY without RETURN would be like EXISTS, but as a clause with the capability of failing the query rather than a boolean expression. Interestingly that use of MANDATORY would not affect the cardinality of the query, whereas MANDATORY with RETURN could potentially increase the cardinality of the query (although never decrease it, since that would mean that no matches were found).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're actually thinking about making RETURN mandatory on MANDATORY match based on the reason that this will be the 95% use case and having to name in the remaining 5% isn't a big deal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit would be less special casing. Then again, maybe someone might find it very natural to leave off RETURN.

@Mats-SX
Copy link
Member

Mats-SX commented Apr 24, 2017

This CIP doesn't contain any formal syntax description in EBNF format.

@boggle boggle force-pushed the CIP-nested-subqueries branch 2 times, most recently from 746478d to 061b2fd Compare October 16, 2017 15:30
@boggle boggle force-pushed the CIP-nested-subqueries branch from 061b2fd to 3ed1ca9 Compare October 16, 2017 15:31
@boggle boggle changed the title Added the nested subqueries CIP Added the Nested, updating, and chained subqueries CIP Oct 16, 2017
@boggle boggle changed the title Added the Nested, updating, and chained subqueries CIP Added the nested, updating, and chained subqueries CIP Oct 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants