Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persisted Documents: encourage URL approach #305

Draft
wants to merge 1 commit into
base: persisted-documents
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 135 additions & 13 deletions spec/Appendix A -- Persisted Documents.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,51 @@ thus should generate different document identifiers.
A _document identifier_ must either be a _prefixed document identifier_ or a
_custom document identifier_.

A _document identifier_ must only contain colons (`:`) and characters that are
defined as
[`unreserved` in RFC3986](https://datatracker.ietf.org/doc/html/rfc3986#section-2.3)
(alphanumeric characters (`A-Z`, `a-z`, `0-9`), dashes (`-`), periods (`.`),
underscores (`_`), and tildes (`~`)).

DocumentIdentifier ::

- PrefixedDocumentIdentifier
- CustomDocumentIdentifier

PrefixedDocumentIdentifier ::

- UnreservedCharacter+ PrefixedDocumentIdentifierContinue+

PrefixedDocumentIdentifierContinue ::

- Colon UnreservedCharacter\*

CustomDocumentIdentifier ::

- UnreservedCharacter+ [lookahead != Colon]

UnreservedCharacter ::

- Letter
- Digit
- `-`
- `.`
- `_`
- `~`

Letter :: one of

- `A` `B` `C` `D` `E` `F` `G` `H` `I` `J` `K` `L` `M`
- `N` `O` `P` `Q` `R` `S` `T` `U` `V` `W` `X` `Y` `Z`
- `a` `b` `c` `d` `e` `f` `g` `h` `i` `j` `k` `l` `m`
- `n` `o` `p` `q` `r` `s` `t` `u` `v` `w` `x` `y` `z`

Digit :: one of

- `0` `1` `2` `3` `4` `5` `6` `7` `8` `9`

Colon :: `:`

### Prefixed Document Identifier

:: A _prefixed document identifier_ is a document identifier that contains at
Expand Down Expand Up @@ -86,14 +131,46 @@ Would have this different _SHA256 hex document identifier_:
sha256:71f7dc5758652baac68e4a10c50be732b741c892ade2883a99358f52b555286b
```

Persisted Documents may contain multiple (named) operations and fragments too;
so the following GraphQL document (with no trailing newline):

```graphql example
query UserName($id: ID!) {
user(id: $id) {
name
}
}

query FriendsNames($id: ID!) {
user(id: $id) {
...User
friends {
...User
}
}
}

fragment User on User {
id
name
}
```

Would have the following _SHA256 hex document identifier_:

```example
sha256:517c56d2ba0779653b7698881207f749509f331bdaccbe951a82c378bc869556
```

### Custom Document Identifier

:: A _custom document identifier_ is a document identifier that contains no
colon symbols (`:`). The meaning of a custom document identifier is
implementation specific.

Note: A 32 character hexadecimal _custom document identifier_ is likely to be an
MD5 hash of the GraphQL document, as traditionally used by Relay.
Note: A 32 character hexadecimal lower-case _custom document identifier_ is
likely to be an MD5 hash of the GraphQL document, as traditionally used by
Relay.

## Persisting a Document

Expand All @@ -105,12 +182,11 @@ specific.

Note: When used as an operation allow-list, persisted documents are typically
stored into a trusted shared key-value store at client build time (either
directly, or indirectly via an authenticated request to the server) such that
the server may retrieve them given the identifier at request time. This must be
done in a secure manner (preventing untrusted third parties from adding their
own persisted document) such that the server will be able to retrieve the
identified document within a _persisted document request_ and know that it is
trusted.
directly, or indirectly via authenticated requests to the server) such that the
server may retrieve them given the identifier at request time. This must be done
in a secure manner (preventing untrusted third parties from adding their own
persisted document) such that the server will be able to retrieve the identified
document within a _persisted document request_ and know that it is trusted.

Note: When used solely as a bandwidth optimization, as in the technique known
colloquially as "automatic persisted queries (APQ)," an error-based mechanism
Expand All @@ -130,12 +206,16 @@ deployed client.

## Persisted Document Request

A server MAY accept a _persisted document request_ via `GET` or `POST`.
A server MAY accept a _persisted document request_ via an HTTP `GET` or `POST`
request to a _GraphQL endpoint_ or subpath thereof.

### Persisted Document Request Parameters

:: A _persisted document request_ is an HTTP request that encodes the following
parameters in one of the manners described in this specification:
:: A _persisted document request_ is an HTTP request that encodes the _persisted
document request parameters_ in one of the manners described in this
specification.

:: The _persisted document request parameters_ are as follows:

- {documentId} - (_Required_, string): The string identifier for the Document.
- {operationName} - (_Optional_, string): The name of the Operation in the
Expand All @@ -145,20 +225,62 @@ parameters in one of the manners described in this specification:
- {extensions} - (_Optional_, map): This entry is reserved for implementors to
extend the protocol however they see fit.

### Persisted Document Request URL

To enable non-GraphQL HTTP tooling to better integrate with a Persisted Document
Request, it is recommended that the URL to which a Persisted Document Request is
sent is a subpath of the _GraphQL endpoint_ containing the {documentId} and the
{operationName} (if any) separated by a forward slash character (`/`). It is
recommended that this practice is only followed when {documentId} is a prefixed
document identifier, since the prefix helps avoid clashes with other subpaths of
the _GraphQL endpoint_.

Note: By following this practice, traditional HTTP tooling can exercise concerns
such as caching, rate limiting, audit logging, access-pattern analysis, error
detection, monitoring and more without needing to fully parse the incoming
GraphQL request.

For example, if the _GraphQL endpoint_ is `https://example.com/graphql` then a
persisted document request may be made to an endpoint such as
`https://example.com/graphql/sha256:517c56d2ba0779653b7698881207f749509f331bdaccbe951a82c378bc869556/FriendNames`.
For documents containing a single anonymous operation the final segment must be
omitted, e.g.
`https://example.com/graphql/sha256:71f7dc5758652baac68e4a10c50be732b741c892ade2883a99358f52b555286b`.
Comment on lines +243 to +248
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe namespace persisted documents under /graphq/<prefix>/<sha>?

e.g.

  • /graphql/persisted/<sha>/<operation_name>
  • /graphql/documents/<sha>/<operation_name>
  • /graphql/persisted-documents/<sha>/<operation_name>

Main reason for this is that graphql-sse already encourages /graphql/stream, which in theory can conflict with a "custom" hash. Changing this would also leave the options open to add other URL based features later on and not conflict with the persisted documents.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do namespace it in HotChocolate for less collision potential. However, Twitter for instance puts it directly on the GraphQL route as as outlined in this document.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this is one of the things I was considering changing before moving this out of draft. I think I also want to change it to /graphql/persisted/<operation_name>/<hash> so that the op-name comes first for easier debugging. If there's no op-name, we can just use - for that part of the URL, e.g. /graphql/persisted/-/<hash>.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of having the operation name first. 🤔

From a server perspective, we first look up the hash for the document, and then the operation name within that document. Having the operation name first seems wrong to me.

From a debugging perspective, it does not matter too much whether the operation name comes before the hash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if we are looking at Chrome DevTools, the last URL part is displayed. So replacing operation name and hash might even be counter productive.

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, re Chrome devtools. Okay lets stick with the hierarchical approach, it makes more sense anyway 👍


Legacy persisted document implementations often issue requests to the _GraphQL
endpoint_ directly (i.e. without a subpath), so it's recommended to support this
pattern too.

:: The term _remaining parameters_ refers to the _persisted document request
parameters_ that are not encoded via the URL subpath; i.e. when the GraphQL
endpoint is queried directly the remaining request parameters are {documentId},
{operationName}, {variables} and {extensions}, whereas when the subpath
technique described above is used the remaining parameters are {variables} and
{extensions}.

### GET

For a _persisted document request_ using HTTP GET, parameters SHOULD be provided
in the query component of the request URL, encoded in the
For a _persisted document request_ using HTTP GET, the _remaining parameters_
SHOULD be provided in the query component of the request URL, encoded in the
`application/x-www-form-urlencoded` format as specified by the
[WhatWG URLSearchParams class](https://url.spec.whatwg.org/#interface-urlsearchparams).

Note: This is only a SHOULD recommendation to allow for variables which are too
long for the query component to be encoded in an alternative way, for example
via headers.

The {documentId} parameter must be a string _document identifier_.

The {operationName} parameter, if present, must be a string.

Each of the {variables} and {extensions} parameters, if used, MUST be encoded as
a JSON string.

Note: JSON encoding is used here to enable reliable encoding of custom scalars
and composite/list inputs; traditional HTTP query strings do not encode enough
detail to tell the difference between a boolean `true` and the string `"true"`,
for example.

Setting the value of the {operationName} parameter to the empty string is
equivalent to omitting the {operationName} parameter.

Expand Down
11 changes: 6 additions & 5 deletions spec/GraphQLOverHTTP.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,8 @@ time.

A _server_ MUST enable GraphQL requests to one or more GraphQL schemas.

Each GraphQL schema a _server_ provides MUST be served via one or more URLs.
Each GraphQL schema a _server_ provides MUST be served via one or more URLs,
each URL is called a _GraphQL endpoint_.

A _server_ MUST NOT require the _client_ to use different URLs for different
GraphQL query and mutation requests to the same GraphQL schema.
Expand Down Expand Up @@ -152,15 +153,15 @@ It is RECOMMENDED to end the path component of the URL with `/graphql`, for
example:

```url example
http://example.com/graphql
https://example.com/graphql
```

```url example
http://product.example.com/graphql
https://product.example.com/graphql
```

```url example
http://example.com/product/graphql
https://example.com/product/graphql
```

# Serialization Format
Expand Down Expand Up @@ -321,7 +322,7 @@ With the following query variables:
This request could be sent via an HTTP GET as follows:

```url example
http://example.com/graphql?query=query(%24id%3AID!)%7Buser(id%3A%24id)%7Bname%7D%7D&variables=%7B%22id%22%3A%22QVBJcy5ndXJ1%22%7D
https://example.com/graphql?query=query(%24id%3AID!)%7Buser(id%3A%24id)%7Bname%7D%7D&variables=%7B%22id%22%3A%22QVBJcy5ndXJ1%22%7D
```

GET requests MUST NOT be used for executing mutation operations. If the values
Expand Down
Loading