GQL maps roughly to SQL: you can think of GQL kind
as a SQL table, a GQL
entity
as a SQL row, and a GQL property
as a SQL column. However, a SQL
row-column lookup is limited to a single value, whereas in GQL a
property can be a multiple value property.
Grammar
The GQL grammar is summarized as follows:
<query> :=
SELECT ( * | [ DISTINCT ] <aggregated-property>+, )
[ FROM <kind> ]
[ WHERE <compound-condition> ]
[ GROUP BY <property-name>+, ]
[ ORDER BY ( <property-name> [ ASC | DESC ] )+, ]
[ LIMIT ( <result-position> |
FIRST “(” <result-position> ,
<result-position> “)” ) ]
[ OFFSET <result-position> [ “+” <result-position> ] ]
<aggregated-property> :=
<property-name>
| aggregator “(” <property-name> “)”
<aggregator> := FIRST
<compound-condition> := <condition>+AND
<condition> :=
<property-name> IS NULL
| <property-name> <comparator> <value>
| <value> <comparator> <property-name>
<comparator> :=
=
| <
| <=
| >
| >=
| IN
| CONTAINS
| HAS ANCESTOR
| HAS DESCENDANT
<result-position> := <binding-site> | <integer-literal>
<value> :=
<binding-site>
| <synthetic-literal>
| <string-literal>
| <integer-literal>
| <double-literal>
| <boolean-literal>
| <null-literal>
<synthetic-literal> :=
KEY “(”
[ “DATASET” “(“ <string-literal> “)” “,” ]
[ “NAMESPACE” “(“ <string-literal> “)” “,” ]
<key-path-element>+, “)”
| BLOB “(” <string-literal> “)”
| BLOBKEY “(” <string-literal> “)”
| DATETIME “(” <string-literal> “)”
<key-path-element> :=
<kind> "," ( <integer-literal> | <string-literal> )
<kind> := <name>
<property-name> := <name>
In the above grammar list, note that:
- The symbol
+,
after an expression indicates that it can be repeated, with repeated expressions separated by a comma. - The use of
+AND
above means anAND
-separated set of conditions.
The following are some examples that return entire entities:
SELECT * FROM myKind WHERE myProp >= 100 AND myProp < 200
SELECT * FROM myKind LIMIT 50 OFFSET @startCursor
SELECT * FROM myKind ORDER BY myProp DESC
Every GQL query string always begins with SELECT <something>
, where
<something>
is one of the following:
*
<property-list>
, a comma delimited list of properties to be returned from the query.__key__
, which returns keys only.
Similar to SQL, GQL keywords are case insensitive. Kind, property, and binding site names are case sensitive.
A GQL query returns zero or more entity results of the requested kind, with each result consisting of an entire entity or some subset of the properties of the entity, or even just the entity key. For example,
SELECT * FROM myKind
SELECT __key__ FROM myKind
SELECT title, year FROM Song WHERE composer = 'Lennon, John'
A result list produced by SELECT *
or SELECT __key__
never contains
duplicates. A result list produced by SELECT <property-list>
may contain
multiple results from one entity, typically when any of those properties are multiple value properties.
Tip: SELECT __key__
or SELECT <property-list>
queries are faster than
SELECT *
queries, and are cheaper, because they are charged at the less
expensive small operations rates, rather than at
the read rates. See also projection queries.
Clauses
The following optional SELECT
clauses are recognized:
Clause | Description |
---|---|
DISTINCT |
Specifies that only completely unique results will be returned. Normally used only with projection queries because non-projection queries return unique results. If you use DISTINCT in a projection query where more than one entity has the same values in the properties being projected, only the first entity is returned. Note that a query string can use either DISTINCT or GROUP BY , but not both. |
GROUP BY |
Groups data from a selected set of entities into a set of summary rows by the value of the specfied property or properties. |
FROM |
Limits the result set to those entities of the given kind. A query string without a FROM clause is called a kindless query and the only filtering allowed in the WHERE clause is filtering by __key__ . |
WHERE |
Limits the result set to those entities that meet one or more conditions. Each condition compares a property of the entity with a value using a comparison operator. If multiple conditions are combined with the AND keyword, then an entity must meet all of the conditions to be returned by the query. GQL does not have an OR operator. |
ORDER BY |
Causes results to be sorted by the specified properties. The ORDER BY clause can specify multiple sort orders as a comma-delimited list, evaluated from left to right. Specify ASC for ascending or DESC for descending order. Note that the order is applied to each property. If the direction is not specified, it defaults to ASC . If no ORDER BY clause is specified, the order of the results is undefined and may change over time. |
LIMIT |
Limits query results to a count, to results preceding a cursor, or both. Often used to page through results of a query. If LIMIT has two <result-position> s, one must be a cursor and the other must be an integer. (Note that OFFSET and LIMIT are independent.) |
OFFSET |
Specifies offsets into the result set: either a cursor, or a count, or both. If OFFSET has two <result-position> s, the left one must be a cursor and the right one must be an integer. Note that an OFFSET with an integer starts at the beginning or at the cursor, then discards the specified number of entities, and so you still incur the cost of reading those entities. (Note also that OFFSET and LIMIT are independent.) |
How to form entity and property names
Kind, property, and binding site names are formed as follows:
-
With any sequence of letters, digits, underscores, dollar signs, or unicode characters in the range from
U+0080
toU+FFFF
(inclusive), so long as the name does not begin with a digit. For example,foo
,bar17
,x_y
,big$bux
,__qux__
. -
You can also use a non-empty backquoted string:
`fig-bash`
or`x.y`
. A backquote character can be represented in a backquoted name by doubling it, for example,`silly``putty`
. A backquoted name can contain escaped characters. -
An unquoted name can match a predefined name, but must not match a keyword. A backquoted name can contain any character except a newline. (It can contain a newline via
\n
, but not as a raw newline.) -
Names are case-sensitive.
How to form literals
You can use the following literals in a comparison:
Literal Type | Description |
---|---|
string |
Formed following these rules:
|
integer |
A sequence of decimal digits with the following options or characteristics:
|
double |
A sequence of decimal digits with the following options or characteristics:
|
boolean |
Can be the values TRUE or FALSE , case-insensitive. |
null |
Represents NULL . Case insensitive. |
Synthetic literals
A synthetic literal is a value that is constructed by a function. The following table lists the supported synthetic literals:
Literal Name | Description |
---|---|
KEY | KEY([DATASET(<dataset>),] [NAMESPACE(<namespace>),] <key-path-element>*,) represents a key.If <dataset> and <namespace> are not supplied, defaults from the current query context are used. Entities are partitioned into various subsets, each used by different datasets and different namespaces within a dataset and so forth, with an ID (partition ID) assigned to each entity subset. The <Key-path-element> is an entity path, which is an even-length comma-separated list of kinds alternating with either integer ids or string names. The integers must be greater than 0 and the strings must not be empty. |
BLOB | BLOB(<string>) represents a blob encoded as <string> via base-64 encoding with character set [A-Za-z0-9-_] and no padding. |
BLOBKEY | BLOBKEY(<string>) represents a blobkey with value <string> , which must not be empty. |
DATETIME | DATETIME(<string>) represents a timestamp. <string> must be in the time format specified in RFC 3339 section 5.6. (However, the second precision is limited to microseconds and leap seconds are omitted.) This standard format is: YYYY-MM-DDThh:mm:ss.SSSSSS+zz:ZZ where:
|
How to escape characters
You can escape certain characters in string literals and backquoted names. The
escaped characters are case sensitive: for example \r
is valid while \R
is
not.
The following is a list of all the characters that can be escaped in GQL:
Character | Escaped |
---|---|
backslash character | \\ |
null character | \0 |
backspace character | \b |
newline character | \n |
return character | \r |
tab character | \t |
the character with decimal code 26 | \Z |
single quotation mark | \’ |
double quotation mark | \” |
backquote character | \` |
\% (2 characters, retaining the backslash, per MySQL) |
\% |
\_ (2 characters, retaining the backslash, per MySQL) |
\_ |
Operators and comparisons
Comparators are either equivalence comparators: =
, IN
, CONTAINS
,
= NULL
, HAS ANCESTOR
, and HAS DESCENDANT
, or inequality comparators:
<
, <=
, >
, and >=
.
Notice that the operator =
is another name for the IN
and CONTAINS
operators. For
example, <value> = <property-name>
is the same as <value> IN <property-name>
,
and <property-name> = <value>
is the same as <property-name> CONTAINS <value>
.
Also <property-name> IS NULL
is the same as <property-name> = NULL
.
A condition can also test whether one entity has another entity as an
ancestor, using the HAS ANCESTOR
or HAS DESCENDANT
operators. These
operators test ancestor relationships between keys. They can operate on
__key__
, but they can also operate on a key-valued property. For
HAS ANCESTOR
, the right operand cannot be a property. For HAS DESCENDANT
,
the left operand cannot be a property. For more information on ancestor relationships, see
ancestor paths.
Only one property may be compared with inequality operators. When a query with
an ORDER BY
clause applies an inequality operator to a property, that
property must be the first property in the ORDER BY
clause.
A typical property name consists of alphanumeric characters optionally mixed with
underscore (_
) and dollar sign ($
). In other words, they match the regular expression
[a-zA-Z0-9_$]+(.[0-9_$]+)*
. Property names
containing other printable characters must be quoted with backquotes,
for example: `first-name`
.
Restrictions
Comparisons must be between a property name and a literal, but these can
be on either side of the operator. For example, A < 7
or 7 > A
. Note,
however, that there is no inverse operator for IS NULL, so while you can have
<property-name> IS NULL
, you cannot have NULL IS <property-name>
.
There is no way to determine whether an entity lacks a value for a
property (that is, whether the property has no value). If you use a condition of
the form property = NULL
, what will occur is a check whether a null value is
explicitly stored for that property. Datastore queries that refer to a property
will never return entities that don't have a value for that property.
Examples
To find all of the entities of kind Person
whose ages are
between 18 and 35, use this query string:
SELECT * FROM Person WHERE age >= 18 AND age <= 35
To find the three entities of kind Person
whose ages are the
greatest, use this query string:
SELECT * FROM Person ORDER BY age DESC LIMIT 3
To return only the name
property for each Person
, use
this query string:
SELECT name FROM Person
To return only the name
property for each Person
,
ordered by age
, use this query string:
SELECT name FROM Person ORDER BY age
To find the keys of the entities of kind Person
that have an
age of NULL
, use this query string:
SELECT __key__ FROM Person WHERE age = NULL
To find all the entities, regardless of kind, that are in Amy's entity group (i.e. Amy and Fred), use this query:
SELECT * WHERE __key__ HAS ANCESTOR KEY(Person, 'Amy')
Argument binding
GQL supports argument binding. For information about these, see Using GQL.
Unsupported features and behavior differences from MySQL/Python GQL
If you are familiar with MySQL or the Python GQL provided by Google App Engine, you might want to take a look at the following list, which highlights the main differences between those products and the Datastore GQL behavior.
MySQL Differences
- MySQL binding sites are represented by
?
. Datastore GQL binding sites are represented by@<name>
or@<number>
. - MySQL supports only
LIMIT
/OFFSET
counts. Datastore GQL also supportsLIMIT
/OFFSET
cursors. - Datastore GQL supports an
OFFSET
without aLIMIT
, MySQL does not. - MySQL supports an offset count via keyword
LIMIT
, Datastore GQL does not. - A MySQL literal string can contain a raw newline. A Datastore GQL literal string cannot.
- A MySQL literal string can
\
-escape any character. A Datastore GQL literal string can only\
-escape a specified list of characters. - A MySQL name can begin with a digit. A Datastore GQL name cannot.
- A Datastore GQL name can contain null characters. A MySQL name cannot.
- A quoted Datastore GQL name can contain
\
-escaped characters. A quoted MySQL name interprets a\
as an ordinary character. - MySQL has different operators, keywords, and predefined names than Datastore GQL.
Python GQL for App Engine Differences
- Python GQL binding sites begin with
:
. Datastore GQL binding sites begin with@
. - Python GQL binding argument names may match the regular expression
__.*__
. Datastore GQL reserves those names for future predefined binding arguments. Note that although binding sites can use those names, argument bindings cannot. - Python GQL supports operators
!=
andOR
. Those operators are not yet supported by Datastore GQL. - Datastore GQL supports the
IN
operator differently. - Python GQL supports functions
datetime
,date
,time
,key
,user
, andgeopt
. Datastore GQL supports functiondatetime
andkey
, but with different arguments. Datastore GQL supports functionsblob
andblobkey
. Datastore GQL support for functiongeopoint
(note spelling) is expected to be added in some future release. - Python GQL expression
ANCESTOR IS <entity-or-key-value>
is represented in Datastore GQL as the more general expression__key__ HAS ANCESTOR <key-value>
. - Datastore GQL supports the expression
<property-name> IS NULL
. Python GQL does not. - Python GQL literal strings are quoted with
’
. Datastore GQL literal strings are quoted with either’
or”
. - Python GQL names are quoted with
”
. Datastore GQL names are quoted with`
. - Datastore GQL literal strings and quoted names can contain spaces, return characters, backslashed characters, and the enclosing quote character (doubled). Python GQL literal strings and quoted names cannot have these.
- Python GQL names may contain
.
without quoting. Datastore GQL reserves.
for future use. - Datastore GQL names may contain
$
andU+0080
toU+FFFF
without quoting. Python GQL names may not. - Python GQL has different keywords and operators than Datastore GQL.
Keywords and predefined names
Keywords and predefined names are case-insensitive.
The following keywords are recognized, although not all of these are currently used in GQL. The unused keywords are marked with an asterisk.
Keyword | ||
---|---|---|
ALL* | DIV* | NULL |
ANCESTOR | EXISTS* | OFFSET |
AND | FALSE | OR* |
ANY* | FROM | ORDER |
AS* | GROUP | REGEXP* |
ASC | HAS | RLIKE* |
BETWEEN* | HAVING* | SELECT |
BINARY* | IN. | SUBSET* |
BY | IS | SUPERSET |
CONTAINS | JOIN* | TRUE |
CURSOR* | LIKE* | WHERE |
DESC | LIMIT | XOR* |
DESCENDANT | MOD* | |
DISTINCT | NOT* |
The following predefined names are recognized:
Predefined Name |
---|
BLOB |
BLOBKEY |
DATETIME |
FIRST |
KEY |