About the BI Engine SQL interface
BigQuery BI Engine is a fast, in-memory analysis service that lets users analyze data stored in BigQuery with sub-second query response times and with high concurrency.
The BI Engine SQL interface expands BI Engine to integrate with other business intelligence (BI) tools such as Looker, Tableau, Power BI, and custom applications to accelerate data exploration and analysis. This page provides an overview of the BI Engine SQL interface, and the expanded capabilities that it brings to this preview version of BI Engine.
Requesting access to the preview
Access to the BI Engine SQL interface in this preview phase is provided through an enrollment process. For access, submit the BI Engine preview enrollment form with your project details. You will be notified by email once your project is enrolled.
Increasing the reservation size during preview
If you need additional memory reservation beyond the default size of 100 GB, you can request an increase. Reservation increases are evaluated on a case-by-case basis, and are only available in some regions.
BI Engine architecture
The BI Engine SQL interface builds on the existing BI Engine for Google Data Studio architecture. The following diagram shows the updated architecture for BI Engine:

This preview version of BI Engine includes the following:
- BigQuery API: BI Engine directly integrates with the BigQuery API. Any BI solution or custom application that works with the BigQuery API through standard mechanisms such as REST JDBC and ODBC drivers can use BI Engine without any changes.
- Vectorized runtime: With the BI Engine SQL interface, BI Engine introduces a more modern technique called vectorized processing. Using vectorized processing in an execution engine makes more efficient use of modern CPU architecture, by operating on batches of data at a time. BI Engine also uses advanced data encodings, specifically, dictionary run-length encoding, to further compress the data that's stored in the in-memory layer.
- Metadata: The metadata stores the table and views definitions, the schema definition, and the fine-grained permissions.
- Reservations: BI Engine reservations manage the memory allocation at the project billing level. BI Engine caches only columns and partitions that are queried or scanned. It does not cache the whole table.
- Distributed in-memory engine: BI Engine is a distributed in-memory execution engine, letting customers allocate significantly larger memory reservations.
Query optimization and acceleration
BigQuery, and by extension BI Engine, breaks down the query plan that's produced for a SQL query into subqueries. A subquery contains a number of operations, such as scanning, filtering, or aggregating data, and is often the unit of execution on a shard.
While all of BigQuery's supported SQL queries are correctly executed by the BI Engine SQL interface, the preview version of the BI Engine SQL interface only optimizes certain subqueries. In particular, it is most optimized for leaf-level subqueries that scan the data from storage, and perform operations such as filter, compute, aggregation, order-by, and certain types of joins. Other subqueries that are not yet fully accelerated by BI Engine revert back to BigQuery for execution.
Because of this selective optimization, simpler BI or dashboard-type queries (resulting in fewer subqueries) benefit the most from BI Engine because the majority of the execution time is spent on leaf-level subqueries that process raw data.
Limitations
The following list explains how and when a query is accelerated when you use this preview version of BI Engine:
- Within the leaf-level subqueries, the acceleration is restricted to standard SQL functions and operators.
- During this preview, users are limited to a maximum of 100 GB of BI Engine capacity for each Google Cloud project. You can request an increase in reservation capacity.
- BigQuery streaming ingestion is not supported during preview.
- The following limitations apply to the size and number of rows:
- Simple queries that reference only one table (such as
GROUP-BYorFILTERqueries) are primarily limited by the reservation size. - For joins, the acceleration is restricted to leaf-level subqueries with
INNERandLEFT OUTERjoins, where a large (fact) table is joined with up to four small and unpartitioned (dimension) tables, each with up to 5 million rows or 5 GB in size for the referenced columns.
- Simple queries that reference only one table (such as
- Queries referencing wildcard tables are not supported.
- This preview version of BI Engine is not available in all regions. For more information, see supported regions.
Monitoring and diagnostics
This section explains how to find statistics about BI Engine, and how BI Engine intregrates with Cloud Monitoring.
Acceleration statistics
Detailed statistics on BI Engine are available through the job
statistics API. You can use the bq command-line tool to fetch statistics associated with
BI Engine accelerated queries.
With BI Engine acceleration enabled, you can run a query in any of these three modes:
DISABLED |
BI Engine disabled the acceleration.
biEngineReasons specifies a more detailed reason. The query was
run using the BigQuery execution engine. |
PARTIAL |
Part of the query was accelerated using BI Engine. As
described in Query optimization
and acceleration, a query plan is generally broken down into multiple
subqueries. This [preview](/products/#product-launch-stages) of
BI Engine supports the common types
of subquery patterns that are typically used in dashboarding. If the query
consists of multiple subqueries, only a few of which fall under the
supported use cases, then BI Engine executes the remaining
subqueries using the normal BigQuery engine. Those remaining
subqueries don't receive the BI Engine acceleration.
In this situation, BI Engine returns a PARTIAL
acceleration code, and uses biEngineReasons to populate the reason
for not accelerating other subqueries. |
FULL |
All of the query was accelerated using BI Engine. |
To fetch the statistics associated with BI Engine accelerated
queries, run the following bq command-line tool command:
bq show --format=prettyjson -j job_id
If the project is enabled for BI Engine acceleration, then the
output produces a new field, biEngineStatistics. Here is a sample job
report:
"statistics": {
"creationTime": "1602175128902",
"endTime": "1602175130700",
"query": {
"biEngineStatistics": {
"biEngineMode": "DISABLED",
"biEngineReasons": [
{
"code": "UNSUPPORTED_SQL_TEXT",
"message": "Detected unsupported join type"
}
]
},
For more information about the BiEngineStatistics field, see the
Job reference.
Cloud Monitoring
BI Engine integrates with Cloud Monitoring to surface key metrics for monitoring and alerting purposes. These are the metrics that are monitored:
| Resource type | Metric name | Description |
|---|---|---|
| Project | Reservation Total Bytes | Total capacity allocated within one Cloud project. |
| Project | Reservation Used Bytes | Total capacity used within one Cloud project. |
Feedback and updates
Once you're enrolled in the preview, you can post questions and comments in the discussion group. We also use this group to provide updates on ongoing improvements and features during the preview phase.
Pricing
There is no cost to using the BI Engine SQL interface during the first 10 weeks of the preview phase.
Once the preview ends, BI Engine will follow the pricing model that's described on the BI Engine pricing page.
Supported regions
Like BigQuery, BI Engine is a regional and a multi-regional resource. BI Engine processes your data in the same region where your data is located inside BigQuery.
The preview phase of BI Engine SQL interface supports the following locations.
Regional locations
| Region description | Region name | |
|---|---|---|
| Americas | ||
| Northern Virginia | us-east4 |
|
| Oregon | us-west1 |
|
| Asia Pacific | ||
| Tokyo | asia-northeast1 |
|
Multi-regional locations
| Multi-region description | Multi-region name |
|---|---|
| Data centers within member states of the European Union1 | EU |
| Data centers in the United States | US |
1 Data located in the EU multi-region is not
stored in the europe-west2 (London) or europe-west6
(Zürich) data centers.