Summary: this tutorial introduces you to PostgreSQL materialized views that allow you to store result of a query physically and update the data periodically.
In PostgreSQL view tutorial, you have learned that views are virtual tables which represent data of the underlying tables. Simple views can be also updatable. PosgreSQL extends the view concept to a next level that allows views to store data physically, and we call those views are materialized views. A materialized view caches the result of a complex expensive query and then allow you to refresh this result periodically.
The materialized views are useful in many cases that require fast data access therefore they are often used in data warehouses or business intelligent applications.
Creating materialized views
To create a materialized view, you use the CREATE MATERIALIZED VIEW
statement as follows:
1 2 3 4 | CREATE MATERIALIZED VIEW view_name AS query WITH [NO] DATA; |
First, specify the the view_name
after the CREATE MATERIALIZED VIEW
clause
Second, add the query that gets data from the underlying tables after the AS
keyword.
Third, if you want to load data into the materialized view at the creation time, you put WITH DATA
option, otherwise you put WITH NO DATA
. In case you use WITH NO DATA
, the view is flagged as unreadable. It means that you cannot query data from the view until you load data into it.
Refreshing data for materialized views
To load data into a materialized view, you use the REFRESH MATERIALIZED VIEW
statement as shown below:
1 | REFRESH MATERIALIZED VIEW view_name; |
When you refresh data for a materialized view, PosgreSQL locks the entire table therefore you cannot query data against it. To avoid this, you can use the CONCURRENTLY
option.
1 | REFRESH MATERIALIZED VIEW CONCURRENTLY view_name; |
With CONCURRENTLY
option, PostgreSQL creates a temporary updated version of the materialized view, compares two versions, and performs INSERT and UPDATE only the differences. You can query against the materialized view while it is being updated. One requirement for using CONCURRENTLY
option is that the materialized view must have a UNIQUE
index. Notice that CONCURRENTLY
option is only available from PosgreSQL 9.4.
Removing materialized views
Removing a materialized view is pretty straightforward as we have done for tables or views. This is done using the following statement:
1 | DROP MATERIALIZED VIEW view_name; |
PostgreSQL materialized views example
The following statement creates a materialized view named rental_by_category
:
1 2 3 4 5 6 7 8 9 10 11 12 13 | CREATE MATERIALIZED VIEW rental_by_category AS SELECT c.name AS category, sum(p.amount) AS total_sales FROM (((((payment p JOIN rental r ON ((p.rental_id = r.rental_id))) JOIN inventory i ON ((r.inventory_id = i.inventory_id))) JOIN film f ON ((i.film_id = f.film_id))) JOIN film_category fc ON ((f.film_id = fc.film_id))) JOIN category c ON ((fc.category_id = c.category_id))) GROUP BY c.name ORDER BY sum(p.amount) DESC WITH NO DATA; |
Because we used the WITH NO DATA
option, we cannot query data from the view. If we try to do so, we will get an error message as follows:
1 2 3 4 | SELECT * FROM rental_by_category; |
1 2 | [Err] ERROR: materialized view "rental_by_category" has not been populated HINT: Use the REFRESH MATERIALIZED VIEW command. |
PostgreSQL is very nice to give us a hint to ask for loading data into the view. Let’s do it by executing the following statement:
1 | REFRESH MATERIALIZED VIEW rental_by_category; |
Now, if we query data again, we will get the result as expected.
From now on, we can refresh the data in the rental_by_category
view using the REFRESH MATERIALIZED VIEW
statement. However, to refresh it with CONCURRENTLY
option, we need to create a UNIQUE index for the view first.
1 | CREATE UNIQUE INDEX rental_category ON rental_by_category (category); |
Let’s refresh data concurrently for the rental_by_category
view.
1 | REFRESH MATERIALIZED VIEW CONCURRENTLY rental_by_category; |
In this tutorial, we have shown you how to work with PostgreSQL materialized views, which come in handy for analytical applications that require quick data retrieval.