We all know that in PostgreSQL we cannot drop an object if there are view dependencies on it:
CREATE TABLE t (id integer PRIMARY KEY); CREATE VIEW v AS SELECT * FROM t; DROP TABLE t; ERROR: cannot drop table t because other objects depend on it DETAIL: view v depends on table t HINT: Use DROP ... CASCADE to drop the dependent objects too. ALTER TABLE t DROP id; ERROR: cannot drop column id of table t because other objects depend on it DETAIL: view v depends on column id of table t HINT: Use DROP ... CASCADE to drop the dependent objects too.
Some people like it because it keeps the database consistent; some people hate it because it makes schema modifications more difficult. But that’s the way it is.
In this article I want to explore the mechanics behind view dependencies and show you how to track what views depend on a certain PostgreSQL object.
Why would I need that?
Imagine you want to modify a table, e.g. change a column’s data type from
bigint because you realize you will need to store bigger numbers.
However, you cannot do that if there are views that use the column. You first have to drop those views, then change the column and then run all the
CREATE VIEW statements to create the views again.
As the example shows, editing tables can be quite a challenge if there is a deep hierarchy of views, because you have to create the views in the correct order. You cannot create a view unless all the objects it requires are present.
Best practices with views
Before I show you how to untangle the mess, I’d like to tell you what mistakes to avoid when you are using views in your database design (excuse my switching to teacher mode; I guess holding courses has that effect on you).
Views are good for two things:
- They allow you to have a recurring SQL query or expression in one place for easy reuse.
- They can be used as an interface to abstract from the actual table definitions, so that you can reorganize the tables without having to modify the interface.
Neither of these applications require you to “stack” views, that is, define views over views.
There are two patterns of using views that tend to be problematic, and they both stem from the mistaken idea that a view works exactly like a table, just because it looks like one:
- Defining many layers of views so that your final queries look deceptively simple.
However, when you try to unravel the views, for example by looking at the execution plan, the query turns out to be so complicated that it is almost impossible to understand what is really going on and how to improve it.
- Defining a denormalized “world view” which is just a join of all your database tables and using that for all of your queries.
People who do that tend to be surprised when certain
WHEREconditions work well, but others take impossibly long.
Never forget that a view is just a “crystallized” SQL statement and gets replaced by its definition when the query is executed.
How are views stored in PostgreSQL?
A view in PostgreSQL is not that different from a table: it is a “relation”, that is “something with columns”.
All such objects are stored in the catalog table
As the documentation states, a view is almost the same as a table, with a few exceptions:
- it has no data file (because it holds no data)
v” rather than “
- it has an
ON SELECTrule called “
This “query rewrite rule” contains the definition of the view and is stored in the
ev_action column of the
pg_rewrite catalog table.
Note that the view definition is not stored as a string, but in the form of a “query parse tree”. Views are parsed when they are created, which has several consequences:
- Object names are resolved during
CREATE VIEW, so the current setting of
- Objects are referred to by their internal immutable “object ID” rather than by their name. Consequently, it is no problem to rename an object or column used in a view definition.
- PostgreSQL knows exactly which objects are used in the view definition, so it can add dependencies on them.
Note that the way PostgreSQl handles views quite different from the way PostgreSQL handles functions: function bodies are stored as strings and not parsed when they are created. Consequently, PostgreSQL cannot know on which objects a given function depends.
How are the dependencies stored?
All dependencies (except those on “shared objects”) are stored in the catalog table
classidstores the object ID of the catalog table containing the dependent object
objidstores the ID of the dependent object
objsubidstores the column number if the dependency is for a column
objsubidare like the three columns above, but describe the object referenced by the dependency
deptypedescribes the kind of dependency
It is important to notice that there is no direct dependency of a view on the objects it uses: the dependent object is actually the view’s rewrite rule. That adds another layer of indirection.
A simple example
In the following, I’ll use this schema to test my queries:
CREATE TABLE t1 ( id integer PRIMARY KEY, val text NOT NULL ); INSERT INTO t1 VALUES (1, 'one'), (2, 'two'), (3, 'three'); CREATE FUNCTION f() RETURNS text LANGUAGE sql AS 'SELECT ''suffix'''; CREATE VIEW v1 AS SELECT max(id) AS id FROM t1; CREATE VIEW v2 AS SELECT t1.val FROM t1 JOIN v1 USING (id); CREATE VIEW v3 AS SELECT val || f() FROM t1;
I have thrown in a function, just to show that a view can depend on objects other than tables.
In the following I will concentrate on tables and columns, but the queries will work for functions too, if you replace the catalog
pg_class that contains tables with the catalog
pg_proc that contains functions.
Finding direct view dependencies on a table
To find out which views directly depend on table
t1, you would query like this:
SELECT v.oid::regclass AS view FROM pg_depend AS d -- objects that depend on the table JOIN pg_rewrite AS r -- rules depending on the table ON r.oid = d.objid JOIN pg_class AS v -- views for the rules ON v.oid = r.ev_class WHERE v.relkind = 'v' -- only interested in views -- dependency must be a rule depending on a relation AND d.classid = 'pg_rewrite'::regclass AND d.refclassid = 'pg_class'::regclass AND d.deptype = 'n' -- normal dependency AND d.refobjid = 't1'::regclass; view ------ v2 v1 v3 v2 (4 rows)
To find views with direct dependencies on the function
f, simply replace “
d.refclassid = 'pg_class'::regclass” with “
d.refclassid = 'pg_proc'::regclass” and “
refobjid = 't1'::regclass” with “
refobjid = 'f'::regproc”.
Actually, the views will usually not depend on the table itself, but on the columns of the table (the exception is if a so-called “whole-row reference” is used in the view). That is why the view
v2 shows up twice in the above list. You can remove those duplicates using
Finding direct dependencies on a table column
We can modify the above query slightly to find those views that depend on a certain table column, which can be useful if you are planning to drop a column (adding a column to the base table is never a problem).
The following query finds the views that depend on the column
val of table
SELECT v.oid::regclass AS view FROM pg_attribute AS a -- columns for the table JOIN pg_depend AS d -- objects that depend on the column ON d.refobjsubid = a.attnum AND d.refobjid = a.attrelid JOIN pg_rewrite AS r -- rules depending on the column ON r.oid = d.objid JOIN pg_class AS v -- views for the rules ON v.oid = r.ev_class WHERE v.relkind = 'v' -- only interested in views -- dependency must be a rule depending on a relation AND d.classid = 'pg_rewrite'::regclass AND d.refclassid = 'pg_class'::regclass AND d.deptype = 'n' -- normal dependency AND a.attrelid = 't1'::regclass AND a.attname = 'val'; view ------ v3 v2 (2 rows)
Recursively finding all dependent views
Now if you haven’t heeded the advice I gave above and you went ahead and defined a complicated hierarchy of views, it doesn’t stop with direct dependencies.
Rather, you need to recursively go through the whole hierarchy.
For example, let’s assume that you want to
DROP and re-create the table
t1 from our example and you need the
CREATE VIEW statements to re-create the views once you are done (dropping them won’t be a problem if you use
DROP TABLE t1 CASCADE).
Then you need to use the above queries in a recursive “common table expression” (CTE). The CTE is for tracking recursive view dependencies and can be reused for all such requirements; the only difference will be in the main query.
WITH RECURSIVE views AS ( -- get the directly depending views SELECT v.oid::regclass AS view, 1 AS level FROM pg_depend AS d JOIN pg_rewrite AS r ON r.oid = d.objid JOIN pg_class AS v ON v.oid = r.ev_class WHERE v.relkind = 'v' AND d.classid = 'pg_rewrite'::regclass AND d.refclassid = 'pg_class'::regclass AND d.deptype = 'n' AND d.refobjid = 't1'::regclass UNION ALL -- add the views that depend on these SELECT v.oid::regclass, views.level + 1 FROM views JOIN pg_depend AS d ON d.refobjid = views.view JOIN pg_rewrite AS r ON r.oid = d.objid JOIN pg_class AS v ON v.oid = r.ev_class WHERE v.relkind = 'v' AND d.classid = 'pg_rewrite'::regclass AND d.refclassid = 'pg_class'::regclass AND d.deptype = 'n' AND v.oid <> views.view -- avoid loop ) SELECT format('CREATE VIEW %s AS%s', view, pg_get_viewdef(view)) FROM views GROUP BY view ORDER BY max(level); format ------------------------------------------- CREATE VIEW v3 AS SELECT (t1.val || f()) + FROM t1; CREATE VIEW v1 AS SELECT max(t1.id) AS id+ FROM t1; CREATE VIEW v2 AS SELECT t1.val + FROM (t1 + JOIN v1 USING (id)); (3 rows)
We need the
GROUP BY because a view may depend on an object in more than one ways: in our example,
v2 depends on
t1 twice: once directly, and once indirectly via
Have questions? Need PostgreSQL support? You can reach us here.