Av rating:
Total votes: 5
Total comments: 0

Procedural, Semi-Procedural and Declarative Programing Part II

02 March 2010

SQL Server accommodates a whole range of programming styles and will even allow you to create code that is wholly procedural. Is a declarative approach inevitably better? Can it be difficult to maintain? Can you avoid the performance problems of procedural code by using triggers? Joe adds some thoughts.

In the last article, I tried to show you how those programmers who are learning SQL tend to carry on trying to solve database problems in a procedural way. There is a big leap from a procedural mindset to a declarative one for most programmers. Most of them don't quite make that leap all at once, but make a gradual step-wise transition from procedural to semi-procedural programming styles.

Procedural code can appear in procedures, but is this necessarily always true? Procedures can be no more than a BEGIN-END block with a sequence of SQL statements without any IF-THEN-ELSE or WHILE-DO loop logic in it. Is such a block procedural or declarative when all it has is one declarative statement in it? I would say it was declarative. Is a block procedural or declarative when it has IF-THEN-ELSE or WHILE-DO loop control logic? I would say 'procedural'.

You can get rid of a lot of IF-THEN-ELSE control logic with CASE expressions. Before the CASE expression, there were unexpected pitfalls in trying to apply procedural logic to SQL. The classic example is an UPDATE statement that was part of Sybase/SQL Server classes for decades. You have a bookstore and want to change the prices of the books. Any book over $25 will be discounted by 10% (we will advertise that) and books under $25 will be increased by 15% (we will not advertise that). The immediate solution is to write this:

BEGIN

UPDATE Books

SET price = price * 1.10

WHERE price < 25.00;

UPDATE Books

SET price = price * 0.85

WHERE price >= 25.00;

END;

But it does not work! Look at a book that sells for $24.95 currently. Its price jumps to $27.45 when the first UPDATE is done. But when we do the second UPDATE, the price goes down to $23.33 finally. That is not what we meant to do. Flipping the updates does not help.

This was the classic argument for cursors. Hang in a loop and use an IF-THEN-ELSE statement to do the update of the current row in the cursor, just like a magnetic tape file. But this is not required today. We have the CASE expression, which is declarative.

UPDATE Books

SET price

= CASE

WHEN price < 25.00

THEN price * 1.10

ELSE price * 0.85

END;

Loops can be replaced with various constructs most of which apply set-oriented operations to the table involved, instead of doing RBAR (pronounced “re-bar”, like the steel rods used in concrete construction; RBAR, is an acronym for 'Row By Agonizing Row' coined by Jeff Moden). But another common change is to use the ROW_NUMBER() and other ordinal functions to replace a counting loop in procedural code.

Procedural, Semi-Procedural and Declarative solutions: an example

If you look at Part I of this series of articles, you will see recursion being used to create a table of sequential numbers. It was replaced with a declarative look-up table. But let's take a similar problem, the calculation of the Fibonacci series, and look at the Procedural, Semi-Procedural and Declarative approaches to it. I'm not trying to suggest that this is a practical problem: If you really needed this data, then you'd download it into a table (from The Fibonacci numbers). The problem is to build a table of Fibonacci numbers with n and the n-th Fibonacci number. I do not want to talk about the Fibonacci series. No, that is a lie. I would love to write a whole book on it, but a lot of other people beat me to it (see the references). Darn! The usual definition for the series is recursive:

FUNCTION fib(n)IF n = 0

THEN RETURN 0;

ELSE IF n = 1

THEN RETURN 1;

ELSE IF n > 1

THEN RETURN (fib(n-2) + fib(n-1));

END IF;

Using Computation to Replace a Look-up

The most extreme example I can remember of using computation to replace a look up was decades ago at Georgia Tech when we had CDC Cyber series computer. The hardware had the best floating point hardware and speed in its days. To give you an idea of what I mean, CDC had a COBOL compiler that converted COBOL picture data to floating point numbers, did the math and converted back to COBOL picture formats. It out-performed IBM machines on campus.

Rather than do table look-up in FORTRAN, one of our sales team members had just had a course on Chebyshev polynomials and fitting data to curves (Wikipedia: Chebyshev polynomials) and had a software package to create these polynomials. The formula that was impossible for a human being to understand. But it was faster than reading a disk and the data had a pattern that worked well with polynomials. Do not do this on anything but a supercomputer. I will now avoid the issue of performance versus maintenance.

There is a little debate about whether to start at (n = 0) or at (n =1), but the idea is that fib(n) = (fib(n-2) + fib(n-1)), so it is 0,1,1,2,3,5,8, etc. Here is a completely procedural loop to compute a table of Fibonacci numbers:

BEGIN

DECLARE @n INTEGER; SET @n = 1;

DECLARE @f0 INTEGER; SET @f0 = 0;

DECLARE @f1 INTEGER; SET @f1 = 1;

WHILE (@n < 93)

BEGIN

SET @f1 = @f0 + @f1

SET @f0 = @f1;

SET @n = @n + 1;

INSERT INTO Fibonacci (@n, @f0);

END;

Since the Fibonacci series has a recursive definition, it makes the semi-procedural programmer feel good to use it in a recursive CTE than can be used in an INSERT INTO statement.

WITH Fibonacci(n, f0, f1)

(SELECT n, f, f1 --fixpoint step

FROM (VALUES (CAST(1 AS BIGINT), CAST(0 AS BIGINT), CAST(1 AS BIGINT)))

UNION ALL

SELECT n + 1, f0 + f1, f0 -- recursion step

FROM Fibonacci

WHERE n < 93) -- upper limit for BIGINT

SELECT n, f0 -- results

FROM Fibonacci;

But recursion is actually a procedural technique. It is also expensive since it is really a cursor under the covers. Another way to do this is with a mathematical trick called a closed form. Iteration and recursion are both avoided by a computation in a simple expression. In the case of the Fibonacci, you can use the Golden Ratio (Wikipedia: The Golden Ratio) or phi, in a formula.

INSERT INTO Fibonacci(n, fib)

SELECT seq,

ROUND (((POWER (1.6190339887, @n)- POWER (1.0 - 1.6190339887, @n))

/SQRT (5.0)), 0)

FROM Series

WHERE seq < 93;

Avoiding Integrity Triggers

Procedural code can appear in triggers and here is where you will find problems. Triggers are the real semi-procedural code in SQL. They are procedural code out of the users immediate control, attached to a table for the purpose of data integrity, computing special columns and external actions like audits. Data integrity used to be done with triggers because that is all we had.

Today, most of those integrity triggers can be replaced by declarative DRI Actions. They perform a simple action for DELETE and UPDATE action. The actions are option clauses on the DDL. The syntax is:

[ON UPDATE | ON DELETE][NO ACTION | CASCADE | SET NULL | SET DEFAULT]

NO ACTION: An error message tells the user that the action is not allowed and we get a ROLLBACK.
CASCADE: Deletes all rows containing data involved in the foreign key relationship.
SET NULL: Sets the value to NULL if all foreign key columns for the table can accept NULLs.
SET DEFAULT: Sets the value to the default value defined for the column if all foreign key columns for the table have defaults defined for them.

INSTEAD OF Triggers: The Good, the Bad and the Ugly.

But some triggers cannot be converted this way. Oops! Data integrity can be more complicated. The INSTEAD OF trigger was invented to solve a problem that has no other solution. In SQL, a VIEW can be updatable if it meets certain conditions. These conditions are pretty limited. The VIEW has to be built on one base table or on views that resolve to one base table. The VIEW has to include a key for the base. The columns that are in the base table, but not exposed in the VIEW, have to have DEFAULT values. None of the columns in the VIEW can be computed.

Are more general VIEWs updatable? Yes, in theory. Can we determine if an arbitrary general VIEW is updatable? No, in theory. Standard SQL and most products played it safe; we look for the easiest case. Look at these two VIEWs, assuming that employees are assigned to one and only one department:

CREATE VIEW Personnel_In (last_name, first_name)

SELECT last_name, first_name

FROM Personnel

WHERE dept_nbr IN (10, 20);

and

CREATE VIEW Personnel_Unioned (last_name, first_name)

SELECT last_name, first_name

FROM Personnel

WHERE dept_nbr = 10

UNION ALL

SELECT last_name, first_name

FROM Personnel

WHERE dept_nbr = 20;

The Personnel_In VIEW is updatable, but the Personnel_Union VIEW is not. Never mind that they are logically identical. Here is another situation:

CREATE VIEW Foobar (a, b, ab)

SELECT a, b, (a+b) FROM Barfoo;

It is easy to see that this makes sense and would work:

UPDATE Foobar

SET a = 6, b = 2;

The computation would then give us 12 for ab. But this statement will fail without a computed column construct or constraint.

UPDATE Foobar

SET ab = 12;

We have no rules for splitting up the computed “ab” value into columns a and b.

If you like reading a bit of theory, look at the references at the end of this article.

The solution was the INSTEAD OF trigger. Instead of doing an insert, update or delete action (all of which might be ambiguous), we perform the procedural code in the underlying base tables or updatable VIEWS that make up the VIEW with the INSTEAD OF trigger. The INSTEAD OF trigger is a BEFORE trigger, in spite of the fact that the database event that it precedes never happens.

I have a theory that you should not write more than five triggers in your career. But writing a trigger does not mean you should forget about table constraints. If you have ever worked with sales commissions, particularly in the Insurance industry, you know that they can be elaborate. We had an INSTEAD OF trigger on a complicated multi-table view that did a lot of computations and enforced a lot of business rules. The rules were driven by considerations like the salesman's level, the type of policy sold, how his sales team was doing and other things. You could often compute a commission several ways and we tried to optimize it or standardize it in the VIEW.

The users did not know what was happening under the covers, and we did not want them to know. The answer would appear in the VIEW by magic. If a rule changed before a payday, the VIEW would reflect the new rules for everyone. The previous system depended on the front end programmers doing the changed computations, and coordinating their programs was a real pain and slow. We were so proud of this trigger.

However, we spent so much time and effort on the VIEW that we forgot to look at the tables which build it. The lack of a simple check on the upper limit of one variable let an oversized value get into a base table. That lead to commissions that were greater than the price of the policy – nice for the salesmen but not for the company. The front end users could not do anything since they did not have access to the base tables.

The algebra in the VIEW was correct, so we knew we had to go through the base tables for bad data. This was harder than it sounds. We would flush out the bad data and feel good. But without a CHECK(), it would creep back in. As Graeme Simsion says, “mop the floor, then fix the leak” and is what we were failing to do.

References:

http://msdn.microsoft.com/en-us/library/ms175521.asp
http://goldennumber.net/
"The Golden Ratio: The Story of PHI, the World's Most Astonishing Number" by Mario Livio.
"The Golden Section: Nature's Greatest Secret" by Scott Olsen.
"The Divine Proportion" by Herbert Edwin Huntley.
"The Fabulous Fibonacci Numbers" by Alfred S. Posamentier.
"A Mathematical History of the Golden Number" by Roger Herz-Fischler.
"The Golden Section (Spectrum)" by Hans Walser.

This article has been viewed 7132 times.

Procedural, Semi-Procedural and Declarative Programing Part II

Procedural, Semi-Procedural and Declarative solutions: an example

Using Computation to Replace a Look-up

Avoiding Integrity Triggers

INSTEAD OF Triggers: The Good, the Bad and the Ugly.

References:

SQL Categories

DLM
Patterns & Practices Library

Get started

Custom RSS feeds

Phil Factor

Top Rated

Most Viewed

Why Join

Procedural, Semi-Procedural and Declarative Programing Part II

Procedural, Semi-Procedural and Declarative solutions: an example

Using Computation to Replace a Look-up

Avoiding Integrity Triggers

INSTEAD OF Triggers: The Good, the Bad and the Ugly.

References:

SQL Categories

DLMPatterns & Practices Library

Get started

Custom RSS feeds

Phil Factor

Top Rated

Most Viewed

Why Join

DLM
Patterns & Practices Library