Click here to monitor SSC
  • Av rating:
  • Total votes: 5
  • Total comments: 0
Joe Celko

Procedural, Semi-Procedural and Declarative Programing Part II

02 March 2010

SQL Server accommodates a whole range of programming styles and will even allow you to create code that is wholly procedural. Is a declarative approach inevitably better? Can it be difficult to maintain? Can you avoid the performance problems of procedural code by using triggers? Joe adds some thoughts.

In the last article, I tried to show you how those programmers who are learning SQL tend to carry on trying to solve database problems in a procedural way. There is a big leap from a procedural mindset to a declarative one for most programmers. Most of them don't quite make that leap all at once, but  make a gradual step-wise transition from procedural to semi-procedural programming styles.

Procedural code can appear in procedures, but is this necessarily always true? Procedures can be no more than a BEGIN-END block with a sequence of SQL statements without any IF-THEN-ELSE or WHILE-DO loop logic in it.  Is such a block procedural or declarative when all it has is one declarative statement in it? I would say it was declarative. Is a block procedural or declarative when it has IF-THEN-ELSE or WHILE-DO loop control logic? I would say 'procedural'.

You can get rid of a lot of IF-THEN-ELSE control logic with CASE expressions. Before the CASE expression, there were unexpected pitfalls in  trying to apply procedural logic to SQL. The classic example is an UPDATE statement that was part of Sybase/SQL Server classes for decades. You have a bookstore and want to change the prices of the books. Any book over $25 will be discounted by 10% (we will advertise that) and books under $25 will be increased by 15% (we will not advertise that). The immediate solution  is to write this:

BEGIN

UPDATE Books

   SET price = price * 1.10

 WHERE price < 25.00;

UPDATE Books

   SET price = price * 0.85

 WHERE price >= 25.00;

END;

But it does not work! Look at a book that sells for $24.95 currently. Its price jumps to $27.45 when the first UPDATE is done. But when we do the second UPDATE, the price goes down to $23.33 finally. That is not what we meant to do. Flipping the updates does not help.

This was the classic argument for cursors. Hang in a loop and use an IF-THEN-ELSE statement to do the update of the current row in the cursor, just like a magnetic tape file. But this is not required today. We have the CASE expression, which is declarative.

UPDATE Books

   SET price

       = CASE

         WHEN price < 25.00

         THEN price * 1.10

         ELSE price * 0.85

         END;

Loops can be replaced with various constructs most of which apply set-oriented operations to the table involved, instead of doing RBAR (pronounced “re-bar”, like the steel rods used in concrete construction; RBAR, is an acronym for 'Row By Agonizing Row' coined by Jeff Moden). But another common change is to use the ROW_NUMBER() and other ordinal functions to replace a counting loop in procedural code.

Procedural, Semi-Procedural and Declarative solutions: an example

If you look at Part I of this series of articles, you will see recursion being used to create a table of sequential numbers. It was replaced with a declarative look-up table. But let's take a similar problem, the calculation of the Fibonacci series, and look at the Procedural, Semi-Procedural and Declarative approaches to it. I'm not trying to suggest that this is a practical problem: If you really needed this data, then you'd download it into a table (from The Fibonacci numbers). The problem is to build a table of Fibonacci numbers with n and the n-th Fibonacci number. I do not want to talk about the Fibonacci series. No, that is a lie. I would love to write a whole book on it, but a lot of other people beat me to it (see the references). Darn! The usual definition for the series is recursive:

FUNCTION fib(n)IF n = 0

THEN RETURN 0;

ELSE IF n = 1

     THEN RETURN 1;

     ELSE IF n > 1

          THEN RETURN (fib(n-2) + fib(n-1));

          END IF;

     END IF;

END IF;

Using Computation to Replace a Look-up

The most extreme example I can remember of using computation to replace a  look up was decades ago at Georgia Tech when we had CDC Cyber series computer. The hardware had the best floating point hardware and speed in its days. To give you an idea of what I mean, CDC had a COBOL compiler that converted COBOL picture data to floating point numbers, did the math and converted back to COBOL picture formats. It out-performed IBM machines on campus.

Rather than do table look-up in FORTRAN, one of our sales team members had just had a course on Chebyshev polynomials and fitting data to curves (Wikipedia: Chebyshev polynomials) and had a software package to create these polynomials. The formula that was impossible for a human being to understand. But it was faster than reading a disk and the data had a pattern that worked well with polynomials. Do not do this on anything but a supercomputer. I will now avoid the issue of performance versus maintenance.

There is a little debate about whether to start at (n = 0) or at (n =1), but the idea is that fib(n) = (fib(n-2) + fib(n-1)), so it is 0,1,1,2,3,5,8, etc. Here is a completely procedural loop to compute a table of Fibonacci numbers:

BEGIN

DECLARE @n INTEGER; SET @n = 1;

DECLARE @f0 INTEGER; SET @f0 = 0;

DECLARE @f1 INTEGER; SET @f1 = 1;

 

WHILE (@n < 93)

BEGIN

SET @f1 = @f0 + @f1

SET @f0 = @f1;

SET @n = @n + 1;

INSERT INTO Fibonacci (@n, @f0);

END;

Since the Fibonacci series has a recursive definition, it makes the semi-procedural programmer feel good to use it in a recursive CTE than can be used in an INSERT INTO statement.

WITH Fibonacci(n, f0, f1)

AS

(SELECT n, f, f1 --fixpoint step

   FROM (VALUES (CAST(1 AS BIGINT), CAST(0 AS BIGINT), CAST(1 AS BIGINT)))

 UNION ALL

 SELECT n + 1, f0 + f1, f0 -- recursion step

   FROM Fibonacci

  WHERE n < 93) -- upper limit for BIGINT

SELECT n, f0 -- results

 FROM Fibonacci;

But recursion is actually a procedural technique. It is also expensive since it is really a cursor under the covers. Another way to do this is with a mathematical trick called a closed form. Iteration and recursion are both avoided by a computation in a simple expression. In the case of the Fibonacci, you can use the Golden Ratio (Wikipedia: The Golden Ratio) or phi, in a formula.

INSERT INTO Fibonacci(n, fib)

SELECT seq,

       ROUND (((POWER (1.6190339887, @n)- POWER (1.0 - 1.6190339887, @n))

            /SQRT (5.0)), 0)

  FROM Series

 WHERE seq < 93;

Avoiding Integrity Triggers

Procedural code can appear in triggers and here is where you will find problems. Triggers are the real semi-procedural code in SQL. They are procedural code out of the users immediate control, attached to a table for the purpose of data integrity, computing special columns and external actions like audits. Data integrity used to be done with triggers because that is all we had.

Today, most of those integrity triggers can be replaced by declarative DRI Actions. They perform a simple action for DELETE and UPDATE action. The actions are option clauses on the DDL. The syntax is:

[ON UPDATE | ON DELETE][NO ACTION | CASCADE | SET NULL | SET DEFAULT]

  • NO ACTION:        An error message tells the user that the action is not allowed and we get a ROLLBACK.
  • CASCADE:          Deletes all rows containing data involved in the foreign key relationship.
  • SET NULL:          Sets the value to NULL if all foreign key columns for the table can accept NULLs.
  • SET DEFAULT:  Sets the value to the default value defined for the column if all foreign key columns for the table have defaults defined for them.

INSTEAD OF Triggers: The Good, the Bad and the Ugly.

But some triggers cannot be converted this way. Oops! Data integrity can be more complicated. The INSTEAD OF trigger was invented to solve a problem that has no other solution. In SQL, a VIEW can be updatable if it meets certain conditions. These conditions are pretty limited. The VIEW has to be built on one base table or on views that resolve to one base table. The VIEW has to include a key for the base. The columns that are in the base table, but not exposed in the VIEW, have to have DEFAULT values. None of the columns in the VIEW can be computed.

Are more general VIEWs updatable? Yes, in theory. Can we determine if an arbitrary general VIEW is updatable? No, in theory. Standard SQL and most products played it safe; we look for the easiest case. Look at these two VIEWs, assuming that employees are assigned to one and only one department:

CREATE VIEW Personnel_In (last_name, first_name)

AS

SELECT last_name, first_name

FROM Personnel

WHERE dept_nbr IN (10, 20);

and

CREATE VIEW Personnel_Unioned (last_name, first_name)

AS

SELECT last_name, first_name

FROM Personnel

WHERE dept_nbr = 10

UNION ALL

SELECT last_name, first_name

  FROM Personnel

 WHERE dept_nbr = 20;

The Personnel_In VIEW is updatable, but the Personnel_Union VIEW is not. Never mind that they are logically identical. Here is another situation:

CREATE VIEW Foobar (a, b, ab)

AS

SELECT a, b, (a+b) FROM Barfoo;

It is easy to see that this makes sense and would work:

UPDATE Foobar

   SET a = 6, b = 2;

The computation would then give us 12 for ab. But this statement will fail without a computed column construct or constraint.

UPDATE Foobar

   SET ab = 12;

We have no rules for splitting up the computed “ab” value into columns a and b.

If you like reading a bit of theory, look at the references at the end of this article.

The solution was the INSTEAD OF trigger. Instead of doing an insert, update or delete action (all of which might be ambiguous), we perform the procedural code in the underlying base tables or updatable VIEWS that make up the VIEW with the INSTEAD OF trigger. The INSTEAD OF trigger is a BEFORE trigger, in spite of the fact that the database event that it precedes never happens.

I have a theory that you should not write more than five triggers in your career. But writing a trigger does not mean you should forget about table constraints. If you have ever worked with sales commissions, particularly in the Insurance industry, you know that they can be elaborate. We had an INSTEAD OF trigger on a complicated multi-table view that did a lot of computations and enforced a lot of business rules. The rules were driven by considerations like the salesman's level, the type of policy sold, how his sales team was doing and other things. You could often compute a commission several ways and we tried to optimize it or standardize it in the VIEW.

The users did not know what was happening under the covers, and we did not want them to know. The answer would appear in the VIEW by magic. If a rule changed before a payday, the VIEW would reflect the new rules for everyone. The previous system depended on the front end programmers doing the changed computations, and coordinating their programs was a real pain and slow. We were so proud of this trigger.

However, we spent so much time and effort on the VIEW that we forgot to look at the tables which build it. The lack of a simple check on the upper limit of one variable let an oversized value get into a base table. That lead to commissions that were greater than the price of the policy – nice for the salesmen but not for the company. The front end users could not do anything since they did not have access to the base tables.

The algebra in the VIEW was correct, so we knew we had to go through the base tables for bad data. This was harder than it sounds. We would flush out the bad data and feel good. But without a CHECK(), it would creep back in. As Graeme Simsion says, “mop the floor, then fix the leak” and is what we were failing to do.

References:

  • http://msdn.microsoft.com/en-us/library/ms175521.asp
  • http://goldennumber.net/
  • "The Golden Ratio: The Story of PHI, the World's Most Astonishing Number" by Mario Livio.
  • "The Golden Section: Nature's Greatest Secret" by Scott Olsen.
  • "The Divine Proportion" by Herbert Edwin Huntley.
  • "The Fabulous Fibonacci Numbers" by Alfred S. Posamentier.
  • "A Mathematical History of the Golden Number" by Roger Herz-Fischler.
  • "The Golden Section (Spectrum)" by Hans Walser.
Joe Celko

Author profile:

Joe Celko is one of the most widely read of all writers about SQL, and was the winner of the DBMS Magazine Reader's Choice Award four consecutive years. He is an independent consultant living in Austin, TX. He has taught SQL in the US, UK, the Nordic countries, South America and Africa.
He served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.
He has written over 800 columns in the computer trade and academic press, mostly dealing with data and databases. He is the author of eight books on SQL for Morgan-Kaufmann, including the best selling SQL FOR SMARTIES.
Joe is a well-known figure on Newsgroups and Forums, and he is famous for his his dry wit. He is also interested in Science Fiction.

Search for other articles by Joe Celko

Rate this article:   Avg rating: from a total of 5 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.
 
Simple-Talk Database Delivery

DLM
Patterns & Practices Library

Visit our patterns and practices library to learn more about database lifecycle management.

Find out how to automate the process of building, testing and deploying your database changes to reduce risk and make rapid releases possible.

Get started

Phil Factor
A Start with Automating Database Configuration Management

For a number of reasons, it pays to have the up-to-date source of all the databases and servers that you're... Read more...

 View the blog

Top Rated

Identifying and Solving Index Scan Problems
 When you're developing database applications, it pays to check for index scans in the SQL Server query... Read more...

A Start with Automating Database Configuration Management
 For a number of reasons, it pays to have the up-to-date source of all the databases and servers that... Read more...

Archiving Hierarchical, Deleted Transactions Using XML
 When you delete a business transaction from the database, there are times when you might want to keep a... Read more...

The Mindset of the Enterprise DBA: Harnessing the Power of Automation
 After you have done the necessary groundwork of standardizing and centralizing your database... Read more...

Rollback and Recovery Troubleshooting; Challenges and Strategies
 What happens if your database deployment goes awry? Do you restore from a backup or snapshot and lose... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.