by Boris Knizhnik,
BIK Information Services, Inc. |
To my DBAs:
Susan Zweighaft, Tracy Chase, Sue Carbo and Jon Feld |
Everyone knows SQL is a great language and it really pays to learn it in-depth. If only we had time to do so! We, full time GUI programmers, spend time learning intricacies of Powerbuilder, or VB or Delphi, or whatever else is out there, because we are hired to do just that. We never have enough time for serious SQL learning, because we are supposed to have a DBA to help when we need it. Yeah, right!
Well, first of all I am yet to see a place where DBAs are plenty and work for them is scarce so they really have time to work with us, mortals. Have you ever seen a DBA who could give you his undivided attention for more than 5 minutes without his beeper going off? Second, how are we supposed to know we need help? I personally can achieve whatever I want by running 8 SQL select statements and 3 cursors. How am I supposed to know I could do just that by using SQL more creatively? Where have you been before? I am certainly not going to change it now, it’s working.
I would like to present a couple of ideas I developed in attempts to learn some tricks to use SQL to achieve the results that usually are achieved in a way, that is half SQL- and half GUI-based.
I can see a lot of critic from DBA community and from SQL gurus. Hey, guys, you may know better, but I am writing for my community – GUI oriented guys.
Actually, I need an SQL function called DECODE in Oracle. There is a similar feature in MS SQL server and Sybase called CASE clause with similar capabilities but a lot more awkward syntax. Nevertheless, it can be used if you don’t have Oracle.
A couple of words about DECODE function:
DECODE(expr, comp_value1, subst_value1, comp_value2, subst_value2, … ,subst_value_if_nomatch) |
This function will compare expr with the list of predefine values.
If
expr=comp_value1 then it returns subst_value1, if expr=comp_value2 then
it returns subst_value2. If expr isn’t equal to any of the values, the
function will return subst_value_if_nomatch. For example:
SELECT DECODE(sex, 'F', 'FEMALE', 'M', 'MALE', NULL, 'Not Reported', 'Invalid Value') FROM personnel; |
Table emp1:
empno | ename | job | sal | comm | deptno |
7369 | SMITH | 25 | 800 | 20 | |
7499 | ALLEN | 23 | 1600 | 300 | 30 |
7521 | WARD | 23 | 1250 | 500 | 30 |
7566 | JONES | 13 | 2975 | 20 | |
7654 | MARTIN | 23 | 1250 | 1400 | 30 |
7698 | BLAKE | 13 | 2850 | 30 | |
7782 | CLARK | 13 | 2450 | 10 | |
7788 | SCOTT | 20 | 3000 | 20 | |
7839 | KING | 12 | 5000 | 10 | |
7844 | TURNER | 23 | 1500 | 30 | |
7876 | ADAMS | 25 | 1100 | 20 | |
7900 | JAMES | 25 | 950 | 30 | |
7902 | FORD | 20 | 3000 | 20 | |
7934 | MILLER | 25 | 1300 | 10 |
SELECT empno |
you can always comment out pieces of it without caring too much about leftover commas. It just makes you life so much easier.
SELECT empno |
Well, what about columns, which don’t seem to make that much sense:
SELECT empno |
Here are the results:
empno | Ename | job | comm | deptno | clerk | salesman | manager | analyst | president |
7369 | SMITH | 25 | 20 | 1 | 0 | 0 | 0 | 0 | |
7499 | ALLEN | 23 | 300 | 30 | 0 | 1 | 0 | 0 | 0 |
7521 | WARD | 23 | 500 | 30 | 0 | 1 | 0 | 0 | 0 |
7566 | JONES | 13 | 20 | 0 | 0 | 1 | 0 | 0 | |
7654 | MARTIN | 23 | 1400 | 30 | 0 | 1 | 0 | 0 | 0 |
7698 | BLAKE | 13 | 30 | 0 | 0 | 1 | 0 | 0 | |
7782 | CLARK | 13 | 10 | 0 | 0 | 1 | 0 | 0 | |
7788 | SCOTT | 20 | 20 | 0 | 0 | 0 | 1 | 0 | |
7839 | KING | 12 | 10 | 0 | 0 | 0 | 0 | 1 | |
7844 | TURNER | 23 | 30 | 0 | 1 | 0 | 0 | 0 | |
7876 | ADAMS | 25 | 20 | 1 | 0 | 0 | 0 | 0 | |
7900 | JAMES | 25 | 30 | 1 | 0 | 0 | 0 | 0 | |
7902 | FORD | 20 | 20 | 0 | 0 | 0 | 1 | 0 | |
7934 | MILLER | 25 | 10 | 1 | 0 | 0 | 0 | 0 |
First, what does a column like "decode(job,'25',1,0) clerk" represent ? It will contain 1 if the person is a clerk and 0 if not. Similarly, column ‘salesman’ will be equal to 1 if a person is a salesman or 0 if not.
Essentially, what we just did, we created self-explanatory column
names
that could be used for documentation purposes and/or conditions:
If you are lucky and your DBA allows you to create a view like that,
let’s call it emp_jobs_vw you can select all managers by select
statement :
SELECT * FROM emp_jobs_vw WHERE manager = 1; |
If you can’t create views you can still write your like this:
SELECT * FROM |
Well, it may not look such a big deal since you can do the same thing by
SELECT * FROM emp1 WHERE job='13'; |
Obviously we are loosing something here: If you had an index on column ‘job’, your optimizer could have used it for WHERE job=’13’ and wouldn’t use it for ‘manager = 1’. This is true but I encourage you to try it first and see whether performance goes down significantly and also consider some of the other suggestions in this article before you discard these tricks as inefficient. Consider this: Your total response time consists of the time it takes server to process your requests and the time it takes your GUI to prepare results for displaying. From this prospective you always have to weigh superefficient SQL that returns results in inconvenient format and your having to massage it in your GUI program versus maybe less efficient SQL that returns results in a ready-to-use format. I am not even mentioning maintenance problems, etc.
Also, don’t get me wrong – I am not lobbying for usage of words for all possible columns that contain code instead of English word. Certain applications or databases seem to lean heavily on a few concepts for which it may be convenient to have English-like words.
Let’s assume we have an application where we want to give a bird’s view of a store.
Let’s consider this SQL:
SELECT empno |
Here are the results:
empno | ename | Job | sal | Low in a food chain | Underpaid manager | Overpaid salesman | mid level clerk |
7369 | SMITH | 25 | 800 | 1 | 0 | 0 | 0 |
7499 | ALLEN | 23 | 1600 | 1 | 0 | 1 | 0 |
7521 | WARD | 23 | 1250 | 1 | 0 | 0 | 0 |
7566 | JONES | 13 | 2975 | 0 | 0 | 0 | 0 |
7654 | MARTIN | 23 | 1250 | 1 | 0 | 0 | 0 |
7698 | BLAKE | 13 | 2850 | 0 | 1 | 0 | 0 |
7782 | CLARK | 13 | 2450 | 0 | 1 | 0 | 0 |
7788 | SCOTT | 20 | 3000 | 0 | 0 | 0 | 0 |
7839 | KING | 12 | 5000 | 0 | 0 | 0 | 0 |
7844 | TURNER | 23 | 1500 | 1 | 0 | 1 | 0 |
7876 | ADAMS | 25 | 1100 | 1 | 0 | 0 | 1 |
7900 | JAMES | 25 | 950 | 1 | 0 | 0 | 1 |
7902 | FORD | 20 | 3000 | 0 | 0 | 0 | 0 |
7934 | MILLER | 25 | 1300 | 1 | 0 | 0 | 0 |
In the inner SELECT we added columns like decode(deptno,10,1,0) dept10 – the same approach – we will have this column equal 1 if this employee works for department 10, and 0 if not, etc.
The outer SELECT assigns names to certain conditions. For instance,
manager * decode(sign(sal - 2900),-1,1,0) underpaid_manager |
What we did here is we assigned names to relatively complex conditions. If a programmer needs to select underpaid managers he/she may code WHERE underpaid_manager =1. Those of you who worked in COBOL environment would probably recognize the 88-level names.
If we could create a view like above mentioned, how much easier the maintenance would be, because the intricacies of who is underpaid manager and who is overpaid salesman would be hidden in the view rather than buried in guts of a GUI program!
With this select statement your GUI program can use columns ‘underpaid_manager’ or ‘overpaid salesman’ as auxiliary columns to select graphical presentation, such as different background colors or bitmaps. It needn’t worry anymore about the complexity of the condition, causing this difference, no complex logic to determine who is who in your program is required. Isn’t it wonderful?
SELECT SUM(dept10) employees_in_dept10 |
Employees in dept10 | Employees in dept20 | Employees in dept30 | N of employees | N of clerks | N of sales men | N of mana gers | N of ana lysts | N of presi dents |
3 | 5 | 6 | 14 | 4 | 4 | 3 | 2 | 1 |
This code would give us counts of people by all three departments
and
also by job. Now, honest, how many of us if application requires these
kind of counts would with very little hesitation code eight separate
queries
for that, or if we are doing it after drinking our morning coffee we
will
end up with a union such as
SELECT 'D' record_type |
And after a little coding efforts in your GUI program to tell what kind of record (‘D’ or ‘J’) you are looking at and decoding the values we will get the same result.
Again, obviously the latter solution is in a way universal, but if a number of items you are interested in is fixed I think the former solution is easier.
SELECT SUM(clerk) n_of_clerks |
N of clerks | Clerk salary total | N of clerks in dept20 | dept20 clerk salary total |
4 | 4150 | 2 | 1900 |
This select does the same thing as these two:
SELECT COUNT(empno) n_of_clerks_in_dept20 |
SELECT COUNT(empno) n_of_clerks |
If you are really proficient with SQL you probably know that you can put a complex expression as an argument for a ‘SUM’ or ‘COUNT’ function and therefore get all the results in one select statement. In such case these expressions will essentially be of the same level of complexity as mine are, and it will just prove my point that this may be the way to go.
Look at this example:
SELECT sum(low_in_a_food_chain) low_in_a_food_chain |
Low in a food chain | Underpaid managers | Overpaid salesmen | mid level clerks |
8 | 2 | 2 | 2 |
If you think it looks threatening - it does. I just want to remind you, I am inserting what could be a view as in-line SELECT statement, and a list of columns is given as one on a line. In any case, what matters here is not so much the time you spend typing this SELECT as the time it takes to run it. Try to use this approach if you have a lot of similar queries and you will see it saves you a lot on a total time.
CREATE TABLE emp1 ( |
If you have any comments about this article please send them to Boris Knizhnik borisk@bikinfo.com