This is the fourth of several blog articles addressing Db2 performance. All the content of these articles will be based upon Db2 for z/OS. However, much of it can also be applied to Db2 for LUW. The articles will address mostly application performance regarding coding efficient SQL statements as well as efficient Db2 application coding techniques and settings. There is a saying in the industry, “it depends”, which is just another waying of saying “I don’t know”. While these articles will avoid using either of these terms, please keep in mind that your result can and will be different. With that in mind any design decision should be thoroughly tested prior to implementation. For testing references please see
Call Only When Necessary
Yes, indeed the most efficient SQL statement is the one that is never executed. There are two things to remember about Db2. First, it’s a black box of sorts. That is, under the covers it is managing data in a vast number of physical datasets, maintaining the associated metadata that obscures the actual physical location and organization of the data. Thus, relieving the application developer of any knowledge of the physicality of the data they are accessing. All the developer needs to know is the tables and columns they need to access. That comes with an associated cost such that a database call will never be as cheap as a native file system call. Second, any call to Db2 from an application program (all calls to Db2 are from a program) is an external call from that application program. For example, if the call is in a CICS program or COBOL batch program to Db2 for z/OS located in the same z/OS LPAR it is considered a cross-memory call from one address space to another. If the call is coming from a remote application on another server, then that call is not only external from the Db2 distributed address space, but it involves traversing a network as well.
So, how can calls to the Db2 be reduced? There are a lot of things that can be done in this regard, and this is the single most important thing that an application developer can do to improve performance. You see, when a DBA sees an application with excessive “in Db2” times, they can investigate individual SQL statements that the application executes. If they see a few SQL statements using a lot of resources it can be an easy fix without application changes, such as adding an index or running statistics. However, if what they see is a whole lot of inexpensive statements executing then there is little a DBA can do to fix the issue. In this case it’s all on the application developer to reduce the number of calls.
Please see article 1 for call avoidance recommendations.
Please see article 2 for using SQL joins to avoid calls to the data server.
Please see article 3 for using subqueries to avoid calls to the data server as well as performance choices for subqueries and joins.
In this article I’m going to present using stored procedures to avoid excessive calls to the data server!
What is a Stored Procedure?
A stored procedure is a compiled program that can execute SQL statements and is stored at a local Db2 server and can be called from a local or remote Db2 client. You can invoke a stored procedure from an application program or from the command line processor. A single call to a stored procedure from a client application can access the database at the server several times. In Db2 the are several types of stored procedures, but I’ll only be addressing SQLPL stored procedures, otherwise known as native stored procedures. So, a native stored procedure is written entirely in SQL and is stored inside the database as source and executable code. A native stored procedure, when it is deployed, is statically bound into a database package. This makes the execution of a stored procedure exceptionally efficient when compared to a remote application program issuing several dynamic SQL statements as part of a transaction. Whenever I mention stored procedures in this article, I am talking about native SQLPL stored procedures only.
Here is a short example that is by no means an example of a stored procedure that improves performance. The following procedure accepts an input parameter of a department number (using the Db2 sample database as a basis for the examples) and returns a result set containing the employees’ number and last name. A stored procedure can accept and return many parameters as well as cursor result sets.
CREATE PROCEDURE DANL.PROC1 ( IN P_DEPTNO CHAR(3) CCSID EBCDIC FOR SBCS DATA ) VERSION V1 ISOLATION LEVEL CS RESULT SETS 1 LANGUAGE SQL QUALIFIER DSN81210 P1: BEGIN -- Declare cursor DECLARE cursor1 CURSOR WITH RETURN FOR SELECT EMPNO, LASTNAME FROM DSN81210.EMP WHERE EMPNO = P_EMPNO; -- Cursor left open for client application OPEN cursor1; END P1
There can also be multiple SQL statements embedded within the stored procedure and logic including such commands as:
- FOR loop
- SIGNAL error
- WHILE loop
- And more…
You can do almost everything within SQL and SQLPL. I have even personally rewritten assembler programs doing bit manipulation as SQL and SQLPL!
Stored Procedures for Performance
There are several reasons to use stored procedures and one of them is performance. I’m only going to talk about performance, so I’ll only address that aspect of stored procedures. A stored procedure is a performance advantage when it incorporates several SQL statements and associated business logic as an alternative to multiple SQL statements issued remotely across a network. This illustration shows a remote transaction that could benefit from being implemented as a stored procedure.
In the above example a remote application is issuing multiple dynamic SQL statements across a network while executing a business transaction. The following example shows the same transaction encapsulated in one stored procedure call.
So, in what was formerly three dynamic SQL calls across a network to the Db2 data server is now one remote call and three local static calls. This could potentially represent a significant elapsed and CPU savings. Perhaps not noticeable to the end user, but to network administrators, DBAs, and system programs who monitor performance, and need to justify upgrades, it can be significant.
Let’s visualize a potential savings by utilizing a stored procedure in a fictional transaction where we are closing the Manufacturing System Department in our sample database business, Spiffy Computer Service. We want to consolidate all the employees and projects from the Manufacturing System Department into the Administration Systems Department. The transaction would involve moving the employees and the project associated with the Manufacturing System Department into the Administration Department, and then deleting the Manufacturing System Department (probably more involved than that, but just trying to make a point). A transaction from a remote application would have to issue the following three statements over a network to accomplish the task.
UPDATE DSN81210.EMP SET WORKDEPT = 'D21' WHERE WORKDEPRT = 'D11'; UPDATE DSN81210.PROJ SET DEPTNO = 'D21' WHERE DEPTNO = 'D11'; DELETE FROM DSN81210.DEPT WHERE DEPTNO = 'D11';
A stored procedure could be constructed to cut the network overhead by two thirds.
CREATE PROCEDURE DanL.transferkillDEPT ( IN P_transDEPTNO CHAR(3) CCSID EBCDIC FOR SBCS DATA, IN P_killDEPTNO CHAR(3) CCSID EBCDIC FOR SBCS DATA ) VERSION V1 ISOLATION LEVEL CS RESULT SETS 1 LANGUAGE SQL QUALIFIER DSN81210 P1: BEGIN UPDATE EMP SET WORKDEPT = P_transDEPTNO WHERE WORKDEPT = P_killDEPTNO; UPDATE PROJ SET DEPTNO = P_transDEPTNO WHERE DEPTNO = P_killDEPTNO; DELETE FROM DEPT WHERE DEPTNO = P_killDEPTNO; END P1
So, then the application would only have to issue the following statement, cutting the network traffic by two thirds.
CALL DanL.transferkillDEPT('D21', 'D11');
I am not saying that this is a perfect solution, but for a high-volume remote application with a high network latency this very well could be the ticket to better performance. One, and one more thing, think about remote batch processes and the ability to send an array into a stored procedure! I am currently exploring this as an option to improve remote batch performance where instead of processing one piece of information at a time, and array of several thousand keys are sent to a stored procedure to batch up processing!
Coming in the Next Blog
In the next blog article, I’ll continue the discussion on application performance with some advanced SQL features that will basically process multiple pieces of information in single SQL calls, keeping the focus on reducing calls to the database.