Database

Databases

Articles

DELETE Vs TRUNCATE

As per my experience in MySQL as of today (11March2009 Wed 10 PM IST), the things go as below.

  • Truncate does behave exactly as that DELETE with no where clause
DELETE FROM <tableName> == TRUNCATE TABLE <tableName>
  • However, the Truncate resets the auto_increment keys if any in the table which the delete does NOT take care of. There are many other such differences with respect to "CASCADE DELETE" etc., Look at the MySQL Documentation for more and clear info!

PK Vs Unique

Difference between Primary Key Vs Unique constraint

  • Both PRIMARY KEY constraint and UNIQUE constraint uses to enforce Entity integrity (defines a row as a unique entity for a particular table), but primary keys do not allow null values. [Reference: ]
  • One difference is that a unique key constraint can be created over columns with null values, while a primary key constraint does not allow null values. The other difference is that there can only be one primary key constraint defined on a table, but there can be multiple unique key constraints defined on a table.
  • afdasasf

DB2 Specific

Oracle Specific

  • Oracle specific stuff are discussed here

SQL Best Practices

Some of the SQL Best practices — sent by Sreejith Menon (in LMRA Bahrain), through email.

Pretest all embedded SQL: Before embedding SQL in an application program, you should test it using SPUFI. This reduces the amount of program testing by ensuring that all SQL code is syntactically correct and efficient before it is placed in an application program.

Never use SELECT *: As a general rule, a query should never ask DB2 for anything more than is required to satisfy the desired task. For maximum flexibility and efficiency, each query should access only the columns needed for the function that will be performed.

Singleton SELECT versus the cursor: To return a single row, an application program can use a cursor or a singleton SELECT. A cursor requires an OPEN, FETCH, and CLOSE to retrieve one row, whereas a singleton SELECT requires only SELECT… INTO. Usually, the singleton SELECT outperforms the cursor.

When the selected row must be updated after it is retrieved, however, using a cursor with the FOR UPDATE OF clause is recommended over a singleton SELECT. The FOR UPDATE OF clause ensures the integrity of the data in the row because it causes DB2 to hold an exclusive lock on the page containing the row to be updated. The singleton select provides no such locking.

Use FOR FETCH ONLY: When a SELECT statement will be used only for retrieval, use the FOR FETCH ONLY clause.

Avoid using DISTINCT: The DISTINCT verb removes duplicate rows form an answer set. If duplicates will not cause a problem, do not code distinct, because it adds overhead by invoking a sort to remove the duplicates.

Limit the data selected: Return the minimum number of columns and rows needed by your application program by making efficient use of the WHERE (SQL predicate) clause. It is almost always more efficient to allow DB2 to use the WHERE clause to limit the data returned.

Code predicates on indexed columns: DB2 usually performs more efficiently when it can satisfy a request using an existing index rather than no index. Design all SQL statements to take advantage of indexes.

Multicolumn indexes: If a table has only multicolumn indexes, try to specify the high-level column in the WHERE clause of your query. This results in an index scan with at least one matching column.

Use ORDER BY only when sequence is important: Code the ORDER BY clause when the sequence of rows being returned is important. Order only those columns that are absolutely necessary in order to improve efficiency.

Use equivalent data types: Use the same data types and lengths when comparing column values to host variables or literals. This eliminates the need for data conversion and allows for the use of an index. For example, comparing a column defined as CHAR(6) to a field which is CHAR(4) or CHAR(7) will cause data conversion and should be avoided at all costs. The easiest way to ensure datatype consistency is to use the DCLGEN fields whenever possible.

Use BETWEEN rather than <= and >= : BETWEEN allows the optimizer to select a more efficient access path.

Use IN instead of LIKE: If you know that only a certain number of occurrences exist, using IN with the specific list is more efficient than using LIKE. The functionality of LIKE can be imitated using a range of values. For example, if you want to retrieve all employees with a last name starting with "M," use BETWEEN 'maaaaaaaaaaaaaaa' and 'mzzzzzzzzzzzzzzz' instead of LIKE 'm%'

Avoid using NOT (except with EXISTS): Not should only be used as an alternative to very complex predicates.

Code the most restrictive predicate first: When you code predicates in your select statement, place the predicate that will eliminate the greatest number of rows first.

Do not use arithmetic expressions in a predicate: DB2 will not use an index for a column when the column is in a predicate that includes arithmetic. Perform calculations before the SQL statement, then use the result in the query.

Use Date and Time arithmetic with care: DB2 provides the capability to add and subtract DATE, TIME, and TIMESTAMP columns. Keep in mind the following rules:

-both operands must be of the same data type (ie, both dates or both times)

Example: CURRENT DATE - 1 MONTH or CURRENT DATE + 1 DAY

-if one operand is a timestamp, the other operand can be a time or a date, but not a timestamp.

Example: TIMESTAMP + 1 DAY or TIMESTAMP - 24 HOURS

-date durations are expressed as DECIMAL(8,0). The valid date durations are:

DAY, MONTH, and YEAR and their plural equivalents.

-time durations are expressed as DECIMAL(6,0). The valid time durations are:

HOUR, MINUTE, SECOND, and MICROSECOND and their plural equivalents.

Specify the number of rows to be returned: When you are coding a cursor to fetch a predictable number of rows, consider specifying the number of rows to be retrieved in the OPTIMIZE FOR n ROWS clause of the CURSOR. This gives DB2 the opportunity to select the optimal access path for the statement based on actual use.

Try to sort only on indexed columns: When using ORDER BY, GROUP BY, DISTINCT, and UNION, it is best to use only indexed columns.

Joins instead of subqueries: Joins will give the DB2 optimizer more options for data access than a subquery.

Minimize number of tables used in a join: As a general rule of thumb, avoid using more than 3 tables in a join.

Explicitly specify all column names in any SQL statement.

Avoid using the substring function in an SQL predicate.

II. Application Development Standards & Guidelines

Code modular DB2 programs: You should design DB2 programs to be modular. One program should accomplish a single, well-defined task. If multiple tasks need to be executed, structure the programs so that tasks can be strung together by having the programs call one another. This is preferable to a single, large program that accomplishes many tasks for two reasons. One, single tasks in separate programs make the programs easier to understand and maintain. Two, if each task can be executed either alone or with other tasks, isolating the tasks in a program enables easier execution of any single task or list of tasks.

Avoid host structures: Avoid SELECTing or FETCHing INTO a group-level host variable structure.

Explicitly code literals: When possible, code literals explicitly in the SQL statement rather than moving the literals to host variables and then processing the SQL statement using the host variables.

Do not use dynamic SQL in application programs.

DDL and DCL in application programs: Do not use data definition (DDL) and data control (DCL) SQL statements in application programs.

CICS/DB2 programs coded in Cobol II should use command-level CICS rather than macro.

Use of DCLGEN: Create the table layout ('include' member) by using the DCLGEN facility (option 2) in DB2I. The DCLGEN includes a DECLARE statement and a cobol layout for the table.

Edit the DCLGEN. Remove the qualifier from the table name (eg. testdba or proddba). The qualifier will be specified when the plan is bound.

Optionally, the cobol field names can be prefixed with the DCLGEN name. This might assist the application programmer when dealing with the same field from different tables. (EX: account-nbr becomes caacctt-account-nbr in the caacctt DCLGEN member and glglent-account-nbr in the glglent DCLGEN member)

Move the member from the default library to DS.TEST.PANVALET and use the standard '++INCLUDE' to bring the definition into your program. (Do not use the 'EXEC SQL INCLUDE' statement as seen in some sample programs.)

Do not use qualifiers in application programmers: After executing the DCLGEN for a table or view, modify the table/view names by removing the qualifier (usually 'TESTDBA'). Use only unqualified table and view names in an application program. This will facilitate an easier migration of programs from test to production since the bind will provide the appropriate qualifier (see section E.2).

Use SQL built-in functions: It is more efficient to use the SQL built-in functions (AVG, COUNT, MAX, MIN, SUM) than the application program perform these functions (before using these, be sure to zero the variable you select into and check for SQLCODE 0, -305, or other).

Avoid SQL scalar functions: Avoid the use of the SQL Scalar functions used for data type conversions, character string manipulation, and date/time conversions (i.e. INTEGER, DECIMAL, HEX, SUBST, etc.).

Use CS instead of RR isolation level: Do not specify repeatable read {isolation (RR)} RR will force DB2 to lock every page accessed and to hold the lock until a COMMIT is issued. Specify cursor stability {isolation (CS)} where possible.

Use of COMP and COMP-3 fields: When updating or inserting COMP or COMP-3 defined columns, use the DCLGEN fields for the values. Using your own defined working storage variables is unpredictable.

DB2 date, time, and timestamp: Use current date, time, or timestamp whenever possible instead of reformatting an operating system date into DB2 format.

III. Batch Programming Standards & Guidelines

Use LOCK TABLE with caution: As a general rule, use the LOCK TABLE only after discussing its implications with your DBA staff. LOCK TABLE can significantly decrease an applications processing time by eliminating page locks, but will also make the tablespace unavailable to any other access. It locks ALL tables in a Tablespace. The locks are held until a COMMIT or DEALLOCATE, and all users are locked out of the tablespace.

Periodically COMMIT WORK in batch update programs: Any batch program that issues more than 500 updates is a candidate for COMMIT processing. COMMIT does not flush data from the DB2 bufferpool and physically apply the data to the table. It will, however, ensure that all modifications have been physically applied to the DB2 log, thereby ensuring data integrity and recoverability.

COMMIT should be issued at the end of each logical unit of work (Commit early and commit often). This will free all tablespace and page locks caused by the program, but will also lose cursor positioning unless the cursor has been declared using the WITH HOLD option.

Make programs restartable: In time-critical applications, DB2 batch programs that modify table data should be restartable in case there is a system error.

Hold cursor rather than reposition: If COMMITs are coded in a program that updates data using cursors, use the WITH HOLD option with the cursor.

About Concurrency: Typically, batch programs are executed in the "batch window" which is a given time frame where online applications will not be accessing the tables being updated. The LOCK TABLE statement would frequently be used in the batch window since it provides the least amount of DB2 locking overhead. Care must be taken as indicated above that online programs will not be attempting to utilize this table space, or that concurrent batch activity is not being processed.

DB2 provides the ability to execute batch table updates in concurrence with CICS or other batch jobs. However, there is a performance cost for such an activity. Either the job is quickly run or frequent COMMIT commands would be needed (see following section).

NOTE: This concurrency feature is quite attractive, and makes batch a better alternative to background CICS processing in most cases.

About COMMITting: It is important to issue COMMIT commands in batch programs for two reasons; it will improve recovery time by reducing the amount of work that has to backed out and redone in the case of an abnormal termination, and it will free locks used by the batch job allowing more concurrency.

A checkpoint/restart routine must be established for batch programs which will store all relevant information needed to restart the program in a separate DB2 table followed by a COMMIT call. When the program runs or reruns, an initialization routine will read the DB2 restart table to obtain a starting point. It will be your responsibility to recover any non-DB2 resources that are updated by the program.

The frequency at which checkpoints should be taken will vary from application to application. There are a number of considerations for selecting the commit frequency. If the batch program is accessing tables that are also being accessed by online transactions, you should checkpoint very frequently. If the data accessed by the batch program is not used concurrently by online transactions, the checkpoint interval should be such that the restart time is acceptable.

Checkpoint/Restart: Once a COMMIT is issued in the batch program, the program must be written to support a restart at the point of the COMMIT. This would include repositioning the cursor, repositioning in an input transaction file, and handling the output files, including reports, so that there will not be any lost data.

It is generally advisable not to write to a non-DB2 file at the same time you are updating a DB2 database. If an output file must be written DB2 table(s) could be used to contain pertinent information for a later process. If the output file is too large for DB2 then the program must insure that all data information is written to the file at the time of the COMMIT (externalize the output records). This can be accomplished through the MVS checkpoint/restart facility, or by closing and then reopening the files (see Section H for details and examples of checkpoint/restart).

Synonyms

Synonyms in Oracle

Synonyms is nothing but an alias name for the tables when used across schemas.

When 'TableA' of 'SchemaA' is being referred/accessed in 'SchemaB', it is natural that it has to be referred
with the prefix of the appropriate schema name as 'SchemaA.TableA'. As it is a kinda overhead and we want to
avoid that, we use synonyms in such a way that we can straight away use, "select * from SynonymOfTableA".

Norms of Synonyms

1. First of all, you should grant the rights (on SELECT, INSERT, DELETE) etc from Source Schema to Target Schema
Here, you have to issue a SELECT grant (for example, just to assign rights on fetching/retrieving) as
"GRANT SELECT on TableA to SchemaB" [Should be done on the SchemaA]

2. Create a Synonym in the Target Schema where you want to use/refer/access the table.
Here, you have to create a schema in SchemaB as
"CREATE SYNONYM D_TableA FOR SchemaA.TableA" (Have to specify the source schema 'SchemaA').

[Convention: Can be of anything. Just to indicate that it is a synonym, people have a specific prefix.
Here, "D_" is being used in JPMC MMPortal Service Center].
We do use MMPORTAL and SHARK as two schemas.

3. You can verify the new schema being created in SchemaB with the following command,
"SELECT * FROM TAB where TABTYPE='SYNONYM'" — displays all the snyonyms of SchemaB

4. Now, proceed working with TableA in SchemaB by using the follwing command as if it TableA is defined in SchemaB
"SELECT * FROM D_TableA"

Alternatively, you can just verify the traditional approach as follows:
"SELECT * FROM SchemaA.TableA"

— M Raghavan alias Saravanan
18 August 2009 Tuesday 8 42 PM IST

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-Share Alike 2.5 License.