PL/pgSQL (en)

Z PostgreSQL
Skočit na navigaci Skočit na vyhledávání
Verze k tisku již není podporovaná a může obsahovat chyby s vykreslováním. Aktualizujte si prosím záložky ve svém prohlížeči a použijte prosím zabudovanou funkci prohlížeče pro tisknutí.

First three parts translated by Martin Řehák, parts 4-6 translated by Karolína Lehotská, parts 7-10 translated by Lucie Prunarová, parts 11-14 translated by Zdeněk Růžička

The PostgreSQL RDBMS, like any true RDBMS, allows users to create and use stored procedures. A stored procedure is database logic code, saved on the SQL server, and executed later. In PostgreSQL, stored procedures can be created using any of these languages: SQL, Perl, Python, TCL, and PL/pgSQL. PL/pgSQL is the most frequently used language for writing stored procedures. PL/pgSQL is a simple ADA-based programming language with SQL support. This language was significantly inspired by Oracle's PL/SQL language. It is simple, strong, but it cannot be used for creating custom PostgreSQL's data types.

This document is designed for PostgreSQL 8.1 or higher

Google AdSense

When PL/pgSQL is not applicable

There are some limitations on when stored procedures can be used. Code in stored procedures must be absolutely non-interactive. Since most applications need some level of interaction, user experiences suggest it is not a good design to try to write an application entirely through stored procedures. The SQL server cannot completely replace the application server; so it is up to the designer to compare the advantages and disadvantages of coding in the database layer versus the application layer. Some limits are included in PL/pgSQL too. This language is strictly static, this language isn't designed for heavy mathematic, string or array operations (the best using of PL/pgSQL is like glue for SQL statements).

PL/pgSQL is not useful in initializations of bigger arrays, because any array update can be slow. If there’s no other way, it’s much faster to create fields of a array by conversion table (by SRF function) into the array. For example the fastest way to populate a field with incrementing values is SELECT ARRAY(SELECT * FROM generate_series(1,100));. The function generate_series is an SRF function generating an incrementing set of values over the entered interval. Same principes are true for strings too. When you’re not satisfied with the speed of a function, try to use the following trick by converting a SRF function – replace the part of string concation by RETURN NEXT command. The final concation is situated out of body, with use of array_to_string command.

CREATE OR REPLACE FUNCTION generate_string(integer) RETURNS SETOF varchar AS $$
BEGIN
  FOR _i IN 1 .. $1 LOOP 
    RETURN NEXT '<item>'||_i||'</item>';
  END LOOP;
  RETURN;
END; $$ LANGUAGE plpgsql;

SELECT array_to_string(
  ARRAY(SELECT * FROM generate_string(1000)), '');

There‘s no support for I/O operations in present and it’s not expected in the near future - PL/pgSQL exists in trusted version only. The solution is to use an untrusted language, like pl/perl or pl/python. PL/pgSQL is not useful to create general triggers. In PL/pgSQL it’s not possible to convert values of type RECORD into fields [a RECORD item is accessible only through a static identifier, known at compilation time; a field could be accessed due to a dynamic index.] So there is no choice of dynamic iteration through all RECORD type items, or of dynamic access to RECORD type. Again, a solution is to use another language; this time a trusted variant could also work.

Inappropriate use of the PL/pgSQL language

PL/pgSQL is a very simple language and mainly very simple interpreted commands. It hasn't own "arithmetic" unit. All commands are translated to a series of SQL statements. Usually "simple SQL" expressions are used - PostgreSQL can evaluate it much faster than full SQL statement, but still slower than truly compiled integrated functionality. So - importantly - think first and use integrated functionality everywhere when it is possible. The following code is an example of bad programming style (I found this code on net):

create or replace function check_phone_number (text)
returns boolean as $$
declare
  _str alias for $1;
  _char text;
  i    int4;
begin
  i=1;
  while true loop
    if length (_str) != 16 then exit; end if;
    _char := substring(_str from i for 1);
    if _char != '+' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char < '0' or _char >'9' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char < '0' or _char >'9' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char < '0' or _char >'9' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char != ' ' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char < '0' or _char >'9' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char < '0' or _char >'9' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char < '0' or _char >'9' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char != ' ' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char < '0' or _char >'9' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char < '0' or _char >'9' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char < '0' or _char >'9' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char != ' ' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char < '0' or _char >'9' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char < '0' or _char >'9' then exit; end if;
    i:=i+1; _char := substring(_str from i for 1);
    if _char < '0' or _char >'9' then exit; end if;
    return true;
  end loop;
  raise exception e'wrong char \'%\' on position %. Use format +XXX XXX XXX XXX', _char, i;
  return false; -- << death code
end;
$$ language plpgsql; 

This code is one big mistake. Long, with repeated code - this code will be really slow in PL/pgSQL. When we need similar functionality, we have to use a regular expressions:

CREATE OR REPLACE FUNCTION check_phone_number(text)
RETURNS boolean AS $$
BEGIN
  IF NOT $1 ~  e'^\\+\\d{3}\\ \\d{3} \\d{3} \\d{3}$' THEN
    RAISE EXCEPTION 'Wrong formated string "%". Expected format is +999 999 999 999';
  END IF;
  RETURN true; 
END;
$$ LANGUAGE plpgsql STRICT IMMUTABLE;

Still this code isn't good. It returns true or raises exceptions. I prefere little bit different design (less restrictive with availability to normalise string:

CREATE OR REPLACE FUNCTION normalise_phone_number(text)
RETURNS text AS $$
DECLARE aux text := translate($1, ' ','');
BEGIN
  IF aux ~ e'^\\+\\d{12}$' THEN
    RETURN substring(aux FROM 2);
  ELSEIF aux ~ e'^\\d{9}$' THEN
    RETURN aux;
  ELSE
    RAISE EXCEPTION 'Unexpected format of phone number "%".', $1;
  END IF;
END;
$$ LANGUAGE plpgsql STRICT IMMUTABLE;

Using:

postgres=# select normalise_phone_number('+420 222 222 222');
 normalise_phone_number 
------------------------
 420222222222
(1 row)

postgres=# select normalise_phone_number('+420 222 222222');
 normalise_phone_number 
------------------------
 420222222222
(1 row)

postgres=# select normalise_phone_number('+420 22 222222');
ERROR:  Unexpected format of phone number "+420 22 222222".

postgres=# select normalise_phone_number('724 191 000');
 normalise_phone_number 
------------------------
 724191000
(1 row)

Don't use PL/pgSQL for deep recursion calls

Due internal design principles PL/pgSQL should not be well optimized for recursion calls. PL/pgSQL supports recursion, and for not too deeps calls can provide enough to satisfy sb performance, but it is very slow for deep recursion. Nice example is taken from presentation http://plv8-talk.herokuapp.com. It is perfect example how don't use PL/pgSQL ever.

-- non recursion form
CREATE OR REPLACE FUNCTION public.psqlfibnr(n integer)
 RETURNS integer
 LANGUAGE plpgsql
 IMMUTABLE STRICT
AS $function$
DECLARE 
  prev1 int = 0;
  prev2 int = 1;
  result int = 0;
BEGIN
  FOR i IN 1..n
  LOOP
    result := prev1 + prev2;
    prev2 := prev1;
    prev1 := result;
  END LOOP;
  RETURN result;
END;
$function$

-- recursion form
CREATE OR REPLACE FUNCTION public.psqlfibr(n integer)
 RETURNS integer
 LANGUAGE plpgsql
 IMMUTABLE STRICT
AS $function$
BEGIN
  IF n < 2 THEN
    RETURN n;
  END IF;
  RETURN psqlfib(n-1) + psqlfib(n-2);
END;
$function$

-- non recursive calls
postgres=# select n, psqlfibnr(n)
              from generate_series(0,35,5) as n;
 n  | psqlfibnr 
----+-----------
  0 |         0
  5 |         5
 10 |        55
 15 |       610
 20 |      6765
 25 |     75025
 30 |    832040
 35 |   9227465
(8 rows)

Time: 1.178 ms

-- recursive calls
postgres=# select n, psqlfib(n)                                                                                                                                                                                                               
              from generate_series(0,35,5) as n;                                                                                                                                                                                                 
 n  | psqlfib                                                                                                                                                                                                                                 
----+---------                                                                                                                                                                                                                                
  0 |       0                                                                                                                                                                                                                                 
  5 |       5                                                                                                                                                                                                                                 
 10 |      55                                                                                                                                                                                                                                 
 15 |     610                                                                                                                                                                                                                                 
 20 |    6765                                                                                                                                                                                                                                 
 25 |   75025                                                                                                                                                                                                                                 
 30 |  832040                                                                                                                                                                                                                                 
 35 | 9227465                                                                                                                                                                                                                                 
(8 rows)                                                                                                                                                                                                                                      
                                                                                                                                                                                                                                              
Time: 282992.820 ms

Speed of recursion calls in PL/pgSQL is not comparatable with recursion optimized languages like Javascript.

When SQL procedures will be faster than PL/pgSQL procedures?

Answer is simple. SQL procedures will be faster everytime when we can use a integrated functionality. I use a bubble sort as example - but it can be quick sort too. You can implement it in PL/pgSQL without any problem, but your code will be always slower (sometimes significantly slower) than simple SQL function that uses integrated quicksort:

CREATE OR REPLACE FUNCTION sort(anyarray)
RETURNS anyarray AS $$
SELECT array(SELECT * FROM unnest($1) ORDER BY 1);
$$ LANGUAGE sql;

Introduction into PL/pgSQL language

PL/pgSQL does not create new types or its own functions. It shares both with the database system generally. PL/pgSQL functions can contains almost all of SQL statements. It has a FOR statement for iteration over SELECT's result. We can use a dynamic SQL - statement EXECUTE. Authors were surely inspired by the PL/SQL language, which is the native programming language for Oracle RDBMS, so it’s not difficult to convert stored procedures from Oracle to PostgreSQL or back. It is possible to use Pl/pgSQL in PostgreSQL for implementation of own aggregates or normal functions, operators, and for the implementation trigger handlers. Each new version of PostgreSQL improves the possibilities of stored procedures, so CREATE FUNCTION statement are enhanced in any PostgreSQL version too. CREATE FUNCTION statement is used to define new custom functions:

CREATE [OR REPLACE] FUNCTION name ([IN|OUT|INOUT] name type [,...]) 
  RETURNS suggest_type AS $$
    {body of function}
  $$ LANGUAGE plpgsql
  [IMMUTABLE | STABLE | VOLATILE]
  [CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT]
  [EXTERNAL SECURITY INVOKER | EXTERNAL SECURITY DEFINER]

I recommend writing the function code in your favorite plain-text text editor. Save the code into a file; then, in psql, use

\i name_of_file

to run the command(s) saved in the file. I definitely advise against writing functions directly in psql or pgAdmin. To use PL/pgSQL it is necessary to enable the language for the database. The command

createlang –L your_db

should create a table, including a line reading

plpgsql | t

If the table doesn’t include this line, PL/pgSQL must be enabled for this database using the following command (which requires Postgres super-user rights):

createlang plpgsql your_db

When PL/pgSQL is enabled, test it by writing a simple function:

CREATE OR REPLACE FUNCTION Sum(a int, b int) 
RETURNS int AS $$
BEGIN
  /*
   *  My first trivial PL/pgSQL function.
   */
  RETURN a + b;
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

If you rewrite the function without bugs, you can test it with SELECT Sum(10,10). In the column sum (the column name is the same as the function's name) the value 20 should appear. In Pl/pgSQL, comments are written the same as in SQL: single-line comments begin with a double dash; multi-line comments have C notation, like /* comment */. Execution of any PL/pgSQL function must be finished by the RETURN statement.

The function has two parts: variable declaration (before the variable's first use) and body. The body is a list (block) of commands separated by semicolons; this block lies between the key words BEGIN and END. The RETURN command ends execution of the function and returns the result. Older versions of PostgreSQL don’t support giving labels to the function's arguments--only positional notation was supported. As a result, you may see this as a $-sign the number of the parameter in sequence (so the first parameter would be $1, the second parameter would be $2, and so on). The default state of function parameters is IN. All of parameters are constants within the function. Blocks are not used for range limitation of control structures, such as IF, WHILE, FOR, LOOP (in contrast to Pascal); they serve only to limit the existence of local variables. It is possible to overload the function; we can have several definitions of a function, which vary in the number and type of arguments. All identifiers, like SQL key words, are case-insensitive. In case of complication, all strings (except those quoted within double apostrophes) are translated to lower case. The assignment operator is := in PL/pgSQL (as in Pascal). It is also possible to assign the result of a SQL query using the SELECT INTO construction:

SELECT INTO target aspect FROM ...;

The result may be a variable of type RECORD, a row, or a list of variables. During administration, check if the result corresponds to what is expected (in number and types). If the result contains multiple lines, the first line's values are used. Variables of type RECORD are never set to NULL. Beyond the test value NULL, which mustn’t be unique always, we can test the value of the built-in variable FOUND, which will be TRUE if the query returns at least one line. It is also possible to save the number of processed lines into a variable by using the GET command after each SQL command.

GET DIAGNOSTICS prom = ROW_COUNT;

Testing FOUND only works after a SELECT INTO. The next two example functions return the number of lines in the "name" table. The PERFORM instruction in the first example is used to execute a SQL command or function when no further processing of the data is done.

CREATE OR REPLACE FUNCTION rows1() 
RETURNS int AS $$
DECLARE r int;
BEGIN
  PERFORM * FROM names;
  GET DIAGNOSTICS r = ROW_COUNT;
  RETURN r;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION rows2() 
RETURNS int AS $$
DECLARE r int;
BEGIN
  SELECT INTO r count(*) FROM names;
  RETURN r;
END;
$$ LANGUAGE plpgsql;

After the DECLARE command, we can declare a list of several variables and their types, separated by semicolon. A variable can be set to an initial value (which is re-calculated during each entry into block), or to NULL, or also indicated as a constant. If these rules are broken, the PL/pgSQL environment will stop execution of the procedure. All variables defined as NOT NULL must have set up a DEFAULT value.

DECLARE name[CONSTANT] type [NOT NULL][DEFAULT|(:=)constant|parameter of function]

Aside from ordinary types, we can also use the RECORD type, which can include a line of random tables or so-called derived types. It is based upon either a line of concrete tables (name _tab%ROWTYPE), columns name_tab.nm_column%TYPE) or previously declared variables (variable%TYPE). You can access single elements within a RECORD or a row using dot notation.

PL/pgSQL code can contain any SQL commands. For instance, a function returning the name of the month on basic his index. PL/pgSQL developers should beware of giving PL/pgSQL variables the same names as database objects – the effects of this namespace collision may be hard-to-find errors. One solution is to use prefixes on PL/pgSQL variables. For instance, PL/pgSQL variables could be prefixed with the underscore symbol "_". There is an additional trap waiting for users of PL/SQL--while Oracle resolves collisions between column names and local variables in favor of the column, PostgreSQL does the opposite. So, in Oracle, you can resolve namespace collisions by specifying the definition of the local variable. In PostgreSQL, thanks to the higher priority of local variables, this will not resolve the conflict.

CREATE OR REPLACE FUNCTION MonthName1 (_month INTEGER) 
RETURNS VARCHAR(10) AS $$
DECLARE result varchar;
BEGIN
  IF _month <1 OR _month > 12 THEN
     RAISE EXCEPTION E'Parameter is outside acceptable limits!\n';
  END IF;
  SELECT INTO result 
        CASE _month
             WHEN  1 THEN 'January'
             WHEN  2 THEN 'February'
             WHEN  3 THEN 'March'
             WHEN  4 THEN 'April'
             WHEN  5 THEN 'May'
             WHEN  6 THEN 'June'
             WHEN  7 THEN 'July'
             WHEN  8 THEN 'August'
             WHEN  9 THEN 'September'
             WHEN 10 THEN 'October'
             WHEN 11 THEN 'November'
             WHEN 12 THEN 'December' END;
  RETURN result;
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

You can see used SELECT INTO SQL statement. You can see IF THEN statement. The construction IF...THEN...END IF is clear. The first argument of the RAISE command is a level of exception. We have three possibilities at our disposal: DEBUG – signs on block, NOTICE – announces the error to the user, EXCEPTION – stops the function's execution. The second argument to RAISE is the text of the error report (we write this in double apostrophes). If the symbol % appears in the text, this symbol is replaced by the value of a variable, which comes as the third, fourth, etc. argument to RAISE. RAISE NOTICE is usually used to display tuning reports or the values of variables. Basically it is only one tuning instrument which at our disposal.

Note that the CASE command is parameterised: it includes a variable. It is possible to use parameters in all SQL commands, everywhere, where some value is expected. It is not possible to parameterise names of columns and tables. It is possible to work around this limitation (at the cost of decreased performance and increased effort) using dynamic queries (see the below example with EXECUTE).

Interpret PL/pgSQL uses cached plans for SQL statements. PostgreSQL 8.2 and older doesn't allow refresh these cached plans, when these plans are invalid (by example - after drop temporary table). There was two solutions - a) using a dynamic sql (without execution plan cache), b) newer to drop temporary tables.

In the following example we will create a table and fill it with values of the sine function in a set range. We use the WHILE construction and the parameterised SQL command INSERT. The table tabsin is created only if it doesn‘t yet exist.

CREATE OR REPLACE FUNCTION tabsin(_p1 float, _p2 float, _p3 float) 
RETURNS void AS $$
DECLARE _i float = _p1; _do float = _p2; _step float = _p3;
BEGIN
  IF NOT EXISTS(SELECT relname FROM pg_class
    WHERE relname = 'tabsin' AND relkind = 'r' AND pg_table_is_visible(oid)) THEN
    RAISE NOTICE 'Create table tabsin';
    CREATE TEMP TABLE tabsin (x NUMERIC(5,4) PRIMARY KEY, fx NUMERIC(5,4));
  ELSE 
    RAISE NOTICE 'Deleting all records from table tabsin';
    TRUNCATE TABLE tabsin;
  END IF;
  WHILE _i < _do LOOP
    INSERT INTO tabsin VALUES(CAST(_i AS NUMERIC(5,4)), SIN(_i));
    _i := _i + _step;
  END LOOP;
END;
$$ LANGUAGE plpgsql;

The command tabsin creates table for an interval from -1 to 1 with step 0.0001. The execution time is very quick. Inserting twenty thousand rows takes two seconds on my 1.6 G notebook. We can check the values with a SELECT command:

SELECT * FROM tabsin WHERE x=0.1234;

You can leave a LOOP ..END LOOP construction using the EXIT command, with a user-defined label (which identifies the loop which should be stopped) and a user-defined condition. The label should be placed at the end of the loop.

<<main>>
        LOOP
          EXIT main WHEN c > 10;
          RAISE NOTICE ''%'', c;
          c := c + 1;
        END LOOP main;

Custom PL/pgSQL functions concept

The most important thing on PL/pgSQL functions (let us say by all UDF functions) is a fact, that we can use in queries – the whole proces of execution dates is simplificated by it. We can processed dates right on the server.

One of the functions missing in PostgreSQL is establishing Easter. It is used for establishing bank holidays in planning functions. I used Carter's algorithm for mark out the Easter Sunday.

CREATE OR REPLACE FUNCTION eastrsunday(_year INTEGER) 
RETURNS DATE AS $$
DECLARE B INTEGER; D INTEGER; E INTEGER; Q INTEGER;
  DD INTEGER; MM INTEGER;
BEGIN
  IF _year < 1900 OR _year > 2099 THEN
    RAISE EXCEPTION 'Out of range';
  END IF;
  B := 255 - 11 * ($1 % 19); D := ((B - 21) % 30) + 21;
  IF D > 38 THEN D := D - 1; END IF;
  E := ($1 + $1/4 + D + 1) % 7; Q := D + 7 - E;
  IF Q < 32 THEN DD:=Q; MM := 3; ELSE DD := Q - 31; MM := 4; END IF;
  RETURN to_date(to_char(DD, '00') 
            || to_char(MM, '00') || to_char(_rok,'0000'), 'DD MM YYYY');
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT; 

The next event, in which is the UDF function necessary, is the usage of cleaning functions for pairing notations in the table. These functions represent last rescue, when we do selections beyond historical databases, where no one take care of domains integrity. I work with column of values in the example, where data can contains strings like "records","rec","rec.", "rc", etc. My task is replace these variant strings by one string:

CREATE OR REPLACE FUNCTION clean(VARCHAR) 
RETURNS VARCHAR AS $$
DECLARE pom varchar DEFAULT $1;
BEGIN pom := to_ascii(lower(trim(both FROM pom)));
  pom = replace(pom,'rec' ,'records');
  pom = replace(pom,'rec.' ,'records');
  pom = replace(pom,'rc','records');
  /* replace multiple spaces by single space */
  WHILE position('  ' IN pom) <> 0 
  LOOP
    pom := replace(pom, '  ',' ');
  END LOOP;
  RETURN trim(both FROM pom);
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

Usage of PL/pgSQL for administration of database

Suitable designed saved procedures make administration of database measurably easier. I use following function for clearing RDBMS from databases that students create during a term. If the function is started without parameters or if the amount of arguments does not agree as well, it shows help.

CREATE OR REPLACE FUNCTION drop_students_databases() 
RETURNS INTEGER AS $$
DECLARE
  helpstr VARCHAR(300);
BEGIN
  helpstr  := E'Function for dropping databases of users whose\n' ||
    E'name (the name of user) suits the mask of order LIKE.\n\n' ||
    E'e.g. SELECT drop_students_databases(' || quote_literal('group%') || E')\n\n' ||
    E'note you have to be loged in as user postgres.\n' ||
    E'author: Pavel Stěhule, stehule@kix.fsv.cvut.cz 19.1.2002\n';
  RAISE NOTICE '%', helpstr;
  RETURN 0;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION drop_students_databases(_mask varchar(100)) 
RETURNS INTEGER AS $$
DECLARE 
  db RECORD;
  deleted INTEGER := 0;
  helpstr VARCHAR(300);
BEGIN 
  IF length(_mask) = 0 OR _mask ISNULL OR current_user() <> 'postgres' THEN
    helpstr  := E'Function for dropping databases of users whose\n' ||
      E'name (the name of user) suits the mask of order LIKE.\n\n' ||
      E'e.g. SELECT drop_students_databases(' || quote_literal('group%') || ')\n\n' ||
      E'note you have to be loged in as user postgres.\n' ||
      E'author: Pavel Stěhule, stehule@kix.fsv.cvut.cz 19.1.2002\n';
    IF current_user <> 'postgres' THEN
      RAISE EXCEPTION E'Function might be called only by user postgres\n\n%', helpstr;
    ELSE
      RAISE EXCEPTION E'Argument of the function must be a non-empty string\n\n%', helpstr;    
    END IF;
  END IF;
  FOR db IN 
    SELECT datname, usename 
       FROM pg_database, pg_user 
      WHERE datdba=usesysid AND usename LIKE _mask
  LOOP
    deleted := deleted + 1;
    RAISE NOTICE 'Drop database %,%', db.datname, db.usename;
    EXECUTE 'drop database ' || quote_ident(db.datname);
  END LOOP;
  RETURN deleted;
END;
$$ LANGUAGE plpgsql;

Notice the construction of FOR IN LOOP END LOOP which iterates through the result of SQL command (in this case list of databases whose owners suit the mask LIKE). Since names may contain gaps (or some special chars), it is needed to type the database name in quotation marks (by quote_ident). We have to use EXECUTE when we would to use a variable on column or table position.

Using PL/pgSQL language for custom functions

PostgreSQL doesn't support function next_day (coming from Oracle). You can use a PL/pgSQL language for implementation functions that simplify porting of application from someone database to PostgreSQL:

CREATE OR REPLACE FUNCTION next_day(IN d date, IN day varchar) 
RETURNS date AS $$
DECLARE 
  id integer
  dow integer;
BEGIN 
  dow := EXTRACT(dow FROM d);
  id := CASE lower(substring(day FROM 1 FOR 3))
            WHEN 'sun' THEN 0
            WHEN 'mon' THEN 1
            WHEN 'tue' THEN 2
            WHEN 'wed' THEN 3
            WHEN 'thu' THEN 4
            WHEN 'fri' THEN 5
            WHEN 'sat' THEN 6 END;
  IF id IS NULL THEN
    RAISE EXCEPTION 'Wrong identifier for day ' || quote_literal(day);
  END IF;
  RETURN CASE id <= dow
             WHEN true  THEN d + (id - dow + 7)
             WHEN false THEN d + (id - dow) END;
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

Usage of PL/pgSQL in CHECK conditions

We can use functions in PL/pgSQL as a control in CHECK conditions. These functions have to have only one argument of corresponding type and if the value of the argument is NULL, then it must return NULL. This behaviour may be treated programmatic or we may use attribute isstrict which ensure that whenever at least one of the arguments is NULL, then we use the value NULL like the result of function without executing the function. The following example tests the validity of ISBN code (although supplement isbn_issn exists, it will not be installed in most cases). Notice the usage of parameter weight of array type.

CREATE OR REPLACE FUNCTION check_ISBN(ISBN CHAR(12)) 
RETURNS boolean AS $$
DECLARE 
  pos INTEGER; 
  asc INTEGER; 
  sum INTEGER DEFAULT 0;
  weight INTEGER[] DEFAULT '{10,9,8,7,6,5,4,3,2,1}';  -- for ISSN {8,7,6,5,4,3,2,1}
  digits INTEGER DEFAULT 1;
BEGIN 
  FOR pos IN 1..length(ISBN) 
  LOOP
    asc := ascii(substr(ISBN,pos,1));
    IF asc IN (88, 120) THEN -- ISDN might contain control number X
      sum := sum + 10;
      digits := digits + 1;
    ELSIF asc >= 48 AND asc <= 57 THEN
      sum := sum + (asc - 48)*weight[digits];
      digits := digits + 1;
    END IF;
  END LOOP;
  IF digits <> 11 THEN -- for ISSN <> 9
    RETURN 'f';
  ELSE
    RETURN (sum % 11) = 0;
  END IF;
END;
$$ LANGUAGE plpgsql STRICT IMMUTABLE;

We can use the defined function for CHECK condition, e.g. (column isbn):

CREATE TABLE ZCHU_ReservationBooks (
  id INTEGER PRIMARY KEY DEFAULT nextval('doc_seq'),
  idoo INTEGER NOT NULL,  -- subject of protection id
  taboo CHAR(1) NOT NULL CHECK(taboo IN ('m','v','s')),
  isign varchar(50) NOT NULL UNIQUE CHECK (isign <> ''),
  typ CHAR(2) NOT NULL REFERENCES cv_TypDocRK(kod),
  author_surname VARCHAR(30) NOT NULL CHECK (author_surname <> ''), 
  author_name VARCHAR(30) NULL CHECK (author_name <> ''),
  title VARCHAR(100) NOT NULL CHECK (nazev <> ''),
  note VARCHAR(300) NULL,
  year_of_publication INTEGER NULL 
     CHECK (year_of_publication > 1918 AND year_of_publication < EXTRACT(year FROM current_date)),
  established DATE DEFAULT current_date NOT NULL,
  disabled DATE NULL CHECK (disabled > established),
  isbn CHAR(12) NULL CHECK (check_ISBN(isbn)),
  url VARCHAR(100) NULL CHECK (url ~* '^(file|http|ftp)://.+')
);

Usage of PL/pgSQL functions for designing own operators

PostgreSQL does not support operation of dividing interval by interval. However, it is not problem to overcome this defect and define own operator / for this combination of operand.

omega=# select '1hour'::interval / '10min'::interval;
ERROR:  operator does not exist: interval / interval

CREATE OR REPLACE FUNCTION div_op(a interval, b interval) 
RETURNS double precision AS $$
BEGIN
  -- note - better to use SQL language 
  RETURN EXTRACT(EPOCH FROM a) / EXTRACT(EPOCH FROM b);
END;
$$ LANGUAGE plpgsql IMMUTABLE RETURNS NULL ON NULL INPUT;

CREATE OPERATOR / (procedure = div_op, leftarg = interval, rightarg = interval);

omega=# select '1hour'::interval / '10min'::interval;
 ?column?
----------
        6

In CHECK expressions we can use binary operators OR and AND. In standard distribution the operators XOR and implication are missing. Nevertheless it is not a problem this missing operators fill in the system.

CREATE OR REPLACE FUNCTION op_xor (a boolean, b boolean) 
RETURNS boolean AS $$
BEGIN
  -- better use SQL language!
  RETURN ((NOT a) AND b) OR (a AND (NOT b));
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION op_imp_lr (a boolean, b boolean) 
RETURNS boolean AS $$
BEGIN
  RETURN (NOT a OR b);
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION op_imp_rl (a boolean, b boolean) 
RETURNS boolean AS $$
BEGIN
  RETURN (NOT b OR a);
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

/*
 * PL/pgSQL is black box for optimalizer, so much better is using 
 * a SQL language 
 */
CREATE OR REPLACE FUNCTION op_imp(a boolean,b boolean)
RETURNS boolean AS $$
SELECT ($1 OR NOT $2)
$$ LANGUAGE sql; 

Both functions have to be registered and we have to push ashore them some symbol. It performs the order CREATE OPERATOR. We create the name of the operator by combining following characters +-*/<>=~!@#%^&|’?$.

DROP OPERATOR # (boolean, boolean);

CREATE OPERATOR # (
  procedure = op_xor,
  leftarg = boolean,
  rightarg = boolean,
  commutator = #
);

DROP OPERATOR >>> (boolean, boolean);

CREATE OPERATOR >>> (
  procedure = op_imp_lr,
  leftarg = boolean,
  rightarg = boolean,
  commutator = <<<
);

CREATE OPERATOR <<< (
  procedure = op_imp_rl,
  leftarg = boolean,
  rightarg = boolean,
  commutator = >>>
);

We can try both operators on the table:

DROP TABLE logtab;

CREATE TABLE logtab (l boolean, p boolean);

INSERT INTO logtab VALUES(FALSE,FALSE);
INSERT INTO logtab VALUES(TRUE,FALSE);
INSERT INTO logtab VALUES(FALSE,TRUE);
INSERT INTO logtab VALUES(TRUE,TRUE);

SELECT l, p, l # p AS XOR FROM logtab;
SELECT l, p, l >>> p AS IMPL FROM logtab;

Using of this two operators make the CHECK conditions significantly easier. In the table "bug_reports" I require entering user’s id or user’s name. Further I require filling in column err_message if the type of return is error.

DROP TABLE bug_reports;
DROP SEQUENCE bug_reports_id_seq;

CREATE TABLE bug_reports (
  id SERIAL PRIMARY KEY,
  inserted DATE DEFAULT current_date NOT NULL,
  inserted_by_not_registrated VARCHAR(60) CHECK (inserted_by_not_registrated <> ''),
  inserted_by INTEGER REFERENCES users(id) CHECK (
    inserted_by_not_registrated IS NOT NULL # inserted_by IS NOT NULL),
  class_err CHAR(2) NOT NULL REFERENCES class_err_codes(code),
  error_message TEXT CHECK (
    (class_err IN ('ch','zc') >>> error_message IS NOT NULL) AND
    (error_message <> '')),
  long_description TEXT NULL CHECK (long_description <> '')
);

INSERT INTO bug_reports (inserted_by, tclass_err, long_description)
  VALUES (1,'po', 'I installed a new version.'); -- ok

INSERT INTO bug_reports (inserted_by, inserted_by_not_registrated, class_err, long_description)
  VALUES (1, 'Pavel Stěhule','po', 'I installed a new version.'); -- exception

INSERT INTO bug_reports (inserted_by_not_registrated, class_err, long_description)
  VALUES ('Pavel Stěhule','po', 'I installed a new version.'); -- ok

INSERT INTO bug_reports (inserted_by_not_registrated, class_err, long_description)
  VALUES ('Pavel Stěhule','ch', 'I installed a new version.'); -- exception 

Absence of operators XOR can be solved easily (without necessity of writing PL/pgSQL function).

inserted_by INTEGER REFERENCES Users(id) CHECK (
    (inserted_by_not_registrated IS NOT NULL) <> (inserted_by IS NOT NULL))

Usage of PL/pgSQL functions in the design of user aggregation functions.

If we compare the number of bulit-in aggregation functions in PostgreSQL with other RDBMS, we find that many aggregation functions are missing in PostgreSQL. Fortunately it is possible to develop user aggregation functions in PostgreSQL.

Aggregation functions are created by using two functions in PostgreSQL. The first, which is ran for each value, giving a result which is in turn used by the second function which gives the final result. The first example doesn’t need a second function – it creates a list of items, which are separated by commas. By aggregation functions is important to ignore NULL.

CREATE OR REPLACE FUNCTION comma_aggreg (state text, p text) RETURNS text AS $$
  BEGIN
    IF p IS NULL THEN
      RETURN state;
    END IF
    IF length(state) > 0 THEN
      RETURN state || ', ' || p;
    ELSE
      RETURN p;
    END IF;
  END;
$$ LANGUAGE plpgsql IMMUTABLE;

DROP AGGREGATE comma(text);

CREATE AGGREGATE comma (
  basetype = text,
  sfunc = comma_aggreg,
  stype = text,
  initcond = ''
);

SELECT comma(name) FROM names;

With a little modification of the aggregation function sum we obtain a "hungry" function – meaning that if the text is already aggregated, it will be ignored. Using this function isn't as fast as other methods, but it is the simplest way to gain the desired result.

CREATE OR REPLACE FUNCTION comma_aggreg(state text,p text) RETURNS text AS $$
BEGIN
  IF p IS NULL THEN
    RETURN state;
  END IF;
  IF length(state) > 0 THEN
    IF position(p in state) = 0 THEN
      RETURN state || ', ' || p;
    ELSE
      RETURN state;
    END IF;
  ELSE
    RETURN p;
  END IF;
END; 
$$ LANGUAGE plpgsql IMMUTABLE;

The table includes the areas of activity in districts and regions. If I wouldn't have used hungry counting of string type variables, the regions would be listed twice (note - or use a DISTINCT aggregate, what is better and faster).

aopk=> select * from activity;
    area       |    district   | region
---------------+---------------+------
 Benešov       | Benešov       | SČ
 Benešov       | Kutná Hora    | SČ
 Č. Budějovice | Č. Budějovice | JČ
(3 line)

aopk=> select area, sum(district), comma(DISTINCT region) from activity group by area;
     area      |         sum         | sum
---------------+---------------------+-----
 Č. Budějovice | Č. Budějovice       | JČ
 Benešov       | Benešov, Kutná Hora | SČ
(2 line)

Next is a little bit more complicated aggregation function returning a list of items in HTML format.

CREATE OR REPLACE FUNCTION html_li_aggr (state text,p text) RETURNS text AS $$
  BEGIN
    IF p IS NULL THEN
      RETURN state;
    ELSE
      RETURN state || '<li>' || p || E'</li>\n';
    END IF;
  END;
$$ LANGUAGE plpgsql IMMUTABLE;

CREATE OR REPLACE FUNCTION html_ul_final (state text) RETURNS text AS $$
  BEGIN
    IF $1 <> '' THEN
      RETURN E'<ul>\n' || state || '</ul>';
    ELSE
      RETURN '';
    END IF;
  END;
$$ LANGUAGE plpgsql IMMUTABLE;

DROP AGGREGATE html_ul (text);

CREATE AGGREGATE html_ul (
  basetype = text,
  sfunc = html_li_aggr,
  stype = text,
  initcond = '',
  finalfunc = html_ul_final
);

The result of the query SELECT html_ul(jmeno) FROM jmena is one value (even though more lines of text) which can be inserted into an HTML page.

note - PostgreSQL 8.4 has very fast buildin aggregate function array_agg. This function in combination with function array_to_string can do same work, but sugnificantly faster on bigger dataset.

Design of triggers in PL/pgSQL

Trigger is a procedure, that RDBMS activates before or after executing the orders INSERT, UPDATE a DELETE in some table, to which we assign a trigger before. An arbitary PL/pgSQL procedure without parametres returning a value of the type TRIGGER can be used as a trigger. The data type trigger, is in fact a RECORD data type compatible with the table for which it is used. Two types of triggers exist: BEFORE and AFTER. BEFORE triggers are called before carrying out an operation, AFTER triggers after carrying out an operation.

Let us imagine the situation, when we want to have data divided into two tables, but we only want to work with a unification of these tables. If we use view, we give up the posibility to use referential integrity. Another possibility is to use a third table, which will include values from the first two tables.

CREATE TABLE source1 (
  code CHAR(2) PRIMARY KEY,
  description VARCHAR(100) CHECK (description <> '')
);

CREATE TABLE source2 (
  code CHAR(2) PRIMARY KEY,
  description VARCHAR(100) CHECK (description <> '')
);

CREATE TABLE object (
  code CHAR(2) PRIMARY KEY,
  description VARCHAR(100) CHECK (description <> ''),
  source CHAR(1) NOT NULL CHECK (source = '1' OR source = '2')
);

I must say that any PL/pgSQL function runs under an implicit transaction, that means if some command in a procedure fails, all changes in the data of executed functions will be annulled. The behaviour of trigger functions is based on this. In the beginning we try to execute the requested function in the final table, if the command fails, it won’t be executed in the source table.

CREATE OR REPLACE FUNCTION trig_build_object_F() RETURNS OPAQUE AS $$
BEGIN
  IF TG_OP = 'DELETE' THEN
    DELETE FROM object WHERE code = OLD.code;
    RETURN OLD;
  ELSE
    IF TG_OP = 'UPDATE' THEN
      UPDATE object SET code = NEW.code, description = NEW.description WHERE code = OLD.code;
      RETURN NEW;
    ELSE 
      INSERT INTO object VALUES(NEW.code, NEW.description,
        CASE TG_RELNAME WHEN 'source1' THEN 1 WHEN 'source2' THEN 2 END);
      RETURN NEW;
    END IF;
  END IF;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;

When executing a PL/pgSQL function as a trigger, we can use some bulit-in variables. TG_OP describes the command which triggers the trigger, TG_RELNAME holds the name of the table (in lower-case letters), in which the trigger was ran, NEW includes a line with a new version of the values (only for INSERT and UPDATE), OLD includes a line with an original version of the values (only for DELETE and UPDATE). If a procedure returns NULL, no data will be changed. Trigger only solves the transfer of data from source tables to the final table. If we change the final table directly, it won’t correspond to the source tables. We can set user rights so that only the owner of the table has the right to change the content of the table. Other users can of course view the final table.

Trigger is always ran with the rights of the user who performed the action which triggered it. If we would strip users of the right to modificate the final table, INSERT would fail. Fortunately we have the posibility to change this behaviour with the attribute SECURITY DEFINER. Then the commands in the body of a function will be ran with the function owner’s rights, which in our case is necessary.

We have to assign a function to a trigger and a table with the command CREATE TRIGGER. We can create two basic types: those which are executed before executing the command, which invokes the trigger and those, which are ran after executing the command.

CREATE TRIGGER t_source1 
  BEFORE INSERT OR UPDATE OR DELETE ON source1
  FOR EACH ROW EXECUTE PROCEDURE trig_build_object_F();

CREATE TRIGGER t_zdroj2
  BEFORE INSERT OR UPDATE OR DELETE ON source2
  FOR EACH ROW EXECUTE PROCEDURE trig_build_object_F();

In some cases we want to prevent UPDATE. The updating of some column in a table (mostly columns of seriál type, where UPDATE has no reason). Again, there exists a supplement NO UPDATE, which solves this problem, but which isn’t installed by default. If there aren't many tries to UPDATE, we can use a trigger and a short PL/pgSQL procedure:

DROP SEQUENCE codebook_id_seq;
DROP TABLE codebook;

CREATE TABLE codebook (
  id SERIAL PRIMARY KEY,
  description VARCHAR(100) CHECK (description <> '')
);

CREATE OR REPLACE FUNCTION trig_static_id() RETURNS TRIGGER AS $$
BEGIN
  IF TG_OP = 'UPDATE' AND NEW.id <> OLD.id THEN
    RAISE EXCEPTION 'You can not update PRIMARY KEY column on table %', TG_RELNAME;
  ELSE
    RETURN NEW;  
  END IF;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER static_id_codebook
  BEFORE UPDATE ON codebook
  FOR EACH ROW EXECUTE PROCEDURE trig_static_id();

Triggers of the BEFORE type are ran before the evaluation of CHECK conditions of columns and tables. Domain types are evaluated before the activation of BEFORE triggers. Thanks to that, it is possible to fill in some missing values in the functions of triggers and not exceed restrictions. On the other side, in a trigger we can detect values that don’t satisfy CHECK conditions. The solution of this situation isn’t to invoke a exception, but to end the execution of the trigger. RDBMS will find an unsatisfactory value and will invoke an exception. If more triggers are defined over one table, then the triggers are ran in alphabetical order (unlike ANSI SQL, where the triggers are ran in the order in they were defined).

Triggers in PostgreSQL can also have static parameters specified. This enables us to use one procedure in different contexts. Static parameters aren’t accessable through parameters of of the function including the trigger, but are accessible in an external variable TG_ARGV[].

Let us have the requirement to log entries into tables, when the value, which will be written to the log can’t be simply derived from name of table. We need complicated branching.

CREATE TABLE foo_a(i integer);
CREATE TABLE foo_b(i integer);

CREATE OR REPLACE FUNCTION trig() RETURNS trigger AS $$
BEGIN
  RAISE NOTICE 'Trigger: %,Insert to table: %,Parameter: %', 
    TG_NAME, TG_RELNAME, TG_ARGV[0];
  RETURN NULL;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER foo_a_trg AFTER INSERT ON foo_a
  FOR EACH ROW EXECUTE PROCEDURE trig('a');
CREATE TRIGGER foo_b_trg AFTER INSERT ON foo_b
  FOR EACH ROW EXECUTE PROCEDURE trig('b');

INSERT INTO foo_a VALUES(1);
INSERT INTO foo_b VALUES(1);
<pre>
The result is:
<pre>
psql:testtr.sql:16: NOTICE:  Trigger: foo_a_trg,Insert to table: foo_a,Parameter: a
INSERT 336812 1
psql:testtr.sql:17: NOTICE:  Trigger: foo_b_trg,Insert to table: foo_b,Parameter: b
INSERT 336813 1

Usage PL/pgSQL function with parametres of type table

In some cases we need to run a function for a whole row of a table, for example when we need to replace the repeated term CASE in the command SELECT. Let us have tables of dealers and of sales. The table of sales includes the columns c1, c2, c3, t2, ot, oec1, oec2, oec3, t1, b1, b2, c050, c150, c300, t1n and t2n, which contain the amount of products sold in some tariff and in a given sale. The tariffs are set directively and do not change.

CREATE TABLE dealers (
  id SERIAL PRIMARY KEY,
  PNo VARCHAR(10) NOT NULL UNIQUE CHECK (PNo ~ '^[0-9]{9,10}$'),
  name VARCHAR(20) NOT NULL CHECK (name <> ''),
  surname VARCHAR(20) NOT NULL CHECK (surname <> '')
);

CREATE TABLE sale (
  id SERIAL PRIMARY KEY,
  dealer int REFERENCES dealers(id),
  closing_date DATE NOT NULL DEFAULT current_date::DATE,
  c1   integer NULL CHECK(c1 > 0),
  c2   integer NULL CHECK(c2 > 0),
  c3   integer NULL CHECK(c3 > 0),
  t2   integer NULL CHECK(t2 > 0),
  ot   integer NULL CHECK(ot > 0),
  oec1 integer NULL CHECK(oec1 > 0),
  oec2 integer NULL CHECK(oec2 > 0),
  oec3 integer NULL CHECK(oec3 > 0),
  t1   integer NULL CHECK(t1 > 0),
  b1   integer NULL CHECK(b1 > 0),
  b2   integer NULL CHECK(b2 > 0),
  c050 integer NULL CHECK(c050 > 0),
  c150 integer NULL CHECK(c150 > 0),
  c300 integer NULL CHECK(c300 > 0),
  t1n  integer NULL CHECK(t1n > 0),
  t2n  integer NULL CHECK(t2n > 0)
);

The customer will require the list of products sold, but not by tariffs, but by the final amount of products divided into classes by volume of each sale. The classes are:

0 - 4, 5 - 9, 10 - 19, 20 - 49, over 50

Then the customer can determine the size of a sale by each dealer and also his orientation for a certain segment of the market. The list can be created either by a selection using the CASE command (below) or by a selection with a PL/pgSQL function, which has one line of the table as an argument.

CREATE OR REPLACE FUNCTION c_sale_seg (rec sale, l int,h int) RETURNS integer AS $$
  DECLARE s INTEGER;
  BEGIN 
    s := coalesce(rec.c1,0) + coalesce(rec.c2,0) + coalesce(rec.c3,0) + coalesce(rec.t2,0)
      + coalesce(rec.oec1,0) + coalesce(rec.oec2,0) + coalesce(rec.oec3,0) + coalesce(rec.t1,0)
      + coalesce(rec.b1,0) + coalesce(rec.b2,0) + coalesce(rec.c150,0) + coalesce(rec.c300,0)
      + coalesce(rec.t1n,0) + coalesce(rec.t2n,0) + coalesce(rec.ot,0);
    IF l IS NOT NULL THEN
      IF s < l THEN RETURN 0; END IF;
    END IF;
    IF h IS NOT NULL THEN
      IF s > h THEN RETURN 0; END IF;
    END IF;
    RETURN s;
  END;
$$ LANGUAGE plpgsql; 

The usage of these functions drastically simplifies the design. The function c_prodej_seg sums the sale in all tariffs. If the volume of a sale is in an interval determined by the second and third arguments, it returns this value, in any other case it returns null. Notice that the first argument of the function is of the type table. In the function itself, we work with it as if it would be tabulka%ROW.

INSERT INTO dealers (PNo, name, surname) 
  VALUES ('7307150000','Pavel', 'Stěhule');
INSERT INTO dealers (PNo, name, surname) 
  VALUES ('7807150000','Zdeněk', 'Stěhule');

INSERT INTO sale (dealer,c1,c2) VALUES (1,18,11);
INSERT INTO sale (dealer,c1,c2) VALUES (1,18,6);
INSERT INTO sale (dealer,c3,c2) VALUES (1,50,24);
INSERT INTO sale (dealer,t1,c3) VALUES (1,1,1);
INSERT INTO sale (dealer,c300)  VALUES (2,10);
INSERT INTO sale (dealer,c1,c2) VALUES (2,11,2);
INSERT INTO sale (dealer,c1,c2) VALUES (2,1,6);
INSERT INTO sale (dealer,c3,c2) VALUES (2,5,12);
INSERT INTO sale (dealer,t1,ot) VALUES (2,1,1);
INSERT INTO sale (dealer,c300)  VALUES (2,252);

It returns this query:

SELECT 
  SUM(c_sale_seg(sale,NULL, 4)) AS lt5,
  SUM(c_sale_seg(sale,5, 9)) AS be5a9,
  SUM(c_sale_seg(sale,10, 19)) AS be10a19,
  SUM(c_sale_seg(sale,20, 49)) AS be20a49,
  SUM(c_sale_seg(sale,50, NULL)) AS ge50,
  name || ' ' || surname as dealer  
from 
  sale join dealers on dealer = dealers.id 
group 
  by name || ' ' || surname ;

table:

 lt5 | be5a9 | be10a19 | be20a49 | ge50 |    dealer
-----+-------+---------+---------+------+----------------
   2 |     0 |       0 |      53 |   74 | Pavel Stěhule
   2 |     7 |      40 |       0 |  252 | Zdeněk Stěhule

Functions that return tables

Individual lines are created by repeatedly calling RETURN NEXT value (this version of RETURN doesn’t end the function) and calling RETURN without parameters, with that we end the execution of a function. As a come-back type we can use scalar types, table types or composite types (these are more or less equivalent to the table type (we create them using the command CREATE TYPE)). If we don’t want to define a new type, we can use OUT parametres (RETURNS SETOF RECORD).

CREATE TYPE tf AS (f1 varchar(10), f2 varchar(10));

CREATE OR REPLACE FUNCTION makesettf(mx integer) 
RETURNS SETOF tf AS $$
DECLARE f tf%ROWTYPE;
BEGIN
  FOR i IN 1..mx 
  LOOP
    f.f1 := CAST(i AS varchar(10));
    f.f2 := 'bbbbb '||CAST(i AS varchar(10));
    RAISE NOTICE '%', f.f1;
    RETURN NEXT f;
  END LOOP;
  RETURN;
END;
$$ LANGUAGE plpgsql;

SELECT a.*, b.* 
   FROM makesettf(10) a 
        JOIN 
        makesettf(5) b 
        ON a.f1 = b.f1;

SELECT * FROM makesettf(3) 
UNION ALL 
SELECT * FROM makesettf(8);

A more practical example, inpired by the UDF of Firebird (which returned a table with the dates of german beer festivals), is to use SRF in the design of functions for the determination of a day by adding n workdays to a given day.

SRF functions are good for generating smaller dynamic (parametrical) tables, because it isn’t possible to index tables which are created by them. This restriction is easily forgiven during testing. Custom functions AddWDays and AddWDays2 were tested by copmaring the values (each function is implemented in a different way, but they must give the same results) over a virtual table created by a union of tables produced by SRF functions.

The function DaysInYear returns a table of days in the given year, the function Seq returns a sequence of integers restricted by a parameter. The function AddWDays and AddWDays2 has two parameters – the first day, to which work days are added, the number of which is determined by the second parameter). The function DiffWDays returns the numer of work days in an interval defined by parameters. If the functions are correctly designed, then the result of test queries must be an empty table.

CREATE OR REPLACE FUNCTION DaysInYear(y integer) 
RETURNS SETOF date AS $$
DECLARE st date;
BEGIN st := to_date('0101'||y,'MMDDYYYY');
  WHILE EXTRACT(year FROM st) < y + 1 LOOP
    RETURN NEXT st;
    st := st + 1;
  END LOOP;
  RETURN;
END;
$$ LANGUAGE plpgsql;

-- testy
SELECT DaysInYear, generate_series FROM DaysInYear(2003) CROSS JOIN generate_series(1,40) 
  WHERE AddWDays(DaysInYear, generate_series) <> AddWDays2(DaysInYear, generate_series);
SELECT DaysInYear, generate_series FROM DaysInYear(2003) CROSS JOIN generate_series(1,40) 
  WHERE DiffWDays(DaysInYear, AddWDays(DaysInYear, generate_series)) <> generate_series;

We have few national holidays, but they are on different dates each year (because of easter) , so it’s good to use SRF for generating the table. If preformance problems would occur, we could materialize the table, that means to save into a normal table with the command SELECT INTO. The function StátníSvátky returns a table with national holidays, including the name for the range given by the first and second parameters. The function PočetStatníchSvátků counts the number of national holidays and discards the days given as the third parameter (mostly Saturday and Sunday)). Intristing is perhaps only the way of entering the comparison value with the array contents = ANY().

CREATE OR REPLACE FUNCTION NationalHolidays(start date, finish date, OUT time date, OUT name varchar(50)) 
  RETURNS SETOF RECORD  AS $$
DECLARE 
  holiday VARCHAR [][] DEFAULT ARRAY
   [['New year', '0101'], 
    ['Easter sunday', 'xx'],
    ['Easter monday','x1'], 
    ['Holiday of work','0501''],
    ['Holiday of freedom',''0508''], 
    ['The day of Cyrila and Metoděj','0705'],
    ['The day of burning master Jan Hus','0706'], 
    ['The day of czech statehood','0928'],
    ['The day of beginning of independent Czechoslovak state', '1028'],
    ['The day of fight to freedom and democracy','1117'],
    ['Christmas eve','1224'],
    ['1. holiday of Christmas','1225'],
    ['2. holiday of Christmas','1226']];
  year integer; stepd date; d varchar;
BEGIN stepd := start;
  WHILE stepd <= finish LOOP
    year := CAST(EXTRACT(year FROM stepd) AS INTEGER);
    FOR i IN array_lower(holiday,1)..array_upper(holiday,1) LOOP
      d := holiday[i][2];
      IF d = 'xx' THEN
        time := Easter_sunday(year);
      ELSIF d = 'x1' THEN
        time := Easter_sunday(year) + 1;
      ELSE
        time := to_date(d||year,'MMDDYYYY');
      END IF;
      IF time BETWEEN stepd AND finish THEN
        name := holiday[i][1]; RETURN NEXT;
      END IF;
    END LOOP;
    stepd := date_trunc('year', stepd) + interval '1 year';
  END LOOP;
  RETURN;
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION NumberOfNationalHolidays(start date, finish date, nwd INTEGER[]) 
RETURNS INTEGER AS $$
DECLARE 
  holiday VARCHAR [] DEFAULT ARRAY
   ['0101', 'xx', 'x1', '0501', '0508', '0705', '0706', 
    '0928', '1028', '1117', '1224', '1225', '1226'];
  year INTEGER; stepd date; d varchar; hol date; pss INTEGER DEFAULT 0;  OM INTEGER;
BEGIN stepd := start;
  WHILE stepd <= finish LOOP
    year := CAST(EXTRACT(year FROM stepd) AS INTEGER);
    FOR i IN array_lower(holiday,1)..array_upper(holiday,1) LOOP
      d := holiday[i];
      IF d = 'xx' THEN
        hol := Easter_sunday(year);
      ELSIF d = 'x1' THEN
        hol := Easter_sunday(year) + 1;
      ELSE
        hol := to_date(d||year,'MMDDYYYY');
      END IF;
      IF NOT EXTRACT(dow FROM hol) = ANY (nwd) THEN
        IF hol BETWEEN stepd AND finish THEN
          pss := pss + 1;
        END IF;
      END IF;
    END LOOP;
    stepd := date_trunc('year', stepd) + interval '1 year';
  END LOOP;
  RETURN pss;
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

The function AddWDays2 is simpler and for a smaller number of days it is probably also faster. I save an array of holidays for one year on into the variable svatky, to which I add work days. The function StatniSvatky returns a table and so it is necessary to transfer it to a constructions array ARRAY(SELECT FROM). Then I decrease the number of work days for each work day and I go on.

CREATE OR REPLACE FUNCTION AddWDaysSlow2(start date, c integer) RETURNS date AS $$
DECLARE st date = start; ed date; holidays date[]; d integer = c; k integer;
BEGIN ed := st + 365;
  SELECT INTO holidays ARRAY(SELECT time FROM NationalHolidays(st, ed));
  LOOP k := CAST(EXTRACT(DOW FROM st) AS INTEGER);
    WHILE k = 6 OR k = 0 OR st = ANY(holidays) LOOP
      st := st + 1; k := CAST(EXTRACT(DOW FROM st) AS INTEGER);
      IF st > ed THEN ed = st + 365; 
        SELECT INTO holidays ARRAY(SELECT time FROM NationalHolidays(st, ed));
      END IF;
    END LOOP;
    EXIT WHEN d = 0 OR d = 1;
    st := st + 1; d := d - 1;
  END LOOP;
  RETURN st;
END; 
$$ LANGUAGE plpgsql;

The second (the first) function AddWDays is more complicated and faster for longer intervals. It uses the function AddWDay, which adds x days for weekends to the number of work days. Then it repeatedly checks for the need of corections for national holidays. If a parameter of AddWDay is Saturday or Sunday, it is moved to Monday. After that I go back to Sunday and increase the number of work days for the amount of days I've gone back. Then I can simply transfer work days to real days.

CREATE OR REPLACE FUNCTION AddWDays(start date, c integer) RETURNS date AS $$
DECLARE st date = start; ed DATE; corection INTEGER; hol INTEGER; d INTEGER;
BEGIN ed := AddWDay(st,c);
  corection = NumberOfNationalHolidays(st, ed, ARRAY[6,0]); 
  WHILE corection > 0 LOOP
    st := ed + 1; ed := AddWDay(st, corection); 
    corection =  NumberOfNationalHolidays(st, ed, ARRAY[6,0]);
  END LOOP;
  RETURN FirstWDay(ed);
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION AddWDay(d date, c integer) RETURNS date AS $$
DECLARE st DATE; wd INTEGER; k INTEGER;
BEGIN k := CAST(EXTRACT(DOW FROM d) AS INTEGER);
  IF k = 0 THEN st := d + 1; k := 1;
  ELSIF k = 6 THEN st := d + 2; k := 1;
  ELSE st := d; END IF;
  st := st - k; wd := c + k - 2;
  st := st + ((wd) / 5) * 7  + ((wd) % 5) + 1;
  RETURN st;
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

If the first parameter of the function DiffWDays is Saturday or Sunday I move the beginning to Monday, if the second parameter is Saturday or Sunday I transfer the end to Friday. If the beginning is after the end I have to increase number of weeks. In the given interval I subtract the number of national holidays. The function ISODOW returns an ISO number of the day (Saturday 6, Sunday 7).

CREATE OR REPLACE FUNCTION ISODOW(date) 
RETURNS INTEGER AS $$
  SELECT CAST(EXTRACT(DOW FROM $1 - 1) + 1 AS INTEGER);
$$ LANGUAGE sql STRICT;

CREATE OR REPLACE FUNCTION DiffWDays(d1 date, d2 date) 
RETURNS integer AS $$
DECLARE wc INTEGER; st DATE; ED DATE; p INTEGER;
BEGIN 
  p := ISODOW(d1); IF p > 5 THEN st := d1 + (8 - p); ELSE st := d1; END IF;
  p := ISODOW(d2); IF p > 5 THEN ed := d2 - (p - 5); ELSE ed := d2; END IF;
  wc := (d2 - d1) / 7;
  IF EXTRACT(DOW FROM st - 1) + 1 > EXTRACT(DOW FROM ed - 1) + 1 THEN
    wc := wc + 1;
  END IF;
  RETURN (d2 - d1 - (wc * 2) - NumberOfNationalHolidays(st,ed, ARRAY[6,7]) + 1);
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

Recursive call of SRF functions

Every SRF functions call has its own so-called context where is set of current returned values stored. This feature will appear only when recursive calling of SRF function, when having incorrectly designed function we think we are loosing the results. The example of incorrectly designed function is:

CREATE OR REPLACE FUNCTION foo (pid integer) 
RETURNS SETOF foo AS $$
  DECLARE rec RECORD;
BEGIN
  FOR rec IN 
     SELECT * FROM foo WHERE foo.pid=pid 
  LOOP
    RETURN NEXT rec;
    RAISE NOTICE 'uid=% pid=%',rec.uid,rec.pid;
    SELECT INTO rec * FROM foo (rec.uid);
  END LOOP;
  RETURN;
END; 
$$ LANGUAGE plpgsql;

False supposition is that transfer of values of nested function is caused by select into rec * from foo() call. The correct form of function is this:

CREATE OR REPLACE FUNCTION foor (pid integer) 
RETURNS SETOF foo AS $$
DECLARE rec RECORD;
BEGIN
  FOR rec IN 
     SELECT * FROM foo WHERE foo.pid=pid 
  LOOP
    RETURN NEXT rec;
    FOR rec IN SELECT * FROM foor (rec.uid) 
    LOOP
      RETURN NEXT rec;
    END LOOP;
  END LOOP;
  RETURN;
END;
$$ LANGUAGE plpgsql;

/*
 * variant with RETURN QUERY
 */
CREATE OR REPLACE FUNCTION foor(_pid integer) 
RETURNS SETOF foo AS $$
DECLARE rec RECORD;
BEGIN
  FOR rec IN 
     SELECT * FROM foo WHERE foo.pid=_pid 
  LOOP
    RETURN NEXT rec;
    RETURN QUERY SELECT * FROM foor(rec.uid)
  END LOOP;
  RETURN;
END;
$$ LANGUAGE plpgsql; 

PostgreSQL 8.4 support Common Table Expression - recursive SELECTs. We can use it without recursive SRF functions:

WITH RECURSIVE x AS (SELECT * 
                        FROM foo 
                       WHERE uid = 3 -- find a root
                     UNION ALL
                     SELECT foo.* 
                        FROM foo 
                             JOIN 
                             x
                             ON x.uid = foo.pid) -- append rows where pid is same as some searched uid  
   SELECT * FROM x; -- show content of stacked table x without changes

The proposal of polymorphic functions, polymorphic functions usage

Polymorphic functions use some of next polymorphic types: anyarray, anyelement during the definition of returned type or argument.

If the function returns the polymorphic type then at least one argument from arguments of function must be of polymorphic type. The polymorphic type is dynamically determined according to the current type of the argument. If the function contains more parameters of anyelement type the real parameters must be of the same type. The same requirement holds for array.

CREATE OR REPLACE FUNCTION polyfoo1(a anyelement) 
RETURNS anyelement AS $$
BEGIN
  RETURN  a + 1;
END;
$$ LANGUAGE plpgsql;

testdb011=> select polyfoo1(date '1.1.2004'), polyfoo1(10);
  polyfoo1  | polyfoo1
------------+----------
 2004-01-02 |       11
(1 row)

CREATE OR REPLACE FUNCTION polyfoo2(a anyelement, b integer) 
RETURNS anyelement AS $$
BEGIN
  RETURN  a - b;
END;
$$ LANGUAGE plpgsql;

pavel@postgres:5432=# select polyfoo2(date '1.1.2004',100);
  polyfoo2  
------------
 2003-09-23
(1 row)

pavel@postgres:5432=# select polyfoo2(200,100);
 polyfoo2 
----------
      100
(1 row)

The body of the aggregation function whose result is array of values of subset is example of using of polymorphic function.

CREATE OR REPLACE FUNCTION aggregate_array(state ANYARRAY, a ANYELEMENT) 
RETURNS ANYARRAY AS $$
   SELECT CASE
     WHEN state IS NULL THEN ARRAY[a]
     WHEN a IS NULL THEN state
     ELSE array_append(state,a)
 END;
$$ LANGUAGE sql;

testdb011=> SELECT count(*), aggarray(surname) FROM people GROUP BY surname ~ '.*á';

 count |                  aggarray
-------+---------------------------------------------
     1 | {Gregorová}
     6 | {Kolík,Čurda,Hojka,Chytlík,Stěhule,Novotný}
(2 rows)

Sometimes polymorphic functions are referred to as generic functions. Internally, PostgreSQL hasn't got any date type ANYELEMENT. Every time the activation of any function is required PostgreSQL always investigates whether a ready code for required function exists (it tests harmony with number and type of arguments). If the function hasn't been started yet PostgreSQL searches for its source code and prepares it for its first start. If PostgreSQL don't find harmony with polymorphic function then it replaces unknown types by required types and prepares function with parameters, which are already all determined – standard mechanism is used.

If we need to identify variant of polymorphic function we can use IS OF operator (or function pg_typeof).

  IF a IS OF (INTEGER) THEN
    RAISE NOTICE 'Parameter a is integer';
  ELSIF a IS OF (varchar) THEN
    RAISE NOTICE 'Parameter a is string';
  END IF;

PL/pgSQL function with (IN) OUT parameters

PL/pgSQL is specific because of its support only to the proposals of function, not procedures. The method of transfer of the final value is the main different between procedure and function. Classically, procedure returns result by variables, which are transfer by reference. The problem is that PostgreSQL don´t support variables, except PL/pgSQL. In older versions of PostgreSQL we had to use complex type, when we needed to obtain more then one value from the code. The current versions of PostgreSQL support OUT variables so we needn't declare a special out type.

CREATE OR REPLACE FUNCTION foo(IN a int, IN b int, OUT aa int, OUT bb int) AS $$
BEGIN
  aa := a*a;
  bb := b*b;
END
$$

SELECT foo(10,10);

News in PostgreSQL 8.3

In this version we can simply return complete result of SQL statement by using a RETURN QUERY statement. It is replacement of repeated call RETURN NEXT statement. RETURN QUERY like RETURN NEXT doesn't ending function execution.

CREATE OR REPLACE FUNCTION dirty_series(m integer, n integer) 
RETURNS SETOF integer AS $$
BEGIN
  RETURN QUERY SELECT * 
                  FROM generate_series(1,m) g(i)
                 WHERE i % n = 0;
  RETURN;
END; $$ LANGUAGE plpgsq;

In this version you can assigng to function a specific values of system variables. This behave is similar to T-SQL or MySQL, where current settings (in function registration time) is stored together with procedure. Typical use case is change of search_path variable for SECURITY DEFINER. Syntax is clean from example:

CREATE FUNCTION report_guc(text) RETURNS TEXT AS
  $$ SELECT current_setting($1) $$ LANGUAGE sql
  SET regex_flavor = basic;

ALTER FUNCTION report_guc(text) 
  RESET search_path 
  SET regex_flavor = extended;

News in PostgreSQL 8.4

Till this time any dynamic SQL was some risk of SQL injection. Readability of code with dynamic SQL wasn't well too. This problems can be removed with using a USING clause for EXECUTE statement. We are able to parametrize dynamic SQL via this clause similary like PREPARED statement does.

unsecure dynamic SQL:

EXECUTE 'SELECT * FROM ' || tabname || ' WHERE value = \'' || value || '\'';

secure dynamic SQL:

EXECUTE 'SELECT * FROM ' || tabname::regclass || ' WHERE value = $1' USING value;

Dynamic SQL can be more secure and more readable now - with less work. But still - use it carefully and as few as possible (for example - when we need generate execution plan with knowledge actual parameters).

CASE statement

The syntax of this statement coming from SQL/PSM language - it is standard language for stored procedures (used in DB2 or MySQL):

/* simple case */
CASE x
    WHEN 1, 2 THEN
         msg := 'one or two';
    ELSE
         msg := 'other value than one or two';
END CASE;

/* search case */
CASE
    WHEN x BETWEEN 0 AND 10 THEN
         msg := 'value is between zero and ten';
    WHEN x BETWEEN 11 AND 20 THEN
         msg := 'value is between eleven and twenty';
END CASE;

Customized exceptions

PostgreSQL allows to pass on lot of informations in any exception. This functionality wasn't accessible from PL/pgSQL RAISE statement. RAISE statement allowed to enter only message text. All other fields of exception was default. This limit is history now. Syntax of RAISE statement was enhanced - so developer can specify SQLSTATE, ERRCODE, MESSAGE, DETAIL, HINT:

RAISE division_by_zero;
RAISE SQLSTATE '22012';
RAISE EXCEPTION 'Nonexistent ID --> %', user_id
   USING HINT = 'Please check your user id';
RAISE 'Duplicate user ID: %', user_id
   USING ERRCODE = 'unique_violation';
RAISE 'Duplicate user ID: %', user_id
   USING ERRCODE = '23505';

We can catch custom exception now. It wasn't possible in older versions:

BEGIN
   ...
EXCEPTION
  WHEN SQLSTATE '22012' THEN ...

Variadic functions

Variadic functions are functions with mutable number of parameters. PostgreSQL contains some of buildin variadic functions - least,greatest,coalesce. Internal implementation isn't too difficult, but it requires changes inside PostgreSQL parser. The workaround on user level was using a function overloading - but it isn't nice solution too - it can duplicate code. Final solution is a variadic function support. All used parameters are stored to array and this array is passed as variadic parameter to function:

CREATE FUNCTION myleast(VARIADIC a numeric[])
 RETURNS NUMERIC AS $$
   SELECT min($1[i])
      FROM generate_subscripts($1,1) g(i)
 $$ LANGUAGE SQL IMMUTABLE;
CREATE FUNCTION
Time: 5,274 ms

postgres=# SELECT myleast(1,2,3,4);
 myleast
---------
        1
(1 row)

Sometimes we have to call variadic function like normal function. Then we can use a VARIADIC modificator:

postgres=# SELECT myleast(VARIADIC ARRAY[1,3,4,-5,6,8]);
 myleast
---------
      -5

Table functions

ANSI SQL knows functions that returns table. PostgreSQL support this kind of functions too - PostgreSQL use little bit different syntax and different name - Set Returned Functions. PostgreSQL 8.4 knows ANSI SQL syntax now:

--ANSI SQL 
create or replace function tab(i int)
returns table(a int, b int) as $$
  select i, i+1 
     from generate_series(1, $1) g(i);
$$ language sql immutable strict;

postgres=# select * from tab(2);
 a | b 
---+---
 1 | 2
 2 | 3
(2 rows)

-- PostgreSQL syntax
create or replace function tab(in int, out a int, out b int)
returns setof record as $$
  select i, i+1 
     from generate_series(1, $1) g(i);
$$ language sql immutable strict;

Default values for function parameters

Possibility to define defaults for function parameters increase developer's komfort and decrease code duplication. Implementation is similar to Firebird - missing parameters are fill in from right by default values.

postgres=# CREATE OR REPLACE FUNCTION instr(varchar, varchar, int = 1) 
           RETURNS int AS $$
              SELECT position($2 IN substring($1 FROM $3)); 
           $$ LANGUAGE sql;
CREATE FUNCTION
Time: 2,738 ms
postgres=# SELECT instr('abcabc','a');
 instr 
-------
     1
(1 row)

Time: 1,907 ms
postgres=# SELECT instr('abcabc','a',2);
 instr 
-------
     3
(1 row)

Time: 1,381 ms

RETURN QUERY EXECUTE

I was pleasantly surprised by success of RETURN QUERY. I wasn't expect so this enhancing can be commited, and implementation was really simple - a few lines in parser. 8.4 RETURN QUERY has support for dynamic SQL - RETURN QUERY EXECUTE:

FOR r IN EXECUTE '....' LOOP
  RETURN NEXT r;
END LOOP

can be replaced by:

RETURN QUERY EXECUTE '...';

News in 9.0

The biggest change of PL/pgSQL in 9.0 is parser refactoring. Now PL/pgSQL uses true SQL parser for integrated SQL. Other changes are plus/minus small.

SQL and PL/pgSQL identifier collision modes

Very significant minus of PL/pgSQL syntactic check was impossibility to find collision between SQL and PL/pgSQL identifiers. Code with ambiguous identifiers doesn't work well usually, but there wasn't any syntactic or runtime error. Now, PL/pgSQL parser can check uniqueness of identifiers inside block:

CREATE OR REPLACE FUNCTION foo() 
RETURNS void AS $$
DECLARE a integer;                                              
BEGIN              
  FOR a IN
     SELECT a FROM omega 
  LOOP                         
    RAISE NOTICE '%', a;
  END LOOP; 
END; 
$$ LANGUAGE plpgsql;
CREATE FUNCTION
Time: 3,501 ms
postgres=# select foo();
ERROR:  column reference "a" is ambiguous
LINE 1: select a from omega
               ^
DETAIL:  It could refer to either a PL/pgSQL variable or a table column.
QUERY:  select a from omega
CONTEXT:  PL/pgSQL function "foo" line 3 at FOR over SELECT rows

Position and mixed notation for passing function parameters

Usually values are passed to parameter variables per position. There are two next manner - named and mixed notation. PostgreSQL use following syntax for named notation - parameter_name := value (note - this syntax isn't final - probably will be defined little bit different (ANSI SQL conform) parameter_name => value (this syntax cannot be implemented now, because module hstore uses operator =>):

CREATE OR REPLACE FUNCTION foo(a varchar, b varchar)
RETURNS void AS $$
BEGIN
  RAISE NOTICE 'A:% B:%', a, b;
  RETURN;
END;
$$ LANGUAGE plpgsql;

SELECT foo('Hello','World');
SELECT foo('Hello', b := 'World'); -- mixed notation
SELECT foo(b := 'World', a := 'Hello');  -- named notation

Other changes

  • IN function parameters are writable now
  • Language PL/pgSQL is available immediately after installation.

New in 9.1

Cycle FOREACH

FOREACH cycle helps to us with iteration over array. SLICE clause can be used for iteration over multidimensional arrays. A iteration over very large arrays can be significantly faste with FOREACH statement than with FOR statement.

CREATE OR REPLACE FUNCTION public.ft()
 RETURNS void
 LANGUAGE plpgsql
AS $function$
DECLARE
  _a int;
  _b int;
  _xx foo[];
BEGIN
  _xx := (SELECT ARRAY(SELECT (a,b) FROM foo));
  FOREACH _a, _b IN ARRAY _xx
  LOOP
    RAISE NOTICE 'a: %,	b: %', _a, _b;
  END LOOP;
END;
$function$;

postgres=# SELECT ft();
NOTICE:  a: 10,b: 20
NOTICE:  a: 30,b: 40
 ft 
────
 
(1 row)

CREATE OR REPLACE FUNCTION public.ft()
 RETURNS void
 LANGUAGE plpgsql
AS $function$
DECLARE
  _a int[];
  _xx int[] := ARRAY[[1,2,3], [4,5,6]];
BEGIN
  FOREACH _a SLICE 1 IN ARRAY _xx
  LOOP
    RAISE NOTICE '%', _a;             
  END LOOP;
END;
$function$;

postgres=# SELECT ft();
NOTICE:  {1,2,3}
NOTICE:  {4,5,6}
 ft 
────
 
(1 row)

New in 9.2

GET STACKED DIAGNOSTICS

When I worked on some project, I needed catch exception, write it to log, and forward it. It was impossible in previous versions, because there was a acccess to SQLERRM and SQLCODE variables only. Now we have access to almost complete data from exception (stack of calls are not accessible) due using GET STACKED DIAGNOSTICS.

create or replace function stacked_diagnostics_test() returns void as $$
declare _detail text;
        _hint text;
        _message text;
begin
  perform raise_test();
exception when others then
  get stacked diagnostics
        _message = message_text,
        _detail = pg_exception_detail,
        _hint = pg_exception_hint;
  raise notice 'message: %, detail: %, hint: %', _message, _detail, _hint;
end;
$$ language plpgsql;
select stacked_diagnostics_test();

New in 9.3

a expression with result of composite type can be used in RETURN and RETURN NEXT statements.

Access to exception info additional fields

Now, GET STACKED DIAGNOSTICS statement can be used for access to fields:: COLUMN_NAME, TABLE_NAME, SCHEMA_NAME, CONSTRAINT_NAME and PG_DATATYPE_NAME. A new related fields can be used in RAISE statement too: COLUMN, TABLE, SCHEMA, CONSTRAINT a DATATYPE.

CREATE OR REPLACE FUNCTION stacked_diagnostics_test() 
RETURNS void AS $$
DECLARE _column_name text;
        _constraint_name text;
        _datatype_name text;
        _table_name text;
        _schema_name text;
BEGIN
  RAISE EXCEPTION USING
    column = '>>some column name<<',
    constraint = '>>some constraint name<<',
    datatype = '>>some datatype name<<',
    table = '>>some table name<<',
    schema = '>>some schema name<<';
EXCEPTION WHEN OTHERS THEN
  GET STACKED DIAGNOSTICS
        _column_name = column_name,
        _constraint_name = constraint_name,
        _datatype_name = pg_datatype_name,
        _table_name = table_name,
        _schema_name = schema_name;
  RAISE NOTICE 'column %, constraint %, type %, table %, schema %',
    _column_name, _constraint_name, _datatype_name, _table_name, _schema_name;
END;
$$ LANGUAGE plpgsql;

New in 9.4

Checking call stack everywhere

In previous version you can check call stack in exception handler with GET STACKED DIAGNOSTICS var = PG_EXCEPTION_CONTEXT statement. In 9.4 you can get call stack everywhere with GET DIAGNOSTICS statement:

CREATE OR REPLACE FUNCTION public.inner_func() RETURNS integer AS  $$
DECLARE
  stack text;
BEGIN
  GET DIAGNOSTICS stack = PG_CONTEXT;
  RAISE NOTICE e'--- Call Stack ---\n%', stack;
  RETURN 1;
END;
$$ LANGUAGE plpgsql;

New in 9.5

Significant internal change is reimplementation of type casting.

ASSERT statement

ASSERT statements ensures function's preconditions. It raises unhandled exception if first expression is not the true. Second optional argument can to specify message attached to raised exception.

postgres=# DO $$BEGIN ASSERT false; END;$$; 
ERROR:  assertion failed
CONTEXT:  PL/pgSQL function inline_code_block line 1 at ASSERT

postgres=# DO $$BEGIN ASSERT false, 'some is wrong'; END;$$; 
ERROR:  some is wrong
CONTEXT:  PL/pgSQL function inline_code_block line 1 at ASSERT

Recommendation for design of stored procedures in PL/pqSQL language

You can find a number of recommendation for writing of the stored procedures in PL/SQL by Oracle Company on the Internet. Implementation of PL/pgSQL is different and therefore not every recommendation is relevant to PostgreSQL. You can write unreadable and not easy-to-maintain, and eventually ineffective, code in PL as well as in other programming languages. If you will keep following recommendations you can decrease the risk of unsuitable code:

  • Use a programmer editor and code save to file. Using a files is better for code comments, better for code organization
  • Be careful of conflict of local variables and database objects:
  1. use prefix of variables (e.g. for local variables use “_” symbol)
  2. use qualified attributes (table. column) at all SQL command in procedures
  • Declare variables by derived type - %TYPE and %ROWTYPE
  • At PL use native SQL everywhere that is possible and wise. It’s possible that PL code may be quicker than complicated request. Try to not use dynamic SQL. Think that you can replace cycle in function by classic SELECT containing construction CASE.
-- bad code
FOR _c1, _c2 IN SELECT c1,c2 FROM tab1 LOOP
  IF _c1 > 20 THEN
    INSERT INTO tab2 VALUES(20,_c2);
  ELSE
    INSERT INTO tab3 VALUES(_c1,_c2);
  END IF;
END LOOP;

-- well code
INSERT INTO tab2 
  SELECT 20,c2 FROM tab1 WHERE c1 > 20;
INSERT INTO tab3
  SELECT c1,c2 FROM tab1 WHERE c1 <= 20;
  • The function should contain only one RETURN command – one way in, one way out.
  • Duplicate code shouldn’t appaer in aplication – don’t write redundant code.
  • Use ASSERT function, eventually its modification:
CREATE OR REPLACE FUNCTION Assert(bool, varchar) RETURNS void AS $$
BEGIN 
 IF NOT $1 OR $1 IS NULL THEN
   IF $2 IS NOT NULL THEN  
     RAISE EXCEPTION 'Assert failure: %', $2; 
   END IF;
   RAISE NOTICE 'Assert. Message is null';
 END IF;   
END;$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION Assert_IsNotNull(anyelement, varchar) RETURNS void AS $$
BEGIN
  PERFORM Assert($1 IS NOT NULL, $2);
END; $$ LANGUAGE plpgsql;
  • Make uderstood on list of errors defined by user beforehand. Follow the put-up notation for text of exceptions.
  • Don’t test NULL variable that is declared as NOT NULL. It’s on PL to find out that this variable won’t contain NULL. Don’t test content of variables by forced command for location of fruitfulness of command, but test content of logical system variable FOUND.
  • Use label <<label>> for loops and blocks, use Boolean type so that your code will be the most readable.
  • Don’t rely on automatic converting of date and timestamp type, tht depand on configuration. Use function to_char and to_date.
  • Don't use IF to fill logic variable:
is_ok := _age > 18;
  • Use every variable just in one meaning. Don't reuse variables.
  • Starting category of function is VOLATILE. If you don't use tables and functions like random() or currval(), use IMMUTABLE or STABLE categories. Process time of function may be half.
  • Don't enclose SQL commands to simply functions uselessly.
  • Try to reduce cursors and temporary tables.
  • Prefer standardized function against your own function.
  • In triggers don't correct data “secretly”, consider whether you can use CHECK.
  • Each procedure should contain up to 50 or 60 lines.
  • Use unit tests (for checking error identification, for checking propriety of corrections).
  • Enclose the call of exception to own procedure.
  • Use only one of two possible method (exception or return error code) of error indication in our code.
  • Minimalise repeated assign statement to variable (use a COALESCE function).
--bad
DECLARE v varchar;
BEGIN
  v := 'a';
  v := v || 'b';
  v := v || 'c';
  RETURN v;
END;

--well
BEGIN
  RETURN 'a' || 'b' || 'c';
END;

-- bad
DECLARE s varchar := '';
BEGIN
  IF x1 IS NULL THEN
    s := s || 'NULL,'
  ELSE
    s := s || x1;
  END IF;
  
  IF x2 IS NULL THEN
    s := s || 'NULL, '
  ELSE
    s := s || x2;
  END IF;
  ...

-- well
DECLARE s varchar;
BEGIN
  s := COALESCE(x1 || ',', 'NULL,')
       || COALESCE(x2 || ',', 'NULL,')
  ...
  • Use a SQL language for simple functions instead PL/pgSQL:
--bad
CREATE OR REPLACE FUNCTION last_day(IN d date) 
RETURNS date AS $$
BEGIN
  RETURN CAST(date_trunc('month',current_date + interval '1month') AS date) - 1;
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

--well
CREATE OR REPLACE FUNCTION last_day(IN d date) 
RETURNS date AS $$
  SELECT CAST(date_trunc('month', $1 + interval '1month') AS date) - 1;
$$ LANGUAGE sql;

Google AdSense