Question: Using variables in a PL/pgSQL function

Question

Using variables in a PL/pgSQL function

Answers 2
Added at 2016-12-28 19:12
Tags
Question

Postgres PL/pgSQL docs say:

For any SQL command that does not return rows, for example INSERT without a RETURNING clause, you can execute the command within a PL/pgSQL function just by writing the command.

Any PL/pgSQL variable name appearing in the command text is treated as a parameter, and then the current value of the variable is provided as the parameter value at run time.

But when I use variable names in my queries I get an error:

ERROR:  syntax error at or near "email"
LINE 16: ...d,email,password) values(identity_id,current_ts,''email'',''...

This is my function:

CREATE OR REPLACE FUNCTION app.create_identity(email varchar,passwd varchar)
RETURNS integer as $$
DECLARE
    current_ts          integer;
    new_identity_id     integer;
    int_max             integer;
    int_min             integer;
BEGIN
    SELECT extract(epoch FROM now())::integer INTO current_ts;
    int_min:=-2147483648;
    int_max:= 2147483647;
    LOOP
        BEGIN
            SELECT floor(int_min + (int_max - int_min + 1) * random()) INTO new_identity_id;
            IF new_identity_id != 0 THEN
                INSERT into app.identity(identity_id,date_inserted,email,password) values(identity_id,current_ts,''email'',''passwd'');
                RETURN new_identity_id;
            END IF;
        EXCEPTION
            WHEN unique_violation THEN
        END;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

Why when I use variables in the query, Postgres throws an error. How is this supposed to be written?

Answers
nr: #1 dodano: 2016-12-28 19:12

You can not put the parameter names in single quotes (''email'' and you can't use the parameter email "as is" because it has the same name as a column in the table. This name clash is one of the reasons it is highly recommended to not use variables or parameters that have the same name as a column in one of the tables. You have three choices to deal with this:

  1. rename the variable. A common naming convention is to prefix parameters with p_ e.g. p_email, then use the un-ambigous names in the insert

    INSERT into app.identity(identity_id,date_inserted,email,password) 
    values(identity_id,current_ts,p_email,p_password);
    
  2. use the $1 for the first parameter and $2 for the second:

    INSERT into app.identity(identity_id,date_inserted,email,password) 
    values(identity_id,current_ts,$1,$2);
    
  3. prefix the parameter name with the function name:

    INSERT into app.identity(identity_id,date_inserted,email,password) 
    values(identity_id,current_ts,create_identity.email,create_identity.password);
    

I would highly recommend to go with option 1


Unrelated, but: you don't need SELECT statements to assign variable values if you don't retrieve those values from a table.

SELECT extract(epoch FROM now())::integer INTO current_ts;

can be simplified to:

current_ts := extract(epoch FROM now())::integer;

and

SELECT floor(int_min + (int_max - int_min + 1) * random()) INTO new_identity_id;

to

new_identity_id := floor(int_min + (int_max - int_min + 1) * random());
nr: #2 dodano: 2016-12-29 01:12

@a_horse answers your actual question and clarifies quoting issues and naming conflicts.

About quoting:

About naming conflicts (behavior of plpgsql changed slightly over time):

Better solution

I suggest a completely different approach to begin with:

CREATE OR REPLACE FUNCTION app.create_identity(_email text, _passwd text
                                             , OUT new_identity_id int) AS
$func$
DECLARE
   _current_ts int := extract(epoch FROM now());
BEGIN
   LOOP
      --+ Generate compeltely random int4 numbers +-----------------------------
      -- integer (= int4) in Postgres is a signed integer occupying 4 bytes   --
      -- int4 ranges from -2147483648 to +2147483647, i.e. -2^31 to 2^31 - 1  --
      -- Multiply bigint 4294967296 (= 2^32) with random() (0.0 <= x < 1.0)   --
      --   trunc() the resulting (positive!) float8 - cheaper than floor()    -- 
      --   add result to -2147483648 and cast the next result back to int4    --
      -- The result fits the int4 range *exactly*                             --
      --------------------------------------------------------------------------
      INSERT INTO app.identity
            (identity_id, date_inserted,  email ,  password)
      SELECT _random_int, _current_ts  , _email , _passwd
      FROM  (SELECT (bigint '-2147483648'       -- could be int, but sum is bigint anyway
                   + bigint '4294967296' * random())::int) AS t(_random_int)  -- random int
      WHERE  _random_int <> 0                   -- exclude 0 (no insert)
      ON     CONFLICT (identity_id) DO NOTHING  -- no exception raised!
      RETURNING identity_id                     -- return *actually* inserted identity_id
      INTO   new_identity_id;                   -- OUT parameter, returned at end

      EXIT WHEN FOUND;                          -- exit after success
      -- maybe add counter and raise exception when exceeding n (100?) iterations
   END LOOP;
END
$func$  LANGUAGE plpgsql;

Major points

  • Your random integer calculation would result in an integer out of range error, because the intermediate term int_max - int_min + 1 operates with integer, but the result won't fit. I suggest the cheaper, correct algorithm above.

  • Entering a block with exception clause is considerably more expensive than without. Luckily, you do not actually need to raise an exception to begin with. Use an UPSERT (INSERT ... ON CONFLICT ... DO NOTHING), to solve this cheaply and elegantly (Postgres 9.5+).
    The manual:

    Tip: A block containing an EXCEPTION clause is significantly more expensive to enter and exit than a block without one. Therefore, don't use EXCEPTION without need.

  • You don't need the extra IF construct either. Use SELECT with WHERE.

  • Make new_identity_id an OUT parameter to simplify.

  • Use the RETURNING clause and insert the resulting identity_id into the OUT parameter directly. Simpler code and faster execution aside, there is an additional, subtle benefit: you get the value that was actually inserted. If there are triggers or rules on the table, this might differ from what you sent with the INSERT.

  • Assignments are relatively expensive in PL/pgSQL. Reduce those to a minimum for efficient code.
    You might remove the last remaining variable _current_ts as well, and do the calculation in the subquery, then you don't need a DECLARE at all. I left that one, since it might make sense to calculate it once, should the function loop multiple times ...

  • All that remains is one SQL command, wrapped into a LOOP to retry until success.

  • If there is a chance that your table might overflow (using all or most int4 numbers) - and strictly speaking, there is always the chance - I would add a counter and raise an exception after maybe 100 iterations to avoid endless loops.

Source Show
◀ Wstecz