CYBERTEC PostgreSQL Logo

Embedded SQL in C for PostgreSQL with ecpg

03.2020 / Category: / Tags: |
embedded SQL taken literally
© Laurenz Albe 2020

 

Embedded SQL is by no means a new feature — in fact it is so old-fashioned that many people may not know about it at all. Still, it has lots of advantages for client code written in C. So I'd like to give a brief introduction and talk about its benefits and problems.

What is embedded SQL?

Typically you use a database from application code by calling the API functions or methods of a library. With embedded SQL, you put plain SQL code (decorated with “EXEC SQL”) smack in the middle of your program source. To turn that into correct syntax, you have to run a pre-processor on the code that converts the SQL statements into API function calls. Only then you can compile and run the program.

Embedded SQL is mostly used with old-fashioned compiled languages like C, Fortran, Pascal, PL/I or COBOL, but with SQLJ there is also a Java implementation. One reason for its wide adoption (at least in the past) is that it is specified by the SQL standard ISO/IEC 9075-2 (SQL/Foundation). This enables you to write fairly portable applications.

To be able to discuss the features in some more detail, I'll introduce a sample C program using embedded SQL.

Sample embedded SQL program

The sample program operates on a database table defined like this:

The program is in a file sample.pgc and looks like this:

Embedded SQL syntax explained

Each SQL statement or declaration starts with the magic words EXEC SQL and ends with a semicolon.

Most statements are translated by the preprocessor right where they are.

Connection and disconnection

There are several ways to specify a connect string to the database, and of course the value does not have to be hard-coded. You can also use CONNECT TO DEFAULT and use libpq environment variables and a password file to connect.

It is possible to open several database connections at once; then you have to name the connections. Connections are like global variables: they are available everywhere once opened, and you don't have to pass them to functions.

Host variables

Special “host variables” exchange data between the program and the SQL statements.

They have to be declared in the declare section:

This is just like any other C variable declaration, but only these variables can be used in SQL statements. They have to be prepended with a colon (:) in SQL statements:

The last variable, v_val_ind, is an indicator variable: it is set to a negative number to indicate a NULL value.

One advantage of embedded SQL is that the conversion between PostgreSQL data types and C data types is done automatically. With libpq, you can either get strings or the internal PostgreSQL representation. There is the additional (non-standard) libpgtypes library that provides convenient support for data types like timestamp or numeric.

Note that v_val was declared as char[81] so that it can contain any 20 UTF-8 characters. If you cannot predict the size of the result, you can use descriptor areas to exchange data.

Exception handling

If SQL statements cause an error or warning or return no data, you define the program behavior with WHENEVER declarations:

These slightly diverge from the SQL standard, which only has CONTINUE and GO TO label actions. Also, the standard uses SQLEXCEPTION instead of SQLERROR. The DO BREAK action inserts a break; statement to break out of the containing loop.

Different from other embedded SQL statements, these directives apply to all embedded SQL statements below them in the source file. They define how the pre-processor translates SQL statements and are independent of the control flow in the program.

To avoid recursion, it is best to set the action to CONTINUE (ignore) in the exception handler.

Note that this syntax allows to write code without adding error handling to every database call, which is convenient and makes the code concise and readable.

Interplay of embedded SQL with libpq

Some advanced or PostgreSQL-specific features of libpq are not available in embedded SQL, for example the Large Object API or, more importantly, COPY.

To use these, you can call the function ECPGget_PGconn(const char *connection_name), which returns the underlying libpq connection object.

Building the executable

The pre-processor is called ecpg and part of the PostgreSQL core distribution. By default, ecpg assumes that source files have the extension .pgc.

To compile the resulting C program, you can use any C compiler, and you have to link with the libecpg library.

Here is a Makefile that can be used to build the sample program with PostgreSQL v12 on RedHat-like systems:

Conclusion

We have only scratched the surface, but I hope I could demonstrate that embedded SQL is a convenient way of writing database client code in C.

Here is a list of advantages and disadvantages of embedded SQL compared to directly using the C API of libpq:

Disadvantages:

  • requires preprocessing
  • unknown to most present-day developers
  • less powerful than libpq (e.g., no direct COPY support)

Advantages:

  • standardized API for good portability between databases
  • more concise and readable code
  • simple but elegant exception handling
  • translation between C and SQL data types

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram