The SQL DDL (data definition language) statements could look like this:
CREATE TABLE product ( product_id serial PRIMARY KEY -- implicit primary key constraint , product text NOT NULL , price numeric NOT NULL DEFAULT 0 ); CREATE TABLE bill ( bill_id serial PRIMARY KEY , bill text NOT NULL , billdate date NOT NULL DEFAULT CURRENT_DATE ); CREATE TABLE bill_product ( bill_id int REFERENCES bill (bill_id) ON UPDATE CASCADE ON DELETE CASCADE , product_id int REFERENCES product (product_id) ON UPDATE CASCADE , amount numeric NOT NULL DEFAULT 1 , CONSTRAINT bill_product_pkey PRIMARY KEY (bill_id, product_id) -- explicit pk );
I made a few adjustments:
The n:m relationship is normally implemented by a separate table –
bill_productin this case.
serialcolumns as surrogate primary keys. In Postgres 10 or later consider an
IDENTITYcolumn instead. See:
- Safely rename tables using serial primary key columns
- Auto increment table column
I highly recommend that, because the name of a product is hardly unique (not a good “natural key”). Also, enforcing uniqueness and referencing the column in foreign keys is typically cheaper with a 4-byte
integer(or even an 8-byte
bigint) than with a string stored as
Don’t use names of basic data types like
dateas identifiers. While this is possible, it is bad style and leads to confusing errors and error messages. Use legal, lower case, unquoted identifiers. Never use reserved words and avoid double-quoted mixed case identifiers if you can.
“name” is not a good name. I renamed the column of the table
product_nameor similar). That is a better naming convention. Otherwise, when you join a couple of tables in a query – which you do a lot in a relational database – you end up with multiple columns named “name” and have to use column aliases to sort out the mess. That’s not helpful. Another widespread anti-pattern would be just “id” as column name.
I am not sure what the name of a
bill_idwill probably suffice in this case.
priceis of data type
numericto store fractional numbers precisely as entered (arbitrary precision type instead of floating point type). If you deal with whole numbers exclusively, make that
integer. For example, you could save prices as Cents.
"Products"in your question) goes into the linking table
bill_productand is of type
numericas well. Again,
integerif you deal with whole numbers exclusively.
You see the foreign keys in
bill_product? I created both to cascade changes:
ON UPDATE CASCADE. If a
bill_idshould change, the change is cascaded to all depending entries in
bill_productand nothing breaks. Those are just references without significance of their own.
I also used
ON DELETE CASCADEfor
bill_id: If a bill gets deleted, its details die with it.
Not so for products: You don’t want to delete a product that’s used in a bill. Postgres will throw an error if you attempt this. You would add another column to
productto mark obsolete rows (“soft-delete”) instead.
All columns in this basic example end up to be
NOT NULL, so
NULLvalues are not allowed. (Yes, all columns – primary key columns are defined
UNIQUE NOT NULLautomatically.) That’s because
NULLvalues wouldn’t make sense in any of the columns. It makes a beginner’s life easier. But you won’t get away so easily, you need to understand
NULLhandling anyway. Additional columns might allow
NULLvalues, functions and joins can introduce
NULLvalues in queries etc.
Read the chapter on
CREATE TABLEin the manual.
Primary keys are implemented with a unique index on the key columns, that makes queries with conditions on the PK column(s) fast. However, the sequence of key columns is relevant in multicolumn keys. Since the PK on
(bill_id, product_id)in my example, you may want to add another index on just
(product_id, bill_id)if you have queries looking for a given
- PostgreSQL composite primary key
- Is a composite index also good for queries on the first field?
- Working of indexes in PostgreSQL
Read the chapter on indexes in the manual.