Lecture 05 -- DDL, DML & Constraints¶

In [1]:
# Run this cell to set up imports
import numpy as np
import pandas as pd
In [2]:
%reload_ext sql
In [3]:
!createdb lecture5 -h localhost
createdb: error: database creation failed: ERROR:  database "lecture5" already exists
In [4]:
%sql postgresql://127.0.0.1:5432/lecture5
Connecting to 'postgresql://127.0.0.1:5432/lecture5'

Manager demo¶

Creating the manager relation

In [5]:
%%sql

DROP TABLE IF EXISTS managers;
CREATE TABLE managers (
    manager_name VARCHAR(20),
    age INTEGER,
    address TEXT,
    manager_id INTEGER
);
SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
Out[5]:
manager_name age address manager_id



Add a tuple

In [6]:
%%sql
INSERT INTO managers
    (manager_id, age, manager_name)
VALUES
    (1, 23, 'Aditya'),
    (2, 23, 'Lisa')
;

/***********/
SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
2 rows affected.
2 rows affected.
Out[6]:
manager_name age address manager_id
Aditya 23 None 1
Lisa 23 None 2

What if we add a new tuple?

In [7]:
%%sql
INSERT INTO managers
    (manager_id, manager_name, age)
VALUES
    (1, 'Michael', 123);

/***********/
SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
1 rows affected.
3 rows affected.
Out[7]:
manager_name age address manager_id
Aditya 23 None 1
Lisa 23 None 2
Michael 123 None 1

Whoops! We can have two tuples with the same ID!

Alternate Managers Schema

One way to to fix is to let the system set the ID. This replaces the manager_id type with a SERIAL instead of an INTEGER.

In [8]:
%%sql
DROP TABLE IF EXISTS managers;
CREATE TABLE managers(
    manager_name VARCHAR(20),
    age INTEGER,
    address TEXT,
    manager_id SERIAL
);

/***********/
SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
Out[8]:
manager_name age address manager_id
In [9]:
%%sql
INSERT INTO managers
    (age, manager_name)
VALUES
    (23, 'Aditya'),
    (23, 'Lisa')
;

SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
2 rows affected.
2 rows affected.
Out[9]:
manager_name age address manager_id
Aditya 23 None 1
Lisa 23 None 2



And we can add another tuple...

In [10]:
%%sql
INSERT INTO managers
    (manager_name, age)
VALUES
    ('Michael', 123);

/***********/
SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
1 rows affected.
3 rows affected.
Out[10]:
manager_name age address manager_id
Aditya 23 None 1
Lisa 23 None 2
Michael 123 None 3

Adding in Constraints¶

In [11]:
%%sql

DROP TABLE IF EXISTS managers;
CREATE TABLE managers (
    manager_name VARCHAR(20) UNIQUE,
    age INTEGER DEFAULT 50,
    address TEXT,
    manager_id SERIAL PRIMARY KEY
);
SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
Out[11]:
manager_name age address manager_id
In [12]:
%%sql
INSERT INTO managers
    (age, manager_name)
VALUES
    (23, 'Aditya'),
    (23, 'Lisa')
;

SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
2 rows affected.
2 rows affected.
Out[12]:
manager_name age address manager_id
Aditya 23 None 1
Lisa 23 None 2
In [13]:
%%sql
INSERT INTO managers
    (manager_name)
VALUES
    ('Michael');

SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
1 rows affected.
3 rows affected.
Out[13]:
manager_name age address manager_id
Aditya 23 None 1
Lisa 23 None 2
Michael 50 None 3

Can't insert another Michael due to the UNIQUE constraint!

In [14]:
%%sql
INSERT INTO managers
    (manager_name)
VALUES
    ('Michael');

SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
RuntimeError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "managers_manager_name_key"
DETAIL:  Key (manager_name)=(Michael) already exists.

[SQL: INSERT INTO managers
    (manager_name)
VALUES
    ('Michael');]
(Background on this error at: https://sqlalche.me/e/20/gkpj)
If you need help solving this issue, send us a message: https://ploomber.io/community

OK what if we update Michael's ID?

In [15]:
%%sql
UPDATE managers
SET manager_id = manager_id + 1 
WHERE manager_name = 'Michael';

SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
1 rows affected.
3 rows affected.
Out[15]:
manager_name age address manager_id
Aditya 23 None 1
Lisa 23 None 2
Michael 50 None 4

This works, but will this work?

In [16]:
%%sql
UPDATE managers
SET manager_id = manager_id - 1 
WHERE manager_name = 'Lisa';

SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
RuntimeError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "managers_pkey"
DETAIL:  Key (manager_id)=(1) already exists.

[SQL: UPDATE managers
SET manager_id = manager_id - 1
WHERE manager_name = 'Lisa';]
(Background on this error at: https://sqlalche.me/e/20/gkpj)
If you need help solving this issue, send us a message: https://ploomber.io/community

ALTER TABLES¶

In [17]:
%%sql
ALTER TABLE managers
    DROP address,
    DROP age,
    ADD income REAL DEFAULT 1000000.0;

/***********/
SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
3 rows affected.
Out[17]:
manager_name manager_id income
Aditya 1 1000000.0
Lisa 2 1000000.0
Michael 4 1000000.0

Beware of UNIQUE and NULL values!!

In [18]:
%%sql

DROP TABLE IF EXISTS managers;
CREATE TABLE managers (
    manager_name VARCHAR(20) UNIQUE,
    age INTEGER DEFAULT 50,
    address TEXT,
    manager_id SERIAL PRIMARY KEY
);
SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
Out[18]:
manager_name age address manager_id
In [19]:
%%sql
INSERT INTO managers(age)
VALUES
    (100),
    (NULL);
Running query in 'postgresql://127.0.0.1:5432/lecture5'
2 rows affected.
Out[19]:
In [20]:
%%sql
SELECT * FROM managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
2 rows affected.
Out[20]:
manager_name age address manager_id
None 100 None 1
None None None 2

Notice Even though we can't have duplicate names, NULL isn't considered a duplicate.




Delete the relation entirely

In [21]:
%%sql
DROP TABLE managers;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
Out[21]:

IMDB Database¶

In [22]:
%reload_ext sql
In [23]:
%sql postgresql://127.0.0.1:5432/lecture5
In [24]:
%%sql
DROP TABLE IF EXISTS actors CASCADE;
DROP TABLE IF EXISTS movies CASCADE;
DROP TABLE IF EXISTS cast_info;

CREATE TABLE actors (
  id INTEGER,
  name TEXT,
  PRIMARY KEY(id)
);

CREATE TABLE movies (
  id INTEGER,
  title TEXT,
  PRIMARY KEY(id)
);

CREATE TABLE cast_info (
  person_id INTEGER,
  movie_id INTEGER,
  FOREIGN KEY (person_id)
    REFERENCES actors (id)
    ON DELETE SET NULL
    ON UPDATE CASCADE,
  FOREIGN KEY (movie_id)
    REFERENCES movies (id)
    ON DELETE SET NULL);

INSERT INTO actors VALUES
    (1, 'Tom Hanks'),
    (3, 'Michelle Yeoh')
;

INSERT INTO movies VALUES
    (23, 'Forrest Gump'),
    (45, 'Tomorrow Never Dies')
;

INSERT INTO cast_info VALUES
    (1, 23),
    (3, 45)
;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
2 rows affected.
2 rows affected.
2 rows affected.
Out[24]:
In [25]:
%%sql
SELECT * FROM cast_info;
Running query in 'postgresql://127.0.0.1:5432/lecture5'
2 rows affected.
Out[25]:
person_id movie_id
1 23
3 45

A. What if we… Delete a tuple from Actor corresponding to id = 1?

In [ ]:
%%sql
DELETE FROM actors
WHERE id = 1;

SELECT * FROM cast_info;

(note: reset table layout for each choice)

B. Change a tuple in Actor from id = 1 to id = 2?

In [ ]:
%%sql
UPDATE actors
SET id = 2
WHERE id = 1;

SELECT * FROM actors;
In [ ]:
%%sql
SELECT * FROM cast_info;

C. Delete a tuple from Movie corresponding to id = 23?

In [ ]:
%%sql
DELETE FROM movies
WHERE id = 23;

SELECT * FROM cast_info;

D. Change a tuple in Movie from id = 23 to id = 24?

In [ ]:
%%sql
UPDATE movies
SET id = 24
WHERE id = 23;

E. Insert a tuple into cast_info that adds a new person_id not found in Actor?

In [ ]:
%%sql
INSERT INTO cast_info VALUES
    (467, 23)
;
In [ ]: