This is a variation of the IMDB database with keys defined. Note that this is a pretty big database! So if you run the below lines, please also remember to delete the imdb_perf_lecture
afterwards to save space on your limited postgreSQL server.
If you didn't load it in with a previous lecture, load in the imdb_perf_lecture
database.
!unzip -u ../lec07/data/imdb_perf_lecture.zip -d ../lec07/data/
Archive: ../lec07/data/imdb_perf_lecture.zip
!psql -h localhost -c 'DROP DATABASE IF EXISTS imdb_perf_lecture'
!psql -h localhost -c 'CREATE DATABASE imdb_perf_lecture'
!psql -h localhost -d imdb_perf_lecture -f ../lec07/data/imdb_perf_lecture.sql
DROP DATABASE CREATE DATABASE SET SET SET SET SET set_config ------------ (1 row) SET SET SET SET SET SET CREATE TABLE psql:../lec07/data/imdb_perf_lecture.sql:33: ERROR: role "yanlisa" does not exist CREATE TABLE psql:../lec07/data/imdb_perf_lecture.sql:45: ERROR: role "yanlisa" does not exist CREATE TABLE psql:../lec07/data/imdb_perf_lecture.sql:59: ERROR: role "yanlisa" does not exist COPY 845888 COPY 2211936 COPY 656453 ALTER TABLE ALTER TABLE ALTER TABLE ALTER TABLE
%reload_ext sql
%sql postgresql://127.0.0.1:5432/imdb_perf_lecture
Example 1
%sql EXPLAIN ANALYZE SELECT * FROM Actor;
QUERY PLAN |
---|
Seq Scan on actor (cost=0.00..13684.88 rows=845888 width=18) (actual time=0.047..56.311 rows=845888 loops=1) |
Planning Time: 0.275 ms |
Execution Time: 80.765 ms |
Notice ANALYZE
:
Notice EXPLAIN
:
Example 2: just planning (no execution)
%sql EXPLAIN SELECT * FROM Actor;
QUERY PLAN |
---|
Seq Scan on actor (cost=0.00..13684.88 rows=845888 width=18) |
Example 3
%sql EXPLAIN ANALYZE SELECT id FROM Actor;
QUERY PLAN |
---|
Seq Scan on actor (cost=0.00..13684.88 rows=845888 width=4) (actual time=0.063..79.801 rows=845888 loops=1) |
Planning Time: 0.085 ms |
Execution Time: 106.623 ms |
Notice:
Example 4:
%sql EXPLAIN ANALYZE SELECT id FROM Actor WHERE id > 4000000;
QUERY PLAN |
---|
Index Only Scan using actor_pkey on actor (cost=0.42..12681.33 rows=445080 width=4) (actual time=0.104..52.051 rows=444781 loops=1) |
Index Cond: (id > 4000000) |
Heap Fetches: 0 |
Planning Time: 0.077 ms |
Execution Time: 64.745 ms |
Notice:
Example 5:
%sql EXPLAIN ANALYZE SELECT id, name FROM Actor WHERE id > 4000000;
QUERY PLAN |
---|
Seq Scan on actor (cost=0.00..15799.60 rows=445080 width=18) (actual time=0.274..71.002 rows=444781 loops=1) |
Filter: (id > 4000000) |
Rows Removed by Filter: 401107 |
Planning Time: 0.075 ms |
Execution Time: 84.819 ms |
Notice:
%%sql
EXPLAIN ANALYZE -- 1
SELECT id
FROM Actor
WHERE id > 4000000 AND name='Tom Hanks';
QUERY PLAN |
---|
Gather (cost=1000.00..11512.90 rows=1 width=4) (actual time=25.958..28.263 rows=0 loops=1) |
Workers Planned: 2 |
Workers Launched: 2 |
-> Parallel Seq Scan on actor (cost=0.00..10512.80 rows=1 width=4) (actual time=23.245..23.246 rows=0 loops=3) |
Filter: ((id > 4000000) AND (name = 'Tom Hanks'::text)) |
Rows Removed by Filter: 281963 |
Planning Time: 0.098 ms |
Execution Time: 28.282 ms |
%%sql
EXPLAIN ANALYZE -- 2
SELECT id
FROM Actor
WHERE id < 4000000 AND name='Tom Hanks';
UsageError: Unrecognized argument(s): -- If you need help solving this issue, send us a message: https://ploomber.io/community
%%sql
EXPLAIN ANALYZE -- 3
SELECT id
FROM Actor;
QUERY PLAN |
---|
Seq Scan on actor (cost=0.00..13684.88 rows=845888 width=4) (actual time=0.073..80.670 rows=845888 loops=1) |
Planning Time: 0.059 ms |
Execution Time: 107.513 ms |
%%sql
EXPLAIN ANALYZE -- 4
SELECT id
FROM Actor
LIMIT 10;
QUERY PLAN |
---|
Limit (cost=0.00..0.16 rows=10 width=4) (actual time=0.017..0.019 rows=10 loops=1) |
-> Seq Scan on actor (cost=0.00..13684.88 rows=845888 width=4) (actual time=0.016..0.017 rows=10 loops=1) |
Planning Time: 0.051 ms |
Execution Time: 0.029 ms |
%config SqlMagic.displaylimit = None
%%sql
EXPLAIN ANALYZE -- 5
SELECT id
FROM Actor
ORDER BY name
LIMIT 10;
QUERY PLAN |
---|
Limit (cost=17366.94..17368.11 rows=10 width=18) (actual time=116.900..119.783 rows=10 loops=1) |
-> Gather Merge (cost=17366.94..99611.72 rows=704906 width=18) (actual time=116.897..119.779 rows=10 loops=1) |
Workers Planned: 2 |
Workers Launched: 2 |
-> Sort (cost=16366.92..17248.05 rows=352453 width=18) (actual time=114.470..114.471 rows=7 loops=3) |
Sort Key: name |
Sort Method: top-N heapsort Memory: 26kB |
Worker 0: Sort Method: top-N heapsort Memory: 26kB |
Worker 1: Sort Method: top-N heapsort Memory: 25kB |
-> Parallel Seq Scan on actor (cost=0.00..8750.53 rows=352453 width=18) (actual time=0.020..58.439 rows=281963 loops=3) |
Planning Time: 0.062 ms |
Execution Time: 119.808 ms |