Data 101 (Info 258): Data Engineering πΎ
UC Berkeley, Spring 2025
Ed Lecture Recordings Gradescope Additional Extensions
Announcements
Schedule
Week 1
- Tue 1/21
- Thu 1/23
-
- Lecture 2 SQL Review
- Course Notes
- Fri 1/24
-
- Project 0 SQL Review
- Due Fri 2/7, 5pm
Week 2
- Tue 1/28
- Thu 1/30
-
- Discussion 1 SQL Review
- Code, Solution
- Fri 1/31
-
- Homework 1 Homework 1
- Due Wed 2/12, 5pm
Week 3
- Tue 2/4
- Lecture 5 DML, DDL, Referential Integrity, Constraints
- Thu 2/6
- Lecture 6 Bits/memory model, Performance Tuning, Index Selection
-
- Discussion 2 TBA
- Solution
- Fri 2/7
- Project 0 Due, 5pm
-
- Project 1 SQL
- Due Wed 2/19, 5pm
Week 4
- Tue 2/11
- Lecture 7 Optimizing for Performance I
- Wed 2/12
- Homework 1 Due, 5pm
- Thu 2/13
- Lecture 8 Optimizing for Performance II
-
- Discussion 3 TBA
- Solution
- Fri 2/14
-
- Homework 2 Homework 2
- Due Wed 2/26, 5pm
Week 5
- Tue 2/18
- Lecture 9 Data Modelling I: Relations, Tensors, Dataframes
- Wed 2/19
- Project 1 Due, 5pm
- Thu 2/20
- Lecture 10 Data Preparation I: Structural
-
- Discussion 4 TBA
- Solution
- Fri 2/21
-
- Project 2 Query Performance
- Due Wed 3/5, 5pm
Week 6
- Tue 2/25
- Lecture 11 Data Preparation II: Numerical, Granularity, Window Functions
- Wed 2/26
- Homework 2 Due, 5pm
- Thu 2/27
- Lecture 12 Data Preparation III: Outliers
-
- Discussion 5 TBA
- Solution
- Fri 2/28
-
- Homework 3 Homework 3
- Due Wed 3/19, 5pm
Week 7
- Tue 3/4
- Lecture 13 Data Preparation IV: Imputation, Entity Resolution
- Wed 3/5
- Project 2 Due, 5pm
- Thu 3/6
- Lecture 14 Data Modeling II: Normalization + ER
-
- Discussion 6 TBA
- Solution
Week 8
- Tue 3/11
- Lecture 15 Backup Lecture/Review
- Wed 3/12
- Midterm 6-8 pm
- Thu 3/13
- Lecture 16 No Class
- Fri 2/1
- Mid-semester Survey
Week 9
- Tue 3/18
- Lecture 17 Semistructured Data: NoSQL, JSON, XML
- Wed 3/19
- Homework 3 Due, 5pm
- Thu 3/20
- Lecture 18 MongoDB I
-
- Discussion 7 TBA
- Solution
- Fri 3/21
-
- Project 3 Data Transformation
- Due Fri 4/4, 5pm
Week 10
- All Week
- Spring Break
Week 11
- Mon 3/31
-
- Homework 4 Homework 4
- Due Wed 4/9, 5pm
- Tue 4/1
- Lecture 19 MongoDB II
- Thu 4/3
- Lecture 20 Data Ops and Pipelines
-
- Discussion 8 TBA
- Solution
- Fri 4/4
- Project 3 Due, 5pm
-
- Project 4 Mongo
- Due Wed 4/16, 5pm
-
- Project 5 Optional* Final Project
- Checkpoint due Mon 4/21, 5pm
Final Report due Fri 5/2, 5pm
Week 12
- Tue 4/5
- Lecture 21 MapReduce, Sampling
- Wed 4/6
- Homework 4 Due, 5pm
- Thu 4/7
- Lecture 22 Transactions and TCL
-
- Discussion 9 TBA
- Solution
Week 13
- Tue 4/15
- Lecture 23 Transactions, BI, OLAP
- Wed 4/16
- Project 4 Due, 5pm
- Thu 4/17
- Lecture 24 Spreadsheets
- Fri 4/18
-
- Homework 5 Homework 5
- Due Wed 4/30, 5pm
Week 14
- Mon 4/21
- Project 5 Checkpoint due, 5pm
- Tue 4/22
- Lecture 25 Parallel and Distributed Computing, CAP Theorem, VMs
- Thu 4/24
- Lecture 26 Graphs Databases and Knowledge Bases
-
- Discussion 10 TBA
- Solution
Week 15
- Tue 4/29
- Lecture 27 TBA
- Wed 4/30
- Homework 5 Due, 5pm
- Thu 5/1
- Lecture 28 Guest Lecture, Closing Thoughts
- Fri 5/2
- Project 5 Final Report due, 5pm
RRR Week
- All Week
- RRR Week
Finals Week
- Wed 5/14
- Final Exam 11:30am - 2:30pm