Data 101 (Info 258): Data Engineering πΎ
UC Berkeley, Spring 2025
Ed Lecture Recordings Gradescope Additional Extensions

Announcements
Schedule
Week 1
- Tue 1/21
- Thu 1/23
-
- Lecture 2 SQL Review
- Course Notes
- Fri 1/24
-
- Project 0 SQL Review
- Due Fri 2/7, 5pm
Week 2
- Tue 1/28
- Thu 1/30
-
- Discussion 1 SQL Review
- Code, Solution
- Fri 1/31
-
- Homework 1 Homework 1
- Due Wed 2/12, 5pm
Week 3
- Tue 2/4
- Thu 2/6
-
- Discussion 2 Relational Algebra, Subqueries, CTEs, Joins
- Solution
- Fri 2/7
- Project 0 Due, 5pm
-
- Project 1 SQL
- Due Wed 2/19, 5pm
Week 4
- Tue 2/11
-
- Lecture 7 Optimizing for Performance I
- Course Notes
- Wed 2/12
- Homework 1 Due, 5pm
- Thu 2/13
-
- Lecture 8 Optimizing for Performance II
- Course Notes
-
- Discussion 3 DML/DDL, Bits Conversion, Query Performance
- Solution, Code
- Fri 2/14
-
- Homework 2 Homework 2
- Due Wed 2/26, 5pm
Week 5
- Tue 2/18
-
- Lecture 9 Optimizing for Performance III
- Course Notes
- Wed 2/19
- Project 1 Due, 5pm
- Thu 2/20
-
- Lecture 10 Data Modeling I
- Course Notes
-
- Discussion 4 Query Performance
- Solution, Code
- Fri 2/21
-
- Project 2 Query Performance
- Due Wed 3/5, 5pm
Week 6
- Tue 2/25
-
- Lecture 11 Data Preparation I: Structural
- Course Notes
- Wed 2/26
- Homework 2 Due, 5pm
- Thu 2/27
-
- Discussion 5 Data Models, Data Preparation
- Solution, Code
- Fri 2/28
-
- Homework 3 Homework 3
- Due Wed 3/19, 5pm
Week 7
- Tue 3/4
-
- Lecture 13 Data Preparation III: Outliers
- Course Notes
- Wed 3/5
- Project 2 Due, 5pm
- Thu 3/6
-
- Discussion 6 Window Functions, Data Granularity
- Solution, Code
Week 8
- Tue 3/11
- Wed 3/12
- Midterm Midterm Exam (6-8pm)
- Thu 3/13
- Lecture No Lecture
- Discussion No Discussion
Week 9
- Tue 3/18
- Lecture No Lecture
- Wed 3/19
- Homework 3 Due, 5pm
- Thu 3/20
- Lecture 16 Data Modeling II: Normalization + ER
-
- Discussion 7 Entity Resolution, ER Diagram, Hampel X84
- Solution, Code
- Fri 3/21
-
- Project 3 Data Transformation
- Due Fri 4/4, 5pm
Week 10
- All Week
- Spring Break
Week 11
- Mon 3/31
-
- Homework 4 Homework 4
- Due Wed 4/9, 5pm
- Tue 4/1
- Lecture 17 Semistructured Data: NoSQL, JSON, XML
- Thu 4/3
- Lecture 18 MongoDB I
-
- Discussion 8 TBA
- Solution
- Fri 4/4
- Project 3 Due, 5pm
-
- Project 4 Mongo
- Due Wed 4/16, 5pm
-
- Project 5 Optional* Final Project
- Checkpoint due Mon 4/21, 5pm
Final Report due Fri 5/2, 5pm
Week 12
- Tue 4/5
- Lecture 19 MongoDB II
- Wed 4/6
- Homework 4 Due, 5pm
- Thu 4/7
- Lecture 20 Data Ops and Pipelines
-
- Discussion 9 TBA
- Solution
Week 13
- Tue 4/15
- Lecture 21 MapReduce, Sampling
- Wed 4/16
- Project 4 Due, 5pm
- Thu 4/17
- Lecture 22 Transactions, BI, OLAP
- Fri 4/18
-
- Homework 5 Homework 5
- Due Wed 4/30, 5pm
Week 14
- Mon 4/21
- Project 5 Checkpoint due, 5pm
- Tue 4/22
- Lecture 23 Transactions, BI, OLAP
- Thu 4/24
- Lecture 24 Spreadsheets
-
- Discussion 10 TBA
- Solution
Week 15
- Tue 4/29
-
- Lecture 25 Parallel and Distributed Computing, CAP Theorem, VMs
- End-of-Semester Form
- Wed 4/30
- Homework 5 Due, 5pm
- Thu 5/1
- Lecture 26 Graphs Databases and Knowledge Bases
- Fri 5/2
- Project 5 Final Report due, 5pm
RRR Week
- All Week
- RRR Week
Finals Week
- Wed 5/14
- Final Final Exam (11:30am-2:30pm)