Semistructured Data

6. Semistructured Data#

In previous lectures, we’ve worked with relational data, where the schema is well-defined, and each row has a fixed set of attributes or columns. Each attribute’s type is pretty simple (e.g., int), and there is no nesting. But not all data is so well-formed. In recent years, unstructured data has been stored in larger quantities than structured data, due to storage costs decreasing and the ease of logging unstructured data. How do we work with this kind of data?