A step above its traditional predecessor is semi-structured data, which arrived in response to the rigidity of table-based formats. Semi-structured data retains some organizational elements of structured data but removes the traditional tabular constraints. This type of data drove the growth and popularity of NoSQL databases such as Cassandra, MongoDB and Redis, which were designed to manage more flexible data structures.
This brings us to unstructured data, which has overwhelmingly become the most common type of data. As its name signifies, unstructured data can come in any form or format, varies widely in size, and creates complex semantic relationships. Thus, unstructured data requires a much different approach to processing and management.
Taking a deeper look at semantic complexity, consider three different photos of the same object. Even though the raw data behind each of these photos could vary widely — file size, number of pixels, resolution, and so on — their semantic meaning is the same. Therein lies the challenge with modern data management. What’s the best way to store, search, and analyze content not based on their technical characteristics but on their meaning?