Practical Big Data Analytics
上QQ阅读APP看书,第一时间看更新

Structured

Structured data, as the name implies, indicates datasets that have a defined organizational structure such as Microsoft Excel or CSV files. In pure database terms, the data should be representable using a schema. As an example, the following table representing the top five happiest countries in the world published by the United Nations in its 2017 World Happiness Index ranking would be an atypical representation of structured data.

We can clearly define the data types of the columns--Rank, Score, GDP per capita, Social support, Healthy life expectancy, Trust, Generosity, and Dystopia are numerical columns, whereas Country is represented using letters, or more specifically, strings.

Refer to the following table for a little more clarity:

World Happiness Report, 2017 [Source: https://en.wikipedia.org/wiki/World_Happiness_Report#cite_note-4]

Commercial databases such as Teradata, Greenplum as well as Redis, Cassandra, and Hive in the open source domain are examples of technologies that provide the ability to manage and query structured data.