A sneak peak into the world of data visualisation
‘A picture is worth a thousand words’ is a common saying which especially holds good when we are trying to understand data, find relationships and interpret it.
Organisations that work with big data extensively use data visualisation to make sense of complex data and it is used for a number of activities such as ascertaining factors which influence buying patterns, ranking sales of a commodity, comparing expenses of different departments as well as their sales, improving customer relationships, locating patterns of fraud etc. It also helps to understand which areas need improvement, thus improving businesses. As data journalist and information designer, David McCandless said in his TED talk, “By visualising information, we turn it into a landscape that you can explore with your eyes, a sort of information map. And when you’re lost in information, an information map is useful.”
Data visualisation can be considered as ‘visual communication’. It is one of the steps in data science or data analysis and is associated with communicating data clearly and efficiently. The human brain processes information through pictorials such as charts and graphs in an efficient manner. Visual perception is handled by the visual cortex and is much faster than cognition, which is handled by the cerebral cortex. As with many tools, there are lots of challenges in this arena, like finding relevant data, analysing it quickly, find trends and most importantly, display it in an appropriate manner. The main objectives of this exercise are data analysis and communication.
While rows and columns were used to depict data in the early days, it was French philosopher and mathematician René Descartes who developed a two-dimensional coordinate system, namely graphs, for displaying values. William Playfair is credited with pioneering many graphs and diagrams like bar charts, pie charts etc that are used extensively today. It was Edward Tufte, the author of The Visual Display of Quantitative Information and William Cleveland, who extended and refined data visualisation techniques for statisticians. The traditional data visualisation tools are different types of bar charts, pie charts, histograms and historigrams, and cartograms among various others.
Two sets of data
Stephen Few, a data visualisation expert, defines two types of data, categorical and quantitative. These can be used either by itself or in combination to create visual displays. Categorical data is used only for distinguishing one data from another such as a company ID number or a student’s registration number. Quantitative refers to numerical measures which can be measured and compared such as a company’s annual expenditure or the marks obtained by a student.
The main types of tools here are tables and graphs. A table is used for quantitative data arranged in rows and columns with labels being categorical data. For example, we can have labels as name of a company (categorical) and annual sales (quantitative) and data representing these values. A graph, also called as charts, denotes a mathematical relationship between the data values and visual objects like lines or bars are coded. Big Data analytics consists of important techniques without which it is not possible to reduce the size and complexity of big data and data visualisation is an important approach which has to be integrated with big data analytics.
Visualisation can be static or interactive. Data with lot of diversity and heterogeneity is a major challenge in this process. A key point to note, however, is that one must be completely conversant with various tools and their usage. Not all people can distinguish between categorical and quantitative data and end up in applying the wrong tool. The data should be ‘cleaned’ and analysed before applying these tools.
There are several career options in this field and they include data scientist, data manager, statistician and data analyst. Several universities are offering or are contemplating a degree in Data Science. Students who wish to embark on a career in data analysis can choose from a number of online courses also. The courses are offered generally as a part of data analysis and interpretation specialisations.
(The author is associate professor, Christ University, Bengaluru)