Big Data: the new leading frontier

Big Data: the new leading frontier

upcoming career

Big Data: the new leading frontier

What is Big Data? Well, if you have ever tweeted, posted a comment on a blog or created an online review — congratulations, you have helped create Big Data. There is no single accepted definition of Big Data yet, but many academics and Big Data practitioners agree that Big Data is characterised by the widely used 3 ‘V’s approach suggested by Gartner Research. The first ‘V’ is volume, which is where the ‘Big’ in Big Data comes from.

This is data in volumes that are difficult to imagine and difficult to handle in conventional business systems. Try to visualise, for example, the volume of data created by social media. The second ‘V’ is velocity, the (high) speed at which data is created, processed and analysed. It is no coincidence that Big Data is associated with real time analytics. Think about what happens when a topic starts to trend on Twitter. The third ‘V’ is variety.

Beyond social media

We are all familiar with standard business data — think of the kind of data that you enter when you create an order or register as a customer or sign-up as a student. This kind of data is known as structured data because it can be put into a structure which can easily be stored in relational databases, the kind of databases most widely used for traditional business data. Now think of the data created through social media, in blogs for example or Facebook posts or Snapchat and think also of all the other kinds of data used in organisations.

These types of data have different structures and formats and are more difficult, sometimes impracticable, to store in a traditional business database. The data in Big Data comes in all shapes and formats including structured and working with Big Data means handling a variety of data formats and structures. Social media is the eye-catching, headline grabbing part of Big Data but Big Data is about a lot more than social media.

Big Data can be data created from sensors which track the movement of objects or changes in the environment such as temperature fluctuations or astronomy data. In the world of the Internet of Things, where devices are connected and wearables create data, Big Data approaches are used to manage and analyse data. Big Data includes data from a whole range of fields such as flight data, population data, financial and health data and this brings us to another ‘V’, value which has been proposed by a number of researchers.

What are the uses of Big Data? Most of us are probably familiar with the idea that social media is analysed by advertisers and used to promote products and events but Big Data has many other uses. It can also be used to assess risk in the insurance industry and to track reactions to products in real time. Big Data is also used to monitor things as diverse as wave movements, traffic data, financial transactions, health and crime. The challenge of Big Data is how to use it to create something that is of value to the user. How can we gather Big Data, how can we store it, process it and analyse it to turn the raw data into information to support decision making?

Many opportunities

Some of the challenges of Big Data are technical ones because the 3 ‘V’s
of Big Data — the volume, velocity and variety of the data — means that different storage and processing techniques are needed. This is a growing market and it is no
accident that Oracle and Microsoft, two of the biggest and most successful relational database vendors, now also provide Big Data offerings. Big Data is supported by a range of technologies such as Hadoop. Never heard of Hadoop? A quick check on IT job sites shows adverts for Hadoop developers, specialists, technical architects and for skills with related technologies such as Hive and Apache Spark.

Traditional relational database skills are still in high demand but increasingly, so are the skills needed to work with the generation of non-relational databases, known as NoSQL. These NoSQL databases, which are often open source, are built to handle the processing of large volumes of data and use different design strategies, architectures and query languages.

One of the biggest challenges of Big Data is Big Data analytics, where analysts examine and interpret Big Data, often working in a real time environment. The job sites show adverts for analysts who can work with Big Data. The new challenges bring new opportunities. From the early days of Big Data, there have been complaints about shortages of properly skilled and experienced Big Data engineers.

In 2011, a report from the McKinsey Global Institute described Big Data as the next frontier and talked of a ‘shortage of talent’ to support the growth of work with Big Data. Five years on and the forecasts are still that more Big Data experts are needed. A 2016 UK government report on digital skills looked at the need to plan for areas of strategic importance, listing Big Data in second place, just behind cyber security and predicted an expansion in Big Data jobs.

Bridging the gap

To meet the demand for Big Data skills, new training courses and academic qualifications are developing. A number of vendors and industry providers offer technical certification in Hadoop and related skills and in NoSQL data technologies. The growth in the demand for Big Data expertise is also driving changes at university level. Most university undergraduate computing courses include relational database skills but Big Data skills are increasingly being added to the curriculum, particularly in specialist Data Science BSc degrees.

However, it is at postgraduate level, that one can see Big Data more prominently as a specialised subject area.  Masters courses in Big Data and related topics are being offered in places such as the UK, Finland, the US, Australia and Canada. Big Data centres are starting to appear on various university campuses and research into Big Data is an active field. There are also many PhD students researching into areas like Big Data quality, the visualisation of Big Data, the performance and management of Big Data and of course Big Data analytics.

As industry investment in Big Data expands, so does the study of Big Data. The more we learn, the more the field grows and develops: Big Data is here to stay.

(The author is senior lecturer, School of Computing, Staffordshire University, UK)