Big Data FAQs – First Series by Shaku Atre- Part 1 of 10
Q1: Can you provide only two words that will tell us the most distinguishing two features of Big Data from the Data so far?
A1: Variety and Velocity.
- It is not only that the standard of “how much data” has changed but also “how soon” has changed dramatically as well.
Q2: What are we supposed to do with the Big Volumes of data?
A2: Big volumes of data beg for analysis in order to glean correlations and inferences and to prove or disprove hypotheses.
- These methods point straight to Data Science.
- In the past, Data Science was practiced only in the academic world. Now, in order to be competitive in the marketplace, every business is expected to possess these academic skills. With one big difference – in academia, results typically did not need to be obtained very quickly, if the problems and the data were very complex. They could take their dear time – something businesses cannot afford to do; Time to Results (TTR) is of paramount importance for businesses to succeed.
Q3: Where are the “mine fields” for Big Data to be used successfully?
A3: Data goes mainly through four phases; the major “mine fields” with Big Data occur in Phases 2, 3, and 4:
- Phase 1: Data is generated by transactions (e.g., billing and reservations), interactions (e.g., shopping online), and observations (e.g., measuring carbon monoxide levels in different sections of an airplane).
- Phase 2: Data is received by various recipients – Are the receiving systems fast enough to handle the output of the data-generating systems? Is it like multiple lanes of cars trying to get into one tunnel?
- Phase 3: Data is stored and processed – Is the storage capacity big enough and is the processing fast enough? (How many tunnels and/or how many lanes in each tunnel should there be? The number of cars on the road trying to enter a tunnel is increasing at a dizzying speed.)
- Phase 4: Insights are created – has to be done fast enough to benefit the businesses’ bottom line. (Can instantaneous rerouting of the cars be done to avoid deadlock, or, even worse, a deadly embrace?)
Q4: What are the main building blocks of Data Science?
A4: Mathematics and Statistical Analysis are Data Science’s main building blocks. Unfortunately exactly these two skills are mostly lacking in today’s Data Analysts who aspire to be Data Scientists.
Q5: What should be the Business Strengths of Big Data Analysts?
- Understanding and use of analytical modeling techniques http://searchbusinessanalytics.techtarget.com/opinion/Analytical-modeling-is-both-science-and-art
- Outstanding familiarity with the business processes of the business of which big data is to be analyzed https://en.wikipedia.org/wiki/Business_process_management
- Familiarity with newer statistical programming languages such as R https://en.wikipedia.org/wiki/R
Risk-taking mentality to experiment with data – an outstanding data scientist has to be willing to swim against the stream when everyone else is swimming with the stream because it is easier for everyone (it is always a good idea to back up the data before it disappears in front of your eyes because you were trying something unusual with the data – and unusual is exactly what you are supposed to do to retrieve the nuggets of unusual insights) https://www2.deloitte.com/content/dam/Deloitte/global/Documents/Governance-Risk-Compliance/gx_grc_Deloitte%20Risk%20Angles-Applying%20analytics%20to%20risk%20management.pdf