GROWING UP IN ANALYTICSBy
Effective analytics depends on a clear understanding of not only the tools but the expertise and strategy needed in reporting and data science.
To provide a tactical bridge between the strategy of becoming more analytically mature and the day-to-day activities of data users, let’s take a look at the tools and skills required to increase analytics capability and maturity in a modern business, with an emphasis on Big Data technology.
A word of caution before getting into the list. You’ll want to avoid the “tool trap”—a common pitfall for organizations attempting to grow their analytics. This happens when companies try to solve their analytics problems by simply buying some type of analytics software.
The software on its own isn’t the answer. Good analytics software requires two things: analytics talent and quality data. The amount of experience analysts have using Tableau or SAS isn’t what makes them good analysts; it’s the amount of experience they have deriving insights from data, understanding a business area, and communicating results. Any good analyst—reporting, data science, or otherwise—is able to figure out any platform put in front of them.
As for quality data, there’s a saying in the data profession: “Garbage in, garbage out” (or GIGO). If the data going in is garbage, then any insights drawn from it will also be rotten. An effective analytics maturity plan will include an evaluation of data sources and steps to increase the quality and quantity of available data.
Now let’s look at the tools, skills required, and other important information and concepts for effective analytics.
Examples of tools: SAS, RapidMiner, Knime, SPSS, Rattle and RCommander (for R), Orange and Weka (for Python), Jupyter, and Zeppelin.
What’s important: probability and statistics; computer science; data architecture; and languages such as R, Scala, and Python, as well as Spark Context for each.
Because data science covers all business processes and can encompass many different techniques and solutions, key characteristics for those involved are the eagerness to always be learning and always wanting to solve new problems using math and statistics.
The question is how to mature your existing business processes and data capabilities to the point that the organization cultivates its own data scientists and has an environment capable of deriving value from data science activities.
Examples of tools: Excel, PowerBI, JasperSoft, SSRS, Crystal Reports, OBIEE, Talend, and SSIS.
Reporting is the foundation of analytics. A strong foundation is created by having consistent business rules around data. In most organizations, a separate data environment is created to maintain data in a consistent and accessible form. This separate environment is called a “reporting database” if it’s a replication of the production database; an “enterprise data warehouse” (EDW) if the data is shaped into multiple dimensional tables (“cubes”) for KPI reporting over time; or a “data lake” if raw data is placed into a Big Data environment along with more and more layers of aggregation and structure via PIG, HIVE, HBASE, and/or SPARK.
Graphical data models such as Neo4j and GraphFrames are becoming more prevalent as well. They provide an easier way to explore complex, highly connected data such as social networks or customer interactions. Graphical data models also give easier mechanisms to explore nested and/or NoSQL data, such as MongoDB, Cassandra, JSON, and XML (XBRL).
Examples of Tools: Spotfire, Tableau, PowerBI, QlikView, MicroStrategy, BIrst, and Domo.
What’s important: ability to synthesize new information; communication skills (written, verbal, and presentation); macro- and microeconomics; basic statistics; knowledge of how to connect to data sources (for example, restful API, ftp, sftp, ODBC, and jdbc); knowledge of data formats (for example, JSON, XML, and delimited flat files); knowledge of data stores (for example, RDBMSs [SQL], HBASE, and MongoDB).
“Business intelligence” is one of the most overused terms in modern business. This can lead to confusion on what it actually is. By the standard definition, business intelligence is the use of internal and external data to provide a competitive advantage to an organization.
Unfortunately, most organizations think connecting Tableau to a database and then creating a pie chart is business intelligence. It isn’t. For one thing, if anyone had asked, the people maintaining the database probably could have made that chart for you, and, furthermore, that isn’t business intelligence; that’s reporting.
Remember, business intelligence is about diagnostic—not descriptive—analytics. If the pie chart is compared to general financial and economic indicators over time and used to justify a hypothesis that the business needs to diversify its product portfolio into more prestige pricing goods, then it would be business intelligence.
Reporting, on the other hand, is about providing the business with fast and consistent answers to known questions, such as “What was total revenue last quarter?” or “What was the ROI of our latest project?”
Strategically, this distinction is important when making hiring decisions and motivating employees. We’ve seen far too many talented analysts get burned out because they get trapped maintaining reports for business users rather than providing value-added guidance and new analysis to those users. Good leadership in business intelligence and reporting groups will understand the skill sets of their respective departments and allow the reporting team to provide consistent and stable information to users while the business intelligence folks find new insights and data sources rather than migrating into a reporting pipeline.