What Is a Data Science Model?By
To discover what data science in finance is all about, start with modeling basics.
Data science, modeling, and scenario planning are more common in finance now. There is no official definition of a data scientist, but a good candidate is advanced by the analytics firm SAS: “Data scientists are a new breed of analytical data expert who have the technical skills to solve complex problems—and the curiosity to explore what problems need to be solved. They’re part mathematician, part computer scientist and part trend-spotter.”
In the Strategic Finance article “How to Master Digital Age Competencies” (bit.ly/2pUa1Fy), Raef Lawson and Daniel Smith covered the new role that finance plays in data science, the skills required, and the importance of crafting a plan based on these skills.
Among those skills a good data scientist should have are:
- Problem structuring—Understand the business logic of how to use data for business problems. Model business rules and processes, create a workflow of how data works, and optimize it.
- Verifying data quality—Validate data quality, and use tools like natural language processing (NLP) to get the probability of error. Identify patterns, trends, and anomalies.
- Identifying new data sources—Know the value of data and how to utilize it. Build a business model using data rules to optimize targets.
WHAT IS MODELING?
The definition of a model, according to Merriam-Webster, is a “system of postulates, data, and inferences presented as a mathematical description of an entity or state of affairs.” That said, models aren’t exclusive to the math genius and computer whiz. You probably make and use models every day.
For example, while driving during your morning commute, do you try to find the fastest lane? Maybe you consider moving to the lane furthest from an on-ramp to avoid congestion, or you move to the lane closest to an upcoming off-ramp because there will be fewer cars in that lane.
In driving the same road over and over again, you’ve learned more about the system—ramp location, traffic congestion, driver behavior—and as you learned, you’ve made modifications in your behavior to optimize your drive time.
A data scientist’s model does the same thing. The data is your experience driving, a computer is your brain trying different driving patterns to learn what works best, and the model is an equation of data inputs affecting a target value. In this case, the target value is how long it takes to get to work.
BUILDING A DATA SCIENCE MODEL
Problem structuring is a very important skill for a data scientist. A great data scientist not only understands the business problem, e.g., new customer acquisition, product design, desk placement to reduce distraction, etc., but can also create a diagram of data and business activity related to the key performance indicator (KPI) associated with the business problem.
In management accounting terms, we might call such activities “KPI determination” or simply “flowcharting.” These core management accounting competencies are considered very valuable in data science.
If we can create a flowchart of the business processes that we seek to optimize (our business case model) and determine the data representing the business processes in our business model, then calculating a mathematical representation of the business processes (our data science model) is often easy.
In our experience, a four-step process gets data science modeling started:
- Define the business problem and the KPIs associated with the business problem.
- Build the business model as a flowchart of the internal business processes and external factors that can influence the business problem KPIs.
- Identify data created by (or representative of) elements in the business model.
- Use analytics techniques to quantify the impact of elements in the business model on the KPIs.
Following the first three steps makes the final step of analytics much easier. Sometimes analytics involves a statistical model to represent the business model; other times it may simply be a dashboard or visualization.
Simple models that use fewer data inputs are a good place to start. For greater objectivity and robust analysis, pick other external data sources to add explanatory value. Start small, get that working, and then increase the predictive power of the model by including variables (such as weather or economic metrics) expected to influence outcomes.
A key to success in finding the right analytical techniques is to start with basic descriptive statistics and then move on to predictive ones such as regression analysis. Another key is to just try beginning with a familiar tool using familiar data, such as Excel’s regression functions, to learn how dependent and independent variables work. Do independent (causal or correlative) data feeds to the regression functions lead to expected results within a reasonable confidence level? If so, expand the use of the regression analysis to forecast independent values and start shaping the future. If not, try a different independent variable, or check that you’re correctly configuring the regression model.
CONSIDERING THE APPLICATIONS
There are many examples of data science in finance projects, such as:
- An optimization market exit/entry model including factors such as seasonality, grants, and regulation.
- A forecast model including factors such as volume seasonality, geography, and demographic patterns. In a forecast model, you take into account drivers for different financial accounts.
- A very advanced data science model that can bring together the supply chain with different facilities around the world, taking into account logistics costs and customer demands.
- Inventory optimization including factors like dead stock, turnaround, etc.
- Pricing optimization and linking that to demand elasticity, market preferences, and discounts.
Consider how you would create a flowchart of business and data processes and the associated KPIs. A business model flowchart and data source should enable analysts and data scientists to easily apply their tools to solve your problems.
In fact, with the KPIs and data identified, it’s easy to do simple analytics and predictive modeling, and you may even want to try building some data science models yourself. You might be surprised how much of a data scientist you are already.