In Defense of Small Data
There has been enough buzz about Big Data over the past year or two to create a very big data file. It seems that if you do not have a strategy around big data, mining ecommerce and social media sites for new insights into your customers, using new data management software tools like Hadoop, you are at a serious disadvantage to your competitors who are all jumping on the big data bandwagon.
And - there is merit to this line of thought for some companies. The increase in the volume, variety and velocity of data flows has increased exponentially with the growth in online activity. There are more types of different data being generated at an increasing pace, and companies need to understand how best to manage and leverage that information to the benefit of their customers, prospects and their own bottom line. IBM includes data veracity as one of the four V’s of Big Data referencing the lack of trust management has in much of the data.
All that said - let’s not forget about small data. Former McKinsey consultant Allen Bonde has noted that big data is about machines and small data is about people.
For many companies, email marketing reports, google analytics and other website analytics can be readily used in addition to internally generated transactional reports. This data tends to be activity-oriented data, locally-sourced, easily accessible and can be used to deliver immediate results.
Companies can mine small data far more easily to generate insights to drive marketing initiatives including cross-sell and upsell opportunities, marketing spend analysis and the like. However, as with any data analysis project it is important to:
define the business problem being addressed
identify the data sources to be utilized
ensure the integrity of the data
Given the current hype around analytics, there is a risk that business users may jump right in expecting their software tools will automatically identify the right questions and the answers. In fact, some of the cloud based services, such as those from IBM’s Watson Analytics and SAS do a great job at this as well as more visualization oriented tools such as Tableau. However, companies will yield the greatest ROI by ensuring that they take a rigorous approach to what they want to achieve from their data analysis project.
Once this is done, the business will be in a better position to identify the best data sources to achieve their objective. Companies may find that internally generated spreadsheets are useful, transactional reports from ERP systems such as Salesforce or their accounting systems, as well undertaking customer surveys or aggregating other third party data can quickly and cheaply help improve the quality of data being analyzed.
The heavy lifting of data cleansing then needs to be completed to ensure that missing, incomplete, inaccurate or duplicate records are identified and treated appropriately.
Once done, the company will be in a position to start generating insights specific to the business problem being addressed and quickly implement processes to drive an improved ROI. And, who knows, maybe then they will have the experience, confidence and budget to move on to Big Data!