About the Author

K Ramkumar, known to his friends as Ram, has an enduring passion for triggering a discussion and joining in with gusto on a range of themes. He believes that every person has an inalienable right to express his/her view no matter how different it is to anyone else. In his book no view is unworthy or big or small. Every view from everyone deserves a consideration without getting caught up with the tyranny of agreement or disagreement.
Read more »

Total Subscribers: 2971 RSS Feed     iTunes    

Research and Analytics: Better accept with a pinch of salt

Nostradamus should be the patron saint of the Big Data and Analytics cult.  I say this because we have hardly paused and reflected on what analytics can or cannot do. Despite the warnings from Taleb, Kahneman and Dobelli we have chosen not to take note of their caution on the human proclivity for cognitive biases.

We often overstate statistical coincidences. We confuse predictive analysis with post-facto analysis meant for diagnosis or mere comprehension which is about what happened and not what caused something. We have become foolhardy to assert that we can predict human behaviour without framing it in a context. Elementary understanding of behavioural science should notify to us that leave alone predicting, even comprehending human behaviour without framing it in a context is impossible.

We also do not set up the hypothesis and test it before we declare causal connections. Worse, our quality of data, sampling and data collection is often appalling. I say these of the best analytical firms and teams with whom I have worked with and not of amateurs.

In my 32 years of working, with the occasional exceptions, I have rarely come across analysts and researchers who understand the difference between the 3 types of research:

  • Explorative
  • Descriptive Diagnostic
  • Experimental

The rigour required for each of these is very different in terms of the research design and construct. The statistical methods applied still can be the same for the three but the objective and limitations are where the differences lie. For the purpose of this article I will be using the term research synonymous with analysis, because that is what analysts do.

An explorative research can at best achieve the purpose of gaining hypothetical insights into a problem, extrapolate hypothetically trends and/or isolate a set of few possible not probable causal factors to a phenomenon, which requires deeper and more focused research. Almost all research which is survey type and is based on self-reported data falls into this type.

These are surveys where the respondents offer an opinion, preference, confirm satisfaction or dis-satisfaction, confirm a like or dislike, endorse a person, position or product, articulate a potential buy or sell orientation, promise advocacy, rank a set of subjects or variables on the basis of certain attributes, etc.

In this research, the researcher has no means to determine the authenticity of the response, verify it against behaviour especially future behaviour or intention to act and gather other secondary evidence to corroborate the self-reporting, above all the application of mind and seriousness of response for every surveyed item.

For any business to use exploratory research to make decisions on committing resources, altering the strategy, launching initiatives or products and taking calls on customer/employee behaviour or preferences will be a huge leap of faith. The output from this kind of research is only one step better than broad generalisation.

90% of all the research/analysis, which business organisations rely on, falls into the exploratory research category. 90% of the business managers who thump table on how they only rely on research for decision making, have no clue about the serious limitations of the research design they are trusting.

In a survey and self-reporting type of survey based research, the other serious limitation is the quality of data collection which is dependent on the commitment, involvement, intelligence, grasp, discipline and integrity of the data collector. Incentives offered for number of respondents surveyed further muddies the water.

Where it is an on-line survey the limitation becomes the motivation, commitment, involvement, patience, focus, discipline and application of mind of the on-line respondent. What makes the process of face to face and on-line data collection unreliable is the unpredictability of the rigour with which the response is reported or recorded. This inherent limitation often plays spoil sport with the quality of data.

Where a team is using secondary data, without the knowledge of the source of the data, authenticity of the source, interference and modification carried out on the data (popularly called as massaging data), uniformity and comparability with respect to time of collection, context in which the data was collected, the construct & inherent biases in the various forms or instruments which were used for collecting the data, nature of question which elicited the data etc  seriously degrades the quality of data. It is this kind of data which is often mined and used by analytics teams of most organisations.

Let us examine few examples. One of the main reasons for some banks getting their retail unsecured lending wrong, during the 2004 to 2008 phase was due to the unreal faith, they had reposed in their analytics, to come up with what was popularly called as predictive credit scoring. The same was true of credit derivatives.

The analysts in this case placed extraordinary reliance on surrogate data, to predict the future credit behaviour of the customers, with no reference to future contexts. They believed that white collar employees or those who had savings account with a certain level of balance in their banks will default the least. The irony here is that assumptions were passed off as posits. There was no empirical basis that either of the 2 assumptions will hold true in reality. There was no comprehension that future contexts may not resemble the present context, especially the cyclic nature inherent in any economy.

In fact, business leaders were so flushed with pride about this analytical miracle tool, they made case studies on this and bandied it everywhere. As with everything that is claimed as research, numerous gullible admirers jumped on to this “Credit Scoring Marvel”. The fact that banks are no credit bureaus and have no access to behavioural data beyond their existing customers in their banks, was not grasped by people who otherwise are very perceptive. This kind of analytics was destined for disaster.

Of late we have become even wilder with predictive analysis. We today believe that by sitting in corporate offices and playing computer games with the captive bought out data and mixing it with proprietary data, we can direct our sales force to identify customers in the physical market place. The irony here is that the only thing we know about these people, who currently do not do business with us is based on the dubious, non-standard and randomly assembled bought out secondary data. This kind of potpourri (kichadi) data is riddled with all the limitations that secondary data suffers, which I have detailed in an earlier para.

We should take note that the predictive validity of psephology based on exit polls and weather predictions based on pure observed data is barely moderate. If this does not humble us about playing wild with behavioural data, nothing will.

The same is true in my experience with what are now called as “Employee Engagement Surveys”. The weights that an employee assigns to various factors that have a bearing on him are different and are also transient. The transient nature makes assignment of the weights difficult. The weights are also life stage and context dependent.

When the stock markets are in the middle of a bull run and an organisation’s stock is doing well, most employees will endorse or articulate preference for an ESOP loaded salary mix. Unfortunately these are not year to year reversible decisions. The next year if the bear chill catches the market, the same employee will dis-endorse an ESOP loaded salary mix, which he had endorsed or preferred a mere 12 months before.

The same is true with subsidised loans or company car. When interest is high the study will say loans promote engagement and when the interest falls, cash out would appear in the research as engagement enhancing move.

We also miss that different factors have different thresholds. Let us take for example pride as an engagement driver. The threshold for high endorsement for Pride in one’s organisation is very high. Any endorsement levels less than 85% for Pride is rare. However the threshold levels for Compensation, Equity, Fairness or Transparency in terms of endorsement levels are always moderate. The highest level of endorsement levels for these will be rarely higher than 65%. How would we normalise this rating idiosyncrasy?  How will an engagement study help us to determine what independent variables drives these dependent variables. Worse is how could simple frequency distribution establish any causal relationship?

That we do not ask these questions is shocking to me. Would we accept pathology report if it had these limitations or fallacies?

The same is true with consumer research. When you ask an ICICI Bank customer, who has not experienced other banks for similar products and at the same level of transaction intensity or frequency, to rank ICICI Bank in comparison to say 4 other banks, what insight can this research throw. In a self-reporting survey how would the person collecting data verify, whether the respondent’s reporting of experience with other banks (assuming she has that) is authentic. Research and analysis surely cannot be based on trust only!

In my experience I have seen, when the factor or survey item which has the highest importance to the customer is scored high or low, then the rest of the factors or items in the survey are impacted to various degrees by this. In the survey type explorative research, there is no way to isolate the causality. This is like a lab test which confirms a state of illness but cannot give the doctor any insight into what may be causing it. It becomes a guessing game. We do not see it as a guessing game because we gaze at numbers & tables and convince ourselves that the conclusions flow from them and not our guesses.

The problem with behavioural survey based research is even confirming presence or absence of anything is at best generalised, with no acceptable levels of reliability or validity. Very few researchers in these studies care to report these because during no two successive years the same research design and research structure is used, making comparability impossible. Hardly anyone questions this gross indiscipline in research.

Is it not shocking that business leaders and board rooms gobble up this kind of research as gospel truth or unassailable fact? Donald R. Keough the former CEO of Coca Cola in his book, “The Ten Commandments for Business Failure” narrates when a customer survey revealed that customers wanted Coke to be sweeter, like most analytics devoted businesses, Coke went on to change the Cola formulation and made it sweeter. According to Don, it took Coke a full year to recover customer loyalty and get back on rails. Half-baked analytics had turned Coke into Pepsi!

In the year 1994, we at Brooke Bond Lipton India were enlightened by market research, which said that the Indian market was waiting to explode three fold on premium ice cream consumption. Paying heed to this the organisation acquired every litre of ice cream manufacturing and marketing capacity in India and not satisfied with it set up a global scale ice cream manufacturing factory at Nasik. This is the story of Kwality Walls. 20 years hence, I understand the market is still waiting to explode and the capacities acquired are not fully utilised.

The thrust of the article is not that research and analytics are waste. It is more to caution what kind of research is needed for decision making. Descriptive diagnostic research is a must, whenever we are seeking to establish causality. Experimental research is a must, whenever we are seeking to predict performance or behaviour. This is seriously expensive. Exploratory research can only help you frame hypothesis. Sadly what most global consulting firms purvey is exploratory research. Most consumer and employee research too are exploratory research.

Hence we should be aware that what we get from consulting firms and in-house analytics is hypothesis and not empirical insights. Where any research is not verifiable, when repeated multiple times or has poor causal proof or where performance change cannot be ascertained or verified, such research cannot be the sole basis for strategy change or resource commitment.

Almost no innovation in the world is a product of any analytics or exploratory research. Let me close with a few tongue-in-cheek comments. Check out what Steve Jobs thinks about consumer research and its usefulness for innovation! It took CIA 10 years to find Osama; analytics notwithstanding. Research and analytics in the behavioural area is useful but let us take it with a pinch of salt, when someone tells us that it can predict future behaviour with reliability.

11 comments on “Research and Analytics: Better accept with a pinch of salt
  1. Roopchand Yadav says:

    Dear Ram,

    There are few facts on the post 2004 banking scenario; which I differ from you. There have been few players who have come out of this phase quite safely and completely immuned.

    However, sir; I delighted to note the following few lines, which I totally agree with you, when you say that :
    The thrust of the article is not that research and analytics are waste. It is more to caution what kind of research is needed for decision making. Descriptive diagnostic research is a must, whenever we are seeking to establish causality. Experimental research is a must, whenever we are seeking to predict performance or behaviour. This is seriously expensive. Exploratory research can only help you frame hypothesis.

    Through this medium I seek your inputs on the best possible way of knowing customer preferences particularly in the corporate world, given the introduction of the technology through the smart-city wave in our country.

    Thank you sir.

    Roopchand Yadav

    • K.Ramkumar says:

      We have to be careful when we use survey based research. As I have written see customer preferences as hypothetical and not empirical. Agree that certain aspects cannot be empirically ascertained because it simply is not possible. Do not thump table that every research is some kind of unassailable fact or truth. Use intellect, judgment and qualitative data to understand quantitative data. We have a compulsion to gobble up any arrangement of data and numbers which passes off as research. We do not challenge research outputs and put it to serious critical examination. We even dismiss and scorn on people who challenge and critically examine research or analytical outputs. When healthy skeptisism, critical examination and judgment is combined with exploratory research or analysis then we will use exploratory research well. However if we believe that such research is a revelation or sure bet truths to be acted on, then we will suffer as Coke, BBLIL or the certain Banks suffered.The limitations of behavioural research and analysis should be borne in mind to make the most of behavioural research. This is like a good Docotor will read an inconclusive or directional data from a path report and will use his clinical diagnostic ability to corroborate and arrive at conclusions.

  2. Roopchand Yadav says:

    Good morning Ram and thank you for your time.

    I was indeed eagerly waiting for your response; and the last line of your message give a complete and conclusive thought on this subject.

    Thanks you once again, and have a wonderful day ahead.


    Roopchand Yadav

  3. RM Renganathan says:

    By experience,we used to have human samples. It is the better way to study them——–GOOD EFFORT

  4. Nathan says:


    Very insightful and readable article. I have a couple of thoughts to offer.

    1. Research in itself would not lie if it is done with due care and caution. If all the methodologies are in place and it is as scientific as it can be, that would be a good piece of work. What we don’t do well is to question the hypothesis.

    2. Even if there is a conclusion that is staring at us in the face, there’s experience on the ground that is seldom validated. I still remember we had some serious maintenance issues at a chemical plant that I worked and the research proved almost conclusively that we had to invest some serious money to put this in place. Until, a wise old mechanic told the supervisor that the fault was more owed to a process operation that could be nixed easily.

    Point is – do we question the assumptions? And do we listen to the wise, although they have no epaulettes ?

  5. Agree with you.

    I am currently reading a beautiful book – Nudge – and it sums up beautifully the points you have made. A particular example in the book stuck with me.

    Consider the following 2 questions asked to youngsters :

    a) How happy are you?
    b) How often are you dating?

    When asked in the above sequence – a, and then b – the correlation (in answers) between the two question is quite low – 11%.

    But when asked in the reverse order – b, and then a – the correlation jumps to 62%

    If the mere sequence of questions can have has such a profound impact, one can only imagine the effect of wording, phrasing etc.

    BUT having said the above, I look at the bigger picture.

    I look at ICICI Bank, Reliance, ITC – and 100’s of other public listed company – all coming out with new products / growing their sales year-on-year over decades.

    It seems to suggest that the way the research is done – and decisions are being taken – in competitive, professionally-run firms is good and continues to improve.

    True it can be more fine-tuned – and we will have occasional 2008-like situations – but I would like to take it as a positive – something to be learned from emerging knowledge – and plugging the gaps.


    Anil Karamchandani
    Author – 21 Office Situations & How to Deal from Them


    I think you should post your article in LinkedIn too. I see Prabir Jha (Cipla) / Abhijit Bhaduri (Wipro) articles there too.

  6. Vyom Upadhyay says:

    Dear Ram

    The blog very nicely puts across the various questions analytics community needs to answer if we need to get closer to the vision of true data driven decision making.

    The world of model led decision making can be roughly classified in two categories. Type 1 decisions are strategic in nature, have long term implications and typically have significant senior management bandwidth devoted to them. Examples of these decisions are whether to launch a new product, change pricing strategies etc. Type 2 decisions are more tactical in nature, any large organization makes thousands if not millions daily, and since we make so many of these decisions so often, we would like these to be automated. Examples of these decision are which individual to offer a product to, how much to charge an individual etc.

    A number of strategic decisions are primarily aided by data. I agree without robust experimentation, the data that aids these decisions can be inadequate, misleading even. But the key point is that data analysis and forecasting models are used to assist this decision making, and not drive these.

    In the tactical decision making world, analytics is indispensable. Despite all the limitations mentioned above, ability to run a standardized process with pre defined logic and intelligence beats any other option instantly. Not having an analytics driven framework on pricing or underwriting decisions would result in thousands of uncontrolled experiments running throughout the organization.

    Obviously, the real world does not have discrete categories called strategic and tactical but a continuum- but the argument still holds.

    As you mentioned, I have seen a number of places where the data contains a lot of biases. Mature areas like credit scoring have established techniques to handle these biases. But any new problem being solved can be a minefield out of which only the careful analysts can come out of looking good.

    I think it is an indication of all the points that you have mentioned that the decision sciences community still thinks of ’Big Data’ as the big thing. The next frontier of ’Big Decisions’ will remain a faraway dream until we tackle these issues conclusively.


  7. Malcom Lazar says:

    you may have an ideal weblog here! would you prefer to make some invite posts on my weblog?

  8. Mandar M Tambe says:

    Dear Ram,
    Research & Analytics…. is like an 8 in 1 multi function swiss knife. The user must know why to use, when to use and how to use. You have given an apt example of the doomed “New Coke” campaign of May 1985. That time the Coke’s Team conducting R&A was absolutely sure and confident about the campaign’s success. But even then the New Coke failed for the reason – in the words of Mr Donald Keough ” it was a deep psychological issue and not a taste issue or real marketing issue”.

  9. Arun says:

    Big data and long data,analytics with context, to me seems a better way to go.

    The philosophical debate of predicting the path of the kicked ball and the kicked dog goes on..

  10. Puneet says:

    In cricket we often hear a group of exerts say – “A captain is as good as his team”. And another group of experts say – “XYZ forged the team and its his leadership qualities that makes the team tick”. Both statements can be proven true by research as the researcher would seek confirmation for his/her hypothesis, resulting in a confirmation bias.
    All research is based on defining a hypothesis and then proving or disproving it using statistical measures. Numbers in itself are numbers, its the interpretation of numbers which is the key. Decision sciences can help in fine tuning decisions by proving or disproving hypothesis. In no ways can decision science make decisions for a manager–atleast till now. No one can predict the future. All that one can do is have an expectation of the future, and analytics provides the expectation of various scenarios to a manager.

    In 2004 to 2008, it was not the credit models that went wrong, but the interpretation of the models (at least in the Indian context). Conceptually, a credit model will give an output which would be the probability of default of a counterparty. In no ways, does the model predict any future state of the world–a mis-conception prevalent in many organizations and across levels. Banks, which understood the power and limitations of their models, built their strategy accordingly. And mostly, its human beliefs and biases which triumph over objective information. No one expects bad things to happen to themselves, its always the domain of the others. Similarly, the credit crisis in the US was a result of the belief that mortgages cannot go bad, in spite of data showing year on year that the defaults were increasing. It was not data/model but the human interpretation that was the cause of the bubble burst. In fact, those who heeded numbers and understood numbers and human biases actually made money (The big short provides some examples).

    Big Data is more than predictive modelling. Its a fancy name given to the use of computer data to improve services and processes, and harness data that is being created by humans on machines to make society more productive. A simple example of big data would be the computerization of a city’s public transport system, so that the user can have the departure/arrival/delay timings of his/her train, bus, metro, expected time of travel from point A to point B based on the prevailing traffic patter at their fingertips. This involves processing gigabytes of data and presenting it to the end user for their consumption. This almost real time processing of data is also big data analytics.

    Often people say that analytics is about making a future decision based on past data. But that’s the case with every promotion that an employee gets in an organization. Any employee who is promoted to the next level gets promoted due his past performance and expected performance in the future. There is no future data available when making that judgement call. As the number of promotions that take place are limited in a year, its easy to rely on human information gathering (and the quality of the information in some cases is as suspect as a database acquired from a third party) and take a futuristic decision.

    At the end of the day, understanding the context of the problem, the data collection, the framing of the hypothesis are all key ingredients in research and analytics, and I would be very wary of managers who don’t understand either of the ingredients, or partially understand the ingredients (at least this lost will be more circumspect).I would be having sleepless nights if these managers are decisions makers, because the organization would be getting infected with the “unknown unkown”, and the burst would be a matter of time.

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


Subscribe to get updates!

Total Subscribers: 2971
RSS Feed    
About the Author

K Ramkumar, known to his friends as Ram, has an enduring passion for triggering a discussion and joining in with gusto on a range of themes. He believes that every person has an inalienable right to express his/her view no matter how different it is to anyone else. In his book no view is unworthy or big or small. Every view from everyone deserves a consideration without getting caught up with the tyranny of agreement or disagreement.
Read more »