Business of Data Warehousing Foundations
Business of Data Warehousing Foundations
mySupermarket is a grocery shopping and comparison website which aims to provide customers with the best price for their shopping. This report examines how data warehousing provided mySupermarket with the foundation in which to build a successful enterprise, and allowed a subsequent expansion into the ‘business intelligence’ sector. The research draws attention to the problems and limitations that mySupermarket encountered including; coping with diverse sources of data streams, customer loyalty issues, achieving real-time data, data integrity and generating a sustainable revenue stream. These problems were tackled respectively through; building their own data warehouse, adopting a CRM strategy underpinned by their warehouse, adopting Microsoft’s SQL software, supermarket website ‘crawling’, offering ‘targeted’ advertising space and the realisation that the granularity of detail they offered, would allow them to expand into the ‘business intelligence’ sector.
The report appreciates the importance of storing data, but concludes that data itself is the prerequisite to success, and that good management is needed to convert this data into meaningful information. It is therefore a combination of data warehousing and good management that has enabled mySupermarket to become a successful venture.
“On the 31st August 2006, entrepreneur Johnny Stern received a seven-figure sum from investors to transform the way consumers shop for their groceries. From this, the price comparison site mySupermarket.co.uk was born and the company has utilised data warehousing to give consumers access to cheaper grocery shopping. The venture has not been without its problems, however four years on the company has withstood Adam Smith’s ‘Invisible Hand’ and grown into a c.£10m company…”
mySupermarket is a grocery shopping and comparison site that allows customers to compare and shop from four main UK supermarkets in one central place. Their mission statement is “to get the best possible price for your supermarket trolley while enjoying an easier and more consumer-friendly shopping experience”. Through the use of SQL and data warehousing, mySupermarket is able to collect product pricing, promotion and availability data directly from retailers’ websites. It then uses its proprietary technology to match identical Stock Keeping Units (SKUs) across retailers. After initial investment from Greylock Partners and Pitango Venture Capital (investors in Facebook & LinkedIn), mySupermarket have faced the same difficulties as other price comparison sites in generating a sustainable revenue stream after Stern declared that “the portal would remain free in principle for shoppers”.
The customer proposition for mySupermarket is to first log into their account. Then choose which supermarket to shop at from; Asda, Ocado/Waitrose, Sainsbury or Tesco. Tick their preferred supermarket and choose a delivery time/date, then start to shop. mySupermarket is updated on a daily basis so that the prices shown are the most competitive. Once the customer has made their choice of store, they start to shop by using the tabbed choices along the top of the page. These are divided up into “virtual aisles” so making a choice from Fruit and Veg, Meat, Fish & Poultry, or Drinks etc. Once shopping has been completed, it then shows basket prices across the four supermarkets and allows the customer an opportunity to switch supermarkets.
This report will critically discuss how data warehousing has enabled mySupermarket to build a successful business model including the benefits and problems that have arisen from the use of this technology. The report will finally analyse the extent to which data warehousing has contributed to mySupermarket’s success.
According to Bill Inmon (1993) data warehousing can be defined as, “a subject-orientated, integrated, time variant and non-volatile, collection of data in support of the management decision making process”. It is, in essence, a large data storage facility which enables an enterprise to gain a competitive advantage through analytics and business intelligence. Providing integrated access to multiple, distributed, heterogeneous databases and other information sources has become one of the leading issues in database research and industry, IEEE Computer (1991) which can be seen through the success of First American Corporation (FAC), Cooper et al (2000) and Tesco/Dunnhumby, J. Perry (2009).
Data mining is the process of ‘digging-out’ patterns from data, usually through Clustering, Classification, Regression and Association rule learning. Data mining technology can generate new business opportunities by providing:
• Automated prediction of trends and behaviours.
• Automated discovery of previously unknown or hidden patterns – D. Champion and C. Coombs (2010)
This process is carried out by sophisticated software packages such as Oracle, IBM and SQL. This alleviates the (potentially) very time consuming task of manually inputting and analysing the data
Within data warehousing, there is a high importance placed on the quality of data, as without it, meaningful analysis is impossible. Data collection should therefore be taken with a high level of detail, and have solid definitions, as to avoid subjectivity.
The purpose of a data warehouse is to support creative strategic decision making through a greater granularity of information with a consistent view of what’s happening.
Customer Relationship Management (CRM) emerged in the 1990’s at a time when customers were becoming better informed and less brand loyal. CRM is an integration of technologies and business processes used to satisfy the needs of a customer during any interaction, Bose (2002, p. 89) and is underpinned by data warehousing. As with VISION in the FAC case (2000), the subsequent benefits of CRM, is that firms are able to exploit the ‘80:20 principle’ which states that some customers are more important/profitable than others. These information sources can only come through data warehousing and data mining.
mySupermarket – The Beginning
The inspiration for mySupermarket came from Stern’s bargain-obsessed elderly relative; who would scour the aisles of Tesco’s to find his favourite tin of baked beans, jot down the price and travel to competitor stores to try and find a better deal. Stern identified the growing interest in online grocery shopping and felt that it was an area that could be exploited (Fig. 1).
Figure 1: Mintel Intelligence – Online Grocery Data
Stern spent 18 months before the launch developing the software and tweaking the concept (Fig. 2).
Figure 2: Adaption of Martin et al., 2005: 193
The data warehouse was developed through ‘crawling’ the four supermarket websites and adding product pricing, promotion and availability data to the warehouse. Once this data was implemented in the warehouse, proprietary technology and SQL software allowed mySupermarket to match identical SKU’s across retailers. This data was also used in developing its CRM strategy through the use of ‘cookies’ to store data in the customers computer using the functionality of their browser to find out whether the computer has visited the site before and what SKUs they purchased. “This enables us to operate an efficient service and to track the patterns of behaviour of visitors to the website.” – mysupermarket.co.uk. The feature enables mySupermarket to utilize this information by creating functions such as a ‘Regular Shop’ button, saving customers’ time on their shopping.
mySupermarket – Problems
Many problems can arise through the use of data warehousing, both technically and commercially. According to Mintel Intelligence (2009), “Consumer loyalty is fairly low in the [price comparison] market – with more than 14 million people (c.58% of market) having used three or more different price comparison sites”. Underlying these efforts was the recognition that, to succeed with this strategy, it must know its customers exceptionally well and leverage that knowledge in website design, service and interaction with their clients. mySupermarket would therefore have to find a strategy to retain a ‘loyal customer base’ in a notoriously disloyal sector.
Kimball & Ross (2002) state that a common pitfall of data warehousing is to “presume that the business, its requirements, analytics, underlying data and supporting technology are static” – an early problem mySupermarket encountered was the variation in regional pricing and a growing demand for ‘real-time’ data.
Another problem with data warehousing is ensuring the integrity of data, this is typically a human procedure and so subject to human error. Even the most sophisticated data mining systems cannot produce good analysis from poor data. A good illustration of this is from Blastard and Dilnot ‘The Tiger That Isn’t’ where a hospital survey found that an alarming amount of patients were being born on the 11th November 1911. Further investigation showed that nurses often would not fill in patient files properly and to save time, when asked to enter patient D.O.B. they would type 11/11/11 into the database. No matter how intelligent a computer system is, if you put ‘garbage-in’ you will get ‘garbage-out’.
Beynon-Davies (2004) states that data warehousing projects are large scale development projects typically taking up to three years to complete. Some of the challenges of such problems may include; selecting, installing and integrating the different hardware and software and also, the diverse sources of data feeding a data warehouse introduces problems of design in terms of creating a homogenous data store.
Finally, as with all comparison sites, the major obstacle facing mySupermarket was generating a sustainable revenue stream from the database they had accumulated. mySupermarket.co.uk did not generate any revenue 5 months after the website went live. Originally, mySupermarket didn’t operate a ‘search advertisement’ scheme (a central platform for companies such as Google eg, BP paying for advertising of their oil spill cleanup when people typed in “BP Oil Spill” – G. Cheeseman, 2010) . There are also no revenue-sharing agreements in place with the four stores whose prices it monitors in effort to remain independent. This, in part, may stem from mySupermarket’s limited market, consisting of ‘a comparison of groceries’. mySupermarket recognised that they would have to expand their focus if they were to generate a large enough turnover to operate a successful business.
mySupermarket – Technological Impact
The first problem mySupermarket addressed was the industries poor ‘customer loyalty’. They decided to attack this through the implementation of a CRM strategy. After the initial launch, mySupermarket was receiving feedback from customers regarding such things as; healthy options, promotions on offer, printable shopping lists and regular shops. mySupermarket realised that the information stored in their data warehouse could be exploited to meet these demands and increase customer utility. Subsequently, a Health Checker feature was launched based on the Food Standard Authority’s approved ‘traffic light’ system. In November 2008, the mySupermarket ‘Quick Shop’ function was added, allowing users to type their shopping list on a virtual notepad and find their required items in one go.
“As delivery slots started running out towards Christmas we also introduced a new ‘print your shopping list’ feature, which was popular,” said Stern. “A lot of our shoppers are using the website as a quick way to find the best deals and are then going to the supermarket to make their purchases”. Recent analysis of visits shows mysupermarket.co.uk has a loyal repeat following, with Stern claiming visitors are spending an average of 20 minutes on the site.
“Until recently, there were few viable tools to provide real-time data warehousing nor an absolutely current picture of an organization’s business and customer” J. Vandermay (2001). To combat the problem of achieving real-time and regional data, mySupermarket used Microsoft’s SQL software. Most data integration solutions focus on moving data only between homogeneous systems and database software. However, SQL integration is capable of moving data among a wide range of databases and systems. It also offers transformational data integration tools to consolidate and synchronize heterogeneous data into a warehouse. This allows consumers to view whether a certain item is in stock in their local store, or view delivery slots for their specific region. This real-time data saves the mySupermarket team having to continually update the warehouse manually.
Fortunately for mySupermarket, their website ‘crawling’ technique allows them to take the SKU data directly from the supermarkets themselves. Therefore data will only be wrong, if the supermarket has made the mistake (so would have to sell the item at that price) and so mySupermarket would not be liable.
Although Stern took half the time recommended by Beynon-Davies, the warehouse has had to be continuously tweaked since its launch. After its launch mySupermarket noticed a data stream that wasn’t being filtered into the data warehouse – calories. After the realisation, mySupermarket were able to add a ‘calorie counter’ function on to the website.
For any business to survive, it needs to generate a revenue stream to achieve a sustainable cash flow:mySupermarket were able to negotiate with supermarkets a commission of £5 for every ‘first-time buyer’ that shops through their site and £1 every time thereafter. Other sources of revenue came from the use of advertising, which could be split into two different segments on-site and search-related advertising. Marks & Spencer (Fig. 3) are one company that has chosen to advertise with mySupermarket.co.uk as the content is relevant and it is independent from the four supermarkets being compared. Advertisers will typically pay $1.00 – $1.50 per 1,000 run-of-site impressions for the advertising placement. However, advertisers may pay even more for targeted sidebar advertisements. Search advertisements are targeted to match key search terms entered on the search engine, these products (advertisements) will then appear first in the search. Danone (Fig. 3) has paid for advertisement when the search term ‘yoghurt’ is entered, and so their umbrella brands (eg, Activia) show at the top of the list, increasing its probability of being bought.
Figure 3: mySupermarket.co.uk – advertising example
Due to the amount of data mining available to mySupermarket, an opportunity was identified for expansion, called ‘mySupermarket insights’. It acts as real-time B2B data service for the ‘Fast Moving Consumer Goods’ (FMCG) sector. As mySupermarket has access to SKU by SKU trends, it is able to offer extremely high level, intelligent data. The services it offers include; New Product Development (NPD) alert reports, Online auditing reports, Price comparison reports, Product substitution report and Customer profiling reports (allowing for further use of CRM through ‘cluster analysis’). This sort of information is of high value to companies and a subscription to the service can range from £5,000 – £20,000p.a. (current clients include Kellogg’s, Innocent Smoothies, Nielson and Ella’s Kitchen).
Finally, mySupermarket is often contracted by media companies, such as ‘the Independent’ to analyse trends for news stories – J. Burchill (2010).
I feel that information is now widely recognised as being one of the key corporate resources, needing to be carefully managed so that it can be effectively utilised in the decision-making process. Timely, accurate and relevant information can only be generated, however, if corporate data is stored in a secure, accessible and flexible manner.
The following table provides a summary of the impact that data warehousing technology had for mySupermarket:
Figure 6: Technological Impact Summary
mySupermarket – Conclusion
To conclude, data warehousing has enabled mySupermarket to overcome issues such as customer retention, real-time data and generating revenue. It really does appear that “information is key”, whereby data is the prerequisite for information. J. Poole et al. (2003) state ‘… the underlying economic justification is ultimately based on the value a given technology provides to the customers of the computing systems and software products’ and so the determinate of mySupermarket’s success is essentially based on ‘whether people use the technology’ and ‘the value of the company’. Based on an monthly unique user level of 1 million, and 500k registered users turning over c.£10m, we can assume that at this point in time mySupermarket is justified economically.
On the other hand, you could argue that mySupermarket is a ‘recession business’ and not a sustainable enterprise. In which case, the rapid growth in recent years could be due to the economic climate and not because there is a long term demand.
Looking towards the future, “Our investors have international ambitions,” Stern said. “They see the potential of transporting the model to different markets.” mySupermarket are looking to expand the company’s development team to support its entry into Europe and the US. mySupermarket are currently looking for another round of funding to bridge G. Murray’s (1994) second equity gap. Technology firms often require ‘follow on development funding’, as cash is heavily plowed into ‘Prototype testing’ and ‘Research & Development’. In terms of an exit, mySupermarket would be very attractive to major FMCG companies such as P&G, Unilever and Kraft’s venture arms. I believe that mySupermarket will achieve their second round funding as they are now profitable and have a proven concept that has high growth prospects for the future.
Over the past few years there has been a huge growth in the use of ‘numbers’ and ‘analytics’. Businesses are recognising that it is not enough to work harder than the competition; they also have to work smarter. Davenport (2006) argues that it is “virtually impossible to differentiate yourself from competitors based on products alone” and so to pull ahead of the pack, businesses need to compete on analytics. In which case, ‘mySupermarket insights’ is poised in a perfect position to capitalize on this new thirst for ‘business intelligence’, whereby companies feel that they will have to subscribe to the service to compete on an even playing field.
However, it is not enough to just store data, it has to be managed, analyzed, implemented and utilised to convert raw data into real information. mySupermarket realised the benefits of data warehousing and were able to exploit this, expanding from a mere ‘price-comparison site’ to a ‘business intelligence provider’ to major FMCG companies. I believe that with the current shift towards analytics and business intelligence, mySupermarket has the potential to be a major force in the FMCG sector whilst offering a greater transparency for customers, all of which stems from good management and data warehousing.
Beyon-Davies, P (2004) – Database Systems, 3rd edition, Palgrave, Basingstoke, pp. 527-538 and 547-553
Bose, R (2002) – Customer Relationship Management: Key concepts for IT success, Vol. 102, No. 2, pp. 89-97
Blastland, M & Dilnot, A (2007) – The Tiger That Isn’t: Seeing a World Through Numbers
Burchill, J (Aug 2010) – The Independent: So the Prince of Green Hypocrites is going on tour. Thank God I’ll be abroad
Cooper et al. (2000) – Data Warehousing Supports Corporate Strategy at First American Corporation Vol. 24, No. 4
Champion, D & Coombs, C (2010) – Handout: BSC070 Enterprise Information Systems
Cheeseman, G (June 2010) – Triple Pundit: Is It Ethical For BP To Buy Oil-Spill-Related Google Search Terms?
Davenport, T. H (2006) – Competing on Analytics
IEEE Computer (Dec 1991) – Special Issue on Heterogeneous Distributed Database Systems, 24(12)
Inmon, W.H. and Kelley, C (1993) – Developing the Data Warehouse. QED Publishing Group, Boston, Massachussetts
Kimball, R & Ross, M (2002) – The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, 2nd edition
Martin et al., (2005): 193 – Managing Information Technology 5th Edition, Pearson Education Inc, pp. 192-195
Mintel Intelligence (Oct 2009) – Web Aggregators, UK
Murray, G (1994) – The Second ‘Equity Gap’: Exit Problems for Seed and Early Stage Venture Capitalists
Perry, J (Nov 2009) – Dunnhumby: A lifetime of loyalty? RetailWeek
Poole, J et al. (2003) – Common Warehouse Metamodel: Introduction to the standard for data warehouse integration
Smith, A (1959) – ‘The Theory of Moral Sentiments’
Vandermay, J (2001) – Considerations for Building a Real-time Data Warehousea