This blog entry stems from a course assignment in INF 202 Introduction to Data and Databases. My topic is about companies, especially retail stores using big data as their main force to sell products. In INF 202 students studied big data and its impact on the world. Big data is massive amount of data that is stored on servers all around the world; these enormous amounts of data can be analyzed to determine changes that can be made to improve the daily lives of millions.
An example of big data could refer to a social network like twitter. When people tweet, their tweets are stored. Millions of tweets are created each day and stored. This requires a massive database to store all the information. With this tweet information Twitter can determine treading topics, politicians can see key issues that concern citizens, companies can see with product is being talked about. The information of this big data can apply to many situations.
Information Week magazine posted an article called Why Sears Is Going All-In On Hadoop where author Doug Henschen explains how Sears Holdings is trying to attract more shoppers by incorporating big data into their business model. Phil Shelley, the vice president of the company saw that Sears Holdings is in decline and decided to find a better way to connect the company with its customers. He did this by introducing Apache Hadoop, which Henschen describes as “the high-scale, open source data processing platform driving the big data trend,” into the company’s strategy. Using Hadoop allows Sears Holdings to store considerable amounts of data about customer purchases. With this data the company can analyze and create business strategies based on customer needs. One example of this is personal coupons; by analyzing a customer’s buying trend, Sears can create coupons and sales specific to only that customer. This way the each customer receives coupons and sales relating to them creating more interest for shopping.
Big data is appealing because of its impressive features. A big data system like Hadoop can store and process data more efficiently and quicker. In the article Henschen writes, “Sears’ process for analyzing marketing campaigns for loyalty club members used to take six weeks on mainframe, Teradata, and SAS servers. The new process running on Hadoop can be completed weekly.” The quicker data is processed the quicker it can be used to benefit the customer. Another appealing factor is Hadoop’s storage space and the cost. Henschen further explains, “Hadoop systems at 200 terabytes cost about one-third of 200-TB relational platforms.” Compared to previous databases, Hadoop’s innovation allows for a versatile and stable database that can be used quickly.
The article introduces the idea of “scaling out” as opposed to the common model of “scaling up.” Scaling up is currently the main practice in businesses around the world. Scaling up is a term used for databases, it means buying better and more expensive hardware to replace older models. With better equipment more data can be stored, but the main problem is cost. New technologies cost a lot of money and can get outdated fast. In comparison, scaling out provides a cheaper and more efficient way to store data because the data is stored in more nodes and clusters. Compared to scaling up, scaling out is also considered more reliable. If the machine in scaling up crashes, lots of data can be lost, but with scaling out, the nodes can create duplicates of data preventing data loss.
The use of Hadoop by Sears Holdings is important because it will definitely be a norm in a couple of years. Storing large amounts of customer’s data and having quick access to it is only available from technologies such as Hadoop because economic reasons. In my opinion I think that stores using large amounts of customer’s data is beneficial, but risky. In order for me to feel comfortable about large amounts of data being stored based on my purchases, I think there needs to be laws and regulations that keep the privacy of the customer. There needs to be preventive measures to insure that customer information will not get into the wrong hands. If certain terms and conditions can be meant then the use of big data in stores would provide a better experience for the customer and a happy customer will create more business for the store. Balancing privacy with use will allow both sides to gain the benefits of big data.
Henschen, Doug. “Why Sears Is Going All-In On Hadoop.” Information Week. 31 Oct. 2012. Web. 16 Dec. 2012. http://www.informationweek.com/global-cio/interviews/why-sears-is-going-all-in-on-hadoop/240009717#disqus_thread
Grzegorz Haranczyk, Spring 2012, Information Science