Wednesday, 20 August 2014

Flipkart VS. Amazon.in: Sentiment analysis using twitter posts


Flipkart and Amazon India are emerging as two biggest players in rapidly growing online retail industry in India. Although Amazon started its operations in India much later than Flipcart, it is giving tough competition to Flipkart. Only future would tell who will surpass another in long run but it is evident that effectiveness to capture customers' needs and quickness to respond accordingly are going to play a major role.

In this exercise i tried to capture customers' sentiments using customers' twitter postings. I used R (http://www.r-project.org/) for this exercise. Below are the key steps describing analysis process.

1.     Search for presence of twitter handles (@Flipkart for Flipkart tweets and @amazonIN for Amazon India tweets) and scrape the tweets accordingly- I used twitteR package in R (http://cran.r-project.org/web/packages/twitteR/index.html) to fetch tweets

2.     Perform pre-processing like remove duplicate tweets etc.

3.     Apply sentiment analysis algorithm to group tweets in one of the two groups i.e. either positive sentiment or negative sentiment- I used a pretty simple algorithm for this which takes into account occurrence of positive and negative sentiment words in each tweet. For sentiment words I used publicly available dictionaries containing sentiment words

So, as you can see it’s quite simple and fast. The only caveat is that Twitter web API imposes restriction on the number of tweets one can access. Nevertheless, one can access thousands of tweets which are good enough to perform not so exhaustive analysis.

OK, now here is the stuff for which we did all this i.e. results. It comes as conclusion that both Flipkart and Amazon score impressively on customer sentiments however Amazon performs slightly better. For Flipcart, around 64% of tweets under analysis carry positive sentiments and 36% carry negative sentiments. However in case of Amazon these figures are 73% and 27% respectively. Below is the graph depicting these numbers.



In next post, i would try to show word cloud supporting above trends. 

Thanks for now!
  

No comments:

Post a Comment