Tweets Sentiment Analyzer

  • Date: Nov 2017
  • Category: Data Science
  • Key Tags: Spark, Kafka, Big Data

It is a project to analysis Tweets' Sentiment in real-time on Twitter.

Introduction

Sentiment Analysis is the process of determining whether a piece of writing is positive, negative or neutral. It's also known as opinion mining, deriving the opinion or attitude of a speaker. For example,

  • “President Donald Trump approaches his rst big test this week from a position of unusual weakness.” - has positive sentiment.
  • “Trump has the lowest standing in public opinion of any new president in modern history.” - has neutral sentiment.
  • “Trump has displayed little interest in the policy itself, casting it as a thankless chore to be done before getting to tax-cut legislation he values more.” - has negative sentiment.

The above examples are taken from CNBC news: http://www.cnbc.com/2017/03/22/trumps-rst-big-test-comes-as-hes-in-an-unusual-position-of-weakness.html You can use any third party sentiment analyzer like Stanford CoreNLP (java/scala), nltk(python) for sentiment analyzing. For example, you can add Stanford CoreNLP as an external library using SBT/Maven in your scala/java project. In python you can import nltk by installing it using pip. Sentiment analysis using Spark Streaming: In Spark Streaming, create a Kafka consumer (for python, shown in the class for streaming) and periodically collect ltered tweets (required for both scala and python) from scrapper. For each hash tag, perform sentiment analysis using Sentiment Analyzing tool (discussed above). Then for each hash tag, save the output with twitter itself.

See project code: Source Code

Functions

  • Coded web scraper to search tweets with given hashtags by using tweepy.
  • Cleaned and pre-processed collected tweets by using Spark Stream and Spark.
  • Solved producer-consumer problem in kafka by using python-kafka.
  • Implemented the sentiment analysis of tweets by using textblob to achieve a 90% correct rates.

Example

RT @USAloveGOD: #Holder, #Comey fight Trump's #FBI slam: 'Not letting this go'
We the #American people feel the @FBI have cheated u…
negative
RT @SusanNow3: Trump just endorsed Roy Moore,  a man who thinks:
1.
negative
Gay people should be put in jail.
positive
2.
negative
Women shouldn't hold pu…
negative
We can only hope It is fake news?
negative
https://t.co/rYWwoh0v83
negative
RT @TheRickyDavila: Boom.
negative
🔥👏🔥👏🔥 https://t.co/kuv1GqfeLe
negative