Analyze Sentiment in Text
This example shows how to use the Valence Aware Dictionary and sEntiment Reasoner (VADER) algorithm for sentiment analysis.
The VADER algorithm uses a list of annotated words (the sentiment lexicon), where each word has a corresponding sentiment score. The VADER algorithm also utilizes word lists that modify the scores of proceeding words in the text:
Boosters – words or n-grams that boost the sentiment of proceeding tokens. For example, words like "absolutely" and "amazingly".
Dampeners – words or n-grams that dampen the sentiment of proceeding tokens. For example, words like "hardly" and "somewhat".
Negations – words that negate the sentiment of proceeding tokens. For example, words like "not" and "isn't".
To evaluate sentiment in text, use the vaderSentimentScores
function.
Load Data
Extract the text data in the file weekendUpdates.xlsx
using readtable
. The file weekendUpdates.xlsx
contains status updates containing the hashtags "#weekend"
and "#vacation"
.
filename = "weekendUpdates.xlsx"; tbl = readtable(filename,'TextType','string'); head(tbl)
ID TextData __ _________________________________________________________________________________ 1 "Happy anniversary! ❤ Next stop: Paris! ✈ #vacation" 2 "Haha, BBQ on the beach, engage smug mode! 😍 😎 ❤ 🎉 #vacation" 3 "getting ready for Saturday night 🍕 #yum #weekend 😎" 4 "Say it with me - I NEED A #VACATION!!! ☹" 5 "😎 Chilling 😎 at home for the first time in ages…This is the life! 👍 #weekend" 6 "My last #weekend before the exam 😢 👎." 7 "can’t believe my #vacation is over 😢 so unfair" 8 "Can’t wait for tennis this #weekend 🎾🍓🥂 😀"
Create an array of tokenized documents from the text data and view the first few documents.
str = tbl.TextData; documents = tokenizedDocument(str); documents(1:5)
ans = 5x1 tokenizedDocument: 11 tokens: Happy anniversary ! ❤ Next stop : Paris ! ✈ #vacation 16 tokens: Haha , BBQ on the beach , engage smug mode ! 😍 😎 ❤ 🎉 #vacation 9 tokens: getting ready for Saturday night 🍕 #yum #weekend 😎 13 tokens: Say it with me - I NEED A #VACATION ! ! ! ☹ 19 tokens: 😎 Chilling 😎 at home for the first time in ages … This is the life ! 👍 #weekend
Evaluate Sentiment
Evaluate the sentiment of the tokenized documents using the vaderSentimentLexicon
function. Scores close to 1 indicate positive sentiment, scores close to -1 indicate negative sentiment, and scores close to 0 indicate neutral sentiment.
compoundScores = vaderSentimentScores(documents);
View the scores of the first few documents.
compoundScores(1:5)
ans = 5×1
0.4738
0.9348
0.6705
-0.5067
0.7345
Visualize the text with positive and negative sentiment in word clouds.
idx = compoundScores > 0; strPositive = str(idx); strNegative = str(~idx); figure subplot(1,2,1) wordcloud(strPositive); title("Positive Sentiment") subplot(1,2,2) wordcloud(strNegative); title("Negative Sentiment")
See Also
vaderSentimentScores
| ratioSentimentScores
| tokenizedDocument