Collocations are phrases or expressions containing multiple words, that are highly likely to co-occur. For example – ‘social media’, ‘school holiday’, ‘machine learning’, ‘Universal Studios Singapore’, etc.Continue reading “Collocations in NLP using NLTK Library”
This post is in continuation with my NLP blog series. You might want to checkout my previous blog in which I discussed data pre-processing in R. In this blog, I will determine the emotions in Ted Talks. At the end, I will compute a HeatMap of emotions and talks to aid in our visualization.
So, without further ado, let’s dive in!
I have recently got my hands dirty with Natural Language Processing (NLP). I know, it’s a little late to the party but I am at least in the party!
To start with a general overview, I implemented quite a few tasks related to NLP including Text Classification, Document Similarity, Part-of-Speech (POS) Tagging, Emotion Recognition, etc. These tasks were made possible by implementing text pre-processing (noise removal, stemming) and text to features (TF-IDF, N-Grams, Topic Modeling, etc). I implemented these in both R and Python. So, I will try to jot down my experiences in both of these environments. Therefore, I will write this as a blog series, wherein each blog will discuss only one particular thing implemented in one particular environment.