Notes on sentiment analysis libraries for python
---------------------
Robbie Fordyce
14 October 2025
---------------------
On python sentiment analysis
---------------------
- Libraries
I’ve been exploring a few sentiment analysis libraries for use in python, particularly for use in sentiment analysis of Simplified Chinese, and have come across the following systems. Piping to LLMs will probably produce better accuracy but at a cost, whereas the below are free to run locally on a machine. LLMs have a low but non-zero risk of low precision responses, whereas these tools should still give reasonable outputs.
For the purposes of comparison, I have two strings under analysis, and will reuse these in the sample code below.
text1 = "Social context remains important for understanding textual data."
text2 = """Across 2012 and 2013, three separate men from three separate states in Australia had three different problems with games that they had bought using Steam, a digital storefront for buying games and managing personal game libraries. For the most part, these games are games that held little sway in the culture or industry of video games and their communities: X-Rebirth, Legends of the Dawn, Thirty Flights of Loving, Nyx Quest: Kindred Spirits, and others. These titles likely provoke half-memories of games, if not genres, in some readers but remain at least twelve years old at a minimum, some older. Dear Esther and Plants vs Zombies have held more sway. Each of these games failed in some substantive way for each of our Australian men, and each of them attempted through terse support tickets to have these failures rectified through refunds with Steam."""
VADER
[ pypi | homepage ]
Good for quick response analyses of social media text, although hasn’t been updated since 2020. Reasonably simple implementation.
>> from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
>> v_analyser = SentimentIntensityAnalyzer()
>> v_analyser.polarity_scores(text1)
# {'neg': 0.0, 'neu': 0.795, 'pos': 0.205, 'compound': 0.2023}
>> v_analyser.polarity_scores(text2)
# {'neg': 0.078, 'neu': 0.862, 'pos': 0.06, 'compound': -0.4118}
Produces a dictionary of scores, including compound. Probably the easiest to implement.
Textblob
[ pypi | homepage ]
Textblob does a bit more than Vader, allowing for sentiment analysis and other things. It has one additional step beyond pip install, as you need to download corpora after installing the library itself.
>> from textblob import TextBlob
>> b_analyser = TextBlob(text1)
>> b_analyser.sentiment
# Sentiment(polarity=0.21666666666666667, subjectivity=0.5333333333333333)
For text1, Textblob returns a sentiment object that has a polarity and a subjectivity score. Note that the score is similar to VADER, but slightly higher. Not surprising, but indicative of the variations.
>> b_analyser = TextBlob(text2)
>> b_analyser.sentiment
# Sentiment(polarity=0.03422619047619047, subjectivity=0.42738095238095236)
For text2, a longer stretch of text, Textblob is scoring the text somewhat higher than VADER. text2 discusses some reasonably negative content at length about returns processes and failures on this front. I would suggest that Vader is actually more accurate here.
Asent
[ pypi | homepage ]
Generates sentiment analysis scores and can work in a variety of languages.
Note that the current distro hasn’t been updated since December 2023, and as such the tool calls distutils, which has been removed in the most recent versions of python. I’ve made a fork here that you can download and install, or you can install the official version via pip and then replace the first line of asent/visualize.py that reads <from distutils.log import warn> with <from logging import warn> and the tool should work fine.
Asent is probably the most useful for the current project, as it allows for analysis of a wide range of languages, including Simplified Chinese text. Obviously this requires the downloading of a language file, and there are a variety to choose from here. Chinese requires an ‘experimental’ branch of the tool, which, on testing, is not super reliable.
>> import spacyFor testing the sentiment analysis in Chinese I used the article body from this WeChat article. There’s a bit of whitespace in the work, so bear with me doing some pretty crude transformations here.
>> spacy.load("zh_core_web_sm") >> import asent
>> from asent import lexicons >> rated_words = lexicons.get("lexicon_zh_chen_skiena_2014_v1.txt") >> import zh_core_web_sm >> nlp = zh_core_web_sm.load() >> nlp.add_pipe("asent_zh_v1") >> text_zh = """ # copy/pasted from article # """
>> text_zh = "".join(text_zh.split())While the article itself is broadly neutral, inspecting actual text shows that many sentences that do have some sort of contextual sentiment associated with them are ultimately scored as 0.0 across the board. So, this lexicon is partly scored, which is honestly pretty good for a free tool, and still useful in some contexts but not enough for the project I’m currently working on.
>> doc = nlp(text_zh)
>> doc._.polarity # DocPolarityOutput(neg=0.015, neu=0.294, pos=0.014, compound=0.018, n_sentences=34)