Guide

Methodology

Notes on Concept, how indicators are calculated. and some general advice of how to interprete indicators.

Concept

The purpose of the indices Sentiment A and Sentiment B is to objectively measure the emotional sentiment of Reddit posts that are filtered by keywords and sorted according to Reddit's user-defined criteria.

In broader terms, these indicators act as proxies for gauging online publications and discussions' sentiment towards particular topics within a set scope.


  • Reddit serves as a source for conducting sentiment analysis on text.
  • Text fragments for analysis undergo a two-step filtration and sorting process. Firstly, a set of subreddits is selected based on a keyword and sorted using Reddit's criteria. Secondly, selected subreddits are filtered based on the presence of a specific keyword in the text fragments. The remaining fragments are then again sorted and the top items are used for analysis. Depending on the indicator's configuration, comments of Reddit users are included or excluded.
  • The approach utilized is user-driven in that each query for a particular sentiment analysis creates a entry in the database and will be shown in later queries for comparisons over time.
  • The fact that an individual requests a particular sentiment value serves as information, as it signifies their interest in acquiring the value. In order to capture this, requests are also distinguished based on their source, whether they are likely from a human or a bot such as a search engine spider or LLM.
  • Based on the output from node-nlp, two indicators are calculated:
    • Sentiment A, which is the mean of analyzed text fragments/reddit posts.
    • Sentiment B additionally includes the ranking based on sorting criteria. The metric utilizes weighted values for ranking order, giving higher ranking postings of higher ranking subreddits more influence than lower ranked ones.
  • Next to the two indicators, mean, median and standard deviation are calculated. The base for the caluclation of these statistical values are sentiments of all analyzed Subreddits - and not the deviation within one Subreddit.