Subreddits & Reservoir Computing

  • Category: Artifical Intelligence
  • Purpose: Semester-Long Club Project
  • Project date: Feb 2021 - May 2021
  • GitHub URL: GitHub

Description

Reservoir Computing is a novel field of Machine Learning. It is the mothership of black boxes, yet it has shown promising results for predicting outcomes that rely on temporality, huge amounts of data, and a similarly large degree of chaos, that is, pure unpredictability. In this project, we identified a chaotic medium, Reddit, and tasked ourselves with predicting the popularity of community-related subreddits based on the general language sentiment of Reddit as a whole.

We used an API to scrape ALL of Reddit's comments & posts over the past few years (yes, it took forever). That data's sentiment was analyzed with a lexicon we created and then compared to subscriber counts in Subreddits we personally identified as community-related. We did end up finding a small signal that community-related Subreddit popularity were related to Reddit sentiments overall!

(I own no rights to the Reddit logo)