Skribbl.io is a great free quarantine / social distanced game where one person attempts to draw a word while everyone else guesses what they are drawing. When setting up the game you can supply your own list of comma separated words doing the game.
The problem with doing this manually is that one person playing will know all the words.
For an upcoming Zoom party I created a python command line application that takes in subreddit names and a few other parameters and using the Praw library retrieves the most commonly used words from the top comments of posts to a subreddit.
Since subreddits are generally devoted to a specific topic you can easily create pseudo-themed word banks by pulling comments from a category of topics and selecting subreddits under that banner.
You can download the program from the GitHub page.
Usage
Install dependencies
The only dependcy you need is Praw
. Install it with the command below.
pip install praw
Setup API keys
If you want to run the program yourself you will need to get a client id
and
client_secret
to use the Reddit API through Praw. The tutorial below has
all the info you need (you only need to watch the setup portion).
To be able to post this project on GitHub (relatively) safely I used environmental
variables to store the values or my Reddit API credentials. You can do the same
or modify the code of collect_reddit_instance
function (shown below) in collect.py
to use your credentials.
def create_reddit_instance():
'''Create a Reddit instance using Praw library.
Returns:
Reddit: Reddit instance via Praw. Credentials set using environmental
variables.
'''
return praw.Reddit(client_id=os.environ['PRAW_CLIENT_ID'],
client_secret=os.environ['PRAW_SECRET'],
user_agent=os.environ['PRAW_USER_AGENT']
)
Set values for PRAW_CLIENT_ID
, PRAW_SECRET
and PRAW_USER_AGENT
or
modify the code directly with your credentials.
Run the program
Once you have that set up you are ready to run the program by
executing the run.py
file. The help menu it will print is below.
python run.py --help
usage: run.py [-h] [-r SUBREDDITS [SUBREDDITS ...]] [-n NUMBER_WORDS] [-mc COMMENT_LIMIT] [-o OUTPUT_DIR] [-f] [-p POST_LIMIT] [-l MIN_WORD_LENGTH]
Harvest frequently words from subreddit comments using Praw
optional arguments:
-h, --help show this help message and exit
-r SUBREDDITS [SUBREDDITS ...], --subreddits SUBREDDITS [SUBREDDITS ...]
List of subreddits to pull comments from
-n NUMBER_WORDS, --number_words NUMBER_WORDS
Number of words to return per subreddit. Defaults to 25.
-mc COMMENT_LIMIT, --comment_limit COMMENT_LIMIT
Max number of comments to harvest per subreddit. Defaults to 10,000
-o OUTPUT_DIR, --output_dir OUTPUT_DIR
Output directory to write results to.
-f, --include_occurrences
Include the number of times each word appears in the final output
-p POST_LIMIT, --post_limit POST_LIMIT
Max number of posts to harvest from. Defaults to 50.
-l MIN_WORD_LENGTH, --min_word_length MIN_WORD_LENGTH
Min word length. Defaults to 3 characters.
For example if I wanted to get the 10 most frequently used words from 100 comments
from r/DataHoarder
I would use the command
python run.py -r "DataHoarder" -n 10 -mc 100
You can also specify multiple subreddits. The top word for each subreddit will be written to a separate text files.
python run.py -r "DataHoarder" "Python" "arduino" -n 10 -mc 100
Or use pre-harvested words
If you do not want to set up the program on your own computer I have already created lists of 25 most used words with 5 or more characters from top comments of the 100 most popular subreddits.
You can download those files from the GitHib page here. Words are listed on a single line separated by command for easier input into
NOTE
I have not reviewed all the words in these files and do not endorse any of the content that may be found within, this is the internet after all.