How To Monitor Keywords on Reddit with Golang?

Monitoring keywords that are mentioned in Reddit posts and comments can be very useful to monitor your brand, your products, your competitors, stay informed about specific topics... Go is a great language for such social listening applications. In this article, we will see how to write a simple Go program that monitors keywords on Reddit.

Monitor Reddit with Golang

Social Media Listening on Reddit

Social listening on Reddit is an incredibly powerful tool for both individuals and organizations across a range of objectives. Here are several reasons why it's particularly important:

How to Monitor Specific Keywords on Reddit?

Reddit exposes a couple of free API endpoints that allow you to get every new posts or comments made on the platform. These endpoints are not very well documented.

In order to get the last 100 Reddit posts, you should make a GET HTTP request to the following API endpoint: https://www.reddit.com/r/all/new/.json?limit=100

In order to get the last 100 Reddit comments, you should make a GET HTTP request to the following API endpoint: https://www.reddit.com/r/all/comments/.json?limit=100

The response of these API endpoints is a JSON object that contains a list of posts or comments.

Here is a (truncated) example of a response I am getting when requesting the posts endpoint:

curl https://www.reddit.com/r/all/new/.json?limit=100

{
"kind": "Listing",
"data": {
    "after": "t3_1asad4n",
    "dist": 100,
    "modhash": "ne8fi0fr55b56b8a75f8075df95fa2f03951cb5812b0f9660d",
    "geo_filter": "",
    "children": [
        {
            "kind": "t3",
            "data": {
                "approved_at_utc": null,
                "subreddit": "GunAccessoriesForSale",
                "selftext": "Morning gents. I\u2019m looking to snag up your forgotten factory yellow spring for the 509T. I need to source one for a buddy who lost his and I cannot find any available anywhere! \n\nIf one of you have the yellow spring laying around, looking to pay $50 shipped\u2026 \n\nTo my 509t owners, it\u2019s the \u201clight\u201d spring that comes in a plastic bag in the carrying case. \n\nThanks in advance  ",
                "author_fullname": "t2_2ezh71n6",
                "saved": false,
                "mod_reason_title": null,
                "gilded": 0,
                "clicked": false,
                "title": "[WTB] 509T yellow spring",
                "link_flair_richtext": [],
                "subreddit_name_prefixed": "r/GunAccessoriesForSale",
                [...]
                "contest_mode": false,
                "mod_reports": [],
                "author_patreon_flair": false,
                "author_flair_text_color": "dark",
                "permalink": "/r/GunAccessoriesForSale/comments/1asadbj/wtb_509t_yellow_spring/",
                "parent_whitelist_status": null,
                "stickied": false,
                "url": "https://www.reddit.com/r/GunAccessoriesForSale/comments/1asadbj/wtb_509t_yellow_spring/",
                "subreddit_subscribers": 182613,
                "created_utc": 1708094934.0,
                "num_crossposts": 0,
                "media": null,
                "is_video": false
                }
            },
        [...]
        ]
    }
}

As you can see, Reddit returns an array of objects. Each object is a post or a comment (depending on the endpoint you requested) that contains comprehensive details like the content of the submission, url of the submission, creation date and time, etc. Most of them won't be useful, and "selftext" is undoubtably the most important element as it contains the content of the post or comment.

Results are presented in descending order. Please note that you cannot request more than 100 posts or comments at the same time.

Why Use Go to Monitor Reddit in Real Time?

The Go programming language offers several powerful features that make it an excellent choice for retrieving Reddit posts and comments in real time. While other programming languages can also be used for this purpose, there are specific advantages to using Go, especially when dealing with real-time data processing. Here's why Go stands out:

First of all, Go's lightweight threads, known as goroutines, allow for efficient multitasking and concurrency. This is particularly useful when you're fetching multiple Reddit posts or comments simultaneously, as each fetch operation can be run in its own goroutine.

Secondly, Go's standard library includes a comprehensive and efficient HTTP client that simplifies the process of making web requests. This is essential for interacting with Reddit's API to fetch posts and comments. Go's HTTP client supports contexts, allowing for request timeouts and cancellations. This is critical for real-time applications where you want to ensure your application remains responsive and is not hung up on delayed responses.

Finally, Go is a compiled language, which typically means that applications written in Go are fast and have a small footprint. This is beneficial for real-time applications that need to process large volumes of data quickly. Go's garbage collector is designed to be efficient and to keep latency low, which is crucial for maintaining performance in real-time data fetching scenarios.

A Simple Go Program That Watches a Keyword in Reddit Posts

Here is a step-by-step plan about making a Go program that monitors the keyword "kwatch.io" in Go posts:

Here is the Go code:

package main

import (
    "encoding/json"
    "fmt"
    "net/http"
    "time"
)

type Post struct {
    Selftext  string `json:"selftext"`
    Title     string `json:"title"`
    Permalink string `json:"permalink"`
}

type Data struct {
    Children []struct {
        Data Post `json:"data"`
    } `json:"children"`
}

type Response struct {
    Data Data `json:"data"`
}

func fetchPosts() {
    resp, err := http.Get("https://www.reddit.com/r/all/new/.json?limit=100")
    if err != nil {
        fmt.Println(err)
        return
    }
    defer resp.Body.Close()

    var r Response
    err = json.NewDecoder(resp.Body).Decode(&r)
    if err != nil {
        fmt.Println(err)
        return
    }

    for _, child := range r.Data.Children {
        if strings.Contains(child.Data.Title, "kwatch.io") || strings.Contains(child.Data.Selftext, "kwatch.io") {
            fmt.Println("Title:", child.Data.Title)
            fmt.Println("Selftext:", child.Data.Selftext)
            fmt.Println("Permalink:", child.Data.Permalink)
            fmt.Println()
        }
    }
}

func main() {
    ticker := time.NewTicker(1 * time.Second)
    for range ticker.C {
        fetchPosts()
    }
}

This program will fetch the last 100 new posts from Reddit every second (asynchronously) and print the title, selftext, and permalink of each post to the console. You can do the same thing with Reddit comments by simply changing the API endpoint URL.

Here are a couple of ideas about how you could improve this program:

Conclusion

Monitoring specific keywords on Reddit is possible thanks to a fairly simple Go program.

Productionizing such a program can be challenging though. First because Reddit is good at blocking you if you make too many requests, but also because their API endpoints return a lot of posts and comments at the same time, which means that your Go program has to handle this high volume smartly.

If you do not want to build and maintain such a system by yourself, we recommend that you use our platform instead: register on KWatch.io here.

Arthur
Go Developer KWatch.io