Bypassing a Paywall

I was using a website (which will be called Voldemort, to maintain their privacy), and it had a paywall, but I didn't want to pay for the content. So, I tried to see if I could get around the paywall. Here's what I found.

Part 1: Figuring out the real call chain

I did some digging with chrome devtools, and I found that voldemort was calling both its own api, and a second API, which I will call death-eater. A quick google search for the death-eater API showed that it essentially acts as an ingress controller for a CDN. I was able to reconstruct the call chain, based on my observed behavior of the voldemort app and publically available documentation for the death-eater API.

    sequenceDiagram

    participant frontend as voldemort frontend
    participant api as voldemort API
    participant death-eater as death-eater API
    participant cdn as voldemort CDN
    
    frontend ->> api: GET <user>/gallery/<number>
    api ->> frontend: <user>/gallery/<number> content
    frontend ->> death-eater: GET media with signed url
    death-eater ->> death-eater: validate url
    alt is valid
        death-eater ->> cdn: GET media
        cdn ->> death-eater: return media
        death-eater ->> death-eater: add watermark
        death-eater ->> frontend: watermarked media
    else is invalid
        death-eater ->> frontend: 403 error
    end

Part 2: Bypassing the Paywall Entirely

So, what if I just tried hitting the CDN url directly? I did a simple curl request directly to the CDN url and... it worked. With this knowledge, I created a bot that utilised the following call chain:

    sequenceDiagram

    participant scraper as scraper
    participant api as voldemort API
    participant death-eater as death-eater API
    participant cdn as voldemort CDN
    
    scraper ->> api: GET <user>/gallery/<number>
    api ->> scraper: <user>/gallery/<number> content
    scraper ->> scraper: extract CDN media url from signed url
    scraper ->> cdn: GET media
    cdn ->> scraper: media

Summary

I was quite surprised that such a naive scrape could work to bypass a paywall. My guess is that the CDN's ingress rules were misconfigured to allow from the public internet, rather than only from the death-eater API.

I totally, definitely responsibly disclosed this vulnerability, and in no way did I slurp down ~3tb of data to use for future AI training.