My Setup for the -arr Software Family

Disclaimer: I, in no way, advocate for piracy. It is only legal to obtain copies of media that you already own.

I wanted to set up the *-arr software stack (Sonarr, Radarr, etc), but all of the resources I found online were for setting it up only on one computer.

Part 1: Gathering Requirements

Firstly, I had to consider storage cost. Cloud storage isn't cheap, and any files I download would be rarely accessed by me. It would make the most sense to rent a minimally powerful storage node, since its only job would be to read and write from disk.

Secondly, I had to consider what effect transcoding would have on server performance. The server doing the streaming of media needs to be sufficiently powerful for real-time transcoding, or else the watching experience will not be enjoyable due to buffering issues.

Finally, I had to consider overall resiliency. I would be using a lot of data, if I was downloading and uploading a bunch of stuff. While not explicitly illegal, I am aware that peer-to-peer file sharing (torrenting, specifically) might violate a cloud provider's "fair use" bandwith policy, and result in the termination or non-renewal of the associated account. Rather than opt for (expensive) data redundancy, I that I would simply use Samba to mount a remote drive to the node responsible for torrenting.

After sleeping on it for a few days, I drafted the final list of requirements. It ended up looking like this:

everything needs to run in a container
services should be addressable via path, only subdomain is not acceptable
media files need to be stored on their own node
torrent clients should be able to be deployed without any manual intervention
there should be zero buffering time when streaming

Part 2: Setup

After some more deliberation, I determined it was best to separate the services by their concerns. I came up with the following groups:

The frontend, where users and devices will make requests or view retrieved media.
The media retrieval service, which receives requests and queries indexers to find matching media.
The download service, which handles actually downloading the media.
The database, which persists configuration information.

Below is an ✨ interactive ✨ C4 diagram detailing the services and their components. Click around to see all the detail!

Part 3: Deployment

Satisfied with the setup I outlined, the only task remaining was to actually deploy the services. I searched for low-cost VPSes, including one with "unlimited" monthly throughput to serve as the node that would download torrents. To keep track of the nodes more easily, I gave them all names corresponding to members of the girl group Brown Eyed Girls.

Below are the associated deployment diagrams:

I already had an Ansible playbook for setting up the node from my previous work with the Mamamoo cluster, so I deployed everything using docker swarm. Redundancy was only necessary for the drive being utilised by Samba; all other data was backed by configuration files. Loss of any node except miryo would not be catastrophic, as I could simply procure a new node and run the associated docker stack deploy commands. Any data loss is data that is readily available on the internet, and I considered the time-cost of gathering such data to be negligible, given my infrequent use of the service anyways.

Summary

The overall monthly cost for the deployment is around $70. The most expensive nodes were the ones requiring high disk capacity and high CPU power. At the start of this endeavor, I understood that newer CPUs were more powerful than older ones, but I didn't have a true understanding the order of magnitude of the difference. Real-time transcoding of 4k media is very demanding for late 2010s server CPUs. I would like to set up remote transcoding, so that I could use load balancing and automatic scaling for transcoders, but Emby and Jellyfin do not yet this. I will likely be revisiting this part of the architecture at a later date.