My Setup for the -arr Software Family
Disclaimer: I, in no way, advocate for piracy. It is only legal to obtain copies of media that you already own.
I wanted to set up the *-arr software stack (Sonarr, Radarr, etc), but all of the resources I found online were for setting it up only on one computer.
Part 1: Gathering Requirements
Firstly, I had to consider storage cost. Cloud storage isn't cheap, and any files I download would be rarely accessed by me. It would make the most sense to rent a minimally powerful storage node, since its only job would be to read and write from disk.
Secondly, I had to consider what effect transcoding would have on server performance. The server doing the streaming of media needs to be sufficiently powerful for real-time transcoding, or else the watching experience will not be enjoyable due to buffering issues.
Finally, I had to consider overall resiliency. I would be using a lot of data, if I was downloading and uploading a bunch of stuff. While not explicitly illegal, I am aware that peer-to-peer file sharing (torrenting, specifically) might violate a cloud provider's "fair use" bandwith policy, and result in the termination or non-renewal of the associated account. Rather than opt for (expensive) data redundancy, I that I would simply use Samba to mount a remote drive to the node responsible for torrenting.
After sleeping on it for a few days, I drafted the final list of requirements. It ended up looking like this:
- everything needs to run in a container
- services should be addressable via path, only subdomain is not acceptable
- media files need to be stored on their own node
- torrent clients should be able to be deployed without any manual intervention
- there should be zero buffering time when streaming
Part 2: Setup
After some more deliberation, I determined it was best to separate the services by their concerns. I came up with the following groups:
- The frontend, where users and devices will make requests or view retrieved media.
- The media retrieval service, which receives requests and queries indexers to find matching media.
- The download service, which handles actually downloading the media.
- The database, which persists configuration information.
Below is an ✨ interactive ✨ C4 diagram detailing the services and their components.. Click around to see all the detail!
Part 3: Deployment
Satisfied with the setup I outlined, the only task remaining was to actually deploy the services. I searched for low-cost VPSes, including one with "unlimited" monthly throughput to serve as the node that would download torrents. To keep track of the nodes more easily, I gave them all names corresponding to members of the girl group Brown Eyed Girls.
Below are the associated deployment diagrams:
I already had an Ansible playbook for setting up the node from my previous work with the Mamamoo cluste, so I deployed everything using docker swarm. Redundancy was only necessary for the drive being utilised by Samba; all other data was backed by configuration files. Loss of any node except miryo
would not be catastrophic, as I could simply procure a new node and run the associated docker stack deploy
commands. Any data loss is data that is readily available on the internet, and I considered the time-cost of gathering such data to be negligible, given my infrequent use of the service anyways.
Summary
The overall monthly cost for the deployment is around $70. The most expensive nodes were the ones requiring high disk capacity and high CPU power. At the start of this endeavor, I understood that newer CPUs were more powerful than older ones, but I didn't have a true understanding the order of magnitude of the difference. Real-time transcoding of 4k media is very demanding for late 2010s server CPUs. I would like to set up remote transcoding, so that I could use load balancing and automatic scaling for transcoders, but Emby and Jellyfin do not yet this. I will likely be revisiting this part of the architecture at a later date.