Configuring a self-hosted Solid POD server
Recently I've been working on a Solid task manager, but I haven't started to use it in production myself. One big reason why I is that I was using a POD from solid.community, but the software was updated without notifying its users and my application was not compatible with the server for a while. It seems to be working now, but this experience taught me how important it is to have control over my data POD.
So I've decided that I will start self-hosting my data POD using node-solid-server. This shouldn't be too difficult because I've already been working with it locally for development. However, there are a couple of things that I don't expect to be straight-forward. Like configuring SSL certificates and scheduling backups.
I've forked the
node-solid-server repository in order to add some customizations in my deployment. Working on this I found out that NSS (node-solid-server) is likely to be replaced by IPS (inrupt-pod-server) on most official Solid servers, as you can read here. This also means that NSS will probably stop getting as much support, so it made me ponder if I should use IPS instead for my self-hosted POD. In the end, I decided to continue with NSS given that it's the one I've been working with and it's already working well. The whole idea of Solid is that applications should be server agnostic and work using the protocol, so let's see how it goes.
Most of the changes I made in my fork have been UI related. I've basically removed all the public UI and created my own simplified version. I also disabled some routes for creating accounts, password reminders and such since I won't be using them. Something I found misleading is that setting the
multiuser configuration flag to false does not hide the registration form, so this was one of my motivations to disable those routes. In order to create the htmls, since they were static assets, I just started writing some inline css. But I soon realized how much I was missing TailwindCSS, so I created a tailwind sandbox that I'll be using from now on whenever I need some simple css. It's just easier to write it using tailwind and copy the purged css in the
head of the static html.
Configuring the SSL was also easier than I expected. I already knew I'd be using Let's Encrypt, but I thought I'd have issues using the certificates in the app. In the end, everything worked on the first try. Looking into some nginx configurations I also learned about two new security recommendations. Those were turning off server_tokens and adding HSTS headers.
You can see all the changes I've made to the original repository in the
live branch of my fork. But keep in mind that this fork is not intended as a general purpose replacement. It contains my customizations and they may not work in other environnments, I'm just sharing them publicly for educational purposes. Also, this repository is the one I'm using in production, including all the configuration files. I thought about doing this in a private repository instead, but I reckon that'd just be security through obscurity. In the end, the important part about security is the SSL keys and access to the server. And of course, you won't be able to find any of those in the repository.
After deploying the server last week, I've started working on a backup solution. I could search for a nodejs solution, or something specific to Solid. But some months ago I configured backups for this website using laravel-backup and it's been working great, so I wanted something similar for my data POD as well.
Actually, I reckon I'll probably need it for other things in the future. So I decided to write my own cli application. My idea is to make it language agnostic, and have different drivers for different projects. That way, whenever I have something new that I want to backup I'll just need to implement a new driver.
Some time ago I wrote a cli application to manage projects using Docker during development, I called it metal. It's worked great for me, but some people had issues installing it. Which is annoying, because it's just a simple wrapper and Docker is doing the heavy load. But it seems like pip has some problems with dependency management (or I don't know how to declare my package properly). So this time I decided to use bash instead (I've used it for ngnix-agora, another cli application I wrote and it's working well). But I quickly realized it was a fool's errand. It was an uphill battle to write modular code, given that I couldn't even have data structures or return values. I thought it would be nice to use a language that compiles to bash, and I found Batsh, but it didn't convince me either. What actually ended up convincing me was this article: Replacing Bash Scripting with Python. So yeah, I ended up doing it with Python as well.
I'm using the click framework, like I did with metal. This time I hope to learn more about python's dependency management and building cli applications. I also set up CI with github actions which at least runs the tests in an environment that isn't my own. One thing I've already learned is to avoid using json for config files, and use toml instead.
You can find this work in progress at https://github.com/noeldemartin/rireki. I'm calling it
rireki because it means "personal history" or "logs" in japanese.
I have now completed the first version of rireki and it's deployed in my server making daily backups of my data POD. There are many ideas I had for it, but I've kept YAGNI in mind and I just implemented the bare essentials (plus a bit more). I wrote some documentation, which I invite you to read if you want to learn about my approach to managing backups. I don't consider writing documentation YAGNI, given that it will also be useful for my future self.
I usually write down the ideas I have that I don't end up implementing, and this time I went a step further and added them as GitHub issues. I was hesitant on doing this for a couple of reasons. First, because I have the habit of judging the "quality" of a repository looking at the ratio of open/closed issues. And second, because I didn't want to depend on GitHub to keep this list. The code is not important because I have my local clone, but I would lose the issues if anything happens to the repository. But both reasons are actually unfounded. I need to stop judging repos like that, and this is a good way for people to see some ideas I have for the future. I have used the "enhancement" and "good first issue" labels to make explicit that these aren't bugs. And about trusting GitHub, I just stored a copy of the issues on the wayback machine, and that should take care of the problem.
Once the first version of rireki was completed, I proceeded to configure the backup on my server. I wasn't sure what files to include in the backup, and looking for existing issues I found I wasn't the first one to ask that. In the end, it seems like
data folders are the only important ones. I haven't tested restoring backups, but I'd consider that YAGNI given how unlikely it is that it will happen. Before you go thinking that I am being reckless, the backup is just a zip with Turtle files. So the data is not even tied up to the implementation of
node-solid-server, it's just semantic data in its purest form. That's part of the beauty of Solid.
To end this task, I want to make a reflection on how long it took. I started it about a month and a half ago. Considering that this is a side project, and I've had Christmas and starting a new job in between, it didn't took that long. I've made some rough estimations and I think I've dedicated about 60 hours. Included on this task is configuring my Solid data POD and implementing a backup cli tool, with deploying and all the back and forth in between. 60 hours is about a week and a half of full-time work, and for such fundamental milestones in the path I want to walk, it's great. However, real-time it's been a month and a half, meaning that I would only be able to do about 8 of these in a year. This certainly gives it a new perspective, and I'm thinking a lot about the concept of appetite I learned reading Shape Up. For this task, I'd say my appetite would have been alright because these were important things to do, but it's also true that my initial estimations were off. That's ok, estimations normally suck, specially the way I am approaching side projects which is with more focus on learning and enjoying than productivity. But again, only 8 a year of this magnitude (which I initially thought would be smaller). I will keep this in mind going forward, and I'm excited to start eating my own dog food with the autonomous data technology I have been building.