Implementing a Media Tracker using Solid
I've recently started using my Solid Task Manager in production. While I don't consider it finished, it's at a point where I can use it without adding more features. One of the things I want to do in my Path is to build the tools to have control in my digital life. I got tired of being at the mercy of others, so I don't want to use anything that doesn't embrace Autonomous Data. And the next step is to implement a Media Tracker.
At the moment, I am using TViso for tracking TV Shows and Movies, MyAnimeList for Manga and nothing for Books. I am not happy with either of those services, but given the vendor lock-in I can't easily switch to alternatives. This makes a great opportunity to continue expanding my toolbox.
My appetite for this task is of 40 hours with 5 weeks in real-time. Keeping this in mind, I'll scope this first version only to tracking Movies.
Activity
I've started working on this and you can find the code here. I tried to find a good name for this app but nothing came to mind, for now I am calling it media-tracker
. I'm not happy with it, so it'll probably change before I'm done with the task.
Other than naming, something I've come across that I'd like to improve is the scaffolding. I'm starting to duplicate a lot of code across projects, so I'll probably release some packages with utilities. One will be plain javascript utilities, for example I have a wrapper for localStorage, Array utilities, Object utilities, etc. Many of these are inspired by Laravel helpers. And another package will contain application scaffolding for Autonomous Data apps using Vue and Soukai.
But of course, not everything was useless. Every time that I create a project from scratch, it's a good opportunity to see if I can improve something in my setup. This time, I learned more about how I am using service workers in my PWAs. Vue cli comes with Workbox installed out of the box, but I've been having some issues pushing updates to clients. I suspect that this has something to do with the default strategy. It seems that apps use precaching with the default configuration, which acts similar to the CacheFirst strategy. The problem with this would be that new code won't reach the clients until the cache is invalidated. So in my new configuration I am using StaleWhileRevalidate for index.html and all other assets can keep using CacheFirst because they are versioned. The truth is that I still have to learn more about this, but for now I'd say it's good enough (and it's not the main goal of this task).
Finally, other than all the house keeping, I've also started working on application specifics. When it comes to Solid, something that continues being a problem for me is finding fitting RDF definitions. But this time, I was lucky because schema.org already supports Movies. It also supports watch actions, so it doesn't seem like modeling will be a problem this time. But there is something new in this app that I'll need to investigate. In solid-focus, I am using one document per resource. But for this app, I think it's a better approach to keep actions in the same document as the movie they are referencing. Actually my initial idea was to have a simple date attribute to indicate when a movie had been watched, but after looking into schema.org I decided to use their approach. This also makes me ponder if actions should be resources at all, which doesn't make a lot of sense to me. So I'll have to investigate more on this.
It's taken me a long time to post this update because I've been busy attending FOSDEM, preparing a talk on Solid and as I'll explain later, solving some technical challenges. On the flip side I've implemented an RSS feed for these updates, maybe that's where you are reading this :).
As I mentioned on my last update, I've been pondering how to model watch actions in this application. Given that I needed to solve a new use-case, I decided to revisit the Solid docs and I'm glad to say they've improved. I also noticed I am neglecting setting up the Type Registry in my applications, that's something I'll have to fix at some point. The conclusion I reached after learning how others are handling similar use-cases is that I should create a new relationship in soukai-solid. As I suspected, watch actions should not be LDP Resources, although they'll be RDF Resources (because all rdf entities are). I hope this doesn't get confusing, basically LDP Resources are documents (they have their own url), and any other entities that don't map to a document will be plain RDF Resources. In modeling this I've called this non-document entities Embedded Resources, and the new relationship can declared using embeds
and isEmbeddedBy
. Since they are within an existing document they can't have their own url. I've solved this by using the url fragment (which I think is a standard practice in Solid). For example, if I have a movie with uuid 12345
at https://my-pod.com/movies/12345
the identifier for a watch action with uuid abc
on that movie would be https://my-pod.com/movies/12345#abc
.
Other than the reasons I've mentioned at the beginning, this has also taken me a couple of weeks in real-time because library development and app development are completely different. When I'm developing library code (soukai and soukai-solid), I have to be careful not to break existing functionality. I also need to make sure that every new feature is congruent with the library. But I think it's a worthy effort because that will allow me to make applications faster (and I plan to make many :D). This may cause this task to go beyond my initial appetite, but I consider that this library work is outside the scope of the task anyways.
I have finally come up with a name that I like for the project: Media Kraken. It's likely to change at some point, because I've seen that it's already used by others. But it'll do as a codename for a while, at least until I start caring about marketing for this project (which may never happen). And yes, brace for the "release the kraken" jokes because they'll be numerous.
Something else I've done is start using the Type Registry like I mentioned in the last update. This has been particularly easy because I've been able to use the new non-document entities feature that I implemented last time. I have created a TypeRegistration model and that's it, so the extra work I did last week has already paid off.
And finally, I have also started integrating the application with 3rd parties. Given that it's an application that will allow browsing movies, the data catalog has to come from somewhere. I have to say that I am both surprised and not surprised at how hard it's been to solve this. What I was thinking at first is that it'd be easy to use some API from imdb (just looking at the name gives the impression that it must be queriable). But it's surprisingly closed, the only thing they provide is downloading some files that are updated every day. It is not very convenient, specially given that I am making an application that lives in the frontend with no backend. So yeah, I've been looking for alternatives and there isn't any real open database for this. I shouldn't be surprised because we are in the age of data and the ones who have it don't want to give it away. This is not the first time that I've faced a similar situation, so it's surprising because that's not how I think the world should work but it's not surprising because it's consistent in how things have been so far.
What many do at this point is start scraping. Again, this is not feasible in this application for a variety of reasons. Let's put aside the ethical, moral and legal implications. Since the application lives in the browser, and CORS is a thing, it's not possible to do scraping without recurring to some sort of proxy. I could have gone down that path, but I'm not convinced that it is a good solution in the long run. So what I ended up doing is using the closest thing I could find to an open database, and that is tmdb.org. I don't consider this 100% open because it requires an API key. This wouldn't be so bad if I had a backend, but I don't. And I cannot ship the API key in the frontend because it would be exposed. So I ended up creating an AWS Lambda that proxies calls to the api, only in order to keep the API key away from the frontend. This is obviously not ideal, specially since it isn't possible to limit AWS usage. But I haven't been able to come up with a better solution, if anyone has it I'm all ears.
What I plan to do at some point is allow users to configure their own proxies. But we all know nobody will do it, unless they have to. And that will happen when the AWS Lambda is close to start incurring costs and I shut it down. I don't think that'll happen anytime soon, so for now I won't worry about this.
Now that I've integrated with a data provider, the app is starting to be functional. But not visually, it sucks, so after a couple more features I'll probably be done with the first version and finish this task by implementing a decent UI.
Today's update is not about Solid, because this week I've been working exclusively on the UI. And the funny thing is, I've only finished the header and the logo! But it's been a fun week.
I'm always looking forward to reaching Flow when I'm working. This is a psychological state that is reached when you're performing a task that you enjoy and it isn't too easy nor too difficult. And that's exactly what designing the header and the logo has been like. I reckon I've spent too much time on this, and I've blown up the appetite budget at this point. This week I've also spent more time working on side-projects than usual (~20 hours, and I usually spend ~10). But in a way that's the point of Flow, that you lose track of time.
After thinking about this I've reached the same conclusion I did a while ago: this is a side-project and doing the work is not my only goal, I also want to learn and explore. But it's important to balance both, that's why I find these reflections useful.
So, what have I actually learned and explored this week? First, let me show you the results:
Mobile layout:
Desktop layout:
What you see in these two screens is the same html, styled using CSS responsive utilities. And yes, that includes the animations too! I've done similar things in the past but not with so many interactions. This time I've decided to use Tailwind CSS without any component framework, and I've also been exploring Tailwind UI that was released this week.
My opinion on Tailwind UI is not great so far, and it pains me to say this because I love Tailwind. It's been really useful for inspiration and to learn some things, but it hasn't been copy & paste as "advertised" (although they admit that you may need to adapt it to your project). One of the worst things has been AlpineJS. It isn't that I don't like it, in fact I didn't know it and it seems nice. But having to adapt it to Vue hasn't been straightforward. I also started with a fairly similar approach to their sample code but I ended up redoing almost everything. I suppose this just means that Tailwind UI is not for me.
If you haven't tried Tailwind please don't be taken back by what I said, Tailwind is awesome and if you haven't used it I encourage you to do so. In the process of exploring Tailwind UI I've also upgraded to version 1.2.0, and I've started using the new transition utilities. Which got me into the rabbit hole that ended with all these animations.
Some weeks ago I said that I wasn't completely happy with the approach I had taken to interact with the TMDB API. I recently found a forum discussion where Travis, TMDB's founder, gives green light to exposing the API key in the frontend. I don't think that's a good approach, but if he says it's ok I guess it is. This is probably one of those situations where theory is one thing and practice is another. This is theoretically a security issue, but in practice nobody is exploiting it.
Other than this, the past 3 weeks I've continued to work mostly on UI. I took a detour to implement a TailwindCSS Colors Generator, but other than that I've implemented search and movies management.
Before getting into the details, here's how it looks at the moment:
This may look deceptively simple given the amount of time it's taken me, roughly 30 hours. But there are some nuances to keep in mind.
It cannot be understated how different it is using a UI framework like Vuetify (as I did with Solid Focus) or using plain CSS. I am using TailwindCSS which isn't exactly plain CSS, but it is essentially the same. The fact is that doing it from scratch takes a lot more time. Not only because it's more difficult, you are also missing building blocks that you'd take for granted such as modals and snackbars. Any simple feature that you are developing can become cumbersome when you realize you need a modal or a snackbar.
On the other hand, it's also more rewarding and more fun. I'm also building reusable components for upcoming projects, so in a sense I may be creating my own UI framework. But the important aspect is the flexibility I have with this approach. Sure, I could have done any of the things I'll explain with other frameworks. But creating these interactions is not only a matter of implementation. They are the result of an exploration process, and using this approach allows me to explore without the constraints (and assumptions) that frameworks inherently have. What I'm doing here is not only implementing a spec, I'm constantly refactoring code and UI.
The first thing I want to highlight is the animation that takes place when a movie is marked as "watched" and is, literally, sent to your collection. You may not have noticed that, so I encourage you to look again. When a movie disappears from the grid, it shrinks and is sent towards the "My Collection" link (which is where you have to click if you want to find the movie again). I know it's a very small detail, and if most people didn't notice it's arguable how useful it is. But that's the kind of thing I appreciate, the little details. And it's also super fun to work on this kind of stuff. If you're wondering how I achieved this, it was using a combination of Vue list transitions and a custom JS script.
Something else that was interesting to work on is the button that marks movies as watched. This cannot be appreciated in the video, but that element is actually a button
when the movie is pending and it becomes a div
once the movie is watched (so, after clicking it). With the magic of Vue and Tailwind combined, this is seamless and cannot be perceived visually. Which is the point. This was achieved using Vue's dynamic component and some advanced attribute bindings:
Vue:
<component
:is="movie.watched ? 'div' : 'button'"
class="badge absolute top-0 right-0 -mt-1 w-10 h-10 flex items-center justify-center"
style="margin-right:-.7rem"
v-bind="movie.watched ? { class: 'watched' } : { type: 'button' }"
@click="movie.pending && markWatched()"
>
<BaseIcon name="bookmark" class="background absolute inset-0 w-10 h-10" />
<BaseIcon v-if="movie.pending" name="time" class="icon-pending text-blue-600 w-4 h-4 z-10" />
<BaseIcon name="checkmark" class="icon-watched text-green-600 w-4 h-4 z-10" />
</component>
TailwindCSS:
.badge {
.background { @apply text-blue-300; }
.icon-watched { @apply hidden; }
&:hover, &.watched {
.background { @apply text-green-300; }
.icon-watched { @apply block; }
.icon-pending { @apply hidden; }
}
}
At this point, I feel like the UI is almost finished. I have gone way past the apetite budget, and I realize this happened because of my nitpicking with the graphic part. But I'm actually confortable working like this, as I explored in a blog post called Order vs Chaos.
Today I want to write a short update on how things are going. I thought by now I'd be almost finished, but turns out I just found something important to improve and that'll probably delay the release even more. I still expect it to happen shortly though, in about 2-3 weeks.
The past two weeks I've been finishing the UI and the only thing that's missing now is the initial loading screen. I've been doing some tinkering with data fetching, and I believe I'll be able to make it really fast (compared with solid-focus which is kind of slow at the moment). I'll have 1500 movies in my account, and that's what I've been using for testing locally.
Something else I've done is deploying the app using github pages (don't use it yet because there'll be breaking changes for sure!). I had some issues with routing that should be solved now. The problem was that some of the application routes, for example /collection
, lead to a 404 github page. The reason for that is that github expects to have an html file in every route and the application is a Vue SPA using vue-router. I'm surprised that I didn't find many resources on how to solve this, but I ended up doing a simple script to handle that.
I've also set up a CI testing environment using github actions. If I hadn't done this before that's because I was in exploration mode, and the app is now starting to become stable enough for a first release. I normally use a TDDish approach to development, but I do 0 tests when I'm exploring or tinkering with new concepts. The same applies to documentation.
Something else interesting I've been doing is a markdown component that allows me to simplify the generation of text-based app content. This may be a bit overkill, but I've enjoyed doing it and it allows me to do things like defining modals entirely with markdown and having some nice interactive import logs.
When I was almost done with this task, I went into a couple of new rabbit holes. None of them were essential for the release, I could have pushed through and released anyways. But at this point I'm embracing the "It'll be done when it's done" craftsman mindset. I am logging how much time I'm dedicating to each part and I'll post a summary of what I've spent my time doing when the task is done.
Still, I'm am not abandoning lessons learned from the shape up methodology. Something I applied recently is the circuit breaker. I did not get into these rabbit holes without betting first. And the second one was one hour shy of getting cancelled.
Rabbithole #1: Lazy Elements Loading
The first rabbithole I went into was "paginating" the movie collection. Yeah, that's in quotes because that's what I thought I'd be doing. When I started testing the application with a dataset of 1000+ movies, I realized how slow it was. This was to be expected because I was rendering all the movies in a single page, images and all.
My first instinct was to paginate the results, and I implemented a version with that. But I was not happy with the result. I also experimented with infinite scroll, but I didn't like it either because it took ages to reach the bottom. After some more tinkering I recalled a blog post on how Google Photos implemented their image browser. I am ashamed to admit that I use Google Photos, although it's in my list of things to replace with autonomous data alternatives. But you can't argue against the quality of the product. Inspired by that post and Google Photo's UX, I implemented a solution where the full scroll height is rendered but elements aren't displayed until they appear on screen. This is possible thanks to the Intersection Observer API and chunking the results.
Rabbithote #2: Web Workers & IndexedDB
The second rabbit hole came about looking at the responsiveness and speed of the initial loading. Once the app is loaded it works well, but the initial loading is excruciating. This is also a problem in Solid Focus, but it's accentuated in this application because the dataset is bigger.
There are multiple reasons for this. Two important ones are network requests and parsing semantic data. Looking into this, I reached the conclusion that it can be improved by caching more data in the browser. I was already doing something similar in offline mode, with a local storage engine. But exploring other improvements I found two browser APIs I hadn't been using: IndexedDB and Web Workers.
Those two play very well together, so I spend some time rewriting different parts of the stack to support web workers and I created a new IndexedDBEngine in Soukai. Although I'm not completely finished with this, and the reason is...
Rabbithole #3: JsonLD serialization
SolidModel currently serializes models to "friendly human-readable json". This is something that SolidEngine already knows, and it translates the attributes to a linked data format. The reason why I followed this approach is that other engines, such as LocalStorageEngine
, don't know anything about Solid and they'll treat serialized models as normal objects.
My goal was that exported models would look understandable to humans, but in hindsight that was a mistake. JsonLD is a standard format and even though if it isn't the most human-readable thing, it's close enough. The cost of not serializing to JsonLD is that semantic data will be lost. This hinders the ability to export and import data, which has become apparent with Media Kraken because I'm implementing these capabilities from the start.
I embarqued on a crusade to rewrite SolidModel to serialize to JsonLD. And I say crusade because this involves refactoring multiple parts of the stack, and it'll definitely cause breaking changes. But I'm confident that it'll be better in the long run.
This one is so core that I don't really have a bet for it, it'll take as long as it needs to. I've already sorted out some of the core changes, and I've replaced the library I was using for interacting with semantic data with n3 and jsonld-streaming-parser. This should also reduce the bundle size and improve performance.
It's been a month since the last update, but at least I can say that I've finished the refactor! I am confident that the foundations are done and I'll close the task in the next update.
Before I go into the details of the refactor, here's a diff of the changes in soukai and soukai-solid. I don't expect anyone to understand them, but you can see the magnitude of what I've been working on. Definitely not trivial, more on this later.
In my previous update I mentioned the motivations to follow this path. I've continued learning about JSON-LD and RDF data structures, and I posted some of my doubts in the Solid community forums. Turns out I was missing more core concepts than I thought.
An important one is the fact that a Solid document doesn't necessarily need to declare a resource with the same url. For example, you could have a movie stored at https://your-pod.com/movies/spirited-away
but the movie resource could have the id <https://your-pod.com/movies/spirited-away#it>
, and it's actually not that uncommon. Something else I was assuming is that related resources would be stored in the same document, for example WatchActions. Yes, my application will store them in the same document, but this doesn't mean that the application should break if some actions are moved to different documents.
What this boils down to is that the proper way to store a Solid Document in JSON-LD format is a graph object. And this, unfortunately, complicates things a lot. This means that a simple model update has to be translated, at the engine layer, to updating an item within an array. In order to tackle this, I've been looking how some NoSQL databases manage this kind of data. One that I've used previously and I like is MongoDB. So my current implementation is very much inspired by that.
Another consequence of this paradigm shift was refactoring relations. The implementation I had before was quite simple, and that's why they didn't handle all the use-cases. My initial approach had been to look at Laravel relationships, but this time I went to Rails' Active Record Associations . One of the main reasons why I did this is that Rails has a decent mongodb driver, whereas Laravel doesn't.
One last thing to mention about the refactor is how nice testing has been. I had already done TDDish development in previous versions, and this meant that I could be confident to move things around knowing that tests had my back. To my surprise, I run a coverage report for the first time and both soukai and soukai-solid had more than 90% coverage, which is nice. Of course, the real proof that tests have my back will come when I upgrade the dependencies in Solid Focus and see how many regressions I find.
I still have to update the documentation for both projects, so I will process all the information I vomited above into the docs.
Now that I'm done with that huge detour, it's time to get back into retrospective mode.
This time, I felt the weight of the project and its complexity. One the things I need to improve is that I'm prone to overengineer. And I'm 100% sold on simplifying, but the thing is that simplicity is hard, possibly more than complexity. I percieve something as simple and before I know it I'm within a rabbithole of complexity. But even in hindsight I'm not sure that I could have done this any simpler, and I don't want to dedicate it more time. The problem is inherently complex, after all I'm trying to implement an Active Record library for Semantic Web technologies, and that's not trivial. Specially given that I'm not an expert in either of those domains. Maybe I shouldn't even we working on this problem at all.
In any case, I am fortunate to be able to dedicate my time on these things and potentially waste it. As I already knew, library development is a completely different beast to application development and I'll have to continue pondering on the correct way to approach it.
For now, it's back to appetite, application and release mode.
After 6 months of starting this task, I can finally say it:
I've released the Kraken!
I'll make the official announcement in social networks next week, but I'm certainly closing this task now and I don't expect to change anything big.
The last couple of weeks I've been tying some loose ends and doing a review of this first release. I'm still doubtful about some implementation details in my approach to use Web Workers, but I think the design choice is correct. Something I'm not too happy about is caching and the way data is managed, but there are some limitations that I'm not sure how to overcome. I've opened a post in the Solid forum to start a conversation about this. And I found yet another unimplemented Solid feature.
I have to say that I'm becoming more disappointed by the Solid community each day. I'll continue supporting the project and making Solid apps because I still share the underlying values and vision. But the developer experience is very bad, and I struggle a lot in making things work that in other contexts would take me 5 minutes. I've been lurking in the Solid community for about two years now, and it seems like there is a lot of talking and theorizing but I still have to see a single Solid application that I really love (and yes, I include my apps in that bucket). It seems to me like one solution would be to implement my own Solid POD. But I don't really have the knowledge, time nor motivation at the moment. This is in stark contrast with the Laravel community, which continues to delight me and it's very hands-on.
Anyways, coming back to Media Kraken. I've updated the documentation and released new versions of the three projects: soukai, soukai-solid and media-kraken. I also started using the main
branch name (instead of master
) in Media Kraken, I'll eventually do that with all my projects. And I also started looking into better ways to build npm libraries, like rollup and api-extractor. But I caught myself getting into another rabbithole so I decided to leave this for another day.
Now, before closing this task let's look at time dedication. In total, I've dedicated 278 hours to this task. That's about 7 weeks of full-time work. However, given that this is a side-project and I've had stops in between, it's taken me 6 months. Here's the hours breakdown (the first weeks don't have a description because they were within appetite budget):
Time dedication hours breakdown
- Week 1 - 11 hours
- Week 2 - 0 hours
- Week 3 - 3 hours
- Week 4 - 7.5 hours
- Week 5 - 9.5 hours
- Week 6 - 9 hours
- Week 7 - 19.5 hours (Logo & Responsive/Animations)
- Week 8 - 15.5 hours (Search, Login, Welcome, Modals)
- Week 9 - 6 hours (Final UI)
- Week 10 - 12.5 hours (Dynamic badges, Send to collection animation, Snackbars, Menu)
- Week 11 - 18 hours (Movie UI Refactor, Deploymnet & UX Tweaks)
- Week 12 - 9.5 hours (Modeling refactor, Import UX, Testing)
- Week 13 - 20 hours (Testing, Pagination, Models Caching)
- Week 14 - 15.5 hours (Web Workers, IndexedDB)
- Week 15 - 12 hours (JSON-LD models serialization)
- Week 16 - 7.5 hours (JSON-LD models serialization, Relationships refactor)
- Week 17 - 17 hours (Relationships refactor, Embedded->Documents refactor)
- Week 18 - 13 hours (@graph Filters & updates, RDFDocument refactor, Engine Caching)
- Week 19 - 10 hours (Relationships refactor, Relations Caching, Kraken Caching)
- Week 20 - 16 hours (Kraken Caching, Import/Export JSON-LD, Testing)
- Week 21 - 2.5 hours (Testing)
- Week 22 - 4.5 hours (Tweaks, JSON-LD url minting)
- Week 23 - 9 hours (Tweaks, Caching)
- Week 24 - 12 hours (Solid modeling, Filters/sorting)
- Week 25 - 5 hours (Documentation)
- Week 26 - 11.5 hours (Documentation, Release)
It'd be an understatement to say that I've exceeded the initial appetite of 40 hours. It's been 7x times that. However, I have already talked about it previously. It has been mostly a conscious decision, and most of the overtime has gone to either UI tweaks (which I enjoy working on) or library development (which I believe will yield benefits in the long term). The actual core of the application was done in the 40 hours budget, but I cannot deny that I've mismanaged this. I suppose the proper way of doing this would've been closing the task at 40 hours, and open new tasks (or "bets" in shape-up language). If this pattern keeps repeating, I'll have to do something about it. In general I don't mind dedicating more time to UI, but I wouldn't like having to dedicate more time to library work given that the whole idea is to make my life easier.
And now, I can start using Media Kraken in my daily life!
Task completed
Here's an afternote for this closed task.
Shortly after completing the task, I announced the release in a post in the Solid Forum. I've got some interesting feedback which I've used for a new release. Nothing major, but for those interested in the evolution of the app, the conversation's continued there :).
As for me, I've already replaced my previous app to track movies with Media Kraken and so far I'm happy with this v0.1. The only real problem is the initial loading time, which I've continued discussing in this other post. It doesn't look like I'll get any solutions for the short term, but at least the caching strategy I implemented makes it bearable.
Task started