Skip to content

Raising an Agent

(Ongoing)

There are many things going on with AI, but I don't love all of them. In this task, I'm creating my own AI toolkit. We'll see how far I take it!

Activity

Task started

Lately, it seems like AI is everything, everywhere, all at once (online). I don't like it either, but after much resisting I think it's time to embrace it. However, I'll do it in my own terms!

Now, I don't know how far I'll take this, because the goal is obviously open-ended. But this time, I've decided to start the task a little differently: I already have the first version online đŸ¤¯. It's still very early, and nothing more than a prototype, but if you're keen to see how it's going, check it out: anima.noeldemartin.com.

So, what is Ànima?

Right now, Ànima is just an interface to chat with different AI models, using your Solid account. It'll store the conversations in your POD, and give the models access to your entire storage. Make sure not to expose it to models you don't trust! In fact, I don't recommend using it with a real POD yet... so maybe just point it to a development POD if you're curious.

At the same time, this is what Ànima doesn't do (yet):

  • Write to your POD (it is read-only).
  • Communicate outside of the browser or the AI model (it can't leak your information).

All in all, it should be relatively safe to play with. Ideally, something like this would be used mostly with local models. But there's nothing stopping you from connecting it to 3rd party providers.

However, this is only the beginning :). I do intend to make it an agent eventually (meaning, it should be capable of performing long-running tasks), and communicate with the outside world. I'd also like to make it capable of creating apps (using Aerogel, of course). But other than that, I'm not sure where I'll take it.

Something cool I've been thinking about is that this could solve the biggest problem in the Solid ecosystem (in my opinion). The current version is a simple SPA, which means that everything is happening in the browser. But ideally, I'd like to install this on a server or a personal computer. The more I think about it, the more I'm convinced that this is the perfect use-case for a Solid POD. Imagine a Solid POD that is easy to install anywhere, and comes with an AI assistant built-in that is private by default. Yes, this sounds a lot like Charlie :).

Now, let's talk a bit about technology choices.

Initially, I was very excited to start using Laravel again, because this was going to be primarily a server-side thing. But then the Laravel AI SDK came out, and it didn't support local models :/. They have added them since, but the support is still limited and by that point I had already moved on. Instead, I started using Vercel's AI SDK. And so far, it's going great. I've been able to release the app as an SPA, which hadn't even crossed my mind. I'm pretty sure some upcoming features will need to live outside of the browser, but it's very cool that this was possible!

In order to manage the server-side stuff, I also started playing with Tauri. Yes, when I say "server-side", I also mean a personal computer. The idea is that you should be able to install Ànima in your own computer, without knowing anything about servers or CLIs. As I was working on this, Tauri became a bit overwhelming soon, with builds that take 30 minutes and 8GB+ of disk space (in build assets, not the final artifact). After some research, I came across Electrobun, which is a very new project that looks promising. I still haven't done much of the native stuff; but I don't think I'll need to, given that this app is basically a wrapper for a node server.

Other than these two, I don't think there is anything else too different from apps I've built before. I'm back to using Vue for the frontend, I am using Aerogel and Soukai for most of the Solid stuff, and I'm playing with all of Vite's new tooling (oxlint, oxfmt, etc.). Maybe something worth mentioning is Bun and ElysiaJS. I hadn't used Bun a lot in the past, and I have to say it's pretty nice. It even helped me pinpoint a bunch of bugs in Inrupt's auth library. And I was very excited about ElysiaJS at the beginning, but now that I've used it for a while I'm more lukewarm. The type safety seems very nice, but it's a bit quirky to work with. In any case, I'm only using it as a transport layer between the frontend and the backend, but most of the communication is pure Typescript (that's how I managed to deploy it as a plain SPA).

So yeah, that's it, I'm building an Agent! If you have some thoughts about this, definitely let me know.

PS: In case you're wondering, yes, I do listen to Amp's Raising an agent podcast. That's where I got the idea for naming this task :).

For the last couple of weeks, I've taken a detour from the main focus of this task (working on Ànima). But I thought I'd share an update anyways, because it's still related to my end goal (making my own AI toolkit). TLDR, today's update is all about how I'm using AI for coding. Feel free to skip if you just care about Ànima updates :).

So, last month I watched Taylor's Laracon EU presentation. He talked through Laravel's AI philosophy, the tooling they've been working on, and the upcoming avalanche of non-developers into the Laravel ecosystem. As always, the presentation was awesome and it inspired me a lot. In particular, he did a demo at the end that orchestrates all the pieces in the Laravel ecosystem (Cloud, Nightwatch, Boost, etc.) to detect and fix a bug without human interaction (besides answering a phone call from his Openclaw :O). Now, how much of that demo was staged or over-hyped, I don't know. But the fact remains that I liked it, and it is a goal worth shooting for in my own work.

In order to learn more about the reality of AI in Laravel, I decided to start working on a new project. Initially, it was going to be a throw-away weekend idea. But it's going so well, that I'm probably going to release it as a real app! However, it has nothing to do with Solid, and it certainly doesn't live up to my software ideals. But that doesn't mean it's not useful :). In short, it's a "podcasts enhancer" (name TBD). I listen to a lot of podcasts, but some of them ramble too much. Like the proverbial book that should have been a blog post. So this application lets you augment RSS feeds. You give it a feed, and it sends it back with transcriptions, summaries, and chapter timestamps. I have already been using it myself in production, and it works! If this sounds like something you would like to use, let me know. The live version is going to be invite-only for now, given that this requires a real server-side and cannot be hosted as a simple PWA.

Anyhow, that experience taught me a lot. Not only is Laravel delightful to use as a developer, it also make AI a lot better. This reinforces my opinion that, if I'm going to succeed as a developer and in creating my own AI toolkit, the guardrails and the environment are probably the most important piece (which is great news for all the other work I've been doing!). The frontend part isn't as nice, though. But for this particular app, that's the least important part and it's good enough.

Reflecting on how I've been working these days, I came up with 4 levels of software development:

  1. Caveman Programming: This is what I've been doing most of my life, which is coding without AI at all. But it's sadly going to disappear, and nowadays I only do this when I'm trying to learn something (such as my recent experiments learning React). I do miss the craftsmanship and flow that went into it, but at this point I can't justify doing this anymore. And the next level is often as satisfying anyways.

  2. AI-assisted Programming: This is what I'm spending most of my time doing. Basically, I use a code editor with inline suggestions (Cursor) and the built-in chat for quick tasks. I still "write" most of the code. Some people would also call this Caveman Programming, but personally I don't think we're at a point where we can stop writing code yet. Or at least, not in the projects I'm working on (more on that later). To be honest, I'm grateful that's the case, because I really like writing code. And the day I stop doing it will be a sad day. But it is definitely coming.

  3. Vibe-engineering: This is how I've been building the podcasts enhancer, and other apps without as much success. I am aware of the codebase and architecture, and I do review every single line of code. But I don't write any of it. The reason I'm not doing this in all my projects is that I'm not happy with the code produced by LLMs. I know a lot of people will disagree with that, but from my own experience I have seen that AI produces wildly different results depending on the environment. With a fresh new Laravel project, and Laravel's opinionated patterns, they are pretty good. But in older projects, or in codebases that aren't well architected, AI still sucks. And I'm not saying this for lack of trying, I have tested this approach countless times for the last few months, but most of the time I still think I'm faster writing it myself. In some projects, though, I am slowly transitioning into this (such as the Soukai rewrite). I am convinced that with the right guardrails, skills, and environment, this will be the way to go. So I'm probably going to be doing a lot more of this in the future.

  4. Vibe-coding: And finally, what everybody is talking about, vibe coding. At this level, I don't even look at the code. But I haven't been as successful here. I built a couple of apps that were decent enough to be useful, like an IndexedDB inspector or a Manga Reader. But we're still very far from making this a viable option to build real software. Still, it is the goal I'll am for if I ever give Ànima this capability.

There are also other considerations to keep in mind, such as which models to use. Though to be honest, I don't see a big difference between all the frontier models. Sadly, local models aren't there yet, but I'm sure some day they will. Interestingly, what seems to have the greatest impact on the outcome is the harness. For example, I pay for an AI Pro Google subscription, and I've been testing Gemini 3.1 in different environments. Surprisingly, the worst one seems to be their official Gemini CLI đŸ˜…. It often takes ages to complete tasks, whereas using OpenCode results are much faster. Ă€nima is, at its core, an AI harness. So it'll be interesting to see how far I can take it.

Finally, I have experimented with some techniques such as spec-driven development, ralph loops, or cloud agents. But most of those didn't turn out great, or weren't significantly better than just telling OpenCode to build something small but significant. Although I do use plan mode sometimes.

And that's about it! I may be forgetting some things, but in general that's a pretty good summary of my current state of AI. Is it going to change drastically in the next few months? Honestly, I don't think so. But it will definitely change in the upcoming years. Hopefully, by the time I'm done with this task, I'll be able to do my best work using my favourite stack, my own tooling, and even local models.