Middleware for Kagi’s Small Web RSS Feed
Published
2024-08-16 Update: in the most meta of meta circumstances this post itself is on Kagi’s feed: https://reader.miniflux.app/share/2653120942386c74541e48cb788c632eaf0d9372
tl;dr I made a fragile proxy of Kagi’s small web RSS feed because by default it’s too big for my RSS reader.
Here’s the background on the why for all this: I wanted to subscribe to the RSS feed1 for Kagi’s Small Web but by default it’s way too big for my RSS reader (Miniflux) to handle. It clocks in at about 25MiB.
As a quick aside: if you’re unfamiliar with the “small web” it typically refers to non-commercial websites created by individual humans who do it because they can and want to, not for any financial gain. It typically invokes nostalgia of the early Internet days before advertising was wide spread. Grassroots. Community. In particular, emphasizing unity in community. Read more on Kagi or Ben Hoyt’s website.
Kagi lists this on their API docs but unfortunately I couldn’t find an obvious way to provide any sort of pagination as part of the request. So, it becomes a quickie DIY project.
Well, I’ve recently discovered Val Town so I thought, why not, let’s keep the party going. Here’s the val I created: https://www.val.town/v/pinjasaur/smallweb
And here’s the code in its entirety:
import { DOMParser, Element } from "jsr:@b-fuze/deno-dom";
const getText = async url => (await fetch(url)).text();
export default async function(req: Request): Promise<Response> {
if (req.headers.get("referer")?.includes("val.")) {
return Response.json({ error: "Cannot access from val.town" }, {
status: 400,
});
}
if (new URL(req.url).pathname === "/favicon.ico") {
return new Response(null, { status: 404 });
}
const params = new URL(req.url).searchParams;
try {
const feed = await getText("https://kagi.com/api/v1/smallweb/feed/");
const parser = new DOMParser();
const atom = parser.parseFromString(feed, "text/html");
const $feed = atom.querySelector("feed");
const entries = $feed.querySelectorAll("entry");
Array.from(entries).forEach($entry => {
$feed.removeChild($entry);
});
Array.from(entries).slice(0, parseInt(params.get("limit"), 10) || 25).forEach(($entry: Element) => {
$feed.appendChild($entry);
});
return new Response(
feed.split("\n")[0]
+ $feed.outerHTML.replace(/\s{2,}/g, "\n").replace(/<link href="(.*)" ?>/g, "<link href=\"$1\"/>").replaceAll(
" ",
" ",
),
{
headers: {
"content-type": params.get("raw") ? "text/xml" : "application/atom+xml",
},
},
);
} catch (err) {
return Response.json({ error: err.message }, {
status: 400,
});
}
}
It’s incredibly fragile, but it technically works.
Noteworthy details:
- It bails on self-requests e.g. when browsing Val Town to prevent it from automatically downloading the RSS feed to the browser.
- I ended up checking for a request to
/favicon.ico
and returning a 404 to prevent an empty icon in the Miniflux UI. - I spent way too long passing the feed through RSS feed validators to get it to a “good enough” state.
- Using the
DOMParser
API to do some “DOM” manipulation. Also, some incredibly hacky regex to make a<link>
into an XML-compliant<link />
. - And finally, the peace-da-resistance as they say2, I added a
?limit=n
query parameter to optionally only return then
most recent entries in the feed. I defaulted this to 25, pretty much exclusively because that’s what the page size Lobsters uses. Also, I added an optional?raw=1
too so it could render prettier in the browser.
Again, fragile, but it works. I’ve been dogfooding it myself for the past couple days, so it really can’t be that bad, right?
I opted to create an issue on the GitHub repository. We’ll see what the verdict is—I’d personally love to take a stab at implementing it myself. I’d much rather have a feature like this in the upstream API versus some hacky middleware that I scraped together.
Go forth and prosper by supporting the small web & their collective RSS feeds.
🖖
I love hearing from readers so please feel free to reach out.
Reply via email • Subscribe via RSS or email
Last modified #js #programming #syndication #web #hack