Rachel Kroll

Reader feedback: feed reader scores and "like" buttons

Well, it's been an interesting couple of days. I got all kinds of feedback submitted from people who wanted to participate in the feed reader score service project. I spent quite a while answering every one of those and issuing keys to anyone who asked.

The whole time, the data has been coming in, and we've been able to start seeing some interesting patterns in the noise. There are a fair number of people who have absolutely perfect feed readers. They show up at reasonable intervals, send the right header(s), and just work with no fuss. It's fantastic.

For those not currently participating, here's how it works. I send you a link to a web page with the instructions and a unique key that's just a bunch of random hex digits. The web page tells you how to turn that into a unique URL that can be handed to a feed reader.

Then, there's another key-based URL which shows the report. I provided a text dump of what it looked like in Thursday's post, and that's all it was at the time. I've since put in the effort to make it actually generate a Real Web Page, complete with actual HTML and all of this stuff (imagine that). Now it looks more like this:

"HTML tables and lists of statistics"

I've removed the identifying marks (the key at the top, and the user-agent string at the bottom).

This program got a little frisky at startup and actually sent multiple identical unconditional requests for some reason. I don't know why, but it's not good behavior.

It also did some requests for some other made-up URLs that weren't supplied by the user (who was talking to me at the time), and I don't mean favicon or those apple-touch-icon things. I mean it took the URL provided by the user and glued on /other /stuff. That kind of stuff doesn't show up here, but I could add it later.

It's strange. Bum rushing a web server isn't a good first impression.

...

Separately, an anonymous reader asked if I would be willing to put up stats for the posts so they could see which ones were particularly popular. I actually don't really analyze that. I can tell on a basic level when things are popular because the "tail -f" of the logs really starts moving, and I can see an uptick on the bandwidth utilization.

But other than that, I have only my own random impressions of what's popular and what isn't based on remembering activity levels and the flavor of whatever feedback it might generate (or not). So, there's no data to provide, and I really don't want to build such a dataset anyway.

A post is a post. If it works out, that's great, but if not, eh. I have no particular reason to "tune" them. I don't have ads to serve, impressions to generate, or eyeballs to sell out to the highest bidder. It's just a whole mess of text.

Most of what I use my logging for is to handle abuse. Analytics? Feh.

The one exception is from 2020 when I noticed that a fair number of bug-tracking systems were leaking URLs (many internal) to specific posts about particular nerdly analyses of failure modes like "don't setenv in multi-threaded code" and "malloc(1213486160)" and all of this. The post I did about that links to the older posts and gives a little info on them and a few scant details about the incoming links. Such referrer data is almost completely dead now, since I just get bare hostnames without any paths. It's rare to see much more than that, so I don't expect a repeat of that post, well, ever.

They also asked for "like/dislike" and maybe more (FB style?) "reactions". I'm not likely to do that, either, for the same reason as why I don't have any kind of public comments: managing that kind of thing is serious work. If it's not managed and kept in the right groove, it will turn into a very bad place at the times of the week when people with lives are out living them, and THE ONE can run amok with nobody to tell them to shut the hell up. You know exactly which forums I'm talking about.

Also, I'm pretty sure that having me trying to operate the forum and yet also moderate it would quickly turn into yet another case of "don't do both" as I have few qualms about dropping the banhammer on bad behavior... but that tends to divide communities. (Been there, done that.)

Finally, I like the fact that all of this stuff is a forest of flat files. There are no connections to any sort of database in the serving path for the posts or the feed. That would be defeated if I added something to dip into a table and look to see how many "likes" it had before shoving the post out the door.

The entire /w/ path in the document root is about 151 MB at the moment. flicker (the current web server) is a monster that besides being physically massive also happens to have 128 GB of RAM in it. That means it can fit the entire thing into memory and basically keep it there. That means there isn't even a concern about how fast the "disks" are, since it only touches that stuff once per item.

You might have noticed that things are usually pretty snappy. That's a big part of why: it's doing as little work as possible.

If there was dynamic stuff going on, that wouldn't work nearly as well. For me to cross that particular divide, it's going to have to be for a very good reason.