> Logically then, the time is ripe for a new search engine to disrupt but the problem - that of determining intrinsic worth - is really hard to solve, so as yet nobody has
I'm sort-of trying to solve this with a recommender system approach.[1] Search isn't my primary objective; I'm mainly focusing on the case where you want to be made aware of relevant information on an ongoing basis but you aren't necessarily looking for something specific.
But that involves building up a ratings database of lots of different URLs. If it becomes really popular, I'd like to experiment with search. It could start with a "normal" search algorithm but then adjust the results based on rating data. E.g. "the top result for this query is X, but people with similar rating data to you always down vote that link, so we'll move that result down".
I've heard people complain about how Google gives you personalized search results sometimes, but I'm not sure if the approach is actually bad. Perhaps it could work if it was focused on from the start instead of bolted on.
Interesting. I'd thought about some sort of meta-search service running over Google, Bing, etc., and aggregating results, that was tied in with some way of rating recommending URLs.
I've never got around to doing it and one of the reasons (excuses) is that I haven't yet figured out a way to stop bad actors exploiting it by, e.g., mass downvoting URLs of competitors. I did think about some sort of moving average, or bias toward more recent votes, which would at least somewhat mitigate against transient actions of this kind, but it's harder to deal with a sustained attack like this.
I've thought about that a fair amount also. I think the key is personalization. If bad actors downvoted a bunch of items, it would only affect people who already shared a lot of similar votes with those actors.
That is still exploitable: if I know you like A, B, and C, (perhaps those are popular items), I could also upvote those items and then downvote another item D. I have an idea for countering this: keep track of history. So instead of just measuring ratings overlap ("of items you've both rated, 75% of items were rated the same"), you keep track of the performance of recommendations made from that user ("of Alice's items that we recommended to Bob, Bob upvoted them 25% of the time").
I haven't thought about this super deeply yet, but I think to beat that, you'd have to actually provide valuable data to the system in the first place.
To generalize that a bit, I guess I'm talking about a reputation network. You could visualize it with a directed graph that has edges weighted from -1 to 1 (0 default). Recommendation + rating history is used in some way to assign those weights, and then future recommendations are taken primarily from other users who have high weights from your perspective.
Quick question about findka, Is the idea to only upvote/downvote things you have actually experienced and liked or dislike, or do you also upvote/downvote things you don't know but like or dislike?
I would think the first case would result in way better data (which is why (I think) last.fm recommendation system is so good), but the latter case seems to be the natural thing to do, humans love to judge shit without actually engaging with stuff.
Mainly the former. Ideally you've experienced the things you've rated, but in many cases you can probably judge fairly accurately if you're interested in something based on a short description.
That brings up a key point which I'm not sure is clear to users or not: voting on items specifically means "this is/is not a good recommendation for me". It's not a judgment on the goodness of the item in general. So if findka showed me e.g. an article on pottery that looked high quality, I'd probably still downvote it since I'm not into pottery.
I think that is not really communicated in the current design, reminds me of the ratings on Netflix where they are intended as "Our estimation of this contents rating for you based on your viewing history" while most people I talked about it just thought it was the rating of the thing in itself.
Good to know, any suggestions for making it clearer? Maybe thumbs up/down is the wrong labeling; perhaps something like "show me more like this" and "not interested"... though I have no idea what icons I would use for that.
I'm sort-of trying to solve this with a recommender system approach.[1] Search isn't my primary objective; I'm mainly focusing on the case where you want to be made aware of relevant information on an ongoing basis but you aren't necessarily looking for something specific.
But that involves building up a ratings database of lots of different URLs. If it becomes really popular, I'd like to experiment with search. It could start with a "normal" search algorithm but then adjust the results based on rating data. E.g. "the top result for this query is X, but people with similar rating data to you always down vote that link, so we'll move that result down".
I've heard people complain about how Google gives you personalized search results sometimes, but I'm not sure if the approach is actually bad. Perhaps it could work if it was focused on from the start instead of bolted on.
[1] https://findka.com