Forum OpenACS Q&A: Re: New Package: News Aggregator

Collapse
Posted by Simon Carstensen on
This bug has been causing me some headache. Perhaps someone can help me out?

The na_check_link proc checks whether link is internal (i.e. a permalink) or external (i.e.points to a source). Take for example Evhead's feed (http://www.evhead.com/rss.xml) - the link nodes point to external URLs. The source he's commenting on is put in the link node. That's when I came up with the na_check_link proc. Later I noticed some doublets from Doc Searl's and David Winer's blogs coming in. It turns out that all Winer's link nodes point to URLs like http://scriptingnews.userland.com/backissues/2003/02/04#When:2:49:44PM. Hence it's detected as an external URL (since its domain name is different from the feed_url which is http://scripting.com/rss.xml). Doc Searl's link nodes point to http://doc.weblogs.com, which is the correct URL, only the feed is placed at http://partners.userland.com/people/docSearls.xml. Again his link nodes are detected as external links.

I'm not sure how to solve this problem. As for Doc Searls I solved it by using the URL of the website instead (which is http://doc.weblogs.com).

I have to make sure the link node doesn't point to an external source, since it's potentially used to check whether the item has already been added (so if Evan writes about a piece by Joi Ito, for example, and links to him and I've subscribed to Joi Ito's feed, there's going to be a doublet).

Any suggestions?

BTW, I'm not sure I get your bugfix Mark. What was wrong with my code?

/Simon