Tuesday, February 21, 2012

RSS Search

Hi! I'm looking for ideas on what would the best approach to design a
search system for a RSS feeds. I will have some 50 RSS feeds (all RSS
2.0 compliant) stored locally on the web server. Now I'm wondering
what would the best method to allow searching of these RSS files.
Since the search will cater to multiple users the search system has to
be robust and efficient. Some ideas that I have for the RSS search
system are:

1. Store all RSS files locally on the web server file system and
perform file system queries. But I guess this might get slow when a
number of users try to search. Moreover, the queries may not be
extensible (for example to allow boolean operations etc).

2. Move the RSS data to the database and then search perform search
using LIKE (or the more advanced indexing service features).

3. Use a 3rd party full-text search engine like Lucene.

4. Use something like XQuery or XPath to query the RSS files directly
but this again *might* (not sure since I haven't worked with either)
get slow when a number of users try to search.

Also, the RSS files I have on the web server will be updated every
hour or so.

So, I have the ideas but I'm not quite sure which one would the most
suitable and efficient. If anyone has ideas on implementing such a
search system for RSS feeds then please share your insight. Thank you
guys!I'm not surprised to see nobody has responded to your questions.
I'm working on the same type of issues and all I have learned so
far is that SQL Server 2000 would require 'shredding' the data
and putting it into the database where the server could be used
to return results all other current options not being performance
friendly.

--
<%= Clinton Gallagher, "Twice the Results -- Half the Cost"
Architectural & e-Business Consulting -- Software Development
NET csgallagher@.REMOVETHISTEXTmetromilwaukee.com
URL http://www.metromilwaukee.com/clintongallagher/

"RiceGuy" <9icj4u613jeqrx8@.jetable.org> wrote in message
news:d7851925.0407242150.380d929a@.posting.google.c om...
> Hi! I'm looking for ideas on what would the best approach to design a
> search system for a RSS feeds. I will have some 50 RSS feeds (all RSS
> 2.0 compliant) stored locally on the web server. Now I'm wondering
> what would the best method to allow searching of these RSS files.
> Since the search will cater to multiple users the search system has to
> be robust and efficient. Some ideas that I have for the RSS search
> system are:
> 1. Store all RSS files locally on the web server file system and
> perform file system queries. But I guess this might get slow when a
> number of users try to search. Moreover, the queries may not be
> extensible (for example to allow boolean operations etc).
> 2. Move the RSS data to the database and then search perform search
> using LIKE (or the more advanced indexing service features).
> 3. Use a 3rd party full-text search engine like Lucene.
> 4. Use something like XQuery or XPath to query the RSS files directly
> but this again *might* (not sure since I haven't worked with either)
> get slow when a number of users try to search.
> Also, the RSS files I have on the web server will be updated every
> hour or so.
> So, I have the ideas but I'm not quite sure which one would the most
> suitable and efficient. If anyone has ideas on implementing such a
> search system for RSS feeds then please share your insight. Thank you
> guys!

No comments:

Post a Comment