[TriLUG] Search Engine question
Aaron S. Joyner
aaron at joyner.ws
Sun Oct 15 13:30:17 EDT 2006
Joseph Mack NA3T wrote:
> On Sun, 15 Oct 2006, WA Brown wrote:
>
>> I have no experience with search engines. I was wondering if there
>> was search engines that I could install on my linux server(Apache
>> 2.0) and others could use it?
>
>
> what are you searching? Local pages (htdig, doesn't keep records)? The
> internet?
>
> Joe
>
Just to point out a sticky wicket here, just because htdig isn't
explicitly keeping logs, be careful what your httpd access logs are
grabbing. In modern versions of apache, on most distributions, the args
to a cgi query are not kept, but you can turn it on if you want to, or
you may have in the past for debugging and forgotten about it, or your
distribution may have done it for you. Just one more data point to be
aware of in the quest of privacy concerns.
Also, as a note, most locally installed search engines can't do a very
good job of "ranking" the pages you have accessible. They're generally
doing simple keyword-search matching, which is to say if you enter the
word "foo", pages results will be ranked in the order of number of
times "foo" appears appears on the result page. This is far less
accurate and sophisticated than the matching a modern web search engine
does (Google is just one example). It may be quite sufficient for your
purposes, but it's good to be aware of the differences so you can
evaluate the search quality of the results for yourself.
Aaron S. Joyner
More information about the TriLUG
mailing list