[TriLUG] Search Engine question

Aaron S. Joyner aaron at joyner.ws
Sun Oct 15 13:30:17 EDT 2006


Joseph Mack NA3T wrote:

> On Sun, 15 Oct 2006, WA Brown wrote:
>
>> I have no experience with search engines. I was wondering if there 
>> was search engines that I could install on my linux server(Apache 
>> 2.0) and others could use it?
>
>
> what are you searching? Local pages (htdig, doesn't keep records)? The 
> internet?
>
> Joe
>
Just to point out a sticky wicket here, just because htdig isn't 
explicitly keeping logs, be careful what your httpd access logs are 
grabbing.  In modern versions of apache, on most distributions, the args 
to a cgi query are not kept, but you can turn it on if you want to, or 
you may have in the past for debugging and forgotten about it, or your 
distribution may have done it for you.  Just one more data point to be 
aware of in the quest of privacy concerns.

Also, as a note, most locally installed search engines can't do a very 
good job of "ranking" the pages you have accessible.  They're generally 
doing simple keyword-search matching, which is to say if you enter the 
word "foo", pages results will be ranked in the order of number of 
times  "foo" appears appears on the result page.  This is far less 
accurate and sophisticated than the matching a modern web search engine 
does (Google is just one example).  It may be quite sufficient for your 
purposes, but it's good to be aware of the differences so you can 
evaluate the search quality of the results for yourself.

Aaron S. Joyner



More information about the TriLUG mailing list