Thanks for the comments so far; we’ll take them on board for future releases of the script. Here’s our considered response to some of the questions and criticisms so far:
“Why not just make the text a different colour rather than highlighting it?”
That’s up to you and your style sheets. The keyword highlighter may be better described as keyword tagger; it surrounds the words the visitor searched for with a span tag that has a special class. You can use this class to highlight the words anyway you want with CSS (without needing to touch the PHP code at all). Whether that is by changing the colour of the text or adding verbal stress to the word for screen readers is up to you. You may even want to change the use of the span element to that of strong or em for a more semantic value; it is your decision.
“Your regular expressions for parsing the HTML are naive”
Implementing a full SGML/XML parser was well beyond the scope of our project, although it may be a good idea for a future version. The main focus of the project was on usability and allowing users to find the information they want faster.
“One additional problem is a search utilising inclusive or exclusive syntax (’+’ or ‘-’)”
We hadn’t thought about this, so thanks for pointing it out. If a user searches for “food -dog”, the highlighting function would try to highlight the words “food” and “-dog”, rather than “dog”. We’ll put in the to-do list and fix it as soon as we can.
“Searching for <! breaks the highlighting function”
Good catch! We missed that one, but it’s been fixed in version 1.8.1 (available from http://suda.co.uk/projects/SEHL/). The problem was that while the special HTML characters were being properly escaped when being searched for, they weren’t when being displayed in the little advisory at the top of the page. So the highlighter was never broken, but the advisory note at the top was.
“Why not use Firefox’s find-in-page feature instead?”
You can, but you can only search for one word or phrase at a time. The highlighting function is available to any visitor without extra effort on their part, and it shows any number of distinct phrases on the page simultaneously.
“Wouldn’t doing it server side have odd results if a browser is using a proxy?”
If we understand you correctly the implication here is that the referring URL would be removed by the proxy; thus nothing will be highlighted, the user is none the wiser, and the system degrades gracefully. Can anyone think of any other problems with a client behind a proxy?
On a final note, we hope you see our code as a seed for future ideas, to allow you to think about how best to provide information to web users, to do more than just provide static information. Remember, our code is released under the GPL, so please feel free to take it, fork it, and make it better! If you have any more comments please let us know. Cheers!