We were commissioned by the marketing department of a global education website to enhance the Search plugin of their Joomla website. In essence, they wanted us to include the synonyms of a word in the search results. For example, if someone types in the word “house”, we also include the words “home”, “residence”, etc…
At first glance, this might seem like an ultra complex project, however, for (ahem!) seasoned PHP developers like us, we knew that the solution was not that complex since we worked on a similar project before. In short, we made the whole thing possible the following way:
- We downloaded a thesaurus database. We downloaded the Princeton WordNet database in MySQL format. The beauty of the Princeton thesaurus is that it is free for commercial use, and the client doesn’t have to pay monthly/yearly royalties to use it.
-
We imported the WordNet MySQL database into the client’s Joomla database. This step simply consisted of creating all the WordNet MySQL tables (we prefixed all the tables with the abc_wordnet_, where abc_ is the database prefix of the Joomla website, which is defined in the configuration.php file as $dbprefix) and then importing the data to these tables.
-
We modified the core Joomla search plugin to include synonyms as well. We modified the onContentSearch function in the content.php file, which is located under the plugins/search/content folder to query the WordNet database for the synonyms of the search keyword and then include them as an OR condition in the search.
The above worked, albeit very slowly, so we couldn’t deploy it on the live server since the website had a huge number of articles and the traffic it was getting was insane. So, we recommended the client to switch the search on their Joomla website to Sphinx for blazing fast search speeds, and they approved our recommendation immediately. So, we implemented Sphinx and then we deployed everything onto the live website, and guess what, search was much much faster than before, even after including the synonyms.
Were there any limitations?
The main limitation was that the synonyms engine only worked when the search consisted of just one word. Making it work for multiple words was extremely difficult because of the many permutations and the hardware limitations. For example, searching for a “large house” becomes a search for a “large home”, “large residence”, “big house”, “big home”, “big residence”, “vast house”, etc… This just wasn’t practical and required a lot more work to implement.
We hope you found our post fun, exciting, and informative. If you need help implementing the above, then please let us know. We are always excited to work on challenging and fun project, our fees are reasonable, and our friendliness is legendary.