The Chris Pliakas presentation on Search Lucene in Drupal

Bryan's picture
Submitted by Bryan on

While I was at DrupalCon last week, Chris Pliakas sent a tweet out that he used screenshots from CMS Report in his Apache Lucene presentation. I'm always flattered when this site gets noticed for something we're apparently doing right. In this particular case, we're using the contributed Drupal module Search Lucene API for our search engine as well as for faceted search and content recommendations (recommended links).

If you had talked to me a few years ago, I would have told you that the Search module that comes with the Drupal CMS is all a site like mine needs. After I became a beta tester for the Acquia Network along with their implementation of Apache Solr called Acquia Search, my opinion quickly changed. I'm now convinced that an enterprise quality search engine is truly something that can make or break your website. If you're a smaller Drupal site that feels like Solr or Acquia Search is overkill or not in your cost range, Search Lucene API may be the answer you've been looking for all this time.

The actual name of Chris' DrupalCon presentation is: Build a Powerful Site Search with the User-Friendly, Easy-to-Install Search Lucene API Module Suite. The video of his presentation can be viewed at Archive.org and has been embedded above. Screenshots from CMSReport.com can be seen in the time frame from 19 minutes to 21 minutes.

Bitrix Introduces the D.I.G. Engine : Enterprise 2.0 Search Technology

bitrix's picture
Submitted by bitrix on

ALEXANDRIA , VA. – Bitrix, Inc. (www.bitrixsoft.com), a technology trendsetter in business communications solutions, introduces D.I.G. technology – an advanced search engine developed specifically for enterprise intranets and websites that enables high-performance data search in texts, media content and documents with smart ranking, sorting and display. The engine is available in the company’s flagship products – Bitrix Intranet Portal and Bitrix Site Manager.

"Information is a gateway to new business opportunities, while information retrieval is a key to this gateway. We are proud to present our perfected search technology and provide customers the ultimate tool for fast and accurate locating of required data across an organization’s digital assets," said Yury Tushinsky, CTO of Bitrix, Inc.

D.I.G. is designed to meet five basic principles to achieve best value and easy user adoption: accuracy, performance, content coverage, security and flexibility. This ready-made search tool intelligently implements an idea that is both simple and brilliant – thorough digging, smart display.

Google PageRank

Bryan's picture
Submitted by Bryan on

CMS Wire's Barb Mosher reported about a forum posting by a Google Employee explaining why PageRank has been dropped from the Google Webmaster Tools. Barb writes:

Do you constantly watch the Google toolbar in your browser to see if your Google PageRank has changed? Do you worry constantly about why your rank is less than that of a competitor? Well, there may not be any reason to worry any longer.

Google has dropped PageRank data from Webmaster Tools.

Google has for some time discussed that PageRank is a very small factor among many factors that they look at for placing a particular indexed page on a search results page. Dropping PageRank from the Webmaster Tools appears to be just one more step in moving PageRank away from everyone's attention.

We've been telling people for a long time that they shouldn't focus on PageRank so much; many site owners seem to think it's the most important metric for them to track, which is simply not true. We removed it because we felt it was silly to tell people not to think about it, but then to show them the data, implying that they should look at it. :-)

I have observed that indeed PageRank doesn't matter for placement on Google's search pages. I've seen CMSReport.com's front page ranked from as low as "3" to as high as "7" over the years. Although the page rank has varied over time, the placement of my web pages on the search pages have stayed about the same. Relevancy of the page to the search terms being used seems to have a much greater impact on how well your site ranks with the search engine. Additional details on why Google doesn't see PageRank as a good measurement for a site can be found on one of their Webmaster FAQ.

Keyword Research for Search Engine Optimization in Drupal 6

Radha1587's picture
Submitted by Radha1587 on

In this PDF article by Ben Finklea, readers will explore:

  • What is a keyword and why it matters
  • Why keyword research is perhaps the most important thing you will do in an SEO campaign
  • Setting goals for your keywords
  • How to use your site to find great keywords including installing and configuring the Top Searches module
  • Several external keyword research tools to speed up the process of finding the best terms
  • A walk-through of the keyword research process

Read More: http://www.packtpub.com/files/8228-drupal-seo-sample-chapter-2-keyword-research.pdf

A book for implementing a Solr-based search engine

swatii's picture
Submitted by swatii on

Packt is pleased to announce Solr 1.4 Enterprise Search Server, a new book that helps developers optimize their website for high volume web traffic with full-text search capabilities. Written by David Smiley and Eric Pugh, Solr 1.4 Enterprise Search Server is a practical reference guide that provides complete guidance on how to use this incredibly powerful tool effectively.

Solr is an Open Source enterprise search server based on the Lucene Java search library. Solr can be integrated with a host of technologies such as Java, JavaScript, Drupal, Ruby, XSLT, PHP, and Python. It also provides users with an enhanced search experience through features such as highlighting search results, spell-corrections, auto-suggest, and phonetic matching.

Solr 1.4 Enterprise Search Server starts off by giving readers a quick overview of Solr. The book helps readers learn how to map relational schemas to Solr’s schema and explains each of the ways in which data can be imported into Solr. From here the book gradually takes them from basic features such as the query syntax, to advanced features that helps enhance their search. Developers can further enhance their search with faceted navigation, result highlighting, and fuzzy queries.

Kentico CMS 4.1 for ASP.NET released

Bryan's picture
Submitted by Bryan on

Kentico CMS 4.1 for ASP.Net was recently released. Kentico, the software vendor, announced that this version of the CMS contains a new enterprise-class search engine and several usability improvements.

The new search engine is based on the Lucene.NET search engine which is already known to provide excellent performance and search results. The new Smart Engine can be used side-by-side with existing SQL-based search engine. The main features of Kentico's search engine includes:

  • displaying search results based on the relevancy (search result ranking)
  • displaying document preview in the search results
  • customizable search filters and custom search queries
  • customizable search scope - you can specify site section, document types and fields that should be searched

The additional usability improvements found in Kentico CMS 4.1 include:

  • New Dialogs for Inserting Images
  • New Concept of Document Attachments
  • Improvements in Multi-lingual Content Management
  • Improved Performance

Kentico CMS 4.1 can be downloaded from the vendor's official download page. You can also find additional information about this CMS at Kentico.com.

Google for the Next Generation

Bryan's picture
Submitted by Bryan on

Yesterday afternoon Google announced at their Webmaster Central Blog that Google is changing the architecture of its search engine. These changes are expected to improve the speed, accuracy, and completeness of the Google search engine. Better yet, the prototype for the enhanced search engine is available for public testing.

For the last several months, a large team of Googlers has been working on a secret project: a next-generation architecture for Google's web search. It's the first step in a process that will let us push the envelope on size, indexing speed, accuracy, comprehensiveness and other dimensions. The new infrastructure sits "under the hood" of Google's search engine, which means that most users won't notice a difference in search results. But web developers and power searchers might notice a few differences, so we're opening up a web developer preview to collect feedback.

Some parts of this system aren't completely finished yet, so we'd welcome feedback on any issues you see. We invite you to visit the web developer preview of Google's new infrastructure at http://www2.sandbox.google.com/ and try searches there.

When first using Microsoft's new Bing search engine one of the surprises for me was the speed in which the results were delivered. I suspect that it's probably no coincidence that as competition heats up Google now sees a need to improve the infrastructure for delivering search results to its users. Whatever the reason, I'm happy to see that changes are coming.

I also have to admit that I get a secret pleasure in knowing that changes with Google's search engine will put those search engine optimization (SEO) folks on even shakier ground. These are the folks that claim for a price they can put your website pages on top of Google's index pages. As you can tell from my tone, I'm not a big believer in SEO. I'm a big believer that writing good content on your site is the only search engine optimization you ever really need. Hopefully Google's new search engine will continue to prove my point.

mojoPortal 2.3.0.8 Released

Bryan's picture
Submitted by Bryan on

A new version of mojoPortal is out, version 2.3.0.8. This new version of the CMS includes the following new features:

  • Search Engine Improvements
  • SEO (Search Engine Optimization) Improvements
  • Content Template Editor
  • Skin (Theme) Improvements

mojoPortal is available at the official download page. Additional details on this version of mojoPortal can be found at mojoPortal.com.

Google improving search for Flash sites

Bryan's picture
Submitted by Bryan on

I'm not a huge fan of creating sites with Adobe's Flash.  I personally find Flash sites difficult to navigate, bookmark, and retrieve worthwhile information.  However, I can understand why the more artistic Web designers and site owners out there prefer to use Flash when building a website.  But in my mind, one of the biggest drawbacks with Flash is that Google and other search engines have a difficult time reading and indexing Flash sites.  Let's face it, if Google can't search your site then it is highly unlikely your customers will find your site in the first few pages displayed by Google no matter which keywords are being used.

Luckily for Flash fans, Google has changed the rules by improving their search capability for Flash sites.

Google has been developing a new algorithm for indexing textual content in Flash files of all kinds, from Flash menus, buttons and banners, to self-contained Flash websites. Recently, we've improved the performance of this Flash indexing algorithm by integrating Adobe's Flash Player technology.

In the past, web designers faced challenges if they chose to develop a site in Flash because the content they included was not indexable by search engines. They needed to make extra effort to ensure that their content was also presented in another way that search engines could find.

So there you have it.  I just lost my number one argument against building a site using Flash technology.  Web designers and site owners will likely want to read Google's questions and answers pertaining to the improvements with Flash indexing.

Pages