Searching the LAN
Worse still, a LAN has no default index. Most servers have a Microsoft Indexer service now, but it isn’t sufficiently clever or friendly enough to resolve this type of search problem. People like systems that both find and filter, that put some kind of relevance ranking into the list of things they find, and which keep track of what’s being asked for. What’s worse is that the Microsoft Indexer only looks at one server: I haven’t seen many businesses with only a single document server lately.
The recent explosion of search utility products has gained some piquancy with the entry of Google into this market. You can have Google Desktop Search; you can have a Google Mini (blue, smallish and up to 100,000 documents) or a Google Search Appliance (yellow, very big and heavy, £20,000 and effectively unlimited numbers of documents). The rest of the players in this business – some of which have been around for a good few years and have quite a variety of offerings – could find themselves suffering now that the current search icon has broken out of its shell and is delivering the swatch of search engines to brighten up equipment rooms across the planet.
So what do the Google boxes do that the software products don’t? Why would you choose a bundled-up hardware box anyway? Running a Google Search Appliance setup isn’t quite a plug-and-play operation. The Appliance isn’t able to ‘hunt the LAN’ without the initial setup. You have to prepare your servers to be searched, then you connect the box and tell it where to go looking. Some resources are presented via Internet Information Server (IIS), others come from databases. This is both good and bad news: good, in that the Google box understands ODBC and SQL data sources, so you can include your corporate SQL-based data in the index; bad, because to look through your file server shares you’ll need to publish those as virtual hosts within IIS.
Network managers will be dividing into diametrically opposed camps even as they read this. One group will hiss and run away at the idea of turning on IIS anywhere inside its corporate firewall; the other would expect everyone to have this on as a matter of course.
It isn’t quite that simple. This is about smaller networks, and most of those I see don’t use IIS in a trivial fashion. Not because it’s technically undesirable, but rather because their ‘web stuff’ is handled by the ‘web guy’, who has almost nothing to do with their internal servers.
Having said that, going through the setup steps on the Appliance is definitely worthwhile. It may take a while for the default crawler robot settings to work through your data pile, but the immediate advantage of the presentation given to your data is that it’s Google style.
That means you can build queries with Google syntax, and you can leave your staff merrily clicking on the search results they dig up. It even presents the usual Google ‘cached’ and ‘plain text’ options for looking at something that’s in a foreign format for your PC. This is especially invaluable when what it returns is some bloated PowerPoint presentation.
However, there are several shortcomings to the Google methodology when it comes to the usual mish-mash of network resources found in smaller networks. One is ably highlighted by Paul Ockenden; another is the licensing price. More than £2,000 for an annual subscription to Google’s combined software licence, hardware maintenance and remote support is a lot to be paying for a smaller network. The way Google limits the number of documents included in the searchable index means this is what the Google Mini is designed for.