Leveraging Microsoft Indexing Service in the Enterprise
There are quite a few tutorials on the internet about Indexing Service, but I found none that showed me all of what I planned to accomplish in my environment. This prompted me to write this HowTo.
Microsoft's description of it is "Indexing Service is a base service for Microsoft® Windows® 2000 or later that extracts content from files and constructs an indexed catalog to facilitate efficient and rapid searching.".
My goal was to leverage this technology to aid my users in searching for documents in our network shares. We have a number of them, and I've found that even if you attempt to promote good "house cleaning" rules, you invariably end up with a mass of documents that can become difficult to circumnavigate. This usually results in frequent phone calls like, "Have you seen this?". Nope, haven't seen it, but if you call Bill, I bet he knows...
While MS Indexing Service, or Desktop Search, works well when tuned properly for the end users local documents, it doesn't do a darn thing for network files. This needs to be set up at the server, and then published in some manner so that the users can effectively search for documents.
Step One Choose the server to house the Index databases, and provide IIS for the ASP pages used for searching. This doesn't have to be the server that shares the files, as you'll soon see. You will, however, want to realize that if you plan to index a large number of documents, it will require some initial horse-power. My environment consists of:
- Six network indexes.
- Each index database range in size from 3mb to 20gb.
- The number of documents in each network repository that are indexed range in size from 500 to 200,000 documents.
- My Indexing server is a 2.8Ghz Xenon, 2Gb RAM, Dell PowerEdge 2600.
- Text Files
- HTML Files
- Microsoft Office documents
Set up an Index service account that has administrative privileges, but is not the Domain Administrator account. The last thing you need is to create issues when you need to reset the Domain Admin password, and have to track down all your services that use it...that's a drag!
Step Two - Set up your Indexes First, right click on the Indexing Service, and on the Generation tab, check the "Index files with unknown extensions" and "Generate abstracts". Doing this will catch files that might get incorrectly named, and generate an abstract that will be displayed when our ASP search page finds it. Also, unless you intend on Indexing your web pages, remove any references to any other Catalogs by right-clicking and deleting them. You'll have to stop Indexing Services first.
Once your Catalog name and Location of the Indexing Catalog (database) is set, you'll have to choose a location to Index, and an account to use for this process.
Once you've got these set up, give Indexing Services a restart, since the catalogs won't build until you do. Now that this is done, you'll want to set your Indexes thusly...
On the Tracking and Generation tabs, tick the "Inherit above settings from Service", and set the WWW server to None, so as to avoid Indexing your web site. Click Ok, and in the M$ style of service changes, restart the services to be sure your settings apply.
Step Three - Set Up ASP Search Page This is the best part, IMHO. Now you expose these Catalogs for your users searching pleasure. I found bits of this out on the web, and M$ also put together a couple of ASP examples together that can be found in the IISSamples directory of you web server. Some documentation on their use can be found here: http://msdn2.microsoft.com/en-us/library/ms692879(VS.85).aspx.
Take this sample, change line 21, and substitute your servers NetBios name, and modify the SELECT and OPTION tags at lines 139 through 146 to substitute your Indexing locations. The javascript referenced in the script include on line 9 is a utility to check for a valid search term and location.
My simple mod is here:
<% ' Customization variables DebugFlag = FALSE ' set TRUE for debugging UseSessions = TRUE ' set FALSE to disable use of session variables RecordsPerPage = 20 ' number of results on a page MaxResults = -1 ' total number of results returned ' Hard-code some parameters that could be taken from the form ' SortBy = "rank[d]" ' sort order IndexServer = "\\IBSFP\" 'Our indexing server ' Set initial conditions NewQuery = FALSE UseSavedQuery = FALSE SearchString = "" QueryForm = Request.ServerVariables("PATH_INFO") if Request.ServerVariables("REQUEST_METHOD") = "POST" then SearchString = Request.Form("SearchString") DocAuthorRestriction = Request.Form("DocAuthorRestriction") FileTypeRestriction = Request.Form("FileTypeRestriction") FSRest = Request.Form("FSRest") FSRestVal = Request.Form("FSRestVal") FSRestOther = Request.Form("FSRestOther") FMMod = Request.Form("FMMod") FMModDate = Request.Form("FMModDate") SortBy = Request.Form("SortBy") Scope = IndexServer & Request.Form("Scope") Catalog = Request.Form("Scope") RankBase = Request.Form("RankBase") ' NOTE: this will be true only if the button is actually pushed. if Request.Form("Action") = "Search" then NewQuery = TRUE NextPageNumber = -1 elseif Request.Form("pg") <> "" then NextPageNumber = Request.Form("pg") UseSavedQuery = UseSessions NewQuery = not UseSessions end if end if %>You may have to play around with this a bit, but once your users start using it instead of their Desktop Search, they will quickly find this is much more efficient, and thank you.
Comments