August 2009 Newsletter

August 31, 2009

August 2009 - Archive

HAPPENINGS

LATEST SEARCH APPLIANCES FEATURED IN KMWORLD

A front-page article by ArnoldIT.com's Stephen E. Arnold highlights the newest versions of the Thunderstone Search Appliance and the Google Search Appliance in the printed July/August edition of KMWorld, a publication of Information Today, Inc.

The author says Thunderstone's high-performance, flexible appliances give administrators and developers excellent control over the system – with strong document-level security, feature-rich tuning controls and ability to schedule, stop, pause or configure database crawls in the same way as they can for file servers, Web servers, intranet servers, etc.

You can read the KMWorld article, entitled "Making room for appliances," online here.

THUNDERSTONE ADDS A RESELLER IN AUSTRALIA

We welcome the following organization to our growing Thunderstone Channel Partner Program:

(For Search Appliances and Webinator)
Tredale
1300 737 078
http://www.tredale.com.au

UPCOMING EVENTS

Development work continues on 2009 Thunderstone Software releases of:

TEXIS (Version 6)
Webinator (Version 6)
Thunderstone's Texis Catalog (eCommerce search engine for online catalogs)

CUSTOMER QUOTE OF THE MONTH

"What we have in this particular case is a Native American user group thesaurus language. It's been developed, and it can be added to. The more that it's used – and you put that feedback loop back into this thesaurus – the smarter it becomes. And it starts to create, with this new millennium, a written mind that parallels the thesaurus user group's community. This is something that TEXIS is equipped to deal with that the other stuff out there is not equipped to deal with. It's part of its strength."

Kathy Pincus
Chief Technology Officer
Mnemotrix Systems, Inc.
http://www.NativeAmericanInstitute.org

TECH TIP: DIRECTING YOUR CRAWLER WITH YOUR CONTENT – USING ROBOTS.TXT AND META-ROBOTS

Robots.txt refers to a way for a website to indicate to crawlers what parts of the site it would like the crawler to stay out of. This is not specific to Thunderstone software, and it applies to other web crawlers too. There are two ways of accomplishing this:

With a "robots.txt" file at the root of the webserver, such as http://www.thunderstone.com/robots.txt. This can be used by site maintainers to tell crawlers to stay out of entire sections. For details of the syntax, please see http://www.robotstxt.org/.

Within an individual HTML page as a custom header, such as <meta name="robots" content="noindex,nofollow">. This allows page authors to control the indexing of the content and following of links of an individual page, without affecting others around it.

Note that these are guidelines – they do not create technical restrictions that prevent a crawler from descending into a directory or following links, and they should not be used for security purposes.

All Thunderstone products obey robots.txt and meta robots by default. Sometimes you need to index content you don't control that has a robots exclusion on it. If you'd like to ignore the robots reccomendations and index the content, there are Walk Settings that allow this:

Robots.txt (Y or N) – whether or not to obey the /robots.txt on a site (if it exists). Defaults to Y.

Meta Robots (Y or N) – whether or not to obey any meta robot headers found within an individual page. Defaults to Y.

Robots Placeholder (Y or N) – When a page is excluded from the crawl via robots, this affects whether a "placeholder" record is kept in the crawl data. The placeholder keeps the page from being visited unnecessarily, but it can cause the page to show up in searches where the URL is being searched (by being included in Index Fields.) Defaults to Y.

Feedback, suggestions and questions are welcome. Send your email to

Tags:
Categories: Main | Thunderstone
comment

Thunderstone Releases Version 27

Oct 13, 2023

CLEVELAND, OH, October 13, 2023 — Improving the ease of implementing search for intranet and public facing websites -- Thunderstone Search Appliance and Webinator Version 27 is here.
Thunderstone Releases Version 25

Nov 28, 2022

CLEVELAND, OH, February 1, 2016 — Improving the ease of implementing search for intranet and public facing websites -- Thunderstone Search Appliance and Webinator Version 20 is here.
Comparing Server vs. Hosted Search Solutions

Sep 12, 2019

In-house servers and hosted solutions are both good options for your organization’s search solution. Find out which is right for your search needs.
How to Prepare and Organize Data and Content for Better Search Through Search Technology

Jul 23, 2019

Unorganized information can hurt the quality of your users’ search results. Learn how you can keep your organization’s data organized.
5 Notable Digital Trends in the Healthcare Industry

Jun 25, 2019

Healthcare has become more digital over the years. Read about recent digital trends and why they make flexible data management a necessity.

Archive Search
Rss feed

Thunderstone Blog: Customized Search Engine & Search Software