Navigation Toggle

Customer Success Story: Using Webinator To Search Online Collections Of Eurasian And East European Research

February 24, 2009
Customer Success Story: Using Webinator To Search Online Collections Of Eurasian And East European Research

The Center for Russian & East European Studies, a sub-unit of the larger University Center for International Studies (UCIS) at the University of Pittsburgh, won a competition a number of years ago to create the Vladimir I. Toumanoff Virtual Library — a collection that includes searchable online documents from many top U.S. researchers and analysts who write about politics, history, sociology, economics and foreign policy related to the states of the former Soviet Union and Central and Eastern Europe. Thunderstone's Webinator indexing and retrieval software enabled the responsible Informatics team to accomplish this goal in an efficient and affordable manner.

The University Center for International Studies (UCIS) provides the organizational framework that supports the University of Pittsburgh's mission to integrate and reinforce all its strands of international scholarship in research, teaching and public service. UCIS includes — in addition to many other highly-acclaimed programs and component units — a Center for Russian & East European Studies, an Asian Studies Center, a Center for Latin American Studies, a European Studies Center, an International Business Center (jointly sponsored with the Katz School of Business) and a European Union Center of Excellence (funded by the European Union.)

As a thin layer on top of the whole UCIS structure, Central Administration handles all business-related core functions and technology issues. When individuals in any of the sub-units need advice or consulting related to I.T. Services, Knowledge Management, database planning, upgrading of their websites or anything else that would fall into technology-mediated information, they call upon Mark J. Weixel, Director of Informatics at UCIS. 

Discovering Webinator and Getting Started With Using It as an Easily Customizable Development Tool

Weixel recalled, "Back in I guess it was '98, I found out about Webinator from a friend of mine who was at Princeton at the time. We had a particular niche here in International Studies, and we wanted to create mini search engines for web content that was specific to certain world regions. We were hoping to create search engines like AltaVista, since Google wasn't even around then, that would allow people to do full-text searching of those websites. But, because we were vetting the list of sites, we thought we could increase the probability that searchers would come across something really relevant to the part of the world we were focusing on.

"We used Webinator to index and search collections of websites that were in and dealt with Russia and Eastern Europe.

"So, that was my original introduction to Webinator. We bought the entry-level product to begin with, and we currently have the Enterprise version. What I really like about it, still, is the fact that it's relatively easy to configure. It's much easier to configure that it was back when we bought the original product, when everything was run through command lines. I like the notion of relevance in terms of returned hits. It seems to make a lot more sense to me than, for example, Google page ranking — which places a much higher priority on popularity than it does on the actual content of the pages where text matches.

"Another thing that has been nice is the fact there is support for synonym matching within the server. And I think Vortex as a scripting language is very powerful. Even though I haven't used it to its fullest ability, it's proven to be quite flexible when we've needed to make modifications."

Implementing a Sophisticated Indexing and Retrieval Package with an Attractive ROI Track Record

Did they look at any competing products? According to Weixel, no, they didn't — for a couple of reasons. One, they're a small shop and they have to ask, "How much is this going to cost?" And, he said, the ROI for a one-time investment in a perpetual Webinator license was always pretty clear. It was a known quantity to them. Plus, Weixel strongly believed, as the person in charge of actually setting up and administering it, Webinator provided an affordable and high-quality solution for his specific application requirements. The business manager trusted Weixel's judgment, and by all accounts Webinator has delivered excellent results.

As to future expansion beyond the Center for Russian & East European Studies, discussions have begun with several of the other sub-units within UCIS. The Center for Latin American Studies and the European Studies Center also seem interested in putting more and more of their materials online — newsletters, conference reports, etc.

Webinator offers UCIS sub-units the possibility of acquiring a well-proven search engine that they could customize as desired and manage on their own.

Digitizing, Capturing and Making Searchable the Publications that Comprise the Vladimir I. Toumanoff Virtual Library

Weixel said their Webinator-powered search implementation getting the heaviest use right now is a project that the University of Pittsburgh's Center for Russian and East European Studies (REES) has done in conjunction with The National Council for Eurasian & East European Research (NCEEER, frequently pronounced 'Nickser') — a federally funded organization charged with supporting research, typically in social sciences, focusing on the former Soviet Union and Eastern Europe.

REES won a competition a number of years ago to create the Vladimir I. Toumanoff Virtual Library comprised of research reports and working papers submitted to NCEEER by scholars under their grants over the last two decades. This collection includes searchable online documents from many top U.S. researchers and analysts who write about politics, history, sociology, economics and foreign policy related to the states of the former Soviet Union and Central and Eastern Europe. NCEEER continues adding to the collection as its funded researchers prepare new papers.

"We proposed scanning and digitizing more than 20 years' worth of reports and then taking it and essentially pointing Webinator at it and, using the documents plug-in, doing a full-text index of the entire corpus. And I think one of the reasons that we won the competition is because, once we had done the really hard work of creating PDFs out of all the printed documents — we were going to be able to put it in once place and, overnight, have a full-text search index. It's my understanding that that was not a component of the other proposals," said Weixel.

He continued, "We successfully contended for that particular project, got it, spent the better part of nine months digitizing the materials and, I kid you not, it took, I think, less than 24 hours, and we had a fully searchable index of the entire corpus of research products. And it worked out well. We have this nice, targeted archive of material. We've got it set to re-index on a regular schedule, so anytime NCEEER gets a new batch of project reports — they upload them, they get caught in the next cycle of indexing, and it makes us very happy.

"The search interface for the archive materials of NCEEER is available through the Vladimir I. Toumanoff Virtual Library at the website of The National Council for Eurasian & East European Research. You kick off the search there, and then you're transported to Pittsburgh for the actual results set.

"Recently we put the server housing Webinator behind the firewall as part of our new increased security policy at the University of Pittsburgh. The fact that the folks at Thunderstone — John, in particular, in the Support Group — were able to work with me in coming up with a way to take a search query and pipe it through a back door into Webinator and then take the result set and present that to users in an accessible front-end, was just fantastic. It took me about two weeks once I had access to the beta version of the code, and that worked out really well. It was satisfying for me on a number of levels, not just because the product did what it was supposed to, but because I had support from people who could actually help me efficiently accomplish what I needed to do. That worked out very, very well."

Weixel added, "Our audience is interesting. Of course, we're housed within a major research university. So, we do have a number of our projects where we're trying to target our students and our faculty. But the area studies centers, these sub-units underneath the University Center for International Studies, most of them have federal funding that mandates what they call 'outreach' — trying to bring the message of international studies to a larger community, whether it's a local business community or whether it's local educators at the Kindergarten through high school level. Most of them probably have some kind of academic interest in one of the regions of focus. However you look at it, it's a pretty large and diverse audience.

"Being in an international studies environment, one thing that is important to us is foreign language support. I will admit to not having tried this yet with any of the CJK languages. But, in terms of the European and Cyrillic-based languages that we've indexed, Webinator has been a really good performer. And we've been quite happy with that."

For more information about UCIS or any of its area studies centers, you may contact UCIS by mail or email at:

University of Pittsburgh
University Center for International Studies
4400 Wesley W. Posvar Hall
Pittsburgh, PA 15260

Customer Success Story: Thunderstone Data Services For Searching A Web-based Resource To Help Students With Homework

September 26, 2008
Customer Success Story: Thunderstone Data Services For Searching A Web-based Resource To Help Students With Homework

Editors at Lincoln Library Press, Inc. created an online version of everything contained in the multi-volume hardback books their publishing company sells to school libraries. This new and popular web-based resource for students, called FactCite, would also offer much additional content not currently available within the printed editions. The creators of FactCite required powerful search technology that could handle the growing information access and retrieval needs of the website. After considering Google and a couple of other potential options, they started with a Thunderstone Search Appliance -- but they ultimately decided it made the most sense to have Thunderstone Data Services host their search solution.

FactCite: The Lincoln Library Online provides more than 20,000 pages of eye-catching, thought-provoking and intelligently presented factual information that includes many useful elements designed specifically to help students in grades 4 - 12 with their homework assignments.

The online articles, bios, photos, illustrations, etc., include 700+ reader-inspiring profiles of the world's greatest athletes (from all eight editions of the 14-volume set entitled The Lincoln Library of Sports Champions,) the complete five-volume contents of The Lincoln Library of Greek & Roman Mythology, as well as the entire seven-volume set of The Lincoln Library Shapers of Society and The United States Encyclopedia of History -- an impressive (but, alas, no longer in print) work which The Lincoln Library Press bought, took over, updated, computerized and put online exclusively at FactCite.

Located in Cleveland, Ohio, The Lincoln Library Press markets its large, colorful books and the FactCite online resource directly to school librarians throughout the United States. Schools that own at least one of the printed and bound sets (Sports Champions, 8th Edition; Greek & Roman Mythology; or Shapers of Society) can obtain unlimited access to FactCite for an annual license fee of $179 per building. Schools that don't own any of the print editions may provide unlimited FactCite access to their students by paying an annual license fee of $495 per building.

The Lincoln Library Press is an imprint of Lincoln Library Press, Inc., which Timothy and Susan Gall have owned and operated since 1992. This husband-wife team already possessed more than 20 years of hands-on experience as reference work developers by the time that their company's editorial offices assumed stewardship of all The Lincoln Library publications in 1998. They first met each other while employed at Northeast Ohio-based ASM International, the Materials Information Society. Timothy, an attorney, worked there as an acquisition editor. He also developed a product called Metal Selector, an online database of alloys and materials, to assist engineers in selecting materials for design purposes. Susan, a teacher by training who additionally earned an MBA, worked at ASM International on the educational end of things -- developing technical courses and producing learning programs in video and print formats.

Putting All Their Best-Available Content on the Web, and Making it Searchable

As the forward-looking owners of all The Lincoln Library publications decided to update, expand, digitize and put their proprietary intellectual property on the web and to offer it to schools as FactCite -- they required search technology that could handle the growing data access and retrieval needs of the website.

Tim Gall, publisher at The Lincoln Library Press, recalled, "We needed search capabilities, and there were a lot of different types of ways to do that. The Thunderstone box was the least expensive, most powerful search solution we found. We looked at Google and a couple of others.

"But there were also some challenges for a small company like us. If you got the Thunderstone Search Appliance box, you physically had to host it here -- which we didn't necessarily want to do, because that means you need a little bit more internal infrastructure. You have to manage your servers. You have to make sure they're up and running. You can try to send it out to other hosted server providers, but that's easier said than done. They say things like, 'You'd have to have your own cabinet. You'd have to hire your own administrator, and on and on.'

"We were working with a web hosting provider that cost us a lot of money because they didn't know how to interface with the search box. They told us we needed a new server. They said Thunderstone wasn't following some protocol. They started talking gobbledygook. And they were wrong. They just didn't spend the time to understand how to get things right. So, they had egg on their face.

"We bought the box. It was here in our office. It doesn't really need much configuration. When we got it, I hired a guy who was a little smarter than me. I think he had that search box up and running inside half an hour. We set it up and just forgot about it. It ran perfectly, right away.

"More recently we've been building our online subscription service to supplement the print products with a large amount of information that's not published in books, that's just available electronically. And we're, little by little, putting that on the web, because -- obviously -- it's all going to end up there anyway eventually. Then, somebody at Thunderstone said: 'I tell you what; we'll host it on a virtual server.' And for me, right now, I can't even tell the difference between whether it's on a box or on the hosted service. It's got the same control panel. I open it up, it works great, and I don't have to worry about it."

Tapping Thunderstone Data Services For Consulting And Hosted Search

The Thunderstone Search Appliance is a plug-and-play device that represents the best search solution for many businesses, NGOs, educational institutions and government agencies around the world. It comes with a one-time, perpetual license and powerful features usually found in much more expensive products.

Some organizations, however, prefer a remotely-hosted, outsourced search solution that requires no additional hardware/software on their premises. Or they may need custom programming not suitable for the built-out Appliance environment.

With Thunderstone Data Services, including monthly hosting and any applicable project consulting or programming work, customers can get precisely what they want -- while enjoying a surprisingly affordable total cost of ownership on their Thunderstone-hosted search solution.

For the sake of maintaining the proprietary nature of the FactCite online subscription service, it would prove critical to develop appropriate security measures. Thunderstone created a Texis-based search application with access control capabilities that fully accommodated the desired user login procedures specified by The Lincoln Library Press.

Gall related, "I was flabbergasted that it hadn't been canned by hardly anyone. You can't find this type of login software very easily out there. I mean, I looked all over. I wanted a simple program -- just a little database to keep track of users, give them a password, etc. I figured, how hard can that be? Someone's canned that. I had quotes from tens of thousands of dollars and up. Or, in many cases, they couldn't even do it at all.

"Thunderstone quickly and easily wrote that customized login capability for us. It's gone very smoothly, and we found it pleasant working with them. No complaints.

"Maybe a school will have eight or ten different computers, and they don't want to have to bother with a separate password and login for each one. It's too much to manage. So, we can use the same password, etc. for computers within a specified IP range. We also give the librarians a user name and password that the kids can use to log in after school hours -- when they're at home."

Combining the Advantages of Printed Books with Online Enhancements

Educational book reviewers have favorably noted that while The Lincoln Library's multi-volume sets are easy to read and grasp for younger students just beginning to learn about the people and topics covered, the well-researched articles also manage to offer much subject matter that eleventh or twelfth graders will find informative and descriptive enough to help them when they write their term papers.

FactCite: The Lincoln Library Online retains the attractive content formatting and the many homework-friendly characteristics that make The Lincoln Library editions popular with students. Users can access everything, complete with all full-color and black-and-white scanned elements, plus the actual source text as it appears in the printed books. Or they can enter a query in the search box -- and it will return relevant results.

FactCite provides both educators and pupils with:

  • Thousands of images
  • Advanced search features
  • Drill-down menus for easy searching
  • Multiple indexes
  • Regularly-updated content

Teachers encourage students to employ FactCite when working on their homework assignments, although they typically remind the kids to properly credit the website in their reference notes and bibliographies.

"We tried to design it from the kids' point of view. Often you go to a site, you type in the keyword, and you get 5,000 hits. And Wikipedia, for example, is not really ideal for kids in school. It's too difficult to read. There's too much information there, and it's not a narrative story. It's sometimes overwhelming.

"The articles at FactCite are written specifically to be useful to our target audience. There's material you can print out for inclusion in your reports. You can print out graphics and images, even outlined images that can be colored in. You won't find excessive links (they're mostly at the end of articles) and certainly none that take you away from the site.

"The challenge is, at certain grade levels, to get kids to read. It's very hard to teach a kid to read on the computer. There are too many distractions. There are too many hot links. You're moving all over, and you lose sight of what you're doing.

"We've brought into our books [and FactCite] large-text introductions that are written at a level that almost any kid could read. And then we have lots of pictures and text boxes to spur their interest, as well as to keep them engaged.

"In terms of using Thunderstone's search technology, we're certainly just scratching the surface currently. But, we know the things we want to do with it in the future." Gall explained.

Enriching with The Lincoln Library of Essential Information

FactCite continues to add fresh, new articles dealing with the fundamentals of Geography, World History, Science and other key areas that will cross-cut most of the curriculum for grades 4 through 12. This growing quantity of robust and enhanced content comes from a rigorously updated version of The Lincoln Library of Essential Information.

According to Gall, the book began in 1907 as The Standard Dictionary of Facts -- a one-volume compilation of all sorts of information, and by 1924 it had turned into The Lincoln Library of Essential Information. At that time there was a big push in America for self education. People were trying to teach themselves all sorts of things. And this book was organized to serve that purpose. Instead of being organized as a standard encyclopedia, things were organized by subject categories. Therefore, students could go through and read about a subject. At the end of the articles there were test questions. Readers, in the privacy of their own homes, could test themselves on the knowledge they acquired by going through the articles.

"It evolved into a two-volume set, and by then it had kind of retreated into the library and out of the home. It still retained its autodidact features. If you wanted to learn a subject area, you could go into it and in six or seven pages you got a pretty good grasp of what it's all about. You'd see which sources to go to for further reading, books you should be looking for, what you really should be reading if you want to have some clue about what's going on. There were always incredibly good authors writing for it, people from the best universities."

"So, we took this set and we scanned it and computerized it. It had never been computerized. And we started sending out the articles for revision. Perhaps about 60 percent are back already. We're moving now to put that material online at FactCite.

"We're going to lead with it on the website first. Rather than making print the primary product and electronic the secondary product, we're now going to flip that -- with electronic as the primary and print as the secondary product," said Gall.

Gall estimated that comparable online products from marketplace competitors would probably cost schools $4,000 to $5,000. The Lincoln Library Press, even as it steadily expands its total knowledge repository in the months ahead, will deliver a much better-priced alternative. And, unlike others, FactCite won't offer 50,000 articles about the 'Internet.' Instead, it will have one or two highly useful articles that kids can and will actually read.

Thunderstone Search Appliances for Searching a Government Surplus-Property Auction Website

August 19, 2008
Thunderstone Search Appliances for Searching a Government Surplus-Property Auction Website

When the U.S. General Services Administration looked to upgrade the searching capabilities on its GSA Auctions® website, it could have written a host-based search engine to run on the mainframe system. Instead, GSA opted to use "off-the-shelf" Thunderstone Search Appliances — enabling the implementation of an affordable search solution that reduces system load while providing sophisticated search features expected by today's savvy users.

The GSA Auctions® website ( empowers people in the general public to bid electronically on excess and/or surplus Federal assets. It supports fully web-enabled auctions that permit registered participants to bid on a single item or multiple items (lots) within specified timeframes.

Auctioned items can include run-of-the-mill office equipment and furniture, as well as more exotic Federal assets such as scientific equipment, heavy machinery, airplanes, vessels, vehicles, etc. The website enables GSA to auction-off and dispose of a widely geographically dispersed inventory of products. Participants can bid on and purchase available assets, without worrying about the actual physical location of any particular item or buyer.

Interested individuals may browse products offered on the auction site, or they can choose to search for items and place bids. With flexible and robust search capabilities powered by the Thunderstone Search Appliance, GSA Auctions® takes advantage of Thunderstone's proven technological expertise in the simultaneous searching of both structured and unstructured data.

Prior to implementing Thunderstone's Appliance-based search solution, things shaped up very differently on the website — in terms of its data access and retrieval functionality. Back-ended by a COBOL application utilizing a cgi interface to the mainframe's web server, the site originally supported a basic full-text search that required parsing the complete database to search each active item.

It seemed appropriate to consider deploying new technology, because the GSA Auctions® site deserved a more feature-rich and less resource-intensive search tool.

Driving TCO Lower by Integrating Mainframe Systems with "Off-The-Shelf" Products

Thomas Schaefer serves as Systems Architect and consultant to the General Services Administration. He helps the GSA identify innovative ways to derive optimal value from its Unisys ClearPath Mainframe investment and to maximize the productive use of all related I.T. resources.

According to Schaefer, GSA could have written a host-based search engine to run on the system. But, they figured, "Why reinvent the wheel?" Instead, GSA rapidly deployed several off-the-shelf Thunderstone Search Appliances to affordably implement a state-of-the-art search solution with reduced load on the ClearPath system for each request.

"By using Thunderstone Search Appliances, GSA Auctions® has gained the rich search features users have come to expect from sites like Google. The broader point is that, while there is a move to consider total cost of ownership and migrate applications to the ClearPath environment, not every problem requires developing custom mainframe software. Some problems are better solved by integrating existing components, mainframe or otherwise, into composite systems," Schaefer said.

Thunderstone's DataLoad API, which allows data to be pushed into the Appliance, gets used a lot by the people who administer the GSA Auctions® site. And they appreciate the "plug-and-play" reliability of their Thunderstone Search Appliances -- because every added server brings with it an additional production cost. Who wants to maintain yet another server? "With Thunderstone," Schaefer said, "I haven't logged-on in eight months, and everything keeps running just fine."

GSA continues to work with Thunderstone to further enhance and refine its electronic auction offerings to the public, using the Thunderstone Search Appliance model as its standard. Schaefer explained, "We like the fact that we still have the sophistication of Texis, but in an appliance."

The current search solution for the site includes two load-balanced Thunderstone Search Appliances (Enterprise Edition) located in a Minnesota production environment. Each Appliance runs active-active. Two additional Thunderstone Search Appliances (one Enterprise Edition and one Small Business Edition) are installed at a GSA facility in Utah, for the purpose of ongoing new application development and testing.

Kevin J. Payne, Director of System Applications at GSA, recalled, "We considered several internal and external search alternatives. Tom [Schaefer] came up with the idea of using a search appliance, which basically meant using either Google or Thunderstone."

"Thunderstone demonstrated a real willingness to make changes that we wanted and to customize their standard appliances to meet our needs. The ability to customize was a big thing. Plus, we really like the way the search engine runs," Payne added.

Customization for the search enhancement project enabled GSA's Thunderstone Search Appliances to add as many as 50 additional data fields (well beyond the three definable fields that come with the standard Appliance) for auction-item searches based on bid amounts and different geographic locations.

Payne said the procurement process was fast and easy. And he expects other key GSA projects to also have their search capabilities powered by Thunderstone Search Appliances in the months ahead. As an example he cited the ( website. Part of the Federal Asset Sales Presidential e-Government (E-GOV) initiative, it will very likely implement its own Thunderstone Search Appliance by Fall 2008.

"These are systems that bring in billions of dollars a year to the U.S. government," noted Payne.

Providing Multiple Ways of Searching Structured/Unstructured Data to Find Precise Results

The GSA Auctions® site offers a variety of ways to search for desired items:

    • Global Search
      The upper right corner of every page on the GSA Auctions® website contains a global search box that can search the entire site for all assets available to bid on and purchase. Users may refine displayed global search results by entering another search term in the search box above their search results and then clicking the "Search Within Results" button.
    • Category Search
      The home page has a category search box for searching only user-selected categories of items. Directly above the category list, the category search box begins with "All Categories" as the default selection from the "category dropdown" list. This list changes to reflect items available for selection as users browse the different product categories.
    • State Search
      Choosing a state from the Browse States dropdown box on the site's main menu header will produce a list of items located in the chosen state and will allow a user to enter keywords for searching the full-text information associated with all the displayed items.
    • Advanced Search
      Accessible via a hyperlink under the global search box, advanced search allows users to: Delimit keyword search types (all words, any words, exact match.) Specify a current winning bid range. Search within a state for an item.


The professionals who work at GSA sometimes tend to write in government-spec language that typical users do not use themselves. For instance, what GSA refers to as "vehicles" on their auction site does not correspond to terms that most people use when searching for vehicles. Rather, people look for "cars," "sedans," "autos," "automobiles," "SUVs," etc.

With built-in Metamorph concept-based searching capabilities, the Thunderstone Search Appliances saved GSA hundreds of thousands of dollars in development costs, According to Payne.

How Ariba built an enterprise-wide Knowledge Management system based on Texis

October 24, 2007
How Ariba built an enterprise-wide Knowledge Management system based on Texis

A major challenge in today's organization is being able to have a search that is flexible enough to search across all the repositories in the appropriate manner to build an enterprise-wide knowledge management system. In this article we look at how Ariba, Inc. created an enterprise-wide knowledge management system using Thunderstone's Texis as a search solution.

Ariba offers the world's leading Spend Management software and services to a wide range of customers that include people from all industries. It provides a set of both CD-based and on-demand software, along with services related to sourcing products and commodities, negotiating contracts, buying against negotiated contracts and other key components of a complete, end-to-end Spend Management solution. Ariba helps organizations analyze, understand and manage their spending in order to rapidly achieve sustainable cost savings and to improve business process efficiency.

Why knowledge management?

Derek Matthews, Ariba's Lead Knowledge Architect, was initially focused on their Professional Services organization and their implementation methodology. He wanted to be able to drive consistent best practices into every implementation engagement. During every engagement they would learn things and want to capture the assets and templates that were the best practices for each phase of a project engagement with the customer.

According to Matthews, “A key piece of knowledge management is searching. Process-wise, the first thing you do when you've got an issue in front of you is that you search our knowledge base to discover if your issue has already been addressed. Is there already a best practice out there that addresses what you're trying to do, or is there a content item out there that is immediately leverageable for what you're wanting to do? If none of those are the case, then we do what we call research. Research means having to dig really deep into content items and collaboration, discussion forums and all this sort of thing. And eventually maybe you'll find your solution on page 75, paragraph 4 of some technical document in combination with some other user guide and whatever else. Then the idea is to build a solution, put that into our knowledge base and enable other users around the globe to find it when searching. Researching is more costly than searching. So, we want to push everything forward to that less-costly searching process.”

Native Power and Flexibility of Thunderstone's TEXIS,

Ariba initially licensed Thunderstone Texis due to the availability of integration code from their content management system vendor who supported Texis because of the attribute storage and search available in Texis. The integration code turned out to be inefficient and was sluggish for Ariba's end users when pushed beyond its design. Because Thunderstone provides the ability to completely customize everything from the Texis databases to Vortex scripts, Ariba's software engineers developed their own integration with the content management system. The result was greatly improved efficiency with several customized interfaces that enabled users of the company's various portals to search for content items containing full text and file attachments plus all the associated metadata needed for efficient filtering, security and results sorting. Over time, Ariba built their current knowledge base. With every engagement it became smoother and smoother and smoother, and they are able to have everyone, globally, following the same process. And then from Professional Services implementation methodology it started growing into Sales and Marketing material, and then into Customer Support -- the entire CRM environment, engineering, HR, and eventually every department in Ariba got on-board with their Knowledge Management platform, and now they're all big users of it (Ariba has around 2,000 employees in North America, Europe, Asia/Pacific, Latin America and the Middle East).

As Matthews explains, “We started rolling out enhancement wave after enhancement wave and have continued that to this day. One of the latest enhancements that we've rolled out has been federated searching capabilities, what Thunderstone calls the meta-searching. We actually have portlets that we use on various portals across our extranet that allow people to pull together incongruent content based on metasearches that they want to perform. It is possible now, within the last year, for an Ariba employee to do a search and be able to pull a service request from our CRM system, a defect from our engineering quality system, a marketing presentation from our content management system, as well as searching any number of other internal web sites that have been indexed. You can search over all of the sources through one integrated portlet that we built on top of Thunderstone Texis.

“Texis allows Ariba to index widely varying content into a platform that can be securely accessed by users from various portals for our customers, partners, suppliers, prospects and employees. We are able to use the search engine to present dynamic, context-sensitive views of content for users to browse with the ability to refine through full-text searching.”

Customized Search Engine for Knowledge Management

Thunderstone has always believed in providing tools to users, letting them add the intelligence and knowledge they have in their field to the core tools Thunderstone provides, enabling the creation of powerful applications that can give a competitive advantage. By providing a SQL based database that is optimized for full-text searching in combination with traditional database attributes Thunderstone enables customers to use their existing expertise in database application creation and efficiently extend it into the world of search. Another benefit Thunderstone provides is the Texis Web Script language, known as Vortex. This is designed to allow developers to quickly create applications using the Texis database by providing a simple but powerful syntax and library. Customers can take Thunderstone's applications and modify them to suit their needs easily, while at the same time allowing Thunderstone to create new applications for customers, usually in a matter of days.

Matthews continued, “From a knowledge management perspective we were excited to get involved with Thunderstone, because what they brought to us was a completely customizable interface. A lot of companies, when you talk about knowledge management, tend to believe there's one out-of-the-box solution you can install that's going to solve all your problems with no customization. But, that's not reality. What Thunderstone allowed us to do was to completely look at any and every requirement that we have. With Thunderstone Texis we get a platform that we can customize down to the most technical level.

“With many search engine solutions on the market you basically install their software, point it at a URL and say 'go index that data.' And that's fine. For some situations that's no problem. But in our case we need to be able to customize, even down to the database level. We need to be able to modify how this Texis database works. We need to make modifications to the Vortex scripts that go out and search for content so that they can make certain metadata changes based on certain conditions that might be present. Those Vortex scripts need to be able to combine text from within attached files together with the metadata describing that document in addition to the full text that might be within those files. For example, there will be categorization that's tied to those files. There will be filtering tags that are applied, as well as security tags that get applied.

“The last piece is critical. We need to be able to fully customize the interface to our search engine in a secure manner. If you're a customer accessing our search engine, we want to be able to allow you to search all of the content that we provide to every customer as well as the content that's provided to you specifically as a customer. Maybe it's a benchmark report that we want you to be able to see. But, at the same time, we don't want to allow you to search against a benchmark report for another customer. Security is very important. We're allowing you to search the content that you, depending on whatever attributes you have as a company, are able to see. If you bought a certain product, we may want you to be able to search content for that product, but we don't want you to get bogged down searching content for another product that you don't even own. It's useful both from a filtering for ease of use perspective as well as a security perspective -- both equally important aspects.

“Thunderstone's ability to customize based on those very, very specific requirements that we had made it a great choice, because at each level we are able to make it do exactly what we want it to do. With Thunderstone Texis you have the option of using it right out of the box, one size fits all, just install it and run. However, and this is what I think so many people don't realize out in the industry, Texis also enables a level of customization that lets you go down to the 'nth degree' depending on whatever your requirements are.”

“I am amazed at the whole design of the engine itself. It really is powerful. Thunderstone doesn't get enough credit, and I've never quite understood why. When it comes to customizing the database, customizing the indexing and how that works, and customizing the user interface -- those three things -- I have not seen any of Thunderstone's competitors demonstrate the ability to hit all three of those the way Thunderstone does and with the depth that Thunderstone does,” said Matthews.

Starting with a flexible and easily customized search solution from Thunderstone Software in one area of the business, Ariba has been able to deploy an enterprise-wide search based knowledge management system by extending it application by application, incorporating the specific knowledge from each part of the organization, all using the same core skill sets.

Webinator as an indexing and retrieval tool for creating vertical search portals on network hubs

September 21, 2007
Webinator as an indexing and retrieval tool for creating vertical search portals on network hubs

Ecological Internet (EI) maintains up-to-date climate, forests and environment portals that serve more than 35,000 visitors a day. By implementing Thunderstone's Webinator, EI enables its website users to search the indexed content of five million URLs and quickly retrieve the desired information.

Why Ecological Internet?

Having earned his B.A. degree in Political Science at Marquette University, Glen R. Barry joined the Peace Corps and went to Papua New Guinea -- where he fell in love with the rainforest while witnessing the tragedy of their very extensive destruction for the sake of making cardboard boxes and other such stuff.

According to him, “During my Peace Corps service in Papua New Guinea from about 1990 I became an early adopter of the Internet and began looking seriously at how networking technologies could be used to facilitate environmental conservation. In the early days of the Internet I was struck by the fact that communication between people anywhere in the world could be used to spread information that would lead to better resource management decisions and better conservation decisions.”

After returning from the Peace Corps he completed an M.S. degree in Conservation Biology and Sustainable Development, as well as a Ph. D. in Land Resources, both from the University of Wisconsin-Madison. His primary research revolved around the creation and maintenance of environmental web portals such as -- which became one of the first 10,000 web sites on the Internet. Dr. Barry's Ph.D. dissertation was entitled Global Forests and the Internet: Assessing the Reach and Usefulness of the Forest Conservation Portal.

In 1999 he decided to add search capabilities to, while also launching a climate site and an environmental sustainability site.

Customized Search Engine for Web Sites

Dr. Barry explained, “We wanted to be able to make our own customized search engine. We preferred an off-the-shelf solution that we could easily install to crawl, index, search and retrieve content from more than 4,000 reviewed scientific-content sites of interest to our target audience of conservation professionals. I remember searching on the Internet and finding a huge list of spidering and robot software that had about a hundred products on it. A lot of them were ‘open source,’ with little snippets of code. I was more concerned with having a fully implemented product that does what you need it to do. I wasn't interested in doing an open source sort of thing. Where do you go for technical support in those situations? Going through the list, most of them weren't fully implemented packages. Many of them were free, but the amount of time that a small organization would need to spend getting them operational would have offset any cost benefits. There were a few other options, but they were going to be much more expensive than Webinator.

“At that time our entire budget was like fifteen thousand dollars a year (even now it's only about seventy-five thousand dollars a year in mostly $25 - $100 donations.) So, we're a really small organization. We chose Webinator. I think our initial license with Thunderstone was eight thousand dollars, which was a major purchase for us. It was a big deal. We were trying to do something that hadn't been done before. We had a vision that we wanted to create a specialized search engine on forests content, on climate change content and on water conservation content. The whole purchasing and installation process was straightforward. And Webinator was very, very stable. It just ran. I'm using it on a Windows platform. My operating system is Windows.

“We wanted to walk about four thousand sites we were feeding, and then we also wanted to do off-site pages. Here's where I think customized search is so good. Not only are we getting the content of the reviewed four thousand sites that I as a scientist have identified, but also each of those sites has links to other sites that are included in our index. So, you have some synergy where you find unexpected things at other good sites. Webinator is a really well thought-out product that has a lot of different tools built into it. It's a full-functioning web indexing and retrieval package. You can even include or exclude specified external links. For instance, we don't want Green Peace's online store and merchandise in our search engine..”

Network “Hubs” to Support Environmental Professionals

Ecological Internet (EI) does not directly focus on the general audience that's looking for fluffy pictures of panda bears. There are other web sites that do that very well. EI's target audience is primarily conservation professionals who need information retrieval tools and who seek useful data to factually support their own work. These people tend to be already highly motivated on the issues, and what they get from Ecological Internet are practical tools to do their work better.

Dr. Barry had been employed in the biology department at the University of Wisconsin as their ‘bioinformatics person’ until he left several years ago to run Dennmark, Wisconsin-based Ecological Internet, Inc. ( on a full-time basis.

“There's a whole branch of science, network science, that over the last decade has studied how diseases spread or how the Internet's organized in a ‘hub’ design comprised of nodes with disproportionately high numbers of links to them. It's like the whole Kevin Bacon ‘six degrees of separation.’ We're all networked, and there are hubs. The Internet is a good demonstration of a lot of these networks. What we tried to do with Ecological Internet was to make a network hub on climate change, a network hub on forests, etc. where all of the best content is linked, indexed and made available in support of intelligent activities to protect the environment. Part of this is awareness, but it's awareness with a purpose to actually achieving something. There is reason to be hopeful. The forces of ignorance and corruption are ominous, but we have new tools - like Webinator - that we've never had before,” said Dr. Barry.

He continued, “I went up there to Thunderstone's headquarters in Cleveland, Ohio to participate in a Webinator training program two years ago. I had already been using the product for six years. During this whole time I think that the Thunderstone Software team has always been very responsive. I don't know of any other comparable product that brings full-text customized search to non-profits at a reasonable price. We wholeheartedly support Thunderstone and would recommend the Webinator search platform highly..”

Ecological Internet (EI) now maintains up-to-date climate, forests and environment portals that serve more than 35,000 visitors a day. By implementing Webinator, EI enables its website users to search the indexed content of five million URLs and quickly retrieve the desired information.

The nonprofits' conservation portals currently include:

EcoEarth.Info ( ( ( ( (
My.EchoEarth.Info (