Note: This documentation is for an old version of Webinator. The latest documentaion is here.

XML Elements in Search Results

 

Search results can be sent as XML from Webinator to the host server. This section describes the XML elements. The elements are listed below in the approximate order that they are sent.

  • <?xml version="1.0"?> The version of this XML.

  • <ThunderstoneResults> Root element that encloses all results.

  • <Query> Main text search string that was submitted by user.

  • <TitleQuery> User's title query

  • <UrlQuery> User's Url query

  • <DepthQuery> User's maximum depth specified

  • <CategoryQuery> User's category query

  • <ModifiedDateLessThan> The mdlt query used, if any.

  • <ModifiedDateGreaterThan> The mdgt query used, if any.

  • <UrlRoot> The URL root of the search script.

  • <Profile> The profile used.

  • <dropXSL> If yes removes XSL from results.

  • <AdvancedSearch> 1 if an advanced search form should be printed.

  • <Proximity> What the proximity for the search was (line, sentence, paragraph, page)

  • <Suffixes> What suffix processing occurred in the search

    • 0 - no suffix processing

    • 1 - plurals and possessives

    • 2 - all word forms

  • <Thesaurus> 1 indicates the thesaurus was used.

  • <Order> How the search was ordered.

    • r - by rank

    • dd - by date, descending

    • da- by date, ascending

  • <RankOrder> The ranking weight of word order, from 0-1000

  • <RankProximity> The ranking weight of query word proximity, from 0-1000

  • <rankDatabaseFrequency> The ranking weight of rarity of words in the database, from 0-1000

  • <RankDocumentFrequency> The ranking weight of frequency of words in the document, from 0-1000

  • <RankPosition> The ranking weight of position in the document, from 0-1000

  • <RankDepth> The ranking weight of depth of the document, from 0-1000

  • <mode> set to admin if this is an admin search

  • <opts> Internal use only.

  • <metasearchTarget> Indicates what backend metasearch targets are available, one element for each target. Currently selected targets will have a selected="selected" attribute.

  • <AdminUrl> The URL to the admin version of the search interface

  • <MakeLiveUrl> The URL to make this look and feel live

  • <authUser> The user that was authenticated via the Proxy Module.

  • <Category> Information about what categories are available. Occurs multiple times.

    • <CatVisible> Set to Y if the category should be selectable in the list of categories.

    • <CatSel> Set to Y if this category is currently selected.

    • <CatVal> The numeric ID associated with this category, used for the select box.

    • <CatName> The displayed name for this category.

  • <TopBestBets> Contains information on the BestBets that display above the results.

    • <BBTitle> The title for this section of BestBets

    • <BestBet> The list of BestBets in this group

      • <BBResultNum> The ordered numbering for this Best Bet, starting at 1.

      • <BBPriority> The priority for this BestBet, as assigned in the admin interface. The Best bets will already be in the proper priority order.

      • <BBLink> The URL that this BestBet links to.

      • <BBLinkDisplay> The URL that displays for this BestBet. Long URLs are intelligently truncated for display.

      • <BBResult> The link text for this individual BestBet, as assigned in the admin interface.

      • <BBDescription> The description for this individual BestBet, as assigned in the admin interface.

      • <BBGroupname> The name of the BestBet group this BestBet belongs to.

      • <BBGroupid> The id of the BestBet group this BestBet belongs to.

      • <BBKeywords> The keywords that trigger this BestBet record to display. This is all keywords for this individual record, not just the one that triggered this activation.

  • <ProfileInfo> Encloses some profile summary info. Child elements include:

    • <Profile> The profile to which the <ProfileInfo> element refers to.

    • <ExitIsEarly> Y if search abort (<UserResultsNum> may be short), N if not.

    • <ExitReason> ok if search finished, otherwise token indicating reason (see table of reasons below).

    • <RedirectUrl> If present, external URL to redirect the user to. Eg. the external Login URL for Results Authorization, to get third-party login cookies.

    • <LoginUrl> If present, local URL to login with rauser/rapass for Results Authorization.

  • <Summary> Encloses search results summary; only sent if a query was actually performed. Child elements include:

    • <Profile> The profile that the <Summary> element applies to.

    • <Start> First result item to list.

    • <End> Last result item to list.

    • <TotalNum> Total number of result items found, before ResAuth etc.

    • <TotalIsEstimate> Y if <TotalNum> is an estimate, N if not.

    • <TotalIsShort> Y if <TotalNum> is short (eg. early exit), N if not

    • <UserResultsNum> Total number of result items found, after ResAuth etc.

    • <UserResultsIsEstimate> Y if <UserResultsNum> is an estimate, N if not.

    • <UserResultsIsShort> Y if <UserResultsNum> is short (eg. early exit), N if not.

    • <ResultsAuthorization> Y if Results Authorization used for query, N if not.

    • <Total> Readable text for total number of results, after ResAuth etc.

    • <CurOrder> Text that describes the order by which results are listed.

    • <OrderLink> Link that provides an alternative sorting order results list.

    • <OrderType> Text that describes <OrderLink>.

    • <NewSkip> (Metasearch only) The skip value to use for any further request. Only needed with the SOAP API.

    • <PreviousLink> Link to the previous page of results (current page minus 1).

    • <FirstPage> 1 if this is the first page of results, 0 if later page.

    • <Pages> Tag that groups tags for a specific page of results. Child elements include:

      • <PageLink> Link to a certain page of the results.

      • <PageNumber> Page number a page of results.

    • <NextLink> Link to the next page of results (current page plus 1).

    • <LastPage> 1 if this is the last page of results, 0 if earlier page.

    • <Credit> Text to introduce credit image.

    • <CreditImage> The URL of credit image.

  • <Result> Tag that contains all elements for a given result. Child elements include:

    • <Profile> Name of the profile for this <Result>. Note that results from meta-search back-ends are re-labeled to the front-end profile.

    • <Num> Number of this result item.

    • <Skip> Internal use: raw skip(s) for result. Valid for Meta Search back-ends.

    • <Id> Identifier for this result item.

    • <ResultTitle> Title of the page of this result item.

    • <Url> The URL of the page for this result item.

    • <ClickUrl> The URL for this result item, as should be clicked by the user. The default (if not present) is <Url>. Only sent if Query Logging is enabled, in which case it contains redirect for logging the click-through.

    • <UrlPDFHi> The URL to highlight this PDF item in user's Acrobat Viewer.

    • <UrlDisplay> Displayed URL for this result item.

    • <RawRank> The raw relevance rank value for this result item (0-1000).

    • <ScaledRank> Raw rank scaled up for a more-like-this search (0-1000).

    • <PercentRank> ScaledRank as a percentage (0-100).

    • <DocSize> Size (bytes) of the page of this result item.

    • <Depth> Number of links walked from Base URL to this URL.

    • <UrlSimilar> The URL to search for pages similar to this result item.

    • <UrlInfo> The URL for context of answers within a matching document.

    • <UrlParents> The URL of pages that link to the page of this search result item.

    • <Modified> Date and time that the page of this result item was last modified.

    • <Visited> Date and time that the page of this result item was crawled.

    • <Abstract> Brief text surrounding the matched word or phrase.

    • <Charset> Character set of the formatted text of the page (typically Storage Charset unless conversion failure).

The following table lists the possible value tokens for the <ExitReason> element:

 

Token Description
ok Normal exit
ResAuth-ExternalLoginRequired Need Login Cookies: redirect to <RedirectUrl>
ResAuth-CredentialsRequired Need user/pass: send rauser/rapass to <LoginUrl>
ResAuth-LoginIncorrect User/pass incorrect; re-send to <LoginUrl>
ResAuth-SuccessLimit Successful Auth Result Limit reached
ResAuth-Timeout Results Authorization timeout
ResAuth-MaxDocsCheck Max Docs to Auth-Check exceeded
ResAuth-SmbError SMB error
ResAuth-NoSmb SMB unavailable/could not be run

Table 6.7: XML <ExitReason> Tokens


Copyright © Thunderstone Software     Last updated: Thu Dec 22 14:38:01 EST 2011
Copyright © 2024 Thunderstone Software LLC. All rights reserved.