  
	
  
  
	
	  Recognized META information in HTML documents
	
	
	  ht://Dig Copyright © 1995-2004 
THANKS.htmlThe ht://Dig Group 	  Please see the file 
COPYINGCOPYING  for
	  license information.
	
	
	
	  Introduction
	
	
	  As the 
index.htmlht://Dig  system will index
	  all HTML pages on a system, individual authors of pages may
	  want to control some of the aspects of the indexing
	  operation. To this end, ht://Dig will recognize some special
	  <META> tag attributes. The following things can be
	  controlled in this manner:
	
	
	  
		Do not index the document
	  
	  
		Notify a user that the document has expired
	  
	  
		Set keywords for the document
	  
	
	
	
	  General <META> tag use
	
	
	  In HTML, any number of <META> tags can be used between
	  the <HEAD> and </HEAD> tags of a document. There
	  are three possible attributes in this tag, two of which are
	  recognized by ht://Dig:
	
	
	  
		NAME
	  
	  
		Used to name a specific property.
	  
	  
		CONTENT
	  
	  
		Used to supply the value for a named property.
	  
	
	
	  A document could start with something like the following:
	
	
	  <HTML>
	  <HEAD>
	  <META NAME="htdig-keywords" CONTENT="phone telephone
	  online electronic directory">
	  <META NAME="htdig-email"
	  CONTENT="pat.user@nowhere.net">
	  <TITLE>Some document title</TITLE>
	  </HEAD>
	  <BODY>
	  
		
Body of document	  
	  </BODY>
	  </HTML>
	
	
	
	  Recognized properties
	
	
	  The following properties are recognized by ht://Dig:
	
	
	  
		htdig-keywords
	  
	  
		htdig-noindex
	  
	  
		htdig-email
	  
	  
		htdig-notification-date
	  
	  
		htdig-email-subject
	  
	  
		robots
	  
	  
		keywords
	  
	  
		description
	  
	  
		author
	  
	
	
	  Detailed information about the 
htdig-email, 	  htdig-notification-date
, and 	  htdig-email-subject
 properties can be found in the
	  
notification.htmlEmail notification service 	  document.
	
	
	  Descriptions of the properties and their values:
	
	
	  
		
htdig-keywords	  
	  
		The value of this property should be a blank separated list
		of keywords which will get a very high weight when
		searching. This can be used to get around some problems
		with common synonyms for words in the document. For
		example, if a document is a telephone directory, possible
		keywords could be "telephone phone directory book list".
		Now, regardless of what text is actually in the document,
		it can be found if these keywords are used in the search.
		The weight that words in the content string will have in
		search results is controlled by the
		
attrs.html#keywords_factor		keywords_factor
 attribute in your configuration.
	  
	  
		
htdig-noindex	  
	  
		This property has no value associated with it. If it is
		used, the document will NOT be included in any searches.
		Example uses of this could be:
		
		  
			A document which is dynamic. ie: the contents change
			continually.
		  
		  
			Temporary document, not officially available, yet.
		  
		  
			A document you just don't want to be found.
		  
		
	  
	  
		
htdig-email	  
	  
		The value is the email address a notification message
		should be sent to. Multiple email addresses can be given by
		separating them by commas. If no email address is given, no
		notification will be sent.
		 (Please check the 
notification.htmlEmail
		notification service
 documentation for more details on
		this.)
	  
	  
		
htdig-notification-date	  
	  
		The value is the date on or after which the notification
		should be sent. The format is simply 
month / day /
		year
, or if the attrs.html#iso_8601iso_8601 		attribute is set, 
year - month - day.
		Make sure that the year has the century with it
		as well. This means that you should use 
1995		instead of 
95.		 If no date is given, no notification will be sent. (Please
		check the 
notification.htmlEmail notification
		service
 documentation for more details on this.)
	  
	  
		
htdig-email-subject	  
	  
		The value specifies the subject the notification message.
		This is an optional property. (Please check the
		
notification.htmlEmail notification service 		documentation for more details on this.)
	  
	  
	        
robots 	  
	  
	  The value specifies restrictions on robots (including ht://Dig)
	  for the current page. These restrictions can be "noindex" to
	  prevent indexing the document but allowing the robot to follow
	  links from the page, "nofollow" to allow indexing but preventing
	  links from being followed, or "none" to prevent
	  both. Additionally, ht://Dig supports the values "index" and
	  "follow" and "all" which obviously are the opposite of the other
	  values and are the default behavior. For more information on
	  META robots tags, check out the
	  
http://www.robotstxt.org/wc/meta-user.html	  HTMLAuthor's Guide to the Robots META tag
.
	  
	  
		
keywords	  
	  
		The value of this property should be a blank separated list
		of keywords, just as for the htdig-keywords property.
		They are treated as equivalent by htdig. The reason for
		two different properties is that the keywords property
		is used by other search engines as well, while the
		htdig-keywords property can be used for words you want
		indexed only by htdig. You can get htdig to treat other
		property names as equivalent to htdig-keywords, or disable
		the htdig-keywords or keywords properties, by changing the
		
attrs.html#keywords_meta_tag_names		keywords_meta_tag_names
 attribute in your configuration.
	  
	  
		
description	  
	  
	  The value allows you to specify an alternate excerpt
	  (description) of a page. If the config-file attribute
	  
attrs.html#use_meta_description	  use_meta_description
 is used, then any documents with
	  descriptions will use them instead of the automatically
	  generated excerpts.
	  The weight that words in the content string will have in
	  search results is controlled by the
	  
attrs.html#meta_description_factor	  meta_description_factor
 attribute in your configuration.
	  
	  
		
author	  
	  
	  The value specifies the name, email address and/or affiliation
	  of the creator or authoriser of a page.
	  The weight that words in the content string will have in
	  search results is controlled by the
	  
attrs.html#author_factorauthor_factor 	  attribute in your configuration.
	  A search for "author:
name" will
	  look only in these fields for the word 
name.
	  
	
	
	Last modified: $Date: 2004/05/28 13:15:19 $
  
