  
	
  
  
	
	  htdump
	
	
	  ht://Dig Copyright © 1995-2004 
THANKS.htmlThe ht://Dig Group 	  Please see the file 
COPYINGCOPYING  for
	  license information.
	
	
	
	  
		
		  Synopsis
		
	  
	  
		htdump [
options]
	  
	
	
	  
		
		  Description
		
	  
	  
		Htdump writes out an ASCII-text version of the document and word 
		databases in the same form as the -t option of htdig.
	  
	
	
	  
		
		  Options
		
	  
	  
		
		  
			-a
		  
		  
			Use alternate work files. Tells htdump to append 
			.work
 to database files, allowing it to
			operate on a second set of databases.
		  
		  
			-c 
configfile		  
		  
			Use the specified 
configfile file instead of the
			default.
		  
		  
			-d
		  
		  
			Do 
not dump the document database.
		  
		  
			-v
		  
		  
			Verbose mode. This doesn't have much effect.
		  
		  
			-w
		  
		  
			Do 
not dump the word database.
		  
		
	  
	
	
	  
		
		  File Formats
		
	  
	  
	  
	       
Document Database
          
	  
		
Each line in the file starts with the document id 
		followed by a list of
		
fieldname:value		separated by tabs. The fields always appear in the
		order listed below:
		
		
		
 fieldname value		
 u URL 		
 t Title 		
 a State (0 = normal, 1 = not found, 2
		= not indexed, 3 = obsolete)
		
 m Last modification time as reported
		by the server
 
		
 s Size in bytes 		
 H Excerpt 		
 h Meta description 		
 l Time of last retrieval 		
 L Count of the links in the document
		(outgoing links)
		
 b Count of the links to the document
		(incoming links or backlinks)
		
 c HopCount of this document 		
 g Signature of the document used for
		duplicate-detection
		
 e E-mail address to use for a
		notification message from htnotify
		
 n Date to send out a notification
		e-mail message
		
 S Subject for a notification e-mail
		message
		
 d The text of links pointing to this
		document. (e.g. <a
		href="docURL">description</a>)
		
 A Anchors in the document (i.e. <A
		NAME=...)
		
	  
	  
	       
Word Database
	  
	  
	  
	  The first line of the ASCII word database is a comment,
	  prefixed with '#' and specifies the columns of the file
	  separated by tabs. 
	  The fields are:
	  
	  
word	  
document id	  
flags	  
location	  
anchor	  
	  
	  
	  
	  
	
	
	  
		
		  Files
		
	  
	  
		
		  
			
attrs.html#config_dirCONFIG_DIR /htdig.conf
		  
		  
			The default configuration file.
		  
		  
		       
attrs.html#database_dirDATABASE_DIR /db.docs
		  
		  
		       The default ASCII document database file.
		  
		  
		       
attrs.html#database_dirDATABASE_DIR /db.worddump
		  
		  
		       The default ASCII word database file.
		  
		
	  
	
	
	  
		
		  See Also
		
	  
	  
		
htdig.htmlhtdig ,
		
htload.htmlhtload  and
		
attrs.htmlConfiguration file format 	  
	
	
	Last modified: $Date: 2004/06/12 13:39:13 $
  
