http://swpat.ffii.org/
Action against software patents
http://www.gnome.org/
Gnome2 Logo
http://www.w3.org/Status
W3C Logo
http://www.redhat.com/
Red Hat Logo
http://xmlsoft.org/
Made with Libxml2 Logo
Module encoding from libxml2
API Menu
../index.html
Main Menu
../docs.html
Developer Menu
../examples/index.html
Code Examples
index.html
API Menu
libxml-parser.html
Parser API
libxml-tree.html
Tree API
libxml-xmlreader.html
Reader API
../guidelines.html
XML Guidelines
../ChangeLog.html
ChangeLog
API Indexes
../APIchunk0.html
Alphabetic
../APIconstructors.html
Constructors
../APIfunctions.html
Functions/Types
../APIfiles.html
Modules
../APIsymbols.html
Symbols
Related links
http://mail.gnome.org/archives/xml/
Mail archive
http://xmlsoft.org/XSLT/
XSLT libxslt
http://phd.cs.unibo.it/gdome2/
DOM gdome2
http://www.aleksey.com/xmlsec/
XML-DSig xmlsec
ftp://xmlsoft.org/
FTP
http://www.zlatkovic.com/projects/libxml/
Windows binaries
http://www.blastwave.org/packages.php/libxml2
Solaris binaries
http://www.explain.com.au/oss/libxml2xslt.html
MacOsX binaries
http://libxmlplusplus.sourceforge.net/
C++ bindings
http://www.zend.com/php5/articles/php5-xmlphp.php#Heading4
PHP bindings
http://sourceforge.net/projects/libxml2-pas/
Pascal bindings
http://rubyforge.org/projects/xml-tools/
Ruby bindings
http://tclxml.sourceforge.net/
Tcl bindings
http://bugzilla.gnome.org/buglist.cgi?product=libxml2
Bug Tracker
libxml-dict.html
Prev
libxml-dict.html
dict
index.html
Up
index.html
API documentation
../index.html
Home
../index.html
The XML C parser and toolkit of Gnome
libxml-entities.html
entities
libxml-entities.html
Next
interface for the encoding conversion functions needed for XML basic encoding and iconv() support.  Related specs are rfc2044        (UTF-8 and UTF-16) F. Yergeau Alis Technologies [ISO-10646]    UTF-8 and UTF-16 in Annexes [ISO-8859-1]   ISO Latin-1 characters codes. [UNICODE]      The Unicode Consortium, "The Unicode Standard -- Worldwide Character Encoding -- Version 1.0", Addison- Wesley, Volume 1, 1991, Volume 2, 1992.  UTF-8 is described in Unicode Technical Report #4. [US-ASCII]     Coded Character Set--7-bit American Standard Code for Information Interchange, ANSI X3.4-1986.
Table of Contents
Enum
#xmlCharEncoding
xmlCharEncoding
Structure
#xmlCharEncodingHandler
xmlCharEncodingHandler
struct _xmlCharEncodingHandler
Typedef
libxml-encoding.html#xmlCharEncodingHandler
xmlCharEncodingHandler
*
xmlCharEncodingHandlerPtr
int
#UTF8Toisolat1
UTF8Toisolat1
(unsigned char * out,
int * outlen,
const unsigned char * in,
int * inlen)
int
#isolat1ToUTF8
isolat1ToUTF8
(unsigned char * out,
int * outlen,
const unsigned char * in,
int * inlen)
int
#xmlAddEncodingAlias
xmlAddEncodingAlias
(const char * name,
const char * alias)
int
#xmlCharEncCloseFunc
xmlCharEncCloseFunc
(
libxml-encoding.html#xmlCharEncodingHandler
xmlCharEncodingHandler
* handler)
int
#xmlCharEncFirstLine
xmlCharEncFirstLine
(
libxml-encoding.html#xmlCharEncodingHandler
xmlCharEncodingHandler
* handler,
libxml-tree.html#xmlBufferPtr
xmlBufferPtr
out,
libxml-tree.html#xmlBufferPtr
xmlBufferPtr
in)
int
#xmlCharEncInFunc
xmlCharEncInFunc
(
libxml-encoding.html#xmlCharEncodingHandler
xmlCharEncodingHandler
* handler,
libxml-tree.html#xmlBufferPtr
xmlBufferPtr
out,
libxml-tree.html#xmlBufferPtr
xmlBufferPtr
in)
int
#xmlCharEncOutFunc
xmlCharEncOutFunc
(
libxml-encoding.html#xmlCharEncodingHandler
xmlCharEncodingHandler
* handler,
libxml-tree.html#xmlBufferPtr
xmlBufferPtr
out,
libxml-tree.html#xmlBufferPtr
xmlBufferPtr
in)
Function type:
#xmlCharEncodingInputFunc
xmlCharEncodingInputFunc
int
#xmlCharEncodingInputFunc
xmlCharEncodingInputFunc
(unsigned char * out,
int * outlen,
const unsigned char * in,
int * inlen)
Function type:
#xmlCharEncodingOutputFunc
xmlCharEncodingOutputFunc
int
#xmlCharEncodingOutputFunc
xmlCharEncodingOutputFunc
(unsigned char * out,
int * outlen,
const unsigned char * in,
int * inlen)
void
#xmlCleanupCharEncodingHandlers
xmlCleanupCharEncodingHandlers
(void)
void
#xmlCleanupEncodingAliases
xmlCleanupEncodingAliases
(void)
int
#xmlDelEncodingAlias
xmlDelEncodingAlias
(const char * alias)
libxml-encoding.html#xmlCharEncoding
xmlCharEncoding
#xmlDetectCharEncoding
xmlDetectCharEncoding
(const unsigned char * in,
int len)
libxml-encoding.html#xmlCharEncodingHandlerPtr
xmlCharEncodingHandlerPtr
#xmlFindCharEncodingHandler
xmlFindCharEncodingHandler
(const char * name)
libxml-encoding.html#xmlCharEncodingHandlerPtr
xmlCharEncodingHandlerPtr
#xmlGetCharEncodingHandler
xmlGetCharEncodingHandler
(
libxml-encoding.html#xmlCharEncoding
xmlCharEncoding
enc)
const char *
#xmlGetCharEncodingName
xmlGetCharEncodingName
(
libxml-encoding.html#xmlCharEncoding
xmlCharEncoding
enc)
const char *
#xmlGetEncodingAlias
xmlGetEncodingAlias
(const char * alias)
void
#xmlInitCharEncodingHandlers
xmlInitCharEncodingHandlers
(void)
libxml-encoding.html#xmlCharEncodingHandlerPtr
xmlCharEncodingHandlerPtr
#xmlNewCharEncodingHandler
xmlNewCharEncodingHandler
(const char * name,
libxml-encoding.html#xmlCharEncodingInputFunc
xmlCharEncodingInputFunc
input,
libxml-encoding.html#xmlCharEncodingOutputFunc
xmlCharEncodingOutputFunc
output)
libxml-encoding.html#xmlCharEncoding
xmlCharEncoding
#xmlParseCharEncoding
xmlParseCharEncoding
(const char * name)
void
#xmlRegisterCharEncodingHandler
xmlRegisterCharEncodingHandler
(
libxml-encoding.html#xmlCharEncodingHandlerPtr
xmlCharEncodingHandlerPtr
handler)
Description
Enum
xmlCharEncoding
Enum xmlCharEncoding {
XML_CHAR_ENCODING_ERROR
= -1 : No char encoding detected
XML_CHAR_ENCODING_NONE
= 0 : No char encoding detected
XML_CHAR_ENCODING_UTF8
= 1 : UTF-8
XML_CHAR_ENCODING_UTF16LE
= 2 : UTF-16 little endian
XML_CHAR_ENCODING_UTF16BE
= 3 : UTF-16 big endian
XML_CHAR_ENCODING_UCS4LE
= 4 : UCS-4 little endian
XML_CHAR_ENCODING_UCS4BE
= 5 : UCS-4 big endian
XML_CHAR_ENCODING_EBCDIC
= 6 : EBCDIC uh!
XML_CHAR_ENCODING_UCS4_2143
= 7 : UCS-4 unusual ordering
XML_CHAR_ENCODING_UCS4_3412
= 8 : UCS-4 unusual ordering
XML_CHAR_ENCODING_UCS2
= 9 : UCS-2
XML_CHAR_ENCODING_8859_1
= 10 : ISO-8859-1 ISO Latin 1
XML_CHAR_ENCODING_8859_2
= 11 : ISO-8859-2 ISO Latin 2
XML_CHAR_ENCODING_8859_3
= 12 : ISO-8859-3
XML_CHAR_ENCODING_8859_4
= 13 : ISO-8859-4
XML_CHAR_ENCODING_8859_5
= 14 : ISO-8859-5
XML_CHAR_ENCODING_8859_6
= 15 : ISO-8859-6
XML_CHAR_ENCODING_8859_7
= 16 : ISO-8859-7
XML_CHAR_ENCODING_8859_8
= 17 : ISO-8859-8
XML_CHAR_ENCODING_8859_9
= 18 : ISO-8859-9
XML_CHAR_ENCODING_2022_JP
= 19 : ISO-2022-JP
XML_CHAR_ENCODING_SHIFT_JIS
= 20 : Shift_JIS
XML_CHAR_ENCODING_EUC_JP
= 21 : EUC-JP
XML_CHAR_ENCODING_ASCII
= 22 : pure ASCII
}
Structure xmlCharEncodingHandler
Structure xmlCharEncodingHandler
struct _xmlCharEncodingHandler {
char *	name
libxml-encoding.html#xmlCharEncodingInputFunc
xmlCharEncodingInputFunc
input
libxml-encoding.html#xmlCharEncodingOutputFunc
xmlCharEncodingOutputFunc
output
iconv_t	iconv_in
iconv_t	iconv_out
}
Function: UTF8Toisolat1
int	UTF8Toisolat1			(unsigned char * out,
int * outlen,
const unsigned char * in,
int * inlen)
Take a block of UTF-8 chars in and try to convert it to an ISO Latin 1 block of chars out.
out
:
a pointer to an array of bytes to store the result
outlen
:
the length of @out
in
:
a pointer to an array of UTF-8 chars
inlen
:
the length of @in
Returns
:
the number of bytes written if success, -2 if the transcoding fails, or -1 otherwise The value of @inlen after return is the number of octets consumed if the return value is positive, else unpredictable. The value of @outlen after return is the number of octets consumed.
Function: isolat1ToUTF8
int	isolat1ToUTF8			(unsigned char * out,
int * outlen,
const unsigned char * in,
int * inlen)
Take a block of ISO Latin 1 chars in and try to convert it to an UTF-8 block of chars out.
out
:
a pointer to an array of bytes to store the result
outlen
:
the length of @out
in
:
a pointer to an array of ISO Latin 1 chars
inlen
:
the length of @in
Returns
:
the number of bytes written if success, or -1 otherwise The value of @inlen after return is the number of octets consumed if the return value is positive, else unpredictable. The value of @outlen after return is the number of octets consumed.
Function: xmlAddEncodingAlias
int	xmlAddEncodingAlias		(const char * name,
const char * alias)
Registers an alias @alias for an encoding named @name. Existing alias will be overwritten.
name
:
the encoding name as parsed, in UTF-8 format (ASCII actually)
alias
:
the alias name as parsed, in UTF-8 format (ASCII actually)
Returns
:
0 in case of success, -1 in case of error
Function: xmlCharEncCloseFunc
int	xmlCharEncCloseFunc		(
libxml-encoding.html#xmlCharEncodingHandler
xmlCharEncodingHandler
* handler)
Generic front-end for encoding handler close function
handler
:
char enconding transformation data structure
Returns
:
0 if success, or -1 in case of error
Function: xmlCharEncFirstLine
int	xmlCharEncFirstLine		(
libxml-encoding.html#xmlCharEncodingHandler
xmlCharEncodingHandler
* handler,
libxml-tree.html#xmlBufferPtr
xmlBufferPtr
out,
libxml-tree.html#xmlBufferPtr
xmlBufferPtr
in)
Front-end for the encoding handler input function, but handle only the very first line, i.e. limit itself to 45 chars.
handler
:
char enconding transformation data structure
out
:
an
libxml-tree.html#xmlBuffer
xmlBuffer
for the output.
in
:
an
libxml-tree.html#xmlBuffer
xmlBuffer
for the input
Returns
:
the number of byte written if success, or -1 general error -2 if the transcoding fails (for *in is not valid utf8 string or the result of transformation can't fit into the encoding we want), or
Function: xmlCharEncInFunc
int	xmlCharEncInFunc		(
libxml-encoding.html#xmlCharEncodingHandler
xmlCharEncodingHandler
* handler,
libxml-tree.html#xmlBufferPtr
xmlBufferPtr
out,
libxml-tree.html#xmlBufferPtr
xmlBufferPtr
in)
Generic front-end for the encoding handler input function
handler
:
char encoding transformation data structure
out
:
an
libxml-tree.html#xmlBuffer
xmlBuffer
for the output.
in
:
an
libxml-tree.html#xmlBuffer
xmlBuffer
for the input
Returns
:
the number of byte written if success, or -1 general error -2 if the transcoding fails (for *in is not valid utf8 string or the result of transformation can't fit into the encoding we want), or
Function: xmlCharEncOutFunc
int	xmlCharEncOutFunc		(
libxml-encoding.html#xmlCharEncodingHandler
xmlCharEncodingHandler
* handler,
libxml-tree.html#xmlBufferPtr
xmlBufferPtr
out,
libxml-tree.html#xmlBufferPtr
xmlBufferPtr
in)
Generic front-end for the encoding handler output function a first call with @in == NULL has to be made firs to initiate the output in case of non-stateless encoding needing to initiate their state or the output (like the BOM in UTF16). In case of UTF8 sequence conversion errors for the given encoder, the content will be automatically remapped to a CharRef sequence.
handler
:
char enconding transformation data structure
out
:
an
libxml-tree.html#xmlBuffer
xmlBuffer
for the output.
in
:
an
libxml-tree.html#xmlBuffer
xmlBuffer
for the input
Returns
:
the number of byte written if success, or -1 general error -2 if the transcoding fails (for *in is not valid utf8 string or the result of transformation can't fit into the encoding we want), or
Function type: xmlCharEncodingInputFunc
Function type: xmlCharEncodingInputFunc
int	xmlCharEncodingInputFunc	(unsigned char * out,
int * outlen,
const unsigned char * in,
int * inlen)
Take a block of chars in the original encoding and try to convert it to an UTF-8 block of chars out.
out
:
a pointer to an array of bytes to store the UTF-8 result
outlen
:
the length of @out
in
:
a pointer to an array of chars in the original encoding
inlen
:
the length of @in
Returns
:
the number of bytes written, -1 if lack of space, or -2 if the transcoding failed. The value of @inlen after return is the number of octets consumed if the return value is positive, else unpredictiable. The value of @outlen after return is the number of octets consumed.
Function type: xmlCharEncodingOutputFunc
Function type: xmlCharEncodingOutputFunc
int	xmlCharEncodingOutputFunc	(unsigned char * out,
int * outlen,
const unsigned char * in,
int * inlen)
Take a block of UTF-8 chars in and try to convert it to another encoding. Note: a first call designed to produce heading info is called with in = NULL. If stateful this should also initialize the encoder state.
out
:
a pointer to an array of bytes to store the result
outlen
:
the length of @out
in
:
a pointer to an array of UTF-8 chars
inlen
:
the length of @in
Returns
:
the number of bytes written, -1 if lack of space, or -2 if the transcoding failed. The value of @inlen after return is the number of octets consumed if the return value is positive, else unpredictiable. The value of @outlen after return is the number of octets produced.
Function: xmlCleanupCharEncodingHandlers
void	xmlCleanupCharEncodingHandlers	(void)
Cleanup the memory allocated for the char encoding support, it unregisters all the encoding handlers and the aliases.
Function: xmlCleanupEncodingAliases
void	xmlCleanupEncodingAliases	(void)
Unregisters all aliases
Function: xmlDelEncodingAlias
int	xmlDelEncodingAlias		(const char * alias)
Unregisters an encoding alias @alias
alias
:
the alias name as parsed, in UTF-8 format (ASCII actually)
Returns
:
0 in case of success, -1 in case of error
Function: xmlDetectCharEncoding
libxml-encoding.html#xmlCharEncoding
xmlCharEncoding
xmlDetectCharEncoding	(const unsigned char * in,
int len)
Guess the encoding of the entity using the first bytes of the entity content according to the non-normative appendix F of the XML-1.0 recommendation.
in
:
a pointer to the first bytes of the XML entity, must be at least 2 bytes long (at least 4 if encoding is UTF4 variant).
len
:
pointer to the length of the buffer
Returns
:
one of the XML_CHAR_ENCODING_... values.
Function: xmlFindCharEncodingHandler
libxml-encoding.html#xmlCharEncodingHandlerPtr
xmlCharEncodingHandlerPtr
xmlFindCharEncodingHandler	(const char * name)
Search in the registered set the handler able to read/write that encoding.
name
:
a string describing the char encoding.
Returns
:
the handler or NULL if not found
Function: xmlGetCharEncodingHandler
libxml-encoding.html#xmlCharEncodingHandlerPtr
xmlCharEncodingHandlerPtr
xmlGetCharEncodingHandler	(
libxml-encoding.html#xmlCharEncoding
xmlCharEncoding
enc)
Search in the registered set the handler able to read/write that encoding.
enc
:
an
libxml-encoding.html#xmlCharEncoding
xmlCharEncoding
value.
Returns
:
the handler or NULL if not found
Function: xmlGetCharEncodingName
const char *	xmlGetCharEncodingName	(
libxml-encoding.html#xmlCharEncoding
xmlCharEncoding
enc)
The "canonical" name for XML encoding. C.f. http://www.w3.org/TR/REC-xml#charencoding Section 4.3.3 Character Encoding in Entities
enc
:
the encoding
Returns
:
the canonical name for the given encoding
Function: xmlGetEncodingAlias
const char *	xmlGetEncodingAlias	(const char * alias)
Lookup an encoding name for the given alias.
alias
:
the alias name as parsed, in UTF-8 format (ASCII actually)
Returns
:
NULL if not found, otherwise the original name
Function: xmlInitCharEncodingHandlers
void	xmlInitCharEncodingHandlers	(void)
Initialize the char encoding support, it registers the default encoding supported. NOTE: while public, this function usually doesn't need to be called in normal processing.
Function: xmlNewCharEncodingHandler
libxml-encoding.html#xmlCharEncodingHandlerPtr
xmlCharEncodingHandlerPtr
xmlNewCharEncodingHandler	(const char * name,
libxml-encoding.html#xmlCharEncodingInputFunc
xmlCharEncodingInputFunc
input,
libxml-encoding.html#xmlCharEncodingOutputFunc
xmlCharEncodingOutputFunc
output)
Create and registers an xmlCharEncodingHandler.
name
:
the encoding name, in UTF-8 format (ASCII actually)
input
:
the
libxml-encoding.html#xmlCharEncodingInputFunc
xmlCharEncodingInputFunc
to read that encoding
output
:
the
libxml-encoding.html#xmlCharEncodingOutputFunc
xmlCharEncodingOutputFunc
to write that encoding
Returns
:
the
libxml-encoding.html#xmlCharEncodingHandlerPtr
xmlCharEncodingHandlerPtr
created (or NULL in case of error).
Function: xmlParseCharEncoding
libxml-encoding.html#xmlCharEncoding
xmlCharEncoding
xmlParseCharEncoding	(const char * name)
Compare the string to the encoding schemes already known. Note that the comparison is case insensitive accordingly to the section [XML] 4.3.3 Character Encoding in Entities.
name
:
the encoding name as parsed, in UTF-8 format (ASCII actually)
Returns
:
one of the XML_CHAR_ENCODING_... values or
libxml-encoding.html#XML_CHAR_ENCODING_NONE
XML_CHAR_ENCODING_NONE
if not recognized.
Function: xmlRegisterCharEncodingHandler
void	xmlRegisterCharEncodingHandler	(
libxml-encoding.html#xmlCharEncodingHandlerPtr
xmlCharEncodingHandlerPtr
handler)
Register the char encoding handler, surprising, isn't it ?
handler
:
the
libxml-encoding.html#xmlCharEncodingHandlerPtr
xmlCharEncodingHandlerPtr
handler block
../bugs.html
Daniel Veillard
