DNA.
wyrm-expat
Version.
1.0.9
Namespace.
::wyrm::expat
Command.
::wyrm::expat::create
Command.
::wyrm::expat::tidy
Command.
::wyrm::expat::resume
Command.
::wyrm::expat::end
Command.
::wyrm::expat::validate
Language.
c
Manpage.
expat (1WY)
Testbase.
Test Script
Test Report
Import.
Interface.
wyrm-io
wyrmwif
Package.
wyrmwif
Export.
Executable.
tidy
Implementation.
wyrm-expat.c
Interface.
wyrm-expat.h
expat.uu
Makefile.
Makefile.irix
Package.
wyrmexpat.dylib

Expat Package and Tcl Interface

Sections.
Introduction
Command Interface
Parser Context
Callback Handlers
Testing
expat-1.95.1
Make.
Executable.
file mkdir $bin/$tcl_platform(os)
if {[string equal $::tcl_platform(os) IRIX64]} {
  set movemakefile "mv $source/Makefile.irix  $source/expat-tidy/tidy-source/Makefile\n"
} else {
  set movemakefile ""
}
rule $bin/$tcl_platform(os)/tidy $source/xmlparse.c "
  ${movemakefile}cd $source/expat-tidy/tidy-source; make
  cp $source/expat-tidy/tidy-source/tidy $bin/$tcl_platform(os)/tidy
  -mkdir $doc/tidy
  cp $source/expat-tidy/tidy-docs/htmldoc/* $doc/tidy
"
Package.
compile -V PVERSION -Iinclude/expat "'-DVERSION=\"expat_1.95.1\"'" -ld -cc -o [
  export package
] $source/xmlparse.c $source/xmlrole.c $source/xmltok.c [
  export implementation
] -- -list [
  export interface
] -list [
  import interface
] $bin/$tcl_platform(os)/tidy $source/xmlparse.c
Script.
rule $source/xmlparse.c $source/expat.uu "
  cd $source; uudecode expat.uu
  gunzip -c $source/expat.tar.gz | (cd $source; tar xf -)
  mv $source/expat-tidy/expat-source/*.c $source
  mkdir -p $include/expat
  -rm -f $include/expat/*
  mv $source/expat-tidy/expat-include/* $include/expat
"
rule $source/xmlrole.c $source/xmlparse.c
rule $source/xmltok.c $source/xmlparse.c
rule clean :: {} "
  -rm $test/wyrm-expat.TESTING
"
   

Introduction

   
top

1 :: This package is an interface between the expat package and Tcl code. The interface is through commands to create and terminate a parser, and feed text to the parser.

Copyright (C) 2002 SM Ryan

Wyrmwif Tcl extensions. For non-profit uses only, provided this copyright is preserved on all copies, this work may be freely copied, modified, redistributed, compiled, and incorporated in other works. This work is distributed with no warranty of any kind; no author or distributor accepts any responsibility for the consequences of using it, or for whether it serves any particular purpose or works at all, unless he or she says so in writing.

 
section    top

#ifndef WYRM_EXPAT_H
#define WYRM_EXPAT_H

	//	wyrm-expat.dna - Copyright (C) 2002 SM Ryan.  All rights reserved.
	
	#include "wyrmwif.h"

	int Wyrmexpat_Init(Tcl_Interp *intr);
	int Wyrmexpat_SafeInit(Tcl_Interp *intr);

#endif

		
 
section    top

static const char COPYRIGHT[] = "wyrm-expat.dna - Copyright (C) 2002 SM Ryan.  All rights reserved.";

#include "wyrm-expat.h"
#include "wyrm-io.h"
#include "expat.h"
#include <stdarg.h>

<TclContext>
<Parser token to parser context map>
<Switch around namespaced names>
<Property list pairs constructors>
<Item index management>
<Callback handlers>
<Commands and package>

		
   
   

Command Interface

   
top
4. expat (1WY) ::

NAME

expat — XML parser.

DESCRIPTION

This package is an interface between the expat package and Tcl code, and between the tidy prgram and Tcl.

wyrm::expat::create creates a new xml parser and returns a token to it. Any number of xml parsers can be active at the same time, each with its own token. The parser-token identifies which parser is being used. Strings from the XML source are passed to the parser with wyrm::expat::resume. The source can be copied from a file, or from a variety of files, or synthesised in another way needed. wyrm::expat::end is called to terminate the parser and return the parse list. The returned list is a two level list of items found in the source, such as start tags, end tags, or processing instructions. Each sublist is one item presented as property value pairs.

reader-script is used to get XML sources for an external entity reference. It is discussed below. If the caller does not wish to deal with external XML sources, this can be specified as --.

This package does not actually read files or other resources. The resources must be acquired in some other fashion, perhaps a uri get command or a read from a channel, and its the strings that are passed to this package.

wyrm::expat::syntax returns true if any <!ELEMENT...> have been seen; if not the input cannot be validated.

<Validate XML command description> <Pipe HTML through tidy to convert to XML>

Each sublist in the returned list has a property %index. This is a list of integers which uniquely identify this element or content or other construct. The length of the index nested is the depth of the nested elements; index positions are monotonic increasing in source order. The sublists of the returned list are:

<Callback handlers>

EXAMPLES

<Examples>
 
section    top
 
section    top

static int expat_createCommand(ClientData clientData,Intr intr,int N,Tcl_Obj *const P[]) {
	bool tidy = clientData!=0;
	Obj Reader = tidy ? (N>=3 ? P[2] : 0) : (N>=2 ? P[1] : 0);
	Obj Encoding = N>=3 && !tidy ? P[2] : 0;
	Obj HtmlChannel = N>=2 && tidy ? P[1] : 0;
	Tcl_Channel htmlChannel = 0,xmlChannel = 0;
	chars tokenName = 0;
	if (HtmlChannel) {
		<Pipe HTML through tidy to convert to XML>
	}
	if (!tidy && N>=4) {
		Tcl_ResetResult(intr);
		Tcl_AppendResult(intr,"usage: wyrm::expat::create [reader-script|-- [encoding]]",0);
		return TCL_ERROR;
	}else if (tidy && N<2) {
		Tcl_ResetResult(intr);
		Tcl_AppendResult(intr,"usage: wyrm::expat::tidy input-html-channel [reader-script|-- [tidy-arguments...]]",0);
		return TCL_ERROR;
	}else {
		static XML_Memory_Handling_Suite memory = {0,0,0};	
		XML_Parser p = 0;
		pTclContext c = heap(TclContext);
		chars reader = Reader?Tcl_GetStringFromObj(Reader,0):"--";
		chars encoding = Encoding?Tcl_GetStringFromObj(Encoding,0):0;
		Obj init = 0;
		zero(1,TclContext,c);
		if (!memory.malloc_fcn) {
			memory.malloc_fcn =(void*(*)(size_t size))Tcl_Alloc;
			memory.realloc_fcn = (void*(*)(void *ptr,size_t size))Tcl_Realloc;
			memory.free_fcn = (void(*)(void *ptr))Tcl_Free;
		}
		c->refcount = 1;
		c->parser = p = XML_ParserCreate_MM(encoding,&memory,"~");
		c->encoding = incr(Encoding);
		c->reader = streq(reader,"--") || streq(reader,"-") ? 0 : incr(Reader);
		c->intr = intr;
		c->htmlChannel = htmlChannel;
		c->xmlChannel = xmlChannel;
		c->anyElement = false;
		init = pairstr(init,"%what","init",-1);
		init = pairstr(init,"%encoding",encoding?encoding:"UTF-8",-1);
		init = pairstr(init,"%version",XML_ExpatVersion(),-1);
		{
			Tcl_DString buffer; chars cwd = Tcl_GetCwd(intr,&buffer);
			init = pairstr(init,"%base",cwd,-1);
			Tcl_DStringFree(&buffer);
		}
		pushIndex(c); init = pairIndex(init,c); incrIndex(c);
		c->result = incr(Tcl_NewListObj(1,&init)); decr(init);
		XML_SetReturnNSTriplet(p,0);
		XML_SetUserData(p,c);
		XML_SetElementHandler(p,startElement,endElement);
		XML_SetCharacterDataHandler(p,characterData);
		XML_SetProcessingInstructionHandler(p,processingInstruction);
		XML_SetCommentHandler(p,comment);
		XML_SetCdataSectionHandler(p,startCdataSection,endCdataSection);
		XML_SetNamespaceDeclHandler(p,startNamespaceDecl,endNamespaceDecl);
		XML_SetXmlDeclHandler(p,xmlDecl);
		XML_SetDoctypeDeclHandler(p,startDoctypeDecl,endDoctypeDecl);
		XML_SetElementDeclHandler(p,elementDecl);
		XML_SetAttlistDeclHandler(p,attlist);
		XML_SetNotationDeclHandler(p,notation);
		XML_SetNotStandaloneHandler(p,notStandaloneHandler);
		XML_SetEntityDeclHandler(p,entity);
		XML_SetExternalEntityRefHandler(p,externalEntityRef);
		XML_SetParamEntityParsing(p,XML_PARAM_ENTITY_PARSING_ALWAYS);
		c->token = createParserToken(p,tokenName);
		Tcl_SetResult(intr,c->token,TCL_VOLATILE);
		return TCL_OK;
	}
}

		
 
section    top
 
section    top

int mode; Obj pipe; int rc,i; ClientData data;
htmlChannel = Tcl_GetChannel(intr,Tcl_GetString(HtmlChannel),&mode);
if (!htmlChannel) {
	Tcl_ResetResult(intr);
	Tcl_AppendResult(intr,"unknown channel: ",Tcl_GetString(HtmlChannel),0);
	return TCL_ERROR;
}
if (!(TCL_READABLE&mode)) {
	Tcl_ResetResult(intr);
	Tcl_AppendResult(intr,"channel is not readable: ",Tcl_GetString(HtmlChannel),0);
	return TCL_ERROR;
}
pipe = oprintf(
		(Tcl_GetChannelHandle(htmlChannel,TCL_READABLE,(ClientData*)&data)==TCL_OK
				? "open [list |tidy <@ %{y}s --output-xml 1 --show-warnings 0 --quiet 1 %{y-}s] r"
				: "open [list |tidy << [read %{y}s] --output-xml 1 --show-warnings 0 --quiet 1 %{y-}s] r"),
			HtmlChannel,incr(Tcl_NewListObj(N-3,P+3)));
rc = Tcl_EvalObj(intr,pipe);
decr(pipe);
if (rc!=TCL_OK) return TCL_ERROR;
tokenName = Tcl_GetStringResult(intr);
xmlChannel = Tcl_GetChannel(intr,tokenName,&mode);
if (!xmlChannel) {
	Tcl_ResetResult(intr);
	Tcl_AppendResult(intr,"xml channel disappeared",0);
	return TCL_ERROR;
}
if (!(TCL_READABLE&mode)) {
	Tcl_ResetResult(intr);
	Tcl_AppendResult(intr,"readable xml channel is not readable",0);
	return TCL_ERROR;
}


		
 
section    top
 
section    top

static int expat_resumeCommand(ClientData clientData,Intr intr,int N,Tcl_Obj *const P[]) {
	bool end = clientData!=0;
	Obj parsertoken = N>=2 ? P[1] : 0;
	Obj input = N>=3 ? P[2] : 0;
	Tcl_ResetResult(intr);
	if (end && N!=2) {
		Tcl_AppendResult(intr,"usage: wyrm::expat::end <parser>",0);
		return TCL_ERROR;
	}else if (!end && N!=3) {
		Tcl_AppendResult(intr,"usage: wyrm::expat::resume <parser> <textline>",0);
		return TCL_ERROR;
	}else {
		XML_Parser p = getParser(Tcl_GetStringFromObj(parsertoken,0));
		pTclContext c = p ? XML_GetUserData(p) : 0;
		int ok,rc=TCL_OK;
		if (!c) {
			Tcl_AppendResult(intr,"unknown parser token: ",Tcl_GetString(parsertoken),0);
			return TCL_ERROR;
		}
		c->intr = intr;
		if (Tcl_IsShared(c->result)) {
			Obj t = incr(Tcl_DuplicateObj(c->result));
			decr(c->result);
			c->result = t;
		}
		if (input) {
			int n; chars s = Tcl_GetStringFromObj(input,&n);
			ok = XML_Parse(p,s,n,0);
		}else {
			ok = XML_Parse(p,"",0,1);
		}
		if (c->subparserError) {
			rc = TCL_ERROR;
			c->subparserError = false;
		}else if (!ok) {
			int o,n; chars s = (chars)XML_GetInputContext(p,&o,&n);
			chars preetc="",postetc=""; int npre=0; chars pre="";
			if (s) {
				if (o>0) {
					npre = o; pre = s;
					if (npre>15) {
						pre = s+o-15; npre = 15; preetc = "...";
					}
				}
				s += o; n -= o;
				if (n>20) {postetc = "..."; n = 20;}
			}else {
				s = ""; n = 0;
			}
			rprintf(intr,"[%d:%d] %s: %s%.*s##%.*s%s",
					XML_GetCurrentLineNumber(p),
					XML_GetCurrentColumnNumber(p),
					XML_ErrorString(XML_GetErrorCode(p)),
					preetc,npre,pre,
					n,s,postetc);
			rc = TCL_ERROR;
		}else if (input) {
			Tcl_ResetResult(intr);
		}else {
			Tcl_SetObjResult(intr,c->result);
		}
		if (!input || rc!=TCL_OK) deleteParser(Tcl_GetString(parsertoken));
		return rc;
	}
}

		
 
section    top

static int expat_syntaxCommand(ClientData clientData,Intr intr,int N,Tcl_Obj *const P[]) {
	Obj parsertoken = N>=2 ? P[1] : 0;
	if (N!=2) {
		Tcl_ResetResult(intr);
		Tcl_AppendResult(intr,"usage: wyrm::expat::syntax <parser>",0);
		return TCL_ERROR;
	}else {
		XML_Parser p = getParser(Tcl_GetStringFromObj(parsertoken,0));
		pTclContext c = p ? XML_GetUserData(p) : 0;
		int ok,rc=TCL_OK;
		if (!c) {
			Tcl_ResetResult(intr);
			Tcl_AppendResult(intr,"unknown parser token: ",Tcl_GetString(parsertoken),0);
			return TCL_ERROR;
		}
		Tcl_SetObjResult(intr,Tcl_NewBooleanObj(c->anyElement));
		return TCL_OK;
	}
}

		
 
section    top
 
section    top
 
section    top

14. Validate XML command :: The parse fsm is a list of states, each state is a list of token+next state pair (x and t); if the current token matches the token in the state, the next state is the index of the next state in the list. Next state -1 means the fsm is exhausted. The parser is a nondeterministic fsm; all possible states are advanced at each token; as long as one state is viable and advances, the input is validated. Each element is parsed with an fsm; the states of nested tags are stacked; the last two elements of the state stack are the current tag (e) and the list of viable states (os).

While within the tag, the next token (start, end, or content) is gotten, the current viable states are checked if they have this token; if so the next state is added to the list of viable states (NS and ns) for the next token.

Element declarations as seen are added to E. Start elements advance to enclosing tag fsm then push the inner fsm at its initial state 0. Content elements are token #PCDATA regardless of their actual content. End elements are token #END which should shift the fsm into the next state -1; they then pop the stacked fsms.

One complication is that if the fsm token is #ANY it matches all tokens.

The second complication is that white space is the document in tags without #PCDATA is elidable content, ignored during validation. White space might or might not be signficant. The token is marked elidable if white space, and all current viable states remain viable, in addition to any successors wit #PCDATA. If any the current states matches #PCDATA, the content is considerred nonelidable.

protocol
states—Parse state stack.
parses—The XML parse list being validated.
newparses—The editted XML parse list being constructed.
errors—Accumulated error messages.
E(tag)—Validation FSM of the tag.
NS—Next state.
token—Current tag or #PCDATA.
elidable—All blank #PCDATA where #PCDATA is not expected.

"proc validate parser {\n",
	"set anyElement [syntax $parser]\n",
	"set parses [end $parser]\n",
	"if {!$anyElement} {return $parses}\n",
	"set states {}\n",
	"set errors {}\n",
	"set newparses {}\n",
	"foreach parse $parses {\n",
		"array set S $parse\n",
		"switch $S(%what) {\n",
			"element {\n",
				"set E($S(%name)) $S(%fsm)\n",
				"lappend newparses $parse\n",
				"continue\n",
			"}\n",
			/*"attlist {\n",
				"set A($S(%element),$S(%attname),assigned) $S(%assigned)\n",
				"set A($S(%element),$S(%attname),required) $S(%required)\n",
				"if {[info exists S(%default)]} {set A($S(%element),$S(%attname),default) $S(%default)}\n",
			"}\n",*/
			"start {\n",
				"set token $S(%name)\n",
				"set elidable 0\n",
				/*"foreach a [array names S {[-a-zA-Z_.]*}] {set U($a) 1}\n",
				"foreach a [array names A $token,*,type] {\n",
					"set a [lindex [split $a ,] 1]\n",
					"array unset A $a\n",
					"if {[info exists S($a)]} {\n",
						"if {\n",
								"[string equal $A($token,$a,assigned) #FIXED]\n",
								"&& [info exists A($token,$a,default)]} {\n",
								"&& ![string equal $A($token,$a,default) $S($a)]\n",
						"} {\n",
							"lappend errors \"fixed value attribute has the wrong value: $a\"\n",
						"}\n",
					"} elseif {$A($token,$a,required)} {\n",
						"lappend errors \"missing required attribute: $a\"\n",
					"} elseif {[info exists A($token,$a,default)]} {\n",
						"lappend parse $a $A($token,$a,default)\n",
					"}\n",
				"}\n",
				"set L [array names U]\n",
				"if {[llength $L]} {lappend parse %unexpected $L}\n",*/
			"}\n",
			"end {\n",
				"set token #END\n",
				"set elidable 0\n",
			"}\n",
			"content {\n",
				"set token #PCDATA\n",
				"set elidable [regexp {^\\s*$} $S(%data)]\n",
			"}\n",
			"default {\n",
				"lappend newparses $parse\n",
				"continue\n",
			"}\n",
		"}\n",
		"set e [lindex $states end-1]\n",
		"set os [lindex $states end]\n",
		"set stillelidable $elidable\n",
		"if {[llength $states]>0 && [llength $os]>0 && [info exists E($e)]} {\n",
			"foreach s $os {\n",
				"if {$elidable} {\n",
					"set NS($s) 1\n",
				"}\n",
				"foreach {x t} [lindex $E($e) $s] {\n",
					"if {[string equal $x #PCDATA]} {set stillelidable 0}\n",
					"set EX(<$x>) 1\n",
					"if {[string equal $x $token] || [string equal $x #ANY]} {\n",
						"set NS($t) 0\n",
					"}\n",
				"}\n",
			"}\n",
			"if {[info exists NS(-1)]} {\n",
				"set ns {}\n",
			"} else {\n",
				"set ns [array names NS]\n",
				"if {[llength $ns]==0} {\n",
					"set msg \"\\[$S(%line):$S(%col)\\] invalid: <$token>, expected: \"\n",
					"set ex [lsort [array names EX]]\n",
					"switch [llength $ex] {\n",
						"0 {;}\n",
						"1 {append msg [lindex $ex 0]}\n",
						"2 {append msg \"[lindex $ex 0] or [lindex $ex 1]\"}\n",
						"default {append msg \"[join [lrange $ex 0 end-1] {, }], or [lindex $ex end]\"}\n",
					"}\n",
					"lappend errors [string map {<#END> \"end of element\" <#PCDATA> \"content data\"} $msg]\n",
				"}\n",
			"}\n",
			"if {[string equal $token #PCDATA]} {lappend parse %elidable $stillelidable}\n",
			"set states [lreplace $states end end $ns]\n",
			"unset -nocomplain NS\n",
			"unset -nocomplain EX\n",
		"}\n",
		"switch $S(%what) {\n",
			"start {\n",
				"lappend states $S(%name) 0\n",
			"}\n",
			"end {\n",
				"set states [lrange $states 0 end-2]\n",
			"}\n",
		"}\n",
		"unset S\n",
		"lappend newparses $parse\n",
	"}\n",
	"if {[llength $errors]} {\n",
		"error [join $errors \\n]\n",
	"}\n",
	"set newparses\n",
"}\n",

		
 
section    top

<Create parser command>
<Resume and end parser commands>
<Any element syntax>

int Wyrmexpat_Init(Tcl_Interp *intr) {
	Tcl_PkgProvide(intr,"wyrmexpat",PVERSION);
	Tcl_CreateObjCommand(intr,"::wyrm::expat::create",expat_createCommand,0,0);
	Tcl_CreateObjCommand(intr,"::wyrm::expat::tidy",expat_createCommand,"tidy",0);
	Tcl_CreateObjCommand(intr,"::wyrm::expat::resume",expat_resumeCommand,0,0);
	Tcl_CreateObjCommand(intr,"::wyrm::expat::end",expat_resumeCommand,"end",0);
	Tcl_CreateObjCommand(intr,"::wyrm::expat::syntax",expat_syntaxCommand,0,0);
	return Tcl_VarEval(intr,
			"namespace eval ::wyrm::expat {\n",
				"namespace export create\n",
				"namespace export tidy\n",
				"namespace export resume\n",
				"namespace export end\n",
				"namespace export validate\n",
				"namespace export syntax\n",
				<Validate XML command>
			"}\n",
	0);
}

int Wyrmexpat_SafeInit(Tcl_Interp *intr) {
	return Wyrmexpat_Init(intr);
}

		
   
   

Parser Context

   
top
type struct TclContext—wyrm-expat and Tcl context of the parser.
int refcount—How many parsers are using this context.
Intr intr—Interpretter the parser was created or resumed with.
XML_Parser parser—The XML parser.
chars token—The token of this parser.
Obj* index—Nesting unique identifier of items parsed.
int depth—Depth of the index.
int mdepth—Allocated depth of the index.
Obj result—The final result list.
Obj encoding—The input encoding.
Obj reader—External entity reference reader or NULL.
Obj pendingNamespace—Hold opening namespaces until start element is created.
bool subparserError—Percolates a subparser reported an error to the base.
int cdata—Whether in a CDATA block.
Tcl_Channel htmlChannel—If piping through tidy, the original HTML channel.
Tcl_Channel xmlChannel—If piping through tidy, the converted XML channel.

typedef struct {
	int refcount;
	Intr intr;
	XML_Parser parser;
	chars token;
	Obj *index;
	int depth,mdepth;
	Obj result;
	Obj encoding;
	Obj reader;
	Obj pendingNamespace;
	bool subparserError;
	int cdata;
	Tcl_Channel	htmlChannel;
	Tcl_Channel	xmlChannel;
	bool anyElement;
} TclContext,*pTclContext;

		
 
section    top

TCL_DECLARE_MUTEX(mapLock)
static Tcl_HashTable *tokenMap=0;

static chars createParserToken(XML_Parser parser,chars token) {
	Tcl_MutexLock(mapLock);
	Tcl_HashEntry *e; int new = 0; static int K; char T[40];
	while (!new) {
		if (!token) {sprintf(T,"wyrm-expat-parser-%d",++K); token = T;}
		if (!tokenMap) {
			tokenMap = heap(Tcl_HashTable); Tcl_InitHashTable(tokenMap,TCL_STRING_KEYS);
		}
		e = Tcl_CreateHashEntry(tokenMap,token,&new);
		if (new)  {Tcl_SetHashValue(e,(chars)parser); token = Tcl_GetHashKey(tokenMap,e);}
		else token = 0;
	}
	Tcl_MutexUnlock(mapLock);
	return token;
}

static void deleteParser(chars token) {
	Tcl_MutexLock(mapLock);
	Tcl_HashEntry *e = tokenMap ? Tcl_FindHashEntry(tokenMap,token) : 0;
	XML_Parser parser = e ? Tcl_GetHashValue(e) : 0;
	if (parser) {
		pTclContext c = XML_GetUserData(parser); int i;
		Tcl_DeleteHashEntry(e);
		if (c && (--(c->refcount))<=0) {
			decr(c->pendingNamespace);
			decr(c->result);
			decr(c->encoding);
			decr(c->reader);
			for (i=0; i<c->depth; i++) decr(c->index[i]);
			dispose(c->index);
			if (c->htmlChannel) Tcl_UnregisterChannel(c->intr,c->htmlChannel);
			if (c->xmlChannel) Tcl_UnregisterChannel(c->intr,c->xmlChannel);
			dispose(c);
		}
		XML_ParserFree(parser);
	}
	Tcl_MutexUnlock(mapLock);
}

static XML_Parser getParser(chars token) {
	Tcl_MutexLock(mapLock);
	Tcl_HashEntry *e = tokenMap ? Tcl_FindHashEntry(tokenMap,token) : 0;
	XML_Parser r = e ? Tcl_GetHashValue(e) : 0;
	Tcl_MutexUnlock(mapLock);
	return r;
}

		
 
section    top

static Obj normaliseNamespacedName(chars name) {
	chars tilde = strrchr(name,'~');
	if (tilde) {
		Obj x = Tcl_NewStringObj(tilde+1,-1);
		Tcl_AppendToObj(x,"~",1);
		Tcl_AppendToObj(x,name,tilde-name);
		return x;
	}else
		return Tcl_NewStringObj(name,-1);
}

		
 
section    top

static Obj pairstr(Obj list,chars prop,chars value,int len) {
	if (value) {
		if (!list) list = incr(Tcl_NewObj());
		Tcl_ListObjAppendElement(0,list,Tcl_NewStringObj(prop,-1));
		Tcl_ListObjAppendElement(0,list,Tcl_NewStringObj(value,len));
	}
	return list;
}
static Obj pairint(Obj list,chars prop,int value) {
	if (!list) list = incr(Tcl_NewObj());
	Tcl_ListObjAppendElement(0,list,Tcl_NewStringObj(prop,-1));
	Tcl_ListObjAppendElement(0,list,Tcl_NewIntObj(value));
	return list;
}
static Obj pairobj(Obj list,chars prop,Obj value) {
	if (value) {
		if (!list) list = incr(Tcl_NewObj());
		Tcl_ListObjAppendElement(0,list,Tcl_NewStringObj(prop,-1));
		Tcl_ListObjAppendElement(0,list,value);
	}
	return list;
}

		
 
section    top

static void pushIndex(pTclContext c) {
	if (!c->index) {
		c->index = nheap(10,Obj);
		c->depth = 1; c->mdepth = 10;
	}else if (++(c->depth)>c->mdepth) {
		c->mdepth = 2*(c->depth);
		c->index = reheap(c->mdepth,Obj,c->index);
	}
	c->index[c->depth-1] = incr(Tcl_NewIntObj(1));
}

static void popIndex(pTclContext c) {
	if (c->index && c->depth>0) {
		c->depth -= 1;
		decr(c->index[c->depth]);
	}
}

static void incrIndex(pTclContext c) {
	if (c->index && c->depth>0) {
		int n=0;
		Tcl_GetIntFromObj(0,c->index[c->depth-1],&n);
		n += 1;
		decr(c->index[c->depth-1]);
		c->index[c->depth-1] = incr(Tcl_NewIntObj(n));
	}
}

static Obj pairIndex(Obj list,pTclContext c) {
	return pairobj(list,"%index",c->index ? Tcl_NewListObj(c->depth,c->index) : Tcl_NewObj());
}

		
   
   

Callback Handlers

   
top

static int notStandaloneHandler(ptr userData) {
	return 1;
}

		
 
section    top
22. Callback handlers ::
start element
The start element <tag [ attributes ] ...> or empty content element <tag [ attributes ] .../> has been parsed. The property list includes
%what start
%name tag
%line number
Which line, based on how many newlines based through wyrm::expat::resume.
%col number
Which column of the current line.
%index {number...}
Unique element index.
%id attribute
If defined, the name of the attribute with type #ID.
%base url
xml:base url if specified.
%xmlns.pre url
URI of each namespace identifier.
%defaulted {attribute...}
All attributes whose values were defaulted.
And every actual and defaulted attribute are also added to the property list. If the attribute is prefixed with a declared namespace identifier, pre:attribute, the reported attribute is actually attribute~uri. If the namespace identifier has not been declared, it is left as specified.
attribute value
All attributes and their values.
end element
The end element </tag> or empty content element <tag [ attributes ] .../> has been parsed. The property list includes
%what end
%name tag
Same tag as the start.
%line number
Which line.
%col number
Which column of the current line.
%index {number...}
Same index as on the start.

static void startElement(ptr userData,const XML_Char *name,const XML_Char **att) {
	pTclContext c = userData; XML_Parser p = c->parser;
	int idoff = XML_GetIdAttributeIndex(p);
	int speclen = XML_GetSpecifiedAttributeCount(p);
	Obj defaulted = 0;
	Obj *ppending; int npending;
	Obj piece = 0;
	piece = pairstr(piece,"%what","start",-1);
	piece = pairobj(piece,"%name",normaliseNamespacedName(name));
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	piece = pairIndex(piece,c); pushIndex(c);
	for (; *att; att+=2,idoff-=2,speclen-=2) {
		if (idoff==0) piece = pairobj(piece,"%id",normaliseNamespacedName(*att));
		if (speclen<=0) {
			if (!defaulted) defaulted = incr(Tcl_NewObj());
			Tcl_ListObjAppendElement(0,defaulted,normaliseNamespacedName(*att));
		}
		if (streq((chars)(att[0]),"http://www.w3.org/XML/1998/namespace~base") || streq((chars)(att[0]),"xml:base")) {
			piece = pairstr(piece,"%base",att[1],-1);
		}
		if (speclen==0) piece = pairint(piece,"%specified",0);
		if (!piece) piece = incr(Tcl_NewObj());
		Tcl_ListObjAppendElement(0,piece,normaliseNamespacedName(att[0]));
		Tcl_ListObjAppendElement(0,piece,Tcl_NewStringObj(att[1],-1));
	}
	if (defaulted) {
		piece = pairobj(piece,"%defaulted",defaulted);
		decr(defaulted);
	}
	if (c->pendingNamespace) {
		if (Tcl_ListObjGetElements(0,c->pendingNamespace,&npending,&ppending)==TCL_OK) {
			for (; npending>0; npending-=2,ppending+=2) {
				int n; chars s = Tcl_GetStringFromObj(ppending[0],&n);
				static char xmlns[] = "%%xmlns.%s";
				chars S = nheap(n+sizeof xmlns,char);
				sprintf(S,xmlns,s);
				piece = pairobj(piece,S,ppending[1]);
				dispose(S);
			}
		}
		decr(c->pendingNamespace); c->pendingNamespace = 0;
	}
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
}

static void endElement(ptr userData,const XML_Char *name) {
	pTclContext c = userData; XML_Parser p = c->parser;
	Obj piece = 0;
	piece = pairstr(piece,"%what","end",-1);
	piece = pairobj(piece,"%name",normaliseNamespacedName(name));
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	popIndex(c); piece = pairIndex(piece,c); incrIndex(c);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
}

		
 
section    top

static void characterData(ptr userData,const XML_Char *s,int n) {
	pTclContext c = userData; XML_Parser p = c->parser;
	int o,m,l; chars t = XML_GetInputContext(p,&o,&m),T; bool entity = false;
	Obj piece = 0;
	t += o; m -= o;
	if (c->cdata==0 && *t=='&' && m>2) {
		for (T=t,l=0; m>0 && !entity; m--,T++,l++) entity = *T==';';
	}
	piece = pairstr(piece,"%what","content",-1);
	piece = pairstr(piece,"%data",s,n);
	if (entity)	piece = pairstr(piece,"%entity",t,l);		
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	piece = pairIndex(piece,c); incrIndex(c);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
}

		
 
section    top

static void processingInstruction(ptr userData,const XML_Char *target,const XML_Char *data) {
	pTclContext c = userData; XML_Parser p = c->parser;
	Obj piece = 0;
	piece = pairstr(piece,"%what","pi",-1);
	piece = pairstr(piece,"%target",target,-1);
	piece = pairstr(piece,"%data",data,-1);
	piece = pairIndex(piece,c); incrIndex(c);
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
}

		
 
section    top

static void comment(ptr userData,const XML_Char *data) {
	pTclContext c = userData; XML_Parser p = c->parser;
	Obj piece = 0;
	piece = pairstr(piece,"%what","comment",-1);
	piece = pairstr(piece,"%data",data,-1);
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	piece = pairIndex(piece,c); incrIndex(c);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
}

		
 
section    top

static void startCdataSection(ptr userData) {
	pTclContext c = userData; XML_Parser p = c->parser;
	Obj piece = 0;
	piece = pairstr(piece,"%what","cdata-begin",-1);
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	piece = pairIndex(piece,c); pushIndex(c);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
	c->cdata += 1;
}

static void endCdataSection(ptr userData) {
	pTclContext c = userData; XML_Parser p = c->parser;
	Obj piece = 0;
	piece = pairstr(piece,"%what","cdata-end",-1);
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	popIndex(c); piece = pairIndex(piece,c); incrIndex(c);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
	c->cdata -= 1;
}

		
 
section    top

static void startNamespaceDecl(ptr userData,const XML_Char *prefix,const XML_Char *uri) {
	pTclContext c = userData; XML_Parser p = c->parser;
	Obj piece = 0;
	piece = pairstr(piece,"%what","namespace-begin",-1);
	piece = pairstr(piece,"%prefix",prefix?prefix:"",-1);
	piece = pairstr(piece,"%uri",uri?uri:"",-1);
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	piece = pairIndex(piece,c); incrIndex(c);
	c->pendingNamespace = pairstr(c->pendingNamespace,prefix,uri,-1);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
}

static void endNamespaceDecl(ptr userData,const XML_Char *prefix) {
	pTclContext c = userData; XML_Parser p = c->parser;
	Obj piece = 0;
	piece = pairstr(piece,"%what","namespace-end",-1);
	piece = pairstr(piece,"%prefix",prefix?prefix:"",-1);
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	piece = pairIndex(piece,c); incrIndex(c);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
}

		
 
section    top

static void xmlDecl(ptr userData,const XML_Char *version,const XML_Char *encoding,int standalone) {
	pTclContext c = userData; XML_Parser p = c->parser;
	Obj piece = 0;
	piece = pairstr(piece,"%what",version?"xml":"text",-1);
	piece = pairstr(piece,"%version",version,-1);
	piece = pairstr(piece,"%encoding",encoding,-1);
	if (standalone>=0) piece = pairint(piece,"%standalone",standalone);
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	piece = pairIndex(piece,c); incrIndex(c);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
}

		
 
section    top

static void startDoctypeDecl(ptr userData,
		const XML_Char *doctypeName,
		const XML_Char *systemId,
		const XML_Char *publicId,
		int has_internal_subset)
{
	pTclContext c = userData; XML_Parser p = c->parser;
	Obj piece = 0;
	piece = pairstr(piece,"%what","doctype-begin",-1);
	piece = pairstr(piece,"%name",doctypeName,-1);
	piece = pairstr(piece,"%system",systemId,-1);
	piece = pairstr(piece,"%public",publicId,-1);
	piece = pairint(piece,"%internal",has_internal_subset);
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	piece = pairIndex(piece,c); pushIndex(c);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
}

static void endDoctypeDecl(ptr userData) {
	pTclContext c = userData; XML_Parser p = c->parser;
	Obj piece = 0;
	piece = pairstr(piece,"%what","doctype-end",-1);
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	popIndex(c); piece = pairIndex(piece,c); incrIndex(c);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
}

		
 
section    top

typedef struct Fsm Fsm,*pFsm;
struct Fsm {
	pFsm	all,next,also;
	chars	trace,name;
	bool	epsilon,marked,cons,top;
	int		number;
};
static pFsm fsm(chars trace,chars name,pFsm next,pFsm also,pFsm *all) {
	pFsm fsm = heap(Fsm);
	fsm->all = (*all); (*all) = fsm;
	fsm->trace = trace; fsm->name = name; fsm->next = next; fsm->also = also;
	fsm->epsilon = false; fsm->marked = false; fsm->cons = false; fsm->top = true;
	fsm->number = -1;
	return fsm;
}
static void freeFsms(pFsm *all) {
	while ((*all)) {
		pFsm a = (*all)->all; dispose((*all)); (*all) = a;
	}
}
static void join(chars trace,pFsm from,pFsm to,pFsm *all) {
	pFsm p;
	for (p=from; p; p=p->also) if (p->next==to) return;
	if (from->next)	{
		from->also = fsm(trace,0,to,from->also,all);
		from->also->top = false;
	}else
		from->next = to;
}
static void dumpfsm(chars label,pFsm *all) {
	pFsm x,p; fprintf(stderr,"%s:\n",label);
	for (x=(*all); x; x=x->all) {
		if (!x->top) continue;
		fprintf(stderr,"  %p %s\n",x,x->name?x->name:"#MISSING");
		for (p=x; p; p=p->also) {
			fprintf(stderr,"    %p (%s) %s%s%s",
				p->next,p->trace,!p->next?"#null":p->next->name?p->next->name:"#MISSING",
				p->epsilon?" epsilon":"",
				p->cons?" cons":"");
			if (p->number>=0) fprintf(stderr," number=%d",p->number);
			fprintf(stderr,"\n");
		}
	}
}
static void epsilonMark(pFsm x,pFsm *all) {
	pFsm p = x;
	if (!x || x->marked) return;
	while (p) {
		if (p->epsilon) {
			p = p->also;
		}else if (p->next==x && !x->name) {
			p->epsilon = true;
			p = p->also;
		}else if (p->next==0) {
			p->epsilon = true;
			p = p->also;
		}else if (!p->next->name) {
			pFsm q;
			p->epsilon = true;
			for (q=p->next; q; q=q->also) {
				if (!q->epsilon) join("copy",x,q->next,all);
			}
			p = x;
		}else {
			p = p->also;
		}
	}
	x->marked = true;
	if (!x->name) x->name = "#MISSING";
	for (p=x; p; p=p->also) {
		if (!p->epsilon) {
			epsilonMark(p->next,all);
		}
	}
}

static void fsmNumber(pFsm x,int *N,pFsm *all) {
	if (!x || x->number>=0) {
		;
	}else if (x->next==0 && x->also==0) {
		x->number = -1;
	}else {
		pFsm p; x->number = *N; *N += 1;
		for (p=x; p; p=p->also) if (!p->epsilon) fsmNumber(p->next,N,all);
	}
}

static void fsmConstruct(pFsm x,Obj L,pFsm *all) {
	if (!x || x->cons) {
		;
	}else if (x->next==0 && x->also==0) {
		x->cons = true;
	}else {
		pFsm p; Obj S = incr(Tcl_NewObj());
		x->cons = true;
		for (p=x; p; p=p->also) {
			if (!p->epsilon) {
				S = pairint(S,p->next->name,p->next->number);
			}
		}
		Tcl_ListObjAppendElement(0,L,S); decr(S);
		for (p=x; p; p=p->also) {
			if (!p->epsilon) {
				fsmConstruct(p->next,L,all);
			}
		}
	}
}
		
static Obj xmlContent(XML_Content *model,Obj *display,pFsm *begin,pFsm *end,pFsm *all) {
	if (model) {
		Obj P[4],L; int i;
		static chars type[] = {"","empty","any","mixed","name","choice","seq"};
		static chars quant[] = {"none","opt","rep","plus"};
		chars sep = ""; pFsm last;
		P[0] = Tcl_NewStringObj(type[model->type],-1);
		P[1] = Tcl_NewStringObj(quant[model->quant],-1);
		P[2] = Tcl_NewStringObj(model->name?model->name:"",-1);
		P[3] = 0;
		L = Tcl_NewListObj(3,P);
		*end = fsm("end",0,0,0,all);
		switch (model->type) {
			case XML_CTYPE_ANY:
				*display = incr(Tcl_NewStringObj("ANY",-1));
				*begin = fsm("#any","#ANY",*end,0,all);
				join("any b->e",*begin,*end,all);
				join("any e->b",*end,*begin,all);
				break;
			case XML_CTYPE_NAME:
				*display = incr(Tcl_NewStringObj(model->name?model->name:"",-1));
				*begin = fsm("name begin",0,fsm("name",model->name,*end,0,all),0,all);
				break;
			case XML_CTYPE_MIXED:
				*display = incr(Tcl_NewStringObj("(#PCDATA"/*)*/,-1)); sep = "|";
				*begin = fsm("mixedbegin",0,fsm("#pcdata","#PCDATA",*end,0,all),0,all);
				break;
			case XML_CTYPE_CHOICE:
			case XML_CTYPE_SEQ:
				*display = incr(Tcl_NewStringObj("("/*)*/,-1));
				last = *begin = fsm("seqchoicebegin",0,0,0,all);
				break;
			default:
				*display = incr(Tcl_NewStringObj("EMPTY",-1));
				*begin = fsm("empty",0,*end,0,all);
				break;
		}
		for (i=0; i<model->numchildren; i++) {
			Obj C; pFsm b,e; Obj SL = xmlContent(&model->children[i],&C,&b,&e,all);
			if (SL) Tcl_ListObjAppendElement(0,L,SL);
			Tcl_AppendToObj(*display,sep,-1);
			Tcl_AppendObjToObj(*display,C); decr(C);
			switch (model->type) {
				case XML_CTYPE_MIXED:
				case XML_CTYPE_CHOICE:
					sep = "|";
					join("choice begin",*begin,b,all); join("choice end",e,*end,all);
					break;					
				case XML_CTYPE_SEQ:
					sep = ",";
					join("sequence and",last,b,all); last = e;
					break;
			}
		}
		switch (model->type) {
			case XML_CTYPE_MIXED:
				Tcl_AppendToObj(*display,/*(*/")",-1);
				if (model->numchildren>1) Tcl_AppendToObj(*display,"*",-1);
				join("mixed b->e",*begin,*end,all);
				join("mixed e->b",*end,*begin,all);
				break;
			case XML_CTYPE_CHOICE:
				Tcl_AppendToObj(*display,/*(*/")",-1);
				break;
			case XML_CTYPE_SEQ:
				Tcl_AppendToObj(*display,/*(*/")",-1);
				join("sequence ->e",last,*end,all);
				break;
		}
		switch (model->quant) {
			case XML_CQUANT_OPT:
				Tcl_AppendToObj(*display,"?",-1);
				join("? b->e",*begin,*end,all);
				break;
			case XML_CQUANT_REP:
				Tcl_AppendToObj(*display,"*",-1);
				join("* b->e",*begin,*end,all);
				join("* e->b",*end,*begin,all);
				break;
			case XML_CQUANT_PLUS:
				Tcl_AppendToObj(*display,"+",-1);
				join("+ e->b",*end,*begin,all);
				break;
		}
		return L;
	}else
		return 0;
}

static void elementDecl(ptr userData,const XML_Char *name,XML_Content *model) {
	pFsm all = 0;
	pTclContext c = userData; XML_Parser p = c->parser;
	Obj display,machine; pFsm begin,end; int N = 0;
	Obj piece = 0;
	c->anyElement = true;
	piece = pairstr(piece,"%what","element",-1);
	piece = pairstr(piece,"%name",name,-1);
	piece = pairobj(piece,"%model",xmlContent(model,&display,&begin,&end,&all));
	piece = pairobj(piece,"%display",display); decr(display);
	begin = fsm("#begin","#BEGIN",begin,0,&all); join("end->#end",end,fsm("#end","#END",0,0,&all),&all);
	epsilonMark(begin,&all);
	machine = incr(Tcl_NewObj()); fsmNumber(begin,&N,&all);
	fsmConstruct(begin,machine,&all); freeFsms(&all);
	piece = pairobj(piece,"%fsm",machine); decr(machine);
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	piece = pairIndex(piece,c); incrIndex(c);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
	dispose(model);
}

		
 
section    top

static void attlist(ptr userData,const XML_Char *elname,
		const XML_Char *attname,
		const XML_Char *att_type,
		const XML_Char *dflt,
		int            isrequired)
{
	pTclContext c = userData; XML_Parser p = c->parser;
	chars assigned = dflt?(isrequired?"#FIXED":""):(isrequired?"#REQUIRED":"#IMPLIED");
	Obj display;
	Obj piece = 0;
	piece = pairstr(piece,"%what","attlist",-1);
	piece = pairstr(piece,"%element",elname,-1);
	piece = pairstr(piece,"%attname",attname,-1);
	piece = pairstr(piece,"%type",att_type,-1);
	piece = pairstr(piece,"%assigned",assigned,-1);
	piece = pairstr(piece,"%default",dflt,-1);
	piece = pairint(piece,"%required",isrequired);
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	display = incr(Tcl_NewStringObj(att_type,-1));
	if (dflt) {
		chars d;
		if (isrequired) {
			Tcl_AppendToObj(display," #FIXED",-1);
		}
		Tcl_AppendToObj(display," '",-1);
		for (d=dflt; *d; d++) {
			if (*d=='\'') {
				Tcl_AppendToObj(display,"&apos;",-1);
			}else {
				Tcl_AppendToObj(display,d,1);
			}
		}
		Tcl_AppendToObj(display,"'",-1);
	}else if (!dflt && isrequired) {
		Tcl_AppendToObj(display," #REQUIRED",-1);
	}else if (!dflt && !isrequired) {
		Tcl_AppendToObj(display," #IMPLIED",-1);
	}
	piece = pairobj(piece,"%display",display); decr(display);
	piece = pairIndex(piece,c); incrIndex(c);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
}

		
 
section    top
32. Callback handlers ::
internal entity declaration
The entity declaration <!ENTITY name "value"> has been parsed. The property list includes
%what entityInternal
%name name
%value value
%line number
Which line.
%col number
Which column of the current line.
%index {number...}
Unique element index.
external unparsed entity declaration
The entity declaration <!ENTITY name [ PUBLIC publicliteral ] SYSTEM systemliteral NDATA notation-name> has been parsed. The property list includes
%what entityExternalUnparsed
%name name
%system systemliteral
%public publicliteral
%notation notation-name
%base uri
%line number
Which line.
%col number
Which column of the current line.
%index {number...}
Unique element index.
external parsed entity declaration
The entity declaration <!ENTITY name [ PUBLIC publicliteral ] SYSTEM systemliteral> has been parsed. The property list includes
%what entityExternalParsed
%name name
%system systemliteral
%public publicliteral
%base uri
%line number
Which line.
%col number
Which column of the current line.
%index {number...}
Unique element index.
internal parameter entity declaration
The entity declaration <!ENTITY % name "value"> has been parsed. The property list includes
%what parameterEntityInternal
%name name
%value value
%line number
Which line.
%col number
Which column of the current line.
%index {number...}
Unique element index.
external parameter entity declaration
The entity declaration <!ENTITY % name [ PUBLIC publicliteral ] SYSTEM systemliteral NDATA notation-name> has been parsed. The property list includes
%what parameterEntityExternal
%name name
%system systemliteral
%public publicliteral
%notation notation-name
%base uri
%line number
Which line.
%col number
Which column of the current line.
%index {number...}
Unique element index.

static void entity(ptr userData,
		const XML_Char *entityName,
		int            is_parameter_entity,
		const XML_Char *value,
		int            value_length,
		const XML_Char *base,
		const XML_Char *systemId,
		const XML_Char *publicId,
		const XML_Char *notationName)
{
	pTclContext c = userData; XML_Parser p = c->parser;
	Obj piece = 0;
	piece = pairstr(piece,"%what",
			is_parameter_entity
				? (value ? "parameterEntityInternal"
					: "parameterEntityExternal")
				: (value ? "entityInternal"
					: notationName ? "entityExternalUnparsed"
					: "entityExternalParsed")
	,-1);
	piece = pairstr(piece,"%name",entityName,-1);
	piece = pairstr(piece,"%value",value,value_length);
	piece = pairstr(piece,"%system",systemId,-1);
	piece = pairstr(piece,"%public",publicId,-1);
	piece = pairstr(piece,"%notation",notationName,-1);
	piece = pairstr(piece,"%base",base,-1);
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	piece = pairIndex(piece,c); incrIndex(c);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
}

		
 
section    top

static void notation(ptr userData,
		const XML_Char *notationName,
		const XML_Char *base,
		const XML_Char *systemId,
		const XML_Char *publicId)
{
	pTclContext c = userData; XML_Parser p = c->parser;
	Obj piece = 0;
	piece = pairstr(piece,"%what","notation",-1);
	piece = pairstr(piece,"%system",systemId,-1);
	piece = pairstr(piece,"%public",publicId,-1);
	piece = pairstr(piece,"%notation",notationName,-1);
	piece = pairstr(piece,"%base",base,-1);
	piece = pairint(piece,"%line",XML_GetCurrentLineNumber(p));
	piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(p));
	piece = pairIndex(piece,c); incrIndex(c);
	Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
}

		
 
section    top

34. Callback handlers :: If a reader-script was specified in the wyrm::expat::create command, the script has properties append to it, and then it is evaluated. The script is passed a parser token; this parser is given the XML source of the external entity of with wyrm::expat::resume, and it is ended with wyrm::expat::end when the source is completed and to continue the original XML source.

The added properties are

%parser parser-token
%base uri
%system systemliteral
%public publicliteral

These items bracket the items parsed from the external entity.

entity parser begin
This item indicates where the reader-script begins execution. The property list includes
%what entity-parser-begin
%system systemliteral
%public publicliteral
%encoding encoding
%base uri
%entity &name;
%line number
Which line.
%col number
Which column of the current line.
%index {number...}
Unique element index.
entity parser end
This item indicates where the reader-script ends execution. The property list includes
%what entity-parser-end
%line number
Which line.
%col number
Which column of the current line.
%index {number...}
Unique element index.

static int externalEntityRef(XML_Parser parser,
		const XML_Char *context,
		const XML_Char *base,
		const XML_Char *systemId,
		const XML_Char *publicId)
{
	pTclContext c = XML_GetUserData(parser);
	if (c->reader) {
		int o,m,l; chars t = XML_GetInputContext(parser,&o,&m),T; bool entity = false;
		Obj reader = incr(Tcl_DuplicateObj(c->reader)); int rc;
		Obj piece = 0;
		t += o; m -= o;
		if (c->cdata==0 && *t=='&' && m>2) {
			for (T=t,l=0; m>0 && !entity; m--,T++,l++) entity = *T==';';
		}
		c->refcount += 1;
		c->parser = XML_ExternalEntityParserCreate(parser,context,
				c->encoding?Tcl_GetStringFromObj(c->encoding,0):0);
		c->token = createParserToken(c->parser,0);
		reader = pairstr(reader,"%parser",c->token,-1);
		reader = pairstr(reader,"%base",base,-1);
		reader = pairstr(reader,"%system",systemId,-1);
		reader = pairstr(reader,"%public",publicId,-1);
		reader = pairobj(reader,"%encoding",c->encoding);
		piece = pairstr(piece,"%what","entity-parser-begin",-1);
		piece = pairstr(piece,"%base",base,-1);
		piece = pairstr(piece,"%system",systemId,-1);
		piece = pairstr(piece,"%public",publicId,-1);
		piece = pairobj(piece,"%encoding",c->encoding);
		piece = pairint(piece,"%line",XML_GetCurrentLineNumber(parser));
		piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(parser));
		piece = pairIndex(piece,c); pushIndex(c);
		if (entity)	piece = pairstr(piece,"%entity",t,l);
		Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
		rc = Tcl_EvalObjEx(c->intr,reader,TCL_EVAL_DIRECT|TCL_EVAL_GLOBAL);
		if (Tcl_IsShared(c->result)) {
			Obj t = incr(Tcl_DuplicateObj(c->result));
			decr(c->result);
			c->result = t;
		}
		decr(reader); deleteParser(c->token);
		c->parser = parser;
		c->subparserError = rc!=TCL_OK;
		piece = pairstr(0,"%what","entity-parser-end",-1);
		piece = pairint(piece,"%line",XML_GetCurrentLineNumber(parser));
		piece = pairint(piece,"%col",XML_GetCurrentColumnNumber(parser));
		popIndex(c); piece = pairIndex(piece,c); incrIndex(c);
		Tcl_ListObjAppendElement(0,c->result,piece); decr(piece);
		return rc==TCL_OK;
	}else {
		return 1;
	}
}

		
   
   

Testing

   
top

35. Examples :: Parsing XML:

set token [wyrm::expat::create A]
wyrm::expat::resume $token <sometag>cdata</sometag>
wyrm::expat::end $token
returns
{
{
%what start
%name sometag
%line 1
%col 1
%index 1
}
{
%what content
%data cdata
%line 1
%col 10
%index {1 1}
}
{
%what end
%name sometag
%line 1
%col 15
%index 1
}
}

Parsing HTML:

set token [read wyrm::expat::tidy A [wyrm::data r "<head><title>T</title><p>x<br>y"]]
wyrm::expat::resume $token [read $token]
wyrm::expat::end $token
returns
{
{
%what start
%name html
...
}
{
%what start
%name html
}
{
%what start
%name title
}
{
%what content
%data T
}
{
%what end
%name title
}
{
%what end
%name head
}
{
%what start
%name body
}
{
%what start
%name p
}
{
%what content
%data x
}
{
%what start
%name br
}
{
%what end
%name br
}
{
%what content
%data y
}
{
%what end
%name p
}
{
%what end
%name body
}
{