Commit 2eaff263 authored by gerd's avatar gerd

fixes for release


git-svn-id: https://godirepo.camlcity.org/svn/lib-pxp/trunk@748 dbe99aee-44db-0310-b2b3-d33182c8eb97
parent 182301ce
Copyright 1999 by Gerd Stolpmann
Copyright 1999-2009 by Gerd Stolpmann
The package PXP is copyright by Gerd Stolpmann.
......
......@@ -9,7 +9,7 @@ with_wlex_compat=1
with_ulex=1
with_pp=1
lexlist="utf8,iso88591,iso88592,iso88593,iso88594,iso88595,iso88596,iso88597,iso88598,iso88599,iso885910,iso885913,iso885914,iso885915,iso885916"
version="1.2.0test2"
version="1.2.1"
exec_suffix=""
help_lex="Enable/disable ocamllex-based lexical analyzer for the -lexlist encodings"
......
......@@ -50,6 +50,8 @@ for PXP; if you are looking for the stable distribution, please go
<sect1>
<title>Version History</title>
<ul>
<li><p>There is currently no development version.</p></li>
<--
<li>
<p><em>1.2.1:</em> Revised documentation</p>
<p>Addition: Pxp_event.unwrap_document</p>
......@@ -199,7 +201,7 @@ instruction, only misc* element misc* or whole documents are possible).
When an external entity A opens an external entity B, and B opens C,
relative paths of C have been interpreted wrong.</p>
</li>
-->
<!--
<li><p><em>1.1:</em> This is the new stable release!</p></li>
<li><p><em>1.0.99:</em></p>
......
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE readme SYSTEM "readme.dtd" [
<!ENTITY % common SYSTEM "common.xml">
%common;
<!-- Special HTML config: -->
<!ENTITY % readme:html:up '<a href="../..">up</a>'>
<!ENTITY % config SYSTEM "config.xml">
%config;
]>
<readme title="Extensions of the XML specification">
<sect1>
<title>This document</title>
<p>This parser has some options extending the XML specification. Here, the
options are explained.
</p>
</sect1>
<sect1>
<title>Optional declarations instead of mandatory declarations</title>
<p>The XML spec demands that elements, notations, and attributes must be
declared. However, there are sometimes situations where a different rule would
be better: <em>If</em> there is a declaration, the actual instance of the
element type, notation reference or attribute must match the pattern of the
declaration; but if the declaration is missing, a reasonable default declaration
should be assumed.</p>
<p>I have an example that seems to be typical: The inclusion of HTML into a
meta language. Imagine you have defined some type of "generator" or other tool
working with HTML fragments, and your document contains two types of elements:
The generating elements (with a name like "gen:xxx"), and the object elements
which are HTML. As HTML is still evolving, you do not want to declare the HTML
elements; the HTML fragments should be treated as well-formed XML fragments. In
contrast to this, the elements of the generator should be declared and
validated because you can more easily detect errors.</p>
<p>The following two processing instructions can be included into the DTD:</p>
<ul>
<li><p><code><![CDATA[<?pxp:dtd optional-element-and-notation-declarations?>]]></code>
References to unknown element types and notations no longer cause an
error. The element may contain everything, but it must be still
well-formed. It may have arbitrary attributes, and every attribute is
treated as an #IMPLIED CDATA attribute.</p>
</li>
<li><p><code><![CDATA[<?pxp:dtd optional-attribute-declarations elements="x y ..."?>]]></code>
References to unknown attributes inside one of the enumerated elements
no longer cause an error. Such an attribute is treated as an #IMPLIED
CDATA attribute.
</p>
<p>If there are several "optional-attribute-declarations" PIs, they are all
interpreted (implicitly merged).</p>
</li>
</ul>
</sect1>
<sect1>
<title>Normalized namespace prefixes</title>
<p>
The XML standard refers to names within namespaces as <em>expanded
names</em>. This is simply the pair (namespace_uri, localname); the namespace
prefix is not included in the expanded name.</p>
<p>
PXP does not support expanded names, but it does support namespaces. However,
it uses a model that is slightly different from the usual representation of
names in namespaces: Instead of removing the namespace prefixes and converting
the names into expanded names, PXP prefers it to normalize the namespace
prefixes used in a document, i.e. the prefixes are transformed such that they
refer uniquely to namespaces.</p>
<p>
The following text is valid XML:
<code><![CDATA[
<x:a xmlns:x="namespace1">
<x:a xmlns:x="namespace2">
</x:a>
</x:a>
]]></code>
The first element has the expanded name (namespace1,a) while the second element
has the expanded name (namespace2,a); so the elements have different types. As
already pointed out, PXP does not support the expanded names directly.
Alternatively, the
XML text is transformed while it is being parsed such that the prefixes become
unique. In this example, the transformed text would read:
<code><![CDATA[
<x:a xmlns:x="namespace1">
<x1:a xmlns:x1="namespace2">
</x1:a>
</x:a>
]]></code>
From a programmers point of view, this transformation has the advantage that
you need not to deal with pairs when comparing names, as all names are still
simple strings: here, "x:a", and "x1:a". However, the transformation seems to
be a bit random. Why not "y:a" instead of "x1:a"? The answer is that PXP allows
the programmer to control the transformation: You can simply demand that
namespace1 must use the normalized prefix "x", and namespace2 must use "y". The
declaration which normalized prefix to use can be programmed (by setting the
namespace_manager object), and it can be included into the DTD:
<code><![CDATA[
<?pxp:dtd namespace prefix="x" uri="namespace1"?>
<?pxp:dtd namespace prefix="y" uri="namespace2"?>
]]></code>
There is another advantage of using normalized prefixes: You can safely refer
to them in DTDs. For example, you could declare the two elements as
<code><![CDATA[
<!ELEMENT x:a (y:a)>
<!ELEMENT y:a ANY>
]]></code>
These declarations are applicable even if the XML text uses different prefixes,
because PXP normalizes any prefixes for namespace1 or namespace2 to the
preferred prefixes "x" and "y".
</p>
<p>Since PXP-1.1.95, the namespace support has been extended. In
addition to prefix normalization, the parser now also stores the
scoping structure of the namespaces (in the namespace_scope
objects). More or less, this means that the parser remembers
which elements have which "xmlns" attributes. There are two
important applications of this feature:</p>
<p>First, it is now possible to look up the namespace URI when
only the original, non-normalized namespace prefix is known.
A number of XML standards, e.g. XSchema, use namespace prefixes
within data nodes. Of course, these prefixes are not normalized
by PXP, but simply remain as they are when the XML text is
parsed. To get the URI of such a prefix p in the context of node
n, just call
<code>
n # namespace_scope # uri_of_display_prefix p
</code>
In PXP terminology, the non-normalized prefixes are now called
"display prefixes".</p>
<p>The other application is that it is now even possible to
retrieve the original "display" prefix of node names, e.g.
<code>
n # display_prefix
</code>
returns it. However, the display prefix is only guessed in the
sense that when there are several prefixes bound to the same
URI, one of the prefixes may be taken. For instance, in
<code><![CDATA[
<x:a xmlns:x="sample" xmlns:y="sample"/>
]]></code>
both "x" and "y" are bound to the same URI "sample", and
the display_prefix method selects now one of the prefixes
at random.</p>
<p>It is now even possible to output the parsed XML text
with original namespace structure: The "display" method
outputs XML text where the namespaces are declared as in the
original XML text.</p>
<p>Regarding the "xmlns" attributes, PXP treats them in a very special
way. It is not only allowed not to declare them in the DTD, such declarations
would be even not applied to the actual "xmlns" attributes. For example,
it is not possible to declare a default value for "xmlns:x", as in
<code><![CDATA[
<ATTLIST ... xmlns:x CDATA "mynamespaceuri">
]]></code>
The default value would be ignored. Furthermore, it is not possible to
declare "xmlns" attributes as being required - validation will always
fail even if the "xmlns" attribute is present.</p>
<p>The model behind this treatment is defined by the "XML information
set" standard. There are two kinds of attributes: normal attributes,
and namespace attributes. PXP validates only normal attributes.</p>
</sect1>
</readme>
......@@ -25,7 +25,7 @@ installrel = $(H)/homepage/ocaml-programming.de/packages/documentation/pxp/index
.PHONY: all
all: README INSTALL ABOUT-FINDLIB SPEC EXTENSIONS PREPROCESSOR
all: README INSTALL ABOUT-FINDLIB SPEC
README: README.xml common.xml config.xml readme.dtd
$(readme) -text README.xml >README
......
This diff is collapsed.
This diff is collapsed.
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE readme SYSTEM "readme.dtd" [
<!ENTITY % common SYSTEM "common.xml">
%common;
<!-- Special HTML config: -->
<!ENTITY % readme:html:up '<a href="../..">up</a>'>
<!ENTITY % config SYSTEM "config.xml">
%config;
]>
<readme title="Release Notes for PXP (Last updated: 20-Jul-2001)">
<sect1>
<title>Linker problems</title>
<sect2>
<title>Assert_failure ("pxp_lexers.ml", 1573, 1585)</title>
<p>This issue applies to: <em>PXP 1.1</em></p>
<p><em>Problem description:</em>
<code>
"We just install the pxp parser and have trouble in using it.
We always obtained an exception that we can not suppress.
Do you know if it's a problem of setting configuration options.
# let d = parse_dtd_entity default_config (from_file "DAG.xml");;
^^^^^^^^^^^?
Uncaught exception: Assert_failure ("pxp_lexers.ml", 1573, 1585)."
</code>
</p>
<p><em>Explanation:</em> You need to link a lexer to the executable. (Ok,
the error message could be better; an Assert_failure is not very explanative.)
Lexers are dynamically registered at runtime in the module Pxp_lexers, so
you can link an executable that does not contain a lexer without producing a
linker error.</p>
<p>Example how to link correctly:</p>
<ul>
<li><p>Direct linking without findlib:
<code>
ocamlc -o my_exec ... pxp_engine.cma pxp_lex_iso88591.cma
pxp_lex_link_iso88591.cmo
</code>
Don't forget to add the .cmo file! It contains the registering code that plugs
the lexers from pxp_lex_iso88591.cma into the registry. It _must_ be a .cmo
file, otherwise the initialization call would be "garbage-collected" at
link-time. (.cmx file if ocamlopt is used.)
</p>
</li>
<li><p>Linking with findlib:
<code>
ocamlfind ocamlc -o my_exec ... -package "pxp_engine,pxp_lex_iso88591" -linkpkg
</code>
</p>
</li>
</ul>
<p>For the other lexers, the linker calls are similar.</p>
<p>The following combinations of lexers are reasonable:</p>
<ul>
<li><p>pxp_lex_iso88591 alone</p>
</li>
<li><p>pxp_lex_utf8 alone</p>
</li>
<li><p>pxp_lex_iso88591 + pxp_lex_utf8</p>
</li>
<li><p>pxp_wlex alone</p>
</li>
</ul>
<p>It is not reasonable not to have a lexer at all, even if PXP is only
used to represent XML trees internally, and not as parser. There must be a lexer
for the character set that is used for the internal representation of
strings. For example, if config.encoding = `Enc_utf8 (see module Pxp_yacc for
the config record), there must be a lexer for UTF-8 (i.e. pxp_lex_utf8 or
pxp_wlex).</p>
</sect2>
</sect1>
</readme>
default: odoc
.PHONY: odoc
odoc: html/pic/done
cd ../../src/pxp-engine && $(MAKE) doc
rm -rf html/ref
mkdir -p html/ref
cp style.css html/ref
ocamldoc -v -g ../../tools/src/odoc/chtml.cmo \
-t "PXP Reference" \
-I ../../src/pxp-engine \
-load ../../src/pxp-engine/pxp_engine.dump \
-d html/ref \
-colorize-code \
-css-style style.css -intro index.txt
clean:
rm -rf html man ps
rm -f src/readme.ent
CLEAN: clean
distclean:
rm -f src/*~
rm -f *~
rm -f ps/*.aux ps/*.dvi ps/*.log ps/*.tex
#-----------------------------------------------------------------
# The following is for the old manual. No longer updated, and totally
# out of date.
# Note: The following Makefile works for _me_. The stable releases of
# PXP contain an already built manual, so you do not need this Makefile.
......@@ -20,8 +53,6 @@ UPSOURCE = $(shell for x in $(SOURCE); do echo "../$$x"; done)
.PHONY: html ps
default: html ps
html: html/book1.htm html/pic/done
ps: ps/pxp.ps ps/pic/done
......@@ -89,29 +120,3 @@ ps/pic/done: src/pic/*.fig
.sgml.xml:
sx -xndata $< >$@; true
.PHONY: odoc
odoc: html/pic/done
cd ../../src/pxp-engine && $(MAKE) doc
rm -rf html/ref
mkdir -p html/ref
cp style.css html/ref
ocamldoc -v -g ../../tools/src/odoc/chtml.cmo \
-t "PXP Reference" \
-I ../../src/pxp-engine \
-load ../../src/pxp-engine/pxp_engine.dump \
-d html/ref \
-colorize-code \
-css-style style.css -intro index.txt
clean:
rm -rf html man ps
rm -f src/readme.ent
CLEAN: clean
distclean:
rm -f src/*~
rm -f *~
rm -f ps/*.aux ps/*.dvi ps/*.log ps/*.tex
......@@ -36,30 +36,21 @@ f doc/ABOUT-FINDLIB
f doc/INSTALL
f doc/README
f doc/DEV
f doc/EXTENSIONS
f doc/PREPROCESSOR
#f doc/RELEASE-NOTES
f doc/SPEC
f doc/design.txt
#d doc/manual
#f doc/manual/Makefile
#d doc/manual/html
#p doc/manual/html/.*\.html
#p doc/manual/html/.*\.css
#d doc/manual/html/pic
#p doc/manual/html/pic/.*\.gif
#d doc/manual/ps
#f doc/manual/ps/pxp.ps
#d doc/manual/src
#f doc/manual/src/extract.ml
#f doc/manual/src/getcode.ml
#f doc/manual/src/markup.css
#f doc/manual/src/markup.dsl
#f doc/manual/src/markup.sgml
#d doc/manual/src/extracted
#d doc/manual/src/pic
#p doc/manual/src/pic/.*\.fig
d doc/manual
f doc/manual/Makefile
d doc/manual/html
d doc/manual/html/ref
d doc/manual/html/pic
p doc/manual/html/ref/.*\.html
p doc/manual/html/ref/.*\.css
p doc/manual/html/pic/.*\.gif
f doc/manual/index.txt
d doc/manual/src
d doc/manual/src/pic
p doc/manual/src/pic/.*\.fig
d examples
f examples/Makefile
......@@ -151,7 +142,6 @@ f src/pxp-engine/pxp_core_parser.mli
f src/pxp-engine/pxp_core_parser.m2y
f src/pxp-engine/pxp_core_types.mli
f src/pxp-engine/pxp_core_types.ml
f src/pxp-engine/pxp_core_types_type.mli
f src/pxp-engine/pxp_dfa.ml
f src/pxp-engine/pxp_dfa.mli
f src/pxp-engine/pxp_document.ml
......@@ -184,12 +174,11 @@ f src/pxp-engine/pxp_top.ml
f src/pxp-engine/pxp_top.mli
f src/pxp-engine/pxp_tree_parser.mli
f src/pxp-engine/pxp_tree_parser.ml
f src/pxp-engine/pxp_type_anchor.mli
f src/pxp-engine/pxp_type_anchor.ml
f src/pxp-engine/pxp_types.ml
f src/pxp-engine/pxp_types.mli
f src/pxp-engine/pxp_yacc.ml
f src/pxp-engine/pxp_yacc.mli
p src/pxp-engine/intro_.*\.txt
d src/pxp-lex
f src/pxp-lex/char_classes_generic.def
......@@ -385,3 +374,7 @@ f tools/src/lexpp/uni_lexer.mll
f tools/src/lexpp/uni_parser.mly
f tools/src/lexpp/uni_types.ml
f tools/src/lexpp/ucs2_to_utf8.ml
d tools/src/odoc
f tools/src/odoc/Makefile
f tools/src/odoc/chtml.ml
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment