Raptor RDF Parser Library - To Do List
Bugs and Features
Bugs should be reported at the
Redland Issue Tracker.
Most of these should be migrated to the issue tracker above.
- Handle bad RDF URI references somehow - such as those containing
dissallowed URI octets (RFC 2396 as updated by RFC 2732) - silently convert
on input or noisly convert on input. For example, > in a URI in RDF/XML would get turned into \u003E in an N-Triples output.
- Update the rdf/rdfs schema documents. Convert the turtle labels
from unicode character form into UTF-8 (DATA, DOCS)
- Ensure there is support to allow
Rasqal
to register a parser outside raptor that provides RDF query results as
triples. Alternatively, provide a skeleton wrapper parser that
allows the same by delegating all triple generation to callbacks. (FEATURE)
- Add a N3 parser (FEATURE)
- Add a PNG parser, like
pngmeta and parse
the embedded RDF/XML into triples (FEATURE)
Done
The most recent changes are at the top, with the first
release version where appropriate.
- Describe in libraptor.3 relevant raptor.h structures -
describe important ones such as raptor_statement, raptor_locator and
raptor_feature as well as reference internal ones such as raptor_identifier,
raptor_identifier_type, raptor_uri_source, raptor_ntriples_term_type,
raptor_genid_type and raptor_uri_handler for the raptor_uri class.
(DOCS) (1.4.8)
- Add namespace/prefix declaration user level callback (FEATURE) (1.4.8)
- Add a guessing parser that sends an Accept: HTTP header for all
supported mime types and uses the returned headers to select a
parser. (FEATURE) (1.4.8)
- Add an XSLT 'parser' based on libxslt (FEATURE) (1.4.6)
- Escape outputing the > delimiter for N-Triples URI-references (1.4.3)
- Allow the supported parsers to be selected by configure (1.3.3)
- Errors which happened when fetching WWW content were always
printed to stderr. They are now passed to the main error routines
which allows applications to retrieve them. (1.3.2)
- In lax mode, warns when unknown rdf:parseType values are seen to prevent
things like 'owl:collection' and 'collection' passing through (1.3.1)
- Turtle parser: a bare ':' and qnames such as 'rdf:_1' now work (1.3.1)
- Describe in libraptor.3 the use of UTF-8 for strings and URIs. (1.3.1)
- Send an HTTP Accept: header with WWW requests corresponding to the
mime type of the parser selected, accepting all others at lower q (1.3.0)
- Guess parser from a mime type, content fragment and/or
content name such as a filename or URI (1.3.0)
- Turtle parser: use
raptor_generate_id
for blank node identifiers (1.3.0)
- Added
--enable-xml-1-1-names
to enable XML 1.1 name checking instead of XML 1.0 (1.1.0)
- Updated XML 1.1 name checking for ranges in the XML/Namespaces in XML 1.1 proposed recommendations (1.1.0)
- Added
--disable-nfc-check
to disable the NFC linking/checking with GNOME glib, even if that library is present (1.1.0)
- Made the N-Triples parser use
raptor_generate_id
for blank nodes identifiers (1.1.0)
- Updated the RDF/XML parser to handle libxml 2.6.0 SAX2 API which
changes the names of all of the SAX1 calls (1.1.0).
- Added an N-Triples Plus parser (1.1.0)
- Correct line counting for N-Triples with \r\n line (DOS) files and
when the line crosses a chunk. (1.1.0)
- Handle WIN32 file URIs starting
file://c:
... (1.1.0)
- Scanning (rapper --scan) for rdf:RDF in embedded XML does not work (0.9.13)
- URI retrieval - make sure it chops off the fragment before
fetching. (0.9.12)
- Make 'make check' not die if NFC tests fail, possible if
no GNOME glib2 is present. (0.9.12)
- Unicode character normalization NFC checks not implemented (0.9.11)
- XML (Exclusive) Canonicalization for XML Literals not implemented (0.9.11)
- libxml2 currently does not do XML attribute normalization i.e.
removing whitespace around attribute content. Added a fix (0.9.11)
- Added raptor_www_no_www_library_init_finish to allow once-only
www library startup/shutdown to be prevented. Default to performed
so that most higher level apps do not need to know or care (0.9.10)
- Docs updated for 0.9.7 to 0.9.10 API changes in the libraptor.3 manual page
- Escape XML attribute values in parseType literal content generation (0.9.10)
- Passing NULL to raptor_start_parse base URI for rdfxml parser
crashed it. Now the RDF/XML parser fails. (0.9.9)
- rdf:parseType="Literal" content with &, <, > and
unicode characters did not get escaped in the encoded string into
entities/character entities. (0.9.9)
- XML Namespaces declared with a prefix and no namespace name (URI)
were accepted - this is illegal (0.9.8)
- Empty files made rdf/xml parser crash with libxml2 and expat; now both
return failure since an empty doc is not allowed (0.9.8)
- rdf:bagID handling added (0.9.7)
- Can now configure on system which has expat alone using
./configure --with-xml-parser=expat (0.9.7)
- Fix compiling libxml 2.3.5 and nearby versions failing on FreeBSD (which is 20 months old) caused by a change to the xmlSAXHandler structure (0.9.7)
- file: URIs were not correctly handled (0.9.6)
- Resilience to XML parser errors, RDF/XML grammar errors (0.9.6)
- Manual pages (0.9.6)
- Made CDATA section work with libxml (0.9.6)
- daml:collection is generating some wrong triples (BUG) (0.9.6)
- Compiling on OSX fails on most systems since it requires libtool 1.4.2
or CVS version which requires automake 2.50 and a newer version of
autoconf. This needs significant changes to the autoconfigure
system. See
Fink porting libtool to OSX.
The fix for now - for the packaged sources only, not CVS
- is to use a patched libtool 1.4.2 (from Debian) that can generate a libtool
that knows OSX. (0.9.6)
- URI resolving to a base URI now working (0.9.5)
- Handle <prop:Elt rdf:ID="id" rdf:resource="http://example.org/obj"/> (RDF Core WG syntax change) (0.9.5)
- Add xml:base support - RDF Core WG syntax change (0.9.4)
- Perform xml:lang processing and pass to application (0.9.4)
- parseType literal broken (0.9.4)
- Tracking of user IDs/generated IDs available to user code (0.9.3)
- daml:collection parseType support (0.9.3)
- rdf:li used as a propertyElt does not work (0.9.3)
- parseType literal support complete (0.9.3)
- Fixed many crashes (0.9.3)
Could not duplicate problem list
- Turtle parser: the lexer
lval->string
values get overwritten on errors (BUG?)
- RSS tag soup parser junks/overwrites element content if it is
delivered in chunks rather than as one big CDATA, such as when
libxml2 sees entities, it emits the content in bits (BUG)
Decided not to do list
- Add a gzip2/bzip2 content reading interface, using libxml2 to do the hard work (FEATURE): users can do this if they need it with the chunk API, or using curl with HTTP content compression
- Provide a perl interface (FEATURE): use Redland bindings
- LSID URN support (FEATURE)
- Record the xml parser used and make available from API (FEATURE)
- Other
rdf:parseType
support (FEATURE)
No need to do list
After decisions from the
RDF Core WG
as recorded in the
attention developers
area of the
RDF Issue Tracking document.
- aboutEach support - removed from syntax
- aboutEachPrefix support - removed from syntax
- Special container support - not special anymore, just typed nodes
Copyright (C) 2001-2010 Dave Beckett
Copyright (C) 2001-2005 University of Bristol