reading/writing terms from/to binary streams

Discussion on the revision of the current Core standard

Moderator: Paulo Moura

reading/writing terms from/to binary streams

Postby mh » Mon Aug 25, 2008 10:47 am

ISO Prolog has the concept of streams. Streams can be of type text and of type binary. Prolog term data structures can only be read from/written to text streams (read_term/write_term), and not from/to binary streams (only get_byte/put_byte). Streams can have the property of repositioning. My vendor, SICS, only offers repositioning in binary streams for performance reasons. This brings me to the following.

Suppose I want to make a read-only database of facts, that is a collection of Prolog terms (functor-argument structures), and that I keep the starting positions where each term is stored in the external file (using stream_property/2) in a table. I can think of an indexing system on the terms, which, combined with a table of starting positions (kept in internal memory), and the property of repositioning gives me direct access to an arbitrary term in the external file. This is a very attractive feature of ISO Prolog, not available in the Edinburgh syntax.

The Problem
Reading Prolog terms from a binary stream is not supported in ISO Prolog. Instead, only get_byte is available for reading a single byte from a binary stream. Although it is easy to flatten an arbitrary Prolog term to a list of bytes, and write it to a binary stream using put_byte, it is far from trivial to make a predicate that can read a series of bytes (using get_byte), terminated by some special byte (e.g. 46, a full stop), and convert it into a Prolog term. Problematic issues include the recursive nature of a Prolog term, operators, quoting inside quoted atoms, double quoting, and character conversion (ASCII/Unicode). However, every Prolog system has the tokenizing and parsing of Prolog terms built-in in the form of the read predicate and family!! By definition read/write and family are able to handle bytes, as files are collections of bytes.

What are we missing in ISO Prolog?
ISO Prolog is missing the functionality of converting a list of bytes to a Prolog term and vice versa.

What is available on the market?
Vendors have understood this omission, and have implemented their own predicates. SICStus v4 has among others write_to_codes/2 and read_from_codes/2 in the library(codesio) (section 10.8 on http://www.sics.se/sicstus/docs/latest4/html/sicstus.html/lib_002dcodesio.html#lib_002dcodesio). SWI Prolog has read_line_to_codes/2 in library(readutil) (section A.19 on http://gollem.science.uva.nl/SWI-Prolog/Manual/readutil.html#read_line_to_codes/3). GNU Prolog has write_term_to_codes/3 and read_term_from_codes/3 (section 7.15 in http://www.gprolog.org/manual/gprolog.html#htoc164).

Suggestions
1. An 'explicit' suggestion.
As ISO Prolog already has char_code/2, atom_codes/2 and number_codes/2, what we are missing is their generalization term_codes/2, especially in mode term_codes(-,+), because tokenizing and parsing of terms is in general too difficult to do by oneself.
term_codes/2 might be extended with a third argument for specifying options, like how to handle quoted atoms (with or without surrounding quotes), how to handle variables (numbervar-ed or not), how to handle operators (wrap into functors or not), etc.,

2. An 'implicit' suggestion
As an alternative, the ISO Prolog predicates read and write and family could be adapted such that they convert a term from/to a list of bytes when reading from or writing term to a binary stream automatically. This solution is in line with the way options of the open/3/4 predicate are handled. When, for example, the option quoted(true) is specified when opening a stream, this means, that for all successive read operations from this stream, each atom and functor will be quoted if this would be necessary for the term to be input by read_term. So, when the option type(binary) is specified for a stream, this means that all I/O calls to/from this stream should be handled as being binary, taking care of any conversion.

Does this make sense?
What goal had the designers of ISO Prolog in mind when defining binary streams without making it easy to handle binary streams? With ISO Prolog predicates only, one can only write a series of bytes corresponding to a non-hierarchical or pre-formatted Prolog term to a binary stream, otherwise you cannot read it back (in general).
mh
 
Posts: 1
Joined: Mon Aug 25, 2008 7:33 am
Location: The Netherlands

Return to Core Revision

Who is online

Users browsing this forum: No registered users and 1 guest

cron