Protocol buffers are Google's language-neutral, platform-neutral,
extensible mechanism for serializing structured data -- think XML, but
smaller, faster, and simpler. You define how you want your data to be
structured once. This takes the form of a template that describes the
data structure. You use this template to encode and decode your data
structure into wire-streams that may be sent-to or read-from your peers.
The underlying wire stream is platform independent, lossless, and may be
used to interwork with a variety of languages and systems regardless of
word size or endianness. Techniques exist to safely extend your data
structure without breaking deployed programs that are compiled against
the "old" format.
The idea behind Google's Protocol Buffers is that you define your
structured messages using a domain-specific language and tool
set. Further documentation on this is at
https://developers.google.com/protocol-buffers.
protobuf_parse_from_codes(+WireCodes:list(int), +MessageType:atom, -Term) is semidet- Process bytes (list of int) that is the serialized form of a message (designated
by
MessageType
), creating a Prolog term.
Protoc
must have been run (with the --swipl_out=
option and the resulting
top-level _pb.pl
file loaded. For more details, see the "protoc" section of the
overview documentation.
Fails if the message can't be parsed or if the appropriate meta-data from protoc
hasn't been loaded.
All fields that are omitted from the WireCodes
are set to their
default values (typically the empty string or 0, depending on the
type; or []
for repeated groups). There is no way of testing
whether a value was specified in WireCodes
or given its default
value (that is, there is no equivalent of the Python
implementation's =HasField`). Optional embedded messages and groups
do not have any default value -- you must check their existence by
using get_dict/3 or similar. If a field is part of a "oneof" set,
then none of the other fields is set. You can determine which field
had a value by using get_dict/3.
- Arguments:
-
WireCodes | - Wire format of the message from e.g., read_stream_to_codes/2.
(The stream should have options encoding(octet) and type(binary) ,
either as options to read_file_to_codes/3 or by calling set_stream/2
on the stream to read_stream_to_codes/2.) |
MessageType | - Fully qualified message name (from the .proto file's package and message ).
For example, if the package is google.protobuf and the
message is FileDescriptorSet , then you would use
'.google.protobuf.FileDescriptorSet' or 'google.protobuf.FileDescriptorSet' .
If there's no package name, use e.g.: 'MyMessage or '.MyMessage' .
You can see the packages by looking at
protobufs:proto_meta_package(Pkg,File,_)
and the message names and fields by
protobufs:proto_meta_field_name('.google.protobuf.FileDescriptorSet',
FieldNumber, FieldName, FqnName) (the initial '.' is not optional for these facts,
only for the top-level name given to protobuf_serialize_to_codes/3). |
Term | - The generated term, as nested dicts. |
- Errors
- -
version_error(Module-Version)
you need to recompile the Module
with a newer version of protoc
.
- See also
- -
library(protobufs)
: Google's Protocol Buffers
- bug
- - Ignores
.proto
extensions. - -
map
fields don't get special treatment (but see protobuf_map_pairs/3). - - Generates fields in a different order from the C++, Python,
Java implementations, which use the field number to determine
field order whereas currently this implementation uses field
name. (This isn't stricly speaking a bug, because it's allowed
by the specification; but it might cause some surprise.)
- To be done
- - document the generated terms (see library(http/json) and json_read_dict/3)
- - add options such as
true
and value_string_as
(similar to json_read_dict/3) - - add option for form of the dict tags (fully qualified or not)
- - add option for outputting fields in the C++/Python/Java order
(by field number rather than by field name).
protobuf_serialize_to_codes(+Term:dict, -MessageType:atom, -WireCodes:list(int)) is det- Process a Prolog term into bytes (list of int) that is the serialized form of a
message (designated by
MessageType
).
Protoc
must have been run (with the --swipl_out=
option and the resulting
top-level _pb.pl
file loaded. For more details, see the "protoc" section of the
overview documentation.
Fails if the term isn't of an appropriate form or if the appropriate
meta-data from protoc
hasn't been loaded, or if a field name is incorrect
(and therefore nothing in the meta-data matches it).
- Arguments:
-
Term | - The Prolog form of the data, as nested dicts. |
MessageType | - Fully qualified message name (from the .proto file's package and message ).
For example, if the package is google.protobuf and the
message is FileDescriptorSet , then you would use
'.google.protobuf.FileDescriptorSet' or 'google.protobuf.FileDescriptorSet' .
If there's no package name, use e.g.: 'MyMessage or '.MyMessage' .
You can see the packages by looking at
protobufs:proto_meta_package(Pkg,File,_)
and the message names and fields by
protobufs:proto_meta_field_name('.google.protobuf.FileDescriptorSet',
FieldNumber, FieldName, FqnName) (the initial '.' is not optional for these facts,
only for the top-level name given to protobuf_serialize_to_codes/3). |
WireCodes | - Wire format of the message, which can be output using
format('~s', [WireCodes]) . |
- Errors
- -
version_error(Module-Version)
you need to recompile the Module
with a newer version of protoc
. - - existence_error if a field can't be found in the meta-data
- See also
- -
library(protobufs)
: Google's Protocol Buffers
- bug
- -
map
fields don't get special treatment (but see protobuf_map_pairs/3). - -
oneof
is not checked for validity.
protobuf_var_int(?A:int)// is det[private]- Conversion between an int A and a list of codes, using the
"varint" encoding.
The behvior is undefined if
A
is negative.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
e.g. phrase(protobuf_var_int(300), S)
=> S = [172,2]
phrase(protobuf_var_int(A), [172,2])
-> A = 300
protobuf_tag_type(?Tag:int, ?WireType:atom)// is det[private]- Conversion between Tag (number) + WireType and wirestream codes.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
@arg Tag The item's tag (field number)
@arg WireType The item's wire type (see prolog_type//2 for how to
convert this to a Prolog type)
prolog_type(?Tag:int, ?PrologType:atom)// is semidet[private]- Match Tag (field number) + PrologType.
When Type is a variable, backtracks through all the possibilities
for a given wire encoding.
Note that 'repeated' isn't here because it's handled by single_message//3.
See also segment_type_tag/3.
- payload(?PrologType, ?Payload) is det[private]
- Process the codes into
Payload
, according to PrologType
TODO
: payload//2 "mode" is sometimes module-sensitive, sometimes not.
payload(enum, A)
// has A as a callable
all other uses of payload//2, the 2nd arg is not callable.
- This confuses check/0; it also makes defining an enumeration
more difficult because it has to be defined in module protobufs
(see
vector_demo.pl
, which defines commands/2)
single_message(+PrologType:atom, ?Tag, ?Payload)// is det[private]- Processes a single messages (e.g., one item in the list in
protobuf([...])
.
The PrologType, Tag, Payload are from Field =.. [PrologType, Tag, Payload]
in the caller
protobuf_message(?Template, ?WireStream) is semidet
protobuf_message(?Template, ?WireStream, ?Rest) is nondet- Marshals and unmarshals byte streams encoded using Google's
Protobuf grammars. protobuf_message/2 provides a bi-directional
parser that marshals a Prolog structure to WireStream, according
to rules specified by Template. It can also unmarshal WireStream
into a Prolog structure according to the same grammar.
protobuf_message/3 provides a difference list version.
- Arguments:
-
Template | - is a protobuf grammar specification. On decode,
unbound variables in the Template are unified with their respective
values in the WireStream. On encode, Template must be ground. |
WireStream | - is a code list that was generated by a protobuf
encoder using an equivalent template. |
- bug
- - The protobuf specification states that the wire-stream can have
the fields in any order and that unknown fields are to be ignored.
This implementation assumes that the fields are in the exact order
of the definition and match exactly. If you use
protobuf_parse_from_codes/3, you can avoid this problem.o
protobuf_segment_message(+Segments:list, -WireStream:list(int)) is det[private]
- protobuf_segment_message(-Segments:list, +WireStream:list(int)) is det[private]
- Low level marshalling and unmarshalling of byte streams. The
processing is independent of the
.proto
description, similar to
the processing done by protoc --decode_raw
. This means that
field names aren't shown, only field numbers.
For unmarshalling, a simple heuristic is used on length-delimited
segments: first interpret it as a message; if that fails, try to
interpret as a UTF8 string; otherwise, leave it as a "blob" (if the
heuristic was wrong, you can convert to a string or a blob by using
protobuf_segment_convert/2). 32-bit and 64-bit numbers are left as
codes because they could be either integers or floating point (use
int32_codes_when/2, float32_codes_when/2, int64_codes_when/2,
uint32_codes_when/2, uint64_codes_when/2, float64_codes_when/2 as
appropriate); variable-length numbers ("varint" in the Protocol
Buffers encoding
documentation),
might require "zigzag" conversion, int64_zigzag_when/2.
For marshalling, use the predicates int32_codes_when/2,
float32_codes_when/2, int64_codes_when/2, uint32_codes_when/2,
uint64_codes_when/2, float64_codes_when/2, int64_zigzag_when/2 to
put integer and floating point values into the appropriate form.
- Arguments:
-
Segments | - a list containing terms of the following form (Tag is
the field number; Codes is a list of integers):
varint(Tag,Varint) - Varint may need int64_zigzag_when/2
fixed64(Tag,Int) - Int signed, derived from the 8 codes
fixed32(Tag,Codes) - Int is signed, derived from the 4 codes
message(Tag,Segments)
group(Tag,Segments)
string(Tag,String) - String is a SWI-Prolog string
- packed(Tag,Type(Scalars)) -
Type is one of
varint , fixed64 , fixed32 ; Scalars
is a list of Varint or Codes , which should
be interpreted as described under those items.
Note that the protobuf specification does not
allow packed repeated string.
length_delimited(Tag,Codes)
repeated(List) - List of segments
Of these, group is deprecated in the protobuf documentation and
shouldn't appear in modern code, having been superseded by nested
message types.
For deciding how to interpret a length-delimited item (when
Segments is a variable), an attempt is made to parse the item in
the following order (although code should not rely on this order):
- message
- string (it must be in the form of a UTF string)
- packed (which can backtrack through the various
Type s)
- length_delimited - which always is possible.
The interpretation of length-delimited items can sometimes guess
wrong; the interpretation can be undone by either backtracking or
by using protobuf_segment_convert/2 to convert the incorrect
segment to a string or a list of codes. Backtracking through all
the possibilities is not recommended, because of combinatoric
explosion (there is an example in the unit tests); instead, it is
suggested that you take the first result and iterate through its
items, calling protobuf_segment_convert/2 as needed to reinterpret
incorrectly guessed segments. |
WireStream | - a code list that was generated by a protobuf
endoder. |
- See also
- - https://developers.google.com/protocol-buffers/docs/encoding
- bug
- - This predicate is preliminary and may change as additional
functionality is added.
detag(+Compound, -Name, -Tag, -Value, List, -CompoundWithList) is semidet[private]- Deconstruct
Compound
or the form Name(Tag,Value)
and create a
new CompoundWithList
that replaces Value
with List
. This is
used by packed_list/2 to transform [varint(1,0),varint(1,1)]
to
varint(1,[0,1])
.
Some of Compound
items are impossible for packed
with the
current protobuf spec, but they don't do any harm.
protobuf_segment_convert(+Form1, ?Form2) is multi[private]- A convenience predicate for dealing with the situation where
protobuf_segment_message/2 interprets a segment of the wire stream
as a form that you don't want (e.g., as a message but it should have
been a UTF8 string).
Form1
is converted back to the original wire stream, then the
predicate non-deterimisticly attempts to convert the wire stream to
a string
or length_delimited
term (or both: the lattter
always succeeds).
The possible conversions are:
message(Tag,Segments)
=> string(Tag,String)
message(Tag,Segments)
=> length_delimited(Tag,Codes)
string(Tag,String)
=> length_delimited(Tag,Codes)
length_delimited(Tag,Codes)
=> length_delimited(Tag,Codes)
Note that for fixed32, fixed64, only the signed integer forms are
given; if you want the floating point forms, then you need to do use
int64_float64_when/2 and int32_float32_when/2.
For example:
~~~{.pl}
?- protobuf_segment_convert(
message(10,[fixed64(13,7309475598860382318)]),
string(10,"inputType"))
.
?- protobuf_segment_convert(
message(10,[fixed64(13,7309475598860382318)]),
length_delimited(10,[105,110,112,117,116,84,121,112,101]))
.
?- protobuf_segment_convert(
string(10, "inputType"),
length_delimited(10,[105,110,112,117,116,84,121,112,101]))
.
?- forall(protobuf_segment_convert(string(1999,"\x1\\x0\\x0\\x0\\x2\\x0\\x0\\x0\"),Z), writeln(Z))
.
string(1999,
uint32_codes_when(?Uint32, ?Codes) is det[private]- Convert between a 32-bit unsigned integer value and its wirestream codes.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
This predicate delays until either Uint32
or Codes
is
sufficiently instantiated.
There is also a non-delayed uint32_codes/2
SWI-Prolog doesn't have a 32-bit integer type, so 32-bit integer
is simulated by doing a range check.
- Arguments:
-
Uint32 | - an unsigned integer that's in the 32-bit range |
Codes | - a list of 4 integers (codes) |
- Errors
- - Type,Domain if
Value
or Codes
are of the wrong
type or out of range.
int32_codes_when(?Int32, ?Codes) is det[private]- Convert between a 32-bit signed integer value and its wirestream codes.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
This predicate delays until either Int32
or Codes
is
sufficiently instantiated.
There is also a non-delayed int32_codes/2
SWI-Prolog doesn't have a 32-bit integer type, so 32-bit integer
is simulated by doing a range check.
- Arguments:
-
Int32 | - an unsigned integer that's in the 32-bit range |
Codes | - a list of 4 integers (codes) |
- Errors
- - Type,Domain if
Value
or Codes
are of the wrong
type or out of range.
float32_codes_when(?Value, ?Codes) is det[private]- Convert between a 32-bit floating point value and its wirestream codes.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
This predicate delays until either Value
or Codes
is
sufficiently instantiated.
There is also a non-delayed float32_codes/2
- Arguments:
-
Value | - a floating point number |
Codes | - a list of 4 integers (codes) |
uint64_codes_when(?Uint64, ?Codes) is det[private]- Convert between a 64-bit unsigned integer value and its wirestream codes.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
SWI-Prolog allows integer values greater than 64 bits, so
a range check is done.
This predicate delays until either Uint64
or Codes
is
sufficiently instantiated.
There is also a non-delayed uint64_codes/2
int64_codes_when(?Int64, ?Codes) is det[private]- Convert between a 64-bit signed integer value and its wirestream codes.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
SWI-Prolog allows integer values greater than 64 bits, so
a range check is done.
This predicate delays until either Int64
or Codes
is
sufficiently instantiated.
There is also a non-delayed int64_codes/2
float64_codes_when(?Value, ?Codes) is det[private]- Convert between a 64-bit floating point value and its wirestream codes.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
This predicate delays until either Value
or Codes
is
sufficiently instantiated.
There is also a non-delayed float64_codes/2
- Arguments:
-
Value | - a floating point number |
Codes | - a list of 8 integers (codes) |
- Errors
- - instantiation error if both
Value
and Codes
are uninstantiated.
- bug
- - May give misleading exception under some circumstances
(e.g.,
float64_codes(_, [_,_,_,_,_,_,_,_])
.
int64_zigzag_when(?Original, ?Encoded) is det[private]- Convert between a signed integer value and its zigzag encoding,
used for the protobuf
sint32
and sint64
types. This is a
low-level predicate; normally, you should use template_message/2 and
the appropriate template term.
SWI-Prolog allows integer values greater than 64 bits, so
a range check is done.
This predicate delays until either Original
or Encoded
is
sufficiently instantiated.
There is also a non-delayed int64_zigzag/2
- Arguments:
-
Original | - an integer in the original form |
Encoded | - the zigzag encoding of Original |
- Errors
- - Type,Domain if
Original
or Encoded
are of the wrong
type or out of range. - - instantiation error if both
Original
and Encoded
are uninstantiated.
- See also
- - https://developers.google.com/protocol-buffers/docs/encoding#types
uint64_int64_when(?Uint64:integer, ?Int64:integer) is det[private]- Reinterpret-cast between uint64 and int64. For example,
uint64_int64(0xffffffffffffffff,-1)
.
This predicate delays until either Uint64
or Int64
is
sufficiently instantiated.
There is also a non-delayed uint64_int64/2
- Arguments:
-
Uint64 | - 64-bit unsigned integer |
Int64 | - 64-bit signed integer |
- Errors
- - Type,Domain if
Value
or Codes
are of the wrong
type or out of range. - - instantiation error if both
Value
and Codes
are uninstantiated.
uint32_int32_when(?Uint32, ?Int32) is det[private]- Reinterpret-case between uint32 and int32.
This predicate delays until either Uint32
or Int32
is
sufficiently instantiated.
There is also a non-delayed uint32_int32/2
- Arguments:
-
Uint32 | - 32-bit unsigned integer (range between 0 and 4294967295). |
Int32 | - 32-bit signed integer (range between -2147483648 and 2147483647). |
- Errors
- - Type,Domain if
Int32
or Uint32
are of the wrong
type or out of range. - - instantiation error if both
UInt32
and Int32
are uninstantiated.
int64_float64_when(?Int64:integer, ?Float64:float) is det[private]- Reinterpret-cast between uint64 and float64. For example,
int64_float64(3ff0000000000000,1.0)
.
This predicate delays until either Int64
or Float64
is
sufficiently instantiated.
There is also a non-delayed uint64_int64/2
- Arguments:
-
Int64 | - 64-bit unsigned integer |
Float64 | - 64-bit float |
- Errors
- - Type,Domain if
Value
or Codes
are of the wrong
type or out of range. - - instantiation error if both
Value
and Codes
are uninstantiated.
int32_float32_when(?Int32:integer, ?Float32:float) is det[private]- Reinterpret-cast between uint32 and float32. For example,
int32_float32(0x3f800000,1.0)
.
This predicate delays until either Int32
or Float32
is
sufficiently instantiated.
There is also a non-delayed uint32_int32/2
- Arguments:
-
Int32 | - 32-bit unsigned integer |
Float32 | - 32-bit float |
- Errors
- - Type,Domain if
Value
or Codes
are of the wrong
type or out of range. - - instantiation error if both
Value
and Codes
are uninstantiated.
segment_to_term(+ContextType:atom, +Segment, -FieldAndValue) is det[private]- ContextType is the type (name) of the containing message
Segment is a segment from protobuf_segment_message/2
TODO
: if performance is an issue, this code can be combined with
protobuf_segment_message/2 (and thereby avoid the use of protobuf_segment_convert/2)
convert_segment(+Type:atom, +ContextType:atom, Tag:atom, ?Segment, ?Value) is det[private]- Compute an appropriate
Value
from the combination of descriptor
"type" (in Type
) and a Segment
.
Reversible on Segment
, Values
.
add_defaulted_fields(+Value0:dict, ContextType:atom, -Value:dict) is det[private]
message_field_default(+ContextType:atom, Name:atom, -DefaultValue) is semidet[private]
combine_fields(+Fields:list, +MsgDict0, -MsgDict) is det[private]- Combines the fields into a dict and sets missing fields to their default values.
If the field is marked as 'norepeat' (optional/required), then the last
occurrence is kept (as per the protobuf wire spec)
If the field is marked as 'repeat', then all the occurrences
are put into a list, in order.
This code assumes that fields normally occur all together, but can handle
(less efficiently) fields not occurring together, as is allowed
by the protobuf spec.
combine_fields_repeat(+Fields:list, Field:atom, -Values:list, RestFields:list) is det[private]- Helper for combine_fields/3
Stops at the first item that doesn't match
Field
- the assumption
is that all the items for a field will be together and if they're
not, they would be combined outside this predicate.
- Arguments:
-
Fields | - a list of fields (Field-Repeat-Value) |
Field | - the name of the field that is being combined |
Values | - gets the Value items that match Field |
RestFields | - gets any left-over fields |
field_and_type(+ContextType:atom, +Tag:int, -FieldName:atom, -FqnName:atom, -ContextType2:atom, -RepeatOptional:atom, -Type:atom) is det[private]- Lookup a
ContextType
and Tag
to get the field name, type, etc.
fqn_repeat_optional(+FqnName:atom, -RepeatOptional:atom) is det[private]- Lookup up
proto_meta_field_label(FqnName, _)
, proto_meta_field_option_packed(FqnName)
and set RepeatOptional to one of
norepeat
, repeat
, repeat_packed
.
fqn_repeat_optional_2(+DescriptorLabelEnum:atom, -RepeatOrEmpty:atom) is det[private]- Map the descriptor "label" to 'repeat' or 'norepeat'.
From
proto_meta_enum_value('.google.protobuf.FieldDescriptorProto.Label', Label, _)
.
field_descriptor_label_repeated(+Label:atom) is semidet[private]- From
proto_meta_enum_value('.google.protobuf.FieldDescriptorProto.Label', 'LABEL_REPEATED', _)
.
TODO
: unused
field_descriptor_label_single(+Label:atom) is semidet[private]- From
proto_meta_enum_value('.google.protobuf.FieldDescriptorProto.Label', Label, _)
.
term_to_segments(+Term:dict, +MessageType:atom, Segments) is det[private]- Recursively traverse a
Term
, generating message segments
field_segment_scalar_or_repeated(+Label, +Packed, +FieldType, +Tag, +FieldTypeName, ?Value, Segment) is det[private]FieldType
is from the .proto
meta information ('TYPE_SINT32', etc.)
protobuf_field_is_map(+MessageType, +FieldName) is semidet- Succeeds if
MessageType
's FieldName
is defined as a map<...> in
the .proto file.
protobuf_map_pairs(+ProtobufTermList:list, ?DictTag:atom, ?Pairs) is det- Convert between a list of protobuf map entries (in the form
DictTag{key:Key, value:Value}
and a key-value list as described
in library(pairs). At least one of ProtobufTermList
and Pairs
must be instantiated; DictTag
can be uninstantiated. If
ProtobufTermList
is from a term created by
protobuf_parse_from_codes/3, the ordering of the items is undefined;
you can order them by using keysort/2 (or by a predicate such as
dict_pairs/3, list_to_assoc/2, or list_to_rbtree/2.
The following predicates are exported, but not or incorrectly documented.