Logtilla’s internals: a tutorial on using ASN.1 and linked replies to interact between Erlang and port programs

I have recently released Logtilla, a framework for parsing and analysing web access log files, and showed in a previous article how to use Logtilla with a simple example. One of the most interesting aspects of Logtilla is in the inside. Logtilla demonstrates how to use ASN.1 to communicate between an Erlang program and a C port program. This article is both an introduction to the usual concepts of ASN.1, and a tutorial to apply ASN.1 to integrate Erlang and C programs. The protocol defined in Logtilla is simple, but the approach described can be used to implement sophisticated distributed protocols, especially appropriate for systems management applications.

In Erlang, a port program (or C port) is a program that is executed as a subprocess of the Erlang interpreter, and communicates with Erlang via its standard input and output. In an Erlang program, a port program is started using standard function open_port/2, and killed using port_close/1. The Erlang program sends binary data to the process’ standard input using port_command/2, and can receive data from the process’ standard output as a standard Erlang message in the form {Port, {data, Data}}, where Data is a binary.

The main problem is to design and implement a protocol to structure the data exchanged between the Erlang program and the port program. One could choose for instance a XML-based protocol, or a simple text-based protocol with one command per line (simple, but requires parsing on the receiving side), etc. In the case of Logtilla, the purpose of the port program is to parse text files, and to return structured data to the Erlang program (Logtilla’s gen_log_analyzer module), so I needed to define a protocol with reasonably complex, but well-structured data types. Since Erlang/OTP has an excellent ASN.1 compiler, and ASN.1’s purpose is to define structured data types for protocols, ASN.1 is the formalism of choice.

An overview of ASN.1 and Logtilla’s datatypes

Logtilla’s datatypes corresponding to the Common Log Format and Apache’s Combined Log Format are defined in ASN.1 module WebAccessLog, in file asn1/WebAccessLog.asn1. The main type it defines is LogEntry:

LogEntry ::= SEQUENCE {
  -- Entries in the common log format:
  remote-host         NetworkAddress,
  client-identity [0] UTF8String OPTIONAL,
  auth-user       [1] UTF8String OPTIONAL,
  time                GeneralizedTime,
  request             UTF8String,
  status              HTTPStatusCode,
  length              INTEGER (0..MAX) OPTIONAL,
  -- Additional entries in the combined log format:
  referrer        [2] UTF8String OPTIONAL,
  user-agent      [3] UTF8String OPTIONAL,
  ...
}

This definition is straightforward. A LogEntry is a SEQUENCE, which is similar to a struct in C or a record in Erlang. It is an ordered set of fields, of which each has a name, e.g. remote-host, and a type, e.g. UTF8String. Here the fields are a direct transcription of the log formats. ASN.1 defines many standard types; only the types in bold above are custom-defined. ASN.1 types can be constrained, e.g. the (0..MAX) constraint on field length limits the range of values that this INTEGERfield can take. A field can also be OPTIONAL. The ellipse "..." at the end of the definition makes the type extensible: any parser of this type can ignore any new field added in future versions. This ensures that future versions of the definition, with fields added after the ellipse, are backward-compatible with older decoders.

When encoding a LogEntry structure, an ASN.1 encoder encodes each field in sequence, and attaches an unambiguous tag to each encoded field. ModuleWebAccessLog specifies that tags should be automatically inferred, with option IMPLICIT TAGS. In that case, the standard tag associated with the ASN.1 type is used for each field. However, in some cases tags have to be manually disambiguated, as in the case of the successive referrer and user-agent fields, which are both of type UTF8String and OPTIONAL: if only one string is transmitted, it would be impossible for a decoder to determine from the type’s tag if it corresponds to one field or the other. Therefore, I had to specify application-specific tags [2] and [3] to disambiguate.

The NetworkAddress type definition is straightforward too:

NetworkAddress ::= CHOICE {
  hostname        UTF8String,
  ip-address  [0] OCTET STRING(SIZE (4)),
  ip6-address [1] OCTET STRING(SIZE (16)),
  ...
}

It is a CHOICE, corresponding roughly to a union in C: a network address is either a hostname, of an IPv4 address, or an IPv6 address, etc. Only the chosen field is encoded/decoded. Again, since the ip-address and ip6-address fields have the same type, they have to be disambiguated by specifying application-specific tags, for the decoder to determine which field has been chosen. An OCTET STRING is a simple sequence of bytes.

The HTTPStatusCode type is an enumeration of the HTTP status codes, as defined in standard IETF RFC 2616:

HTTPStatusCode ::= ENUMERATED {
  continue(100),
  switching-protocols(101),
  ok(200),
  created(201),
  accepted(202),
  non-authoritative-information(203),
  no-content(204),
  reset-content(205),
  partial-content(206),
  etc. etc. etc.
  ...
}

An ENUMERATION is similar to an enum in C.

Defining Logtilla’s protocol using ASN.1

I am using the terminology from the RM-ODP (Reference Model of Open Distributed Processing, ITU-T X.903). The most common concept for modeling interactions between two systems (a client and a server) is the concept of operation. There are two kinds of operations:

  • an interrogation, which consists of two interactions:
    • an invocation, initiated by a client to convey information to a server to request the server to perform a function, followed by
    • a termination, initiated by the server to convey information in response to the invocation.
  • an announcement, which consists of only an invocation.

Examples of announcements are Erlang messages, and CORBA “oneway” remote method calls, and common examples of interrogations are RPC calls, and CORBA remote method calls. In many cases, the possible operations that can be performed with a server are specified in an IDL (Interface Definition Language). In the OSI world, the IDL is ROSE (Remote Operations Service Elements, ITU-T X.880-X.882). I use ROSE’s concepts to specify the protocol implemented in Logtilla. Unfortunately, the ASN.1 modules specified in ROSE cannot be compiled using the asn1c ASN.1-to-C compiler, which I use in Logtilla, so I wrote the protocol specification by hand in “pure” ASN.1.

Logtilla currently implements only one operation: ParseLogFile. It is an interrogation. I defined one ASN.1 type for each PDU, in file asn1/WebAccessLogParserOperations.asn1, used each for either an invocation or a termination. There are two possible terminations for a ParseLogFile operation:EndOfFile notifies the end of the parsing of a file, and CannotOpenFile notifies an error when opening the file.

This protocol must be completed by additional operations to allow the Logtilla file parser (the server) to send back log entries (LogEntry data) to the Erlang client. I use the concept of linked reply. This concept is used a lot in systems management protocols, the most important being CMIP (Common Management Information Protocol, ITU-T X.711). The concept of linked reply is one of the few concepts that are very specific to systems management, cf. ODMA (Open Distributed Management Architecture, ITU-T X.703).

A linked reply is an operation that is initiated by the server, and identified as part of an operation that was initiated by the client, and that happens before the client operation terminates. In ROSE, this is implemented by adding a linked ID field to each invocation of a linked reply, which corresponds to the invoke ID of the client operation. In Logtilla, the log file parser returns LogEntry data in a sequence of ReturnLogEntry linked replies, which PDU very simply contains only a linked ID and a LogEntry:

-- Invocation:
ReturnLogEntry ::= SEQUENCE {
  linked-id INTEGER, -- an invoke-id passed in a ParseLogFile
  argument LogEntry
}

Note that ReturnLogEntry is an announcement, i.e. it doesn’t require a termination, so there is no need to include an invoke ID in the ReturnLogEntry type.

The possible sequences of interactions / PDUs for a ParseLogFile operation are therefore:

  • success:
    client -----ParseLogFile----> server
    client <---ReturnLogEntry---- server
    client <---ReturnLogEntry---- server
    ...
    client <---ReturnLogEntry---- server
    client <------EndOfFile------ server
    
  • error:
    client ----ParseLogFile----> server
    client <---CannotOpenFile--- server
    

We still have to group the PDU types to identify all the PDUs that can be sent by the Erlang client (and received by the C server), and vice-versa. This is easily done by defining two CHOICE types:

-- PDUs sent by the client to the server:
ConsumerPDU ::= CHOICE {
  parse-log-file [1] ParseLogFile,
  ...
}
 
-- PDUs sent by the server to the client:
SupplierPDU ::= CHOICE {
  cannot-open-file [1] CannotOpenFile,
  return-log-entry [2] ReturnLogEntry,
  end-of-file [3] EndOfFile,
  ...
}

Implementing the protocol in Erlang

Since communication with port programs is based on streams of bytes, one must find a way to segment the data into PDUs. This is done in Logtilla by passing the{packet, 2} option to the open_port/2 function when starting the C port program. With this option each sent PDU is prepended with its size in 2 bytes, and the data sent by the port program is segmented by reading a PDU size in 2 bytes, reading a binary of exactly that size, and passing that binary to the Erlang program in a{Port, {data, Data}} message. This takes care of segmenting. To transmit PDUs over TCP connections, instead of between Erlang programs and port programs, one can use the TPKT packet format (IETF RFC 2126), which similarly prepends the PDU size to each PDU.

Logtilla’s gen_log_analyzer only has to decode the encoded PDUs, to match the type of the PDU using normal pattern matching, and to call the behaviour module’s appropriate callbacks:

handle_info({Port, {data, EncPDU}}, State) when is_port(Port) ->
  {ok, PDU} = 'WebAccessLogParserOperations':decode('SupplierPDU', EncPDU),
  case PDU of
    {'cannot-open-file', CannotOpenFile} ->
      #'CannotOpenFile'{'invoke-id'=InvokeId} = CannotOpenFile, ...,
      {noreply, NewState};
    {'return-log-entry', ReturnLogEntry} ->
      #'ReturnLogEntry'{'linked-id'=LinkedId, 'argument'=LogEntry} = ReturnLogEntry,
      ..., Mod:handle_log_entry(LogEntry, ModState), ...,
      {noreply, NewState};
    {'end-of-file', EndOfFile} ->
      #'EndOfFile'{'invoke-id'=InvokeId} = EndOfFile, ...,
      {noreply, NewState}
  end.

Sending PDUs to the port program is also easy:

handle_call({parse, FileName}, {From, Tag}, State) ->
  Port = ...,
  % Build and encode the PDU into a binary:
  InvokeId = ...,
  Invoke = #'ParseLogFile'{'invoke-id'=InvokeId, 'argument'=FileName},
  {ok, PDU} = 'WebAccessLogParserOperations':encode(
    'ConsumerPDU', {'parse-log-file', Invoke}),
  % Using the {packet, 2} option for the port, the PDU will be
  % prefixed by its size as a 16-bit integer:
  port_command(Port, PDU),
  ...,
  {noreply, NewState};

All the encoding/decoding is implemented in modules automatically generated by OTP’s ASN.1 compiler, as well as the record definitions (CannotOpenFile,ReturnLogEntry, EndOfFile, LogEntry, etc.).

I will describe the C side of the protocol implementation in a separate article.

Conclusion

This article describes a step-by-step approach to define a lightweight protocol to allow interactions between heterogeneous systems (a C port program and an Erlang program, in the case of Logtilla). The encoded form of ASN.1 data is very compact, and ASN.1 is very expressive: it defines many standard datatypes, etc.

The linked reply pattern is very common in systems management, since when querying a large set of managed objects for their states we usually want to be able to start processing replies as soon as managed objects send them back, without having to wait for all objects to have replied. This is also useful in the case of Logtilla, where using linked replies allows the Erlang program to handle and analyze replied log entries as soon as they are parsed by the port program. I believe that the linked reply concept apply to many contexts where large volumes of data are exchanged, and should not be confined to systems management.