<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Erlang/OTP</title>
  <link rel="alternate" type="text/html" href="http://www.berabera.info/en/taxonomy/term/29"/>
  <link rel="self" type="application/atom+xml" href="http://www.berabera.info/en/taxonomy/term/29/atom/feed"/>
  <id>http://www.berabera.info/en/taxonomy/term/29/atom/feed</id>
  <updated>2008-01-29T13:48:41+09:00</updated>
  <entry>
    <title>Logtilla and GeoIP: analyze the geolocation of web clients</title>
    <link rel="alternate" type="text/html" href="http://www.berabera.info/en/node/276" />
    <id>http://www.berabera.info/en/node/276</id>
    <published>2009-09-21T15:06:26+09:00</published>
    <updated>2009-09-21T15:12:40+09:00</updated>
    <author>
      <name>Romain Lenglet</name>
    </author>
    <category term="Erlang/OTP" />
    <category term="Logtilla" />
    <category term="Web" />
    <summary type="html"><![CDATA[<p>This article presents a simple <a href="http://github.com/rlenglet/Logtilla">Logtilla</a> log analysis module, <a href="http://github.com/rlenglet/Logtilla/blob/master/src/log_geoip_stats.erl"><code>log_geoip_stats</code></a>, which gives the top N client countries, in terms of hits, from web access log files. This module uses the <a href="http://bitbucket.org/mattsta/libgeoip-erlang/src/">libgeoip-erlang</a> library to get geolocations from clients' IP addresses.</p>
    ]]></summary>
    <content type="html"><![CDATA[<p>This article presents a simple <a href="http://github.com/rlenglet/Logtilla">Logtilla</a> log analysis module, <a href="http://github.com/rlenglet/Logtilla/blob/master/src/log_geoip_stats.erl"><code>log_geoip_stats</code></a>, which gives the top N client countries, in terms of hits, from web access log files. This module uses the <a href="http://bitbucket.org/mattsta/libgeoip-erlang/src/">libgeoip-erlang</a> library to get geolocations from clients' IP addresses.</p>
<p>&lt;!--break--></p>
<h2>libgeoip-erlang installation<br /></h2>
<p>Install prerequisite software: Mercurial, and the GeoIP library. On Debian, those are packages <a href="http://packages.debian.org/sid/mercurial">mercurial</a>, <a href="http://packages.debian.org/sid/libgeoip1">libgeoip1</a>, and <a href="http://packages.debian.org/sid/libgeoip-dev">libgeoip-dev</a>. Then, get the libgeoip-erlang sourcecode, and compile it:</p>
<pre>hg clone <a href="http://bitbucket.org/mattsta/libgeoip-erlang/" title="http://bitbucket.org/mattsta/libgeoip-erlang/">http://bitbucket.org/mattsta/libgeoip-erlang/</a>
cd libgeoip-erlang/
make
</pre><p>Then, make sure that the generated <code>libgeoip-1.0.1</code> directory is in the code load path, e.g. by passing <code>-pz .../libgeoip-1.0.1</code> to the <code>erl</code>interpreter. I personally prefer to install everything into <code>/usr/local</code>, and to use <a href="http://www.gnu.org/software/stow/">GNU Stow</a> (Debian package <a href="http://packages.debian.org/sid/stow">stow</a>) to manage packages there:</p>
<pre>sudo mkdir -p /usr/local/stow/libgeoip-1.0.1/lib/erlang/lib/
sudo cp -r libgeoip-1.0.1 /usr/local/stow/libgeoip-1.0.1/lib/erlang/lib/
sudo chown -R root:root /usr/local/stow/libgeoip-1.0.1/
sudo stow -d /usr/local/stow/ libgeoip-1.0.1
</pre><p>This has the effect of installing <code>libgeoip-1.0.1</code> into <code>/usr/local/lib/erlang/lib/</code> with the possibility to easily uninstall it like any Stow package, with one command: <code>sudo stow -d /usr/local/stow/ -D libgeoip-1.0.1</code>. After installing additional Erlang libraries into <code>/usr/local/lib/erlang/lib/</code>, those can be loaded simply by setting the <code>ERL_LIBS=/usr/local/lib/erlang/lib</code> environment variable, as shown below.</p>
<p>Get <a href="http://www.maxmind.com/">MaxMind</a>'s free GeoLite City database. On Debian, this can be done by running:</p>
<pre>sudo sh /usr/share/doc/libgeoip1/examples/geolitecityupdate.sh
</pre><p>This command installs the database into <code>/usr/share/GeoIP/GeoIPCity.dat</code>.</p>
<p>Test that <code>libgeoip</code> works correctly:</p>
<pre>$ ERL_LIBS=/usr/local/lib/erlang/lib erl
&gt; application:start(libgeoip_app).
&gt; libgeoip:set_db("/usr/share/GeoIP/GeoIPCity.dat").
&gt; libgeoip:lookup(&lt;&lt;91,121,26,170&gt;&gt;).
</pre><p>This should give you the location of the <code>www.berabera.info</code> server in France.</p>
<h2>Using log_geoip_stats to analyze web client locations<br /></h2>
<p>Logtilla's <a href="http://github.com/rlenglet/Logtilla/blob/master/src/log_geoip_stats.erl">log_geoip_stats</a> module uses the <code>libgeoip</code> library to count the number of parsed log entries per client country. Here is a sample usage to analyze a single Apache <code>access.log</code> file:</p>
<pre>$ cd src
$ PATH=../c_src:$PATH ERL_LIBS=/usr/local/lib/erlang/lib erl
&gt; application:start(libgeoip_app).
&gt; libgeoip:set_db("/usr/share/GeoIP/GeoIPCity.dat").
&gt; {ok, Pid} = gen_log_analyzer:start_link(log_geoip_stats, [], []).
&gt; ok = gen_log_analyzer:parse(Pid, "/var/log/apache2/access.log").
&gt; log_geoip_stats:get_stats(Pid, 10).
</pre><p>The <code>get_stats/2</code> function orders the countries by number of hits, converts the numbers of hits into percentages, and returns the top N countries (here, N=10). For one of my <code>access.log</code> files, this prints out:</p>
<pre>[{'US',40.94006639874192},
 {'JP',30.572543537771566},
 {'GB',4.403284990389656},
 {'FR',3.9664511619779836},
 {'BY',3.8266643368862483},
 {'TR',3.6286330013396237},
 {'CH',2.958821131108393},
 {'TH',0.9260877162327451},
 {'PR',0.9202632651872561},
 {'CA',0.9086143630962782}]
</pre><p>The vast majority of my visitors in that period came from the USA (40%) and Japan (30%).</p>
<h2>Implementation overview</h2>
<p>In module <code>log_geoip_stats</code>, most of the code is boilerplate to implement the <code>gen_log_analyzer</code> behaviour. The most interesting pieces are functions <code>handle_log_entry/2</code> and <code>get_stats/2</code>:</p>
<pre><em>% Analyze a parsed log entry:</em>
handle_log_entry(LogEntry, State) -&gt;
    case LogEntry#'LogEntry'.'remote-host' of
        {'ip-address', IPAddress} -&gt;
            case libgeoip:lookup(list_to_binary(IPAddress)) of
                {geoip, Country, _, _, _, _, _, _, _} -&gt;
                    <em>% Address found:</em>
                    State1 = update_country(list_to_atom(Country), State),
                    {ok, State1};
                [] -&gt;
                    <em>% Address Unknown to the GeoIP library:</em>
                    State1 = update_country('unknown', State),
                    {ok, State1}
            end;
        _Else -&gt;
            <em>% If the client address is a hostname or an ip6-address,</em>
            <em>% count it as 'unknown':</em>
            State1 = update_country('unknown', State),
            {ok, State1}
    end.

<em>% Query the stats from the analysis process, and order and convert the data:</em>
get_stats(Name, Length) -&gt;
    Stats = dict:to_list(gen_log_analyzer:call(Name, get_stats)),
    Total = lists:foldl(fun({_, Count}, Total) -&gt; Total + Count end,
                         0, Stats),
    Stats1 = lists:sort(fun({_, C1}, {_, C2}) -&gt; C1 &gt; C2 end, Stats),
    Stats2 = lists:sublist(Stats, Length),
    lists:map(fun({Country, Count}) -&gt; {Country, Count*100/Total} end, Stats2).
</pre><p>One possible improvement would be to handle timeouts to calls to <code>libgeoip:lookup/1</code>. The implementation of that function implicitly imposes an arbitrary timeout of 200ms, which I have sometimes observed. My current implementation does not tolerate such timeouts.</p>
    ]]></content>
  </entry>
  <entry>
    <title>Logtilla&#039;s internals: a tutorial on using ASN.1 and linked replies to interact between Erlang and port programs</title>
    <link rel="alternate" type="text/html" href="http://www.berabera.info/en/node/267" />
    <id>http://www.berabera.info/en/node/267</id>
    <published>2009-09-15T22:02:15+09:00</published>
    <updated>2009-09-15T22:01:42+09:00</updated>
    <author>
      <name>Romain Lenglet</name>
    </author>
    <category term="ASN.1" />
    <category term="Erlang/OTP" />
    <category term="Logtilla" />
    <summary type="html"><![CDATA[<p>I have recently released <a href="http://github.com/rlenglet/Logtilla">Logtilla</a>, a framework for parsing and analysing web access log files, and showed in a previous article <a href="/en/node/266">how to use Logtilla with a simple example</a>. One of the most interesting aspects of Logtilla is in the inside. Logtilla demonstrates how to use <a href="http://en.wikipedia.org/wiki/ASN.1">ASN.1</a> to communicate between an Erlang program and a C port program. This article is both an introduction to the usual concepts of ASN.1, and a tutorial to apply ASN.1 to integrate Erlang and C programs. The protocol defined in Logtilla is simple, but the approach described can be used to implement sophisticated distributed protocols, especially appropriate for systems management applications.</p>
    ]]></summary>
    <content type="html"><![CDATA[<p>I have recently released <a href="http://github.com/rlenglet/Logtilla">Logtilla</a>, a framework for parsing and analysing web access log files, and showed in a previous article <a href="/en/node/266">how to use Logtilla with a simple example</a>. One of the most interesting aspects of Logtilla is in the inside. Logtilla demonstrates how to use <a href="http://en.wikipedia.org/wiki/ASN.1">ASN.1</a> to communicate between an Erlang program and a C port program. This article is both an introduction to the usual concepts of ASN.1, and a tutorial to apply ASN.1 to integrate Erlang and C programs. The protocol defined in Logtilla is simple, but the approach described can be used to implement sophisticated distributed protocols, especially appropriate for systems management applications.</p>
<p>&lt;!--break--></p>
<p>In Erlang, a <a href="http://www.erlang.org/doc/tutorial/c_port.html">port program</a> (or C port) is a program that is executed as a subprocess of the Erlang interpreter, and communicates with Erlang via its standard input and output. In an Erlang program, a port program is started using standard function <code>open_port/2</code>, and killed using <code>port_close/1</code>. The Erlang program sends binary data to the process' standard input using <code>port_command/2</code>, and can receive data from the process' standard output as a standard Erlang message in the form <code>{Port, {data, Data}}</code>, where <code>Data</code> is a binary.</p>
<p>The main problem is to design and implement a protocol to structure the data exchanged between the Erlang program and the port program. One could choose for instance a XML-based protocol, or a simple text-based protocol with one command per line (simple, but requires parsing on the receiving side), etc. In the case of Logtilla, the purpose of the port program is to parse text files, and to return structured data to the Erlang program (Logtilla's <a href="http://github.com/rlenglet/Logtilla/blob/master/src/gen_log_analyzer.erl"><code>gen_log_analyzer</code></a> module), so I needed to define a protocol with reasonably complex, but well-structured data types. Since Erlang/OTP has an excellent ASN.1 compiler, and ASN.1's purpose is to define structured data types for protocols, ASN.1 is the formalism of choice.</p>
<h2>An overview of ASN.1 and Logtilla's datatypes</h2>
<p>Logtilla's datatypes corresponding to the <a href="http://en.wikipedia.org/wiki/Common_Log_Format">Common Log Format</a> and <a href="http://httpd.apache.org/docs/2.0/logs.html">Apache's Combined Log Format</a> are defined in ASN.1 module <code>WebAccessLog</code>, in file <a href="http://github.com/rlenglet/Logtilla/blob/master/asn1/WebAccessLog.asn1"><code>asn1/WebAccessLog.asn1</code></a>. The main type it defines is <code>LogEntry</code>:</p>
<pre><strong>LogEntry</strong> ::= SEQUENCE {
  <em>-- Entries in the common log format:</em>
  remote-host         <strong>NetworkAddress</strong>,
  client-identity [0] UTF8String OPTIONAL,
  auth-user       [1] UTF8String OPTIONAL,
  time                GeneralizedTime,
  request             UTF8String,
  status              <strong>HTTPStatusCode</strong>,
  length              INTEGER (0..MAX) OPTIONAL,
  <em>-- Additional entries in the combined log format:</em>
  referrer        [2] UTF8String OPTIONAL,
  user-agent      [3] UTF8String OPTIONAL,
  ...
}
</pre><p>This definition is straightforward. A <code>LogEntry</code> is a <code>SEQUENCE</code>, which is similar to a <code>struct</code> in C or a record in Erlang. It is an ordered set of fields, of which each has a name, e.g. <code>remote-host</code>, and a type, e.g. <code>UTF8String</code>. Here the fields are a direct transcription of the log formats. ASN.1 defines many standard types; only the types in bold above are custom-defined. ASN.1 types can be constrained, e.g. the <code>(0..MAX)</code> constraint on field <code>length</code> limits the range of values that this <code>INTEGER</code> field can take. A field can also be <code>OPTIONAL</code>. The ellipse <code>"..."</code> at the end of the definition makes the type extensible: any parser of this type can ignore any new field added in future versions. This ensures that future versions of the definition, with fields added after the ellipse, are backward-compatible with older decoders.</p>
<p>When encoding a <code>LogEntry</code> structure, an ASN.1 encoder encodes each field in sequence, and attaches an unambiguous tag to each encoded field. Module <code>WebAccessLog</code> specifies that tags should be automatically inferred, with option <code>IMPLICIT TAGS</code>. In that case, the standard tag associated with the ASN.1 type is used for each field. However, in some cases tags have to be manually disambiguated, as in the case of the successive <code>referrer</code> and <code>user-agent</code> fields, which are both of type <code>UTF8String</code> and <code>OPTIONAL</code>: if only one string is transmitted, it would be impossible for a decoder to determine from the type's tag if it corresponds to one field or the other. Therefore, I had to specify application-specific tags <code>[2]</code> and <code>[3]</code> to disambiguate.</p>
<p>The <code>NetworkAddress</code> type definition is straightforward too:</p>
<pre><strong>NetworkAddress</strong> ::= CHOICE {
  hostname        UTF8String,
  ip-address  [0] OCTET STRING(SIZE (4)),
  ip6-address [1] OCTET STRING(SIZE (16)),
  ...
}
</pre><p>It is a <code>CHOICE</code>, corresponding roughly to a <code>union</code> in C: a network address is either a hostname, of an IPv4 address, or an IPv6 address, etc. Only the chosen field is encoded/decoded. Again, since the <code>ip-address</code> and <code>ip6-address</code> fields have the same type, they have to be disambiguated by specifying application-specific tags, for the decoder to determine which field has been chosen. An <code>OCTET STRING</code> is a simple sequence of bytes.</p>
<p>The <code>HTTPStatusCode</code> type is an enumeration of the HTTP status codes, as defined in standard <a href="http://tools.ietf.org/html/rfc2616">IETF RFC 2616</a>:</p>
<pre><strong>HTTPStatusCode</strong> ::= ENUMERATED {
  continue(100),
  switching-protocols(101),
  ok(200),
  created(201),
  accepted(202),
  non-authoritative-information(203),
  no-content(204),
  reset-content(205),
  partial-content(206),
  <em>etc. etc. etc.</em>
  ...
}
</pre><p>An <code>ENUMERATION</code> is similar to an <code>enum</code> in C.</p>
<h2>Defining Logtilla's protocol using ASN.1</h2>
<p>I am using the terminology from the <a href="http://www.itu.int/rec/T-REC-X.903/en">RM-ODP (Reference Model of Open Distributed Processing, ITU-T X.903)</a>. The most common concept for modeling interactions between two systems (a client and a server) is the concept of <em>operation</em>. There are two kinds of operations:</p>
<ul>
<li>an <em>interrogation</em>, which consists of two interactions:
<ul>
<li>an <em>invocation</em>, initiated by a client to convey information to a server to request the server to perform a function, followed by</li>
<li>a <em>termination</em>, initiated by the server to convey information in response to the invocation.</li>
</ul>
</li>
<li>an <em>announcement</em>, which consists of only an invocation.</li>
</ul>
<p>Examples of annoucements are Erlang messages, and CORBA "oneway" remote method calls, and common examples of interrogations are RPC calls, and CORBA remote method calls. In many cases, the possible operations that can be performed with a server are specified in an IDL (Interface Definition Language). In the <a href="http://en.wikipedia.org/wiki/OSI_model">OSI</a> world, the IDL is <a href="http://www.itu.int/rec/T-REC-X.880/en">ROSE (Remote Operations Service Elements, ITU-T X.880-X.882)</a>. I use ROSE's concepts to specify the protocol implemented in Logtilla. Unfortunately, the ASN.1 modules specified in ROSE cannot be compiled using the <a>asn1c ASN.1-to-C compiler</a>, which I use in Logtilla, so I wrote the protocol specification by hand in "pure" ASN.1.</p>
<p>Logtilla currently implements only one operation: <code>ParseLogFile</code>. It is an interrogation. I defined one ASN.1 type for each PDU, in file <a href="http://github.com/rlenglet/Logtilla/blob/master/asn1/WebAccessLog.asn1"><code>asn1/WebAccessLogParserOperations.asn1</code></a>, used each for either an invocation or a termination. There are two possible terminations for a <code>ParseLogFile</code> operation: <code>EndOfFile</code> notifies the end of the parsing of a file, and <code>CannotOpenFile</code> notifies an error when opening the file.</p>
<pre><em>-- Invocation:</em>
<strong>ParseLogFile</strong> ::= SEQUENCE {
  invoke-id INTEGER,
  argument UTF8String -- the name of the file to parse
}
 
<em>-- Erroneous termination of a ParseLogFile operation:</em>
<strong>CannotOpenFile</strong> ::= SEQUENCE {
  invoke-id INTEGER
}
 
<em>-- Normal termination:</em>
<strong>EndOfFile</strong> ::= SEQUENCE {
  invoke-id INTEGER
}
</pre><p>Since there may be several operations performed simultaneously by a client, the client must be able to associate a received termination to a pending operation. Therefore, each invocation PDU contains an <em>invoke ID</em> field, which is an integer identifier that is unambiguous to the client. This invoke ID is sent back by the server in each termination PDU, also in an <em>invoke ID</em> field.</p>
<p>This protocol must be completed by additional operations to allow the Logtilla file parser (the server) to send back log entries (<code>LogEntry</code> data) to the Erlang client. I use the concept of <em>linked reply</em>. This concept is used a lot in systems management protocols, the most important being <a href="http://www.itu.int/rec/T-REC-X.711/en">CMIP (Common Management Information Protocol, ITU-T X.711)</a>. The concept of linked reply is one of the few concepts that are very specific to systems management, cf. <a href="http://www.itu.int/rec/T-REC-X.703/en">ODMA (Open Distributed Management Architecture, ITU-T X.703)</a>.</p>
<p>A linked reply is an operation that is initiated by the server, and identified as part of an operation that was initiated by the client, and that happens before the client operation terminates. In ROSE, this is implemented by adding a <em>linked ID</em> field to each invocation of a linked reply, which corresponds to the invoke ID of the client operation. In Logtilla, the log file parser returns <code>LogEntry</code> data in a sequence of <code>ReturnLogEntry</code> linked replies, which PDU very simply contains only a linked ID and a <code>LogEntry</code>:</p>
<pre><em>-- Invocation:</em>
<strong>ReturnLogEntry</strong> ::= SEQUENCE {
  linked-id INTEGER, <em>-- an invoke-id passed in a ParseLogFile</em>
  argument LogEntry
}
</pre><p>Note that <code>ReturnLogEntry</code> is an announcement, i.e. it doesn't require a termination, so there is no need to include an invoke ID in the <code>ReturnLogEntry</code> type.</p>
<p>The possible sequences of interactions / PDUs for a <code>ParseLogFile</code> operation are therefore:</p>
<ul>
<li>success:<br />
<pre>client -----ParseLogFile----&gt; server
client &lt;---ReturnLogEntry---- server
client &lt;---ReturnLogEntry---- server
...
client &lt;---ReturnLogEntry---- server
client &lt;------EndOfFile------ server
</pre></li>
<li>error:<br />
<pre>client ----ParseLogFile----&gt; server
client &lt;---CannotOpenFile--- server
</pre></li>
</ul>
<p>We still have to group the PDU types to identify all the PDUs that can be sent by the Erlang client (and received by the C server), and vice-versa. This is easily done by defining two <code>CHOICE</code> types:</p>
<pre><em>-- PDUs sent by the client to the server:</em>
<strong>ConsumerPDU</strong> ::= CHOICE {
  parse-log-file [1] ParseLogFile,
  ...
}
 
<em>-- PDUs sent by the server to the client:</em>
<strong>SupplierPDU</strong> ::= CHOICE {
  cannot-open-file [1] CannotOpenFile,
  return-log-entry [2] ReturnLogEntry,
  end-of-file [3] EndOfFile,
  ...
}
</pre><h2>Implementing the protocol in Erlang</h2>
<p>Since communication with port programs is based on streams of bytes, one must find a way to segment the data into PDUs. This is done in Logtilla by passing the <code>{packet, 2}</code> option to the <code>open_port/2</code> function when starting the C port program. With this option each sent PDU is prepended with its size in 2 bytes, and the data sent by the port program is segmented by reading a PDU size in 2 bytes, reading a binary of exactly that size, and passing that binary to the Erlang program in a <code>{Port, {data, Data}}</code> message. This takes care of segmenting. To transmit PDUs over TCP connections, instead of between Erlang programs and port programs, one can use the <a href="http://tools.ietf.org/html/rfc2126">TPKT packet format (IETF RFC 2126)</a>, which similarly prepends the PDU size to each PDU.</p>
<p>Logtilla's <code>gen_log_analyzer</code> only has to decode the encoded PDUs, to match the type of the PDU using normal pattern matching, and to call the behaviour module's appropriate callbacks:</p>
<pre>handle_info({Port, {data, EncPDU}}, State) when is_port(Port) -&gt;
  {ok, PDU} = 'WebAccessLogParserOperations':decode('SupplierPDU', EncPDU),
  case PDU of
    {'cannot-open-file', CannotOpenFile} -&gt;
      #'CannotOpenFile'{'invoke-id'=InvokeId} = CannotOpenFile, ...,
      {noreply, NewState};
    {'return-log-entry', ReturnLogEntry} -&gt;
      #'ReturnLogEntry'{'linked-id'=LinkedId, 'argument'=LogEntry} = ReturnLogEntry,
      ..., Mod:handle_log_entry(LogEntry, ModState), ...,
      {noreply, NewState};
    {'end-of-file', EndOfFile} -&gt;
      #'EndOfFile'{'invoke-id'=InvokeId} = EndOfFile, ...,
      {noreply, NewState}
  end.
</pre><p>Sending PDUs to the port program is also easy:</p>
<pre>handle_call({parse, FileName}, {From, Tag}, State) -&gt;
  Port = ...,
  <em>% Build and encode the PDU into a binary:</em>
  InvokeId = ...,
  Invoke = #'ParseLogFile'{'invoke-id'=InvokeId, 'argument'=FileName},
  {ok, PDU} = 'WebAccessLogParserOperations':encode(
    'ConsumerPDU', {'parse-log-file', Invoke}),
  <em>% Using the {packet, 2} option for the port, the PDU will be</em>
  <em>% prefixed by its size as a 16-bit integer:</em>
  port_command(Port, PDU),
  ...,
  {noreply, NewState};
</pre><p>All the encoding/decoding is implemented in modules automatically generated by OTP's ASN.1 compiler, as well as the record definitions (<code>CannotOpenFile</code>, <code>ReturnLogEntry</code>, <code>EndOfFile</code>, <code>LogEntry</code>, etc.).</p>
<p>I will describe the C side of the protocol implementation in a separate article.</p>
<h2>Conclusion</h2>
<p>This article describes a step-by-step approach to define a lightweight protocol to allow interactions between heterogeneous systems (a C port program and an Erlang program, in the case of Logtilla). The encoded form of ASN.1 data is very compact, and ASN.1 is very expressive: it defines many standard datatypes, etc.</p>
<p>The <em>linked reply</em> pattern is very common in systems management, since when querying a large set of managed objects for their states we usually want to be able to start processing replies as soon as managed objects send them back, without having to wait for all objects to have replied. This is also useful in the case of Logtilla, where using linked replies allows the Erlang program to handle and analyze replied log entries as soon as they are parsed by the port program. I believe that the <em>linked reply</em> concept apply to many contexts where large volumes of data are exchanged, and should not be confined to systems management.</p>
    ]]></content>
  </entry>
  <entry>
    <title>First release of Logtilla, a web access log analyzer in Erlang</title>
    <link rel="alternate" type="text/html" href="http://www.berabera.info/en/node/266" />
    <id>http://www.berabera.info/en/node/266</id>
    <published>2009-09-13T18:11:09+09:00</published>
    <updated>2009-09-13T18:53:22+09:00</updated>
    <author>
      <name>Romain Lenglet</name>
    </author>
    <category term="Erlang/OTP" />
    <category term="Logtilla" />
    <category term="Web" />
    <summary type="html"><![CDATA[<p>I have written a small Erlang framework for parsing web access logs, called <a href="http://github.com/rlenglet/Logtilla">Logtilla</a>, hosted on <a href="http://github.com/rlenglet">GitHub</a>. This framework supports parsing logs in the <a href="http://en.wikipedia.org/wiki/Common_Log_Format">Common Log Format</a>, or in <a href="http://httpd.apache.org/docs/2.0/logs.html">Apache's Combined Log Format</a>. Thanks to the use of a C port program to do the parsing, Logtilla is very efficient: it can parse and analyze 15,000 entries/sec on my 4-year-old laptop.</p>
    ]]></summary>
    <content type="html"><![CDATA[<p>I have written a small Erlang framework for parsing web access logs, called <a href="http://github.com/rlenglet/Logtilla">Logtilla</a>, hosted on <a href="http://github.com/rlenglet">GitHub</a>. This framework supports parsing logs in the <a href="http://en.wikipedia.org/wiki/Common_Log_Format">Common Log Format</a>, or in <a href="http://httpd.apache.org/docs/2.0/logs.html">Apache's Combined Log Format</a>. Thanks to the use of a C port program to do the parsing, Logtilla is very efficient: it can parse and analyze 15,000 entries/sec on my 4-year-old laptop.</p>
<p>&lt;!--break--></p>
<h2>Installation</h2>
<p>To build it, pull the Git archive from <a href="http://github.com/rlenglet/Logtilla">the Logtilla Git repository</a>, and then initialize the build system, configure, and build:</p>
<pre>autoreconf -vi
./configure
make
</pre><p>This requires you to install <a href="http://www.gnu.org/software/autoconf/">Autoconf</a>, <a href="http://www.gnu.org/software/automake/">Automake</a>, the <a href="http://lionet.info/asn1c/">asn1c ASN.1-to-C compiler</a> which is used by Logtilla (I have tested that both the released 0.9.21 version and the version in the asn1c SVN repository are usable for Logtilla), and of course any recent version of <a href="http://www.erlang.org/">Erlang/OTP</a>.</p>
<h2>Overview of Logtilla</h2>
<p>Logtilla consists essentially of a single behaviour module: <code>gen_log_analyzer</code>, which defines the following callbacks:</p>
<ul>
<li><code>init/1</code>: Initialize the state.<br />
<pre><strong>init(</strong><span style="color: green;">Args</span>::any()<strong>)</strong> -&gt;
    {'ok', State::any()}
    | 'ignore'
    | {'stop', Reason::any()}.
</pre></li>
<li><code>handle_log_entry/2</code>: Handle a parsed log entry. The <code>LogEntry</code> record type is defined in header file <code>WebAccessLog.hrl</code>.<br />
<pre><strong>handle_log_entry(</strong><span style="color: green;">LogEntry</span>::#'LogEntry'(), <span style="color: green;">State</span>::any()<strong>)</strong> -&gt;
    {'ok', NewState::any()}
    | {'error', Reason::any(), NewState::any()}.
</pre></li>
<li><code>handle_call/3</code>: Handle an application-specific call. This callback is similar to the <code>gen_server:handle_call/3</code> callback.<br />
<pre><strong>handle_call(</strong><span style="color: green;">Msg</span>::any(), {<span style="color: green;">From</span>::pid(), <span style="color: green;">Tag</span>::any()}, <span style="color: green;">State</span>::any()<strong>)</strong> -&gt;
    {'reply', Reply::any(), NewState::any()}
    | {'reply', Reply::any(), NewState::any(), Timeout::timeout()}
    | {'noreply', NewState::any()}
    | {'noreply', NewState::any(), Timeout::timeout()}
    | {'stop', Reason::any(), Reply::any(), NewState::any()}.
</pre></li>
<li><code>handle_cast/2</code>: Handle an application_specific cast. This callback is similar to the <code>gen_server:handle_cast/2</code> callback.<br />
<pre><strong>handle_cast(</strong><span style="color: green;">Msg</span>::any(), <span style="color: green;">State</span>::any()<strong>)</strong> -&gt;
    {'noreply', NewState::any()}
    | {'noreply', NewState::any(), Timeout::timeout()}
    | {'stop', Reason::any(), NewState::any()}.
</pre></li>
<li><code>terminate/2</code>: Cleanup on termination. This callback is similar to the <code>gen_server:terminate/2</code> callback.<br />
<pre><strong>terminate(</strong><span style="color: green;">Reason</span>::any(), <span style="color: green;">State</span>::any()) -&gt;
    no_return().
</pre></li>
<li><code>code_change/3</code>: Update the state after a module upgrade. This callback is similar to the <code>gen_server:code_change/3</code> callback.<br />
<pre><strong>code_change(</strong>{'down', <span style="color: green;">OldVsn</span>::any()} | <span style="color: green;">OldVsn</span>::any(), <span style="color: green;">State</span>::any(), <span style="color: green;">Extra</span>::any()) -&gt;
    {'ok', NewState::any()}.
</pre></li>
</ul>
<p>The most important callbacks to implement are <code>init/1</code> and <code>handle_log_entry/2</code>.</p>
<h2>Running example</h2>
<p>Logtilla contains a basic example module, <code>log/logtilla_test</code>. It counts how many parsed log entries correspond to a query reply for which a length was returned, and how many don't have a length. This module has no practical purpose, but is useful to illustrate the behaviour callbacks. The module's most important parts are:</p>
<pre>-module(logtilla_test).

<em>% Implement Logtilla's gen_log_analyzer behaviour:</em>
-behaviour(gen_log_analyzer).
<em>% Include Logtilla's header for the definition of the LogEntry record:</em>
-include("WebAccessLog.hrl"). <em>

% Define and initialize the state:</em>
-record(state, {count_without_length=0, count_with_length=0}).
init([]) -&gt;
  State = #state{},
  {ok, State}.

<em>% Analyze the log entry and update the state:</em>
handle_log_entry(LogEntry, State) -&gt;
  case LogEntry#'LogEntry'.length of
    asn1_NOVALUE -&gt;
      {ok, State#state{
        count_without_length=State#state.count_without_length+1}};
    _Length -&gt;
      {ok, State#state{
        count_with_length=State#state.count_with_length+1}}
  end.

<em>% Implement an application-specific call to return the stats:</em>
handle_call(get_stats, _, State) -&gt;
  {reply, {State#state.count_without_length, State#state.count_with_length},
   State}.
</pre><p>To execute this example to parse a file named <code>/var/log/apache2/access.log</code>:</p>
<pre>$ cd src
$ PATH=../c_src:$PATH erl
&gt; {ok, Pid} = gen_log_analyzer:start_link(logtilla_test, [], []).
&gt; ok = gen_log_analyzer:parse(Pid, "/var/log/apache2/access.log").
&gt; gen_log_analyzer:call(Pid, get_stats).
</pre><p>This prints out a tuple with the count of entries without a length and the count of entries with a length.</p>
<p>You must add the <code>c_src</code> directory to the <code>PATH</code>, as it is where the <code>logtilla_parser</code> program is generated, and this program is executed as a port program by <code>gen_log_analyzer</code> to parse the files, so this program must be found in the <code>PATH</code>.</p>
<p>I will soon write other blog posts on the internals of Logtilla (which is the most interesting), and on future works.</p>
    ]]></content>
  </entry>
  <entry>
    <title>Coming changes in GNU Autoconf&#039;s Erlang support</title>
    <link rel="alternate" type="text/html" href="http://www.berabera.info/en/node/238" />
    <id>http://www.berabera.info/en/node/238</id>
    <published>2009-08-25T21:37:57+09:00</published>
    <updated>2009-08-25T22:07:24+09:00</updated>
    <author>
      <name>Romain Lenglet</name>
    </author>
    <category term="Erlang/OTP" />
    <category term="GNU Autoconf" />
    <category term="GNU Autotools" />
    <summary type="html"><![CDATA[<p>I have sent <a title="patches to GNU Autoconf to add new macros for testing Erlang modules, include files, and functions" href="http://lists.gnu.org/archive/html/autoconf-patches/2009-08/msg00010.html">patches to GNU Autoconf to add new macros for testing Erlang modules, include files, and functions</a>: <code>AC_ERLANG_CHECK_MOD</code>, <code>AC_ERLANG_CHECK_HEADER</code>, <code>AC_ERLANG_CHECK_LIB_HEADER</code>, and <code>AC_ERLANG_CHECK_FUNC</code>. I have also sent <a title="a patch to fix the AC_RUN_IFELSE macro" href="http://lists.gnu.org/archive/html/autoconf-patches/2009-08/msg00010.html">a patch to fix the <code>AC_RUN_IFELSE</code> macro</a> which executes Erlang test code, so that this macro cleanly fails if the code doesn't compile, and <a title="another patch to fix the AC_COMPILE_IFELSE macro" href="http://lists.gnu.org/archive/html/autoconf-patches/2009-08/msg00014.html">another patch to fix the <code>AC_COMPILE_IFELSE</code> macro</a>, which tests that Erlang test code compiles (this is <a title="a long known Autoconf bug" href="http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=454798">a <em>long</em> known Autoconf bug</a>).</p>
    ]]></summary>
    <content type="html"><![CDATA[<p>I have sent <a title="patches to GNU Autoconf to add new macros for testing Erlang modules, include files, and functions" href="http://lists.gnu.org/archive/html/autoconf-patches/2009-08/msg00010.html">patches to GNU Autoconf to add new macros for testing Erlang modules, include files, and functions</a>: <code>AC_ERLANG_CHECK_MOD</code>, <code>AC_ERLANG_CHECK_HEADER</code>, <code>AC_ERLANG_CHECK_LIB_HEADER</code>, and <code>AC_ERLANG_CHECK_FUNC</code>. I have also sent <a title="a patch to fix the AC_RUN_IFELSE macro" href="http://lists.gnu.org/archive/html/autoconf-patches/2009-08/msg00010.html">a patch to fix the <code>AC_RUN_IFELSE</code> macro</a> which executes Erlang test code, so that this macro cleanly fails if the code doesn't compile, and <a title="another patch to fix the AC_COMPILE_IFELSE macro" href="http://lists.gnu.org/archive/html/autoconf-patches/2009-08/msg00014.html">another patch to fix the <code>AC_COMPILE_IFELSE</code> macro</a>, which tests that Erlang test code compiles (this is <a title="a long known Autoconf bug" href="http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=454798">a <em>long</em> known Autoconf bug</a>).</p>
<p>&lt;!--break--></p>
<p>Only the <code>AC_RUN_IFELSE</code> patch has been committed so far. The other patches are functional, but the Autoconf maintainers would prefer a better integration with the other Autoconf macros, for instance by making Autoconf's <code>AC_CHECK_HEADER</code> macro working for Erlang code when the test language is set to Erlang, which could be used like that:</p>
<pre style="background-color: #efefef;">AC_LANG([Erlang])<br />AT_CHECK_HEADER([eunit/include/eunit.hrl])</pre><p>I agree that this is the right way to go, and it will benefit the other supported languages, but it will take me a lot more time to implement.</p>
<p>Ralf Wildenhus, one of GNU Autoconf's maintainers, has also found <a href="http://lists.gnu.org/archive/html/bug-autoconf/2009-08/msg00028.html">a bug in the <code>AT_CHECK_EUNIT</code> macro</a>, which <a href="/en/node/194">executes EUnit tests from a GNU Autotest testsuite</a>: the macro was always failing when used with Erlang/OTP versions prior to R13. We finally fixed that bug, so that <a href="http://lists.gnu.org/archive/html/autoconf-patches/2009-08/msg00039.html"><code>AT_CHECK_EUNIT</code> will work with prior versions of Erlang/OTP</a>, when it will be first available in the next release of Autoconf. Users don't have to worry about the details of that bug, but I just want to share below my frustration of using Erlang to implement command-line tools.</p>
<p>The problem with <code>AT_CHECK_EUNIT</code> was that the Erlang module generated internally by <code>AT_CHECK_EUNIT</code> for calling EUnit was calling function <code>init:stop/1</code> (taking an exit code as an argument). But that function was introduced only in R13, and only <code>init:stop/0</code> (without argument) was available in previous versions. There are only three high-level ways to stop the Erlang VM:</p>
<ul>
<li>by calling <code>halt/1</code>, which takes the exit code as an argument, and abruptly exits the VM with that exit code without flushing the standard output;</li>
<li>by calling <code>init:stop/0</code>, which cleanly flushes the standard output, and always exits the VM with exit code 0;</li>
<li>by calling <code>init:stop/1</code>, which both cleanly flushes the standard output, and exits the VM with the exit code given as an argument, but is available only from version R13.</li>
</ul>
<p>In AT_CHECK_EUNIT, we need both to return custom exit codes, to integrate with Autotest, and to flush the standard output, to let Autotest capture the whole output produced by EUnit. So only <code>init:stop/0</code> can be used. Internally, the Erlang test module was initially run within Autotest's standard <code>AT_CHECK</code> macro, basically like:</p>
<pre style="background-color: #efefef;">AT_CHECK([erl -s foobar start], [0])</pre><p><code>AT_CHECK</code> checks the exit code of the command in argument, and compares it to the expected exit code (here, 0). If it is the expected exit code, then the test succeeds. If the exit code is 77, the test is skipped, e.g. to indicate that a requirement to the test is not met. Otherwise, the test fails.</p>
<p>In the executed Erlang module, we determine that the exit code is 77 if EUnit is not available (i.e., if module <code>eunit</code> cannot be loaded); otherwise if EUnit was run successfully and the EUnit test passed it is 0; otherwise it is 1. That exit code was passed to <code>init:stop/1</code> to exit the VM. In the fixed version of <code>AT_CHECK_EUNIT</code>, the module now writes out the exit code into a temporary file, and then calls <code>init:stop/0</code> (and therefore the VM always exits with code 0 and normally never fails). The real exit code is then read from the file and checked in a subsequent test, like:</p>
<pre style="background-color: #efefef;">AT_CHECK([erl -s foobar start], [0])<br />AT_CHECK([test -f tempfile &amp;&amp; (exit `cat tempfile`)])</pre>    ]]></content>
  </entry>
  <entry>
    <title>EUnit integration into GNU Autotest</title>
    <link rel="alternate" type="text/html" href="http://www.berabera.info/en/node/194" />
    <id>http://www.berabera.info/en/node/194</id>
    <published>2009-08-02T17:31:35+09:00</published>
    <updated>2009-08-02T18:28:56+09:00</updated>
    <author>
      <name>Romain Lenglet</name>
    </author>
    <category term="Erlang/OTP" />
    <category term="GNU Autoconf" />
    <category term="GNU Autotest" />
    <category term="GNU Autotools" />
    <summary type="html"><![CDATA[<p><a title="GNU Autotest is GNU Autoconf's unit testing tool" href="http://www.gnu.org/software/autoconf/manual/autoconf.html#Using-Autotest">GNU Autotest is GNU Autoconf's unit testing tool</a>. and is very generic, portable and simple. Autotest is therefore well suited to run tests using other testing tools, such as <a title="EUnit (Erlang/OTP's unit testing tool)" href="http://erlang.org/doc/apps/eunit/index.html">EUnit (Erlang/OTP's unit testing tool)</a>, <a title="JUnit" href="http://www.junit.org/">JUnit</a>, etc. I just added to <a title="GNU Autoconf" href="http://www.gnu.org/software/autoconf/">GNU Autoconf</a> the <a title="AT_CHECK_EUNIT macro to run EUnit unit tests in Autotest testsuites" href="http://lists.gnu.org/archive/html/autoconf-patches/2009-07/msg00059.html"><code>AT_CHECK_EUNIT</code> macro to run EUnit unit tests in Autotest testsuites</a>. It is the first Autotest macro to integrate such an external testing tool, and will be included in the next version of GNU Autoconf (&gt; 2.64). A complete, working example project using <code>AT_CHECK_EUNIT</code> can be downloaded in <a title="autotest-eunit-demo-1.0.tgz" href="/files/autotest-eunit-demo-1.0.tgz"><code>autotest-eunit-demo-1.0.tgz</code></a>. This article shows how to use <code>AT_CHECK_EUNIT</code> by explaining the important parts of this example project.</p>
    ]]></summary>
    <content type="html"><![CDATA[<p><a title="GNU Autotest is GNU Autoconf's unit testing tool" href="http://www.gnu.org/software/autoconf/manual/autoconf.html#Using-Autotest">GNU Autotest is GNU Autoconf's unit testing tool</a>. and is very generic, portable and simple. Autotest is therefore well suited to run tests using other testing tools, such as <a title="EUnit (Erlang/OTP's unit testing tool)" href="http://erlang.org/doc/apps/eunit/index.html">EUnit (Erlang/OTP's unit testing tool)</a>, <a title="JUnit" href="http://www.junit.org/">JUnit</a>, etc. I just added to <a title="GNU Autoconf" href="http://www.gnu.org/software/autoconf/">GNU Autoconf</a> the <a title="AT_CHECK_EUNIT macro to run EUnit unit tests in Autotest testsuites" href="http://lists.gnu.org/archive/html/autoconf-patches/2009-07/msg00059.html"><code>AT_CHECK_EUNIT</code> macro to run EUnit unit tests in Autotest testsuites</a>. It is the first Autotest macro to integrate such an external testing tool, and will be included in the next version of GNU Autoconf (&gt; 2.64). A complete, working example project using <code>AT_CHECK_EUNIT</code> can be downloaded in <a title="autotest-eunit-demo-1.0.tgz" href="/files/autotest-eunit-demo-1.0.tgz"><code>autotest-eunit-demo-1.0.tgz</code></a>. This article shows how to use <code>AT_CHECK_EUNIT</code> by explaining the important parts of this example project.</p>
<p>&lt;!--break--></p>
<h2>How to write Autotest testsuites</h2>
<p>In Autotest, each project has usually one <em>testsuite</em> to run all its tests. A testsuite is a set of <em>test groups</em>. The purpose of Autotest is to generate a single, portable shell script that performs all the tests in a testsuite. A testsuite is specified in a text file with file extension <code>.at</code>, typically named <code>testsuite.at</code>, and which contents are <a title="M4" href="http://www.gnu.org/software/m4/">M4</a> macro calls. Autotest defines a set of M4 macros specific to implementing testsuites.</p>
<p>The main <code>testsuite.at</code> file must start with a call to <code>AT_INIT</code>, and optionally a call to macro <code>AT_COPYRIGHT</code>, and then typically includes other <code>.at</code> Autotest files that define the test groups. For instance, a <code>testsuite.at</code> file that includes <code>autotest_eunit_demo.at</code> is:</p>
<pre style="background-color: #efefef;"><strong>AT_INIT</strong><br /><strong>AT_COPYRIGHT</strong>([Copyright (c) 2009 Romain Lenglet])<br /><strong>m4_include</strong>([autotest_eunit_demo.at])<br />...</pre><p>An included <code>.at</code> file should contain a set of test groups. A call to <code>AT_BANNER</code> at the top of a <code>.at</code> file displays information about the set of test groups. Then, each test group consists in a number tests enclosed between <code>AT_SETUP</code> and <code>AT_CLEANUP</code>. <code>AT_SETUP</code> takes a descriptive text as an argument. Tests in Autotest consist in creating files using macro <code>AT_DATA</code>, and executing commands using <code>AT_CHECK</code>, mixed with Bourne Shell code.</p>
<p>For instance, a set of simple tests of command <code>grep</code> is:</p>
<pre style="background-color: #efefef;"><strong>AT_BANNER</strong>([Trivial tests of grep])<br /><strong>AT_SETUP</strong>([grep])<br /><strong>AT_DATA</strong>([testfile],<br />[[line one<br />line two<br />]])<br /><strong>AT_CHECK</strong>([grep -q one testfile])<br /><strong>AT_CLEANUP</strong></pre><p><code>AT_CHECK</code> executes the given command line, and tests that its exit code is 0, and that the command outputs nothing on its standard outputs. <code>AT_CHECK</code> takes several optional arguments to specify which exit code is expected (0 is the default), whether the command's output should be ignored, etc.</p>
<h2>How to use the <code>AT_CHECK_EUNIT</code> macro</h2>
<p>The new <code>AT_CHECK_EUNIT</code> macro is similar to <code>AT_CHECK</code>. However, instead of executing a command line, it executes a EUnit test, given a EUnit test specification. Its syntax is the following:</p>
<p><code>AT_CHECK_EUNIT(<em>module</em>, <em>test-spec</em>, [<em>erlflags</em>], [<em>run-if-fail</em>], [<em>run-if-pass</em>])</code></p>
<p><em>test-spec</em> is the specification of the test to run. It must be a valid <a title="EUnit test specification" href="http://erlang.org/doc/apps/eunit/chapter.html#1.5">EUnit test specification</a>. <em>module</em> is a unique Erlang module name. Under the hood, <code>AT_CHECK_EUNIT</code> generates a "wrapper" Erlang module named <em>module</em>, which calls the EUnit library to execute the <em>test-spec</em>, compiles that wrapper module and executes it. If the EUnit test passes, <code>AT_CHECK_EUNIT</code> passes, otherwise it fails. <em>erlflags</em> are optional command-line options to pass to the Erlang interpreter to execute the wrapper module. This can be used for instance to specify the paths to the compiled modules under test.</p>
<p>For instance, to test all the EUnit tests associated to a module <code>autotest_eunit_demo</code>, one can use EUnit specification <code>{module, autotest_eunit_demo}</code>. To perform that test in Autotest, it is sufficient to write this <code>autotest_eunit_demo.at</code> file:</p>
<pre style="background-color: #efefef;">AT_BANNER([Tests demonstrating the integration of GNU Autotest and EunitTrivial tests of grep])<br />AT_SETUP([autotest_eunit_demo])<br /><strong>AT_KEYWORDS</strong>([autotest_eunit_demo])<br /><strong>AT_CHECK_EUNIT</strong>([test],<br />  [{module, autotest_eunit_demo}],<br />  [-pa "${abs_top_builddir}/src"])<br />AT_CLEANUP</pre><p>In this example, the project's build system is assumed to compile the tested modules (<code>autotest_eunit_demo.beam</code>, etc.) into the <code>src</code> subdirectory of the project. So <em>erlflags</em> is set to <code>-pa "${abs_top_builddir}/src"</code>, which configures the Erlang VM to load compiled modules from that directory.</p>
<p>The call to <code>AT_KEYWORDS</code> macro, which is a standard Autotest macro, associates keywords to this test group. A testsuite can be setup to run only tests with specific keywords, instead of running all the tests in the testsuite.</p>
<h2>How to generate testsuites</h2>
<p>An Autotest testsuite should be generated using GNU Autoconf and GNU Automake. An Autotest testsuite is typically contained in a subdirectory <em>tests</em> of a project, which should contain all the <code>.at</code> files for the testsuite (<code>testsuite.at</code>, and all other included <code>.at</code> files), and a <code>Makefile.am</code> Automake file to generate and run the testsuite. <code>This Makefile.am</code> is mostly boilerplate given in the <a title="Autoconf manual" href="http://www.gnu.org/software/autoconf/manual/autoconf.html#Making-testsuite-Scripts">Autoconf manual</a>. What varies between projects is the list of <code>.at</code> files to process (in bold):</p>
<pre style="background-color: #efefef;"><em># Always generate package.m4 into the source directory, not into the<br /># build directory, since it must be distributed, along with testsuite,<br /># configure, etc.<br /></em>$(srcdir)/package.m4: $(top_srcdir)/configure.ac<br />        :;{ \<br />          echo '# Signature of the current package.' &amp;&amp; \<br />          echo 'm4_define([AT_PACKAGE_NAME],      [$(PACKAGE_NAME)])' &amp;&amp; \<br />          echo 'm4_define([AT_PACKAGE_TARNAME],   [$(PACKAGE_TARNAME)])' &amp;&amp; \<br />          echo 'm4_define([AT_PACKAGE_VERSION],   [$(PACKAGE_VERSION)])' &amp;&amp; \<br />          echo 'm4_define([AT_PACKAGE_STRING],    [$(PACKAGE_STRING)])' &amp;&amp; \<br />          echo 'm4_define([AT_PACKAGE_BUGREPORT], [$(PACKAGE_BUGREPORT)])'; \<br />          echo 'm4_define([AT_PACKAGE_URL],       [$(PACKAGE_URL)])'; \<br />        } &gt; $@-t<br />        mv $@-t $@<br />EXTRA_DIST = <strong>testsuite.at autotest_eunit_demo.at</strong> package.m4 $(TESTSUITE)<br />TESTSUITE = $(srcdir)/testsuite<br />check-local: atconfig $(TESTSUITE)<br />        $(SHELL) '$(TESTSUITE)' $(TESTSUITEFLAGS)<br />installcheck-local: atconfig $(TESTSUITE)<br />        $(SHELL) '$(TESTSUITE)' $(TESTSUITEFLAGS)<br />clean-local:<br />        test ! -f '$(TESTSUITE)' || \<br />          $(SHELL) '$(TESTSUITE)' --clean<br />AUTOM4TE = autom4te<br />AUTOTEST = $(AUTOM4TE) --language=autotest<br /><em># Always generate testsuite into the source directory, not into the<br /># build directory, since it must be distributed, along with<br /># package.m4, configure, etc.<br /></em>$(TESTSUITE): <strong>$(srcdir)/testsuite.at $(srcdir)/autotest_eunit_demo.at</strong> \<br />              $(srcdir)/package.m4<br />        $(AUTOTEST) -I '$(srcdir)' -o $@.tmp $@.at<br />        mv $@.tmp $@<br />DISTCLEANFILES = atconfig<br /></pre><p>The top directory must contain a <code>Makefile.am</code> that defines <code>SUBDIRS</code> to build at least the <code>tests</code> subdirectory (as well as the other subdirectories, etc.):</p>
<pre style="background-color: #efefef;">...<br />SUBDIRS += tests<br />...</pre><p>Finally, the <code>configure.ac</code> Autoconf configuration file must contain a call to macro <code>AC_CONFIG_TESTDIR(<em>dir</em>)</code> where dir is the subdirectory of the testsuite (in this case, <code>tests</code>). To enable the use of macro <code>AT_CHECK_EUNIT</code> in testsuites, <code>configure.ac</code> must also detect the Erlang interpreter and the Erlang compiler, using macros <code>AC_ERLANG_PATH_ERL</code> and <code>AR_ERLANG_PATH_ERLC</code> (or better, <code>AC_ERLANG_NEED_ERL</code> and <code>AC_ERLANG_NEED_ERLC</code>):</p>
<pre style="background-color: #efefef;">AC_PREREQ([2.64.6-683a0])<br />AC_INIT([Autotest Eunit integration demo], [1.0], [romain.lenglet@example.com],<br />  [autotest-eunit-demo])<br />AM_INIT_AUTOMAKE([1.10])<br />...<br /><strong>AC_ERLANG_NEED_ERL<br />AC_ERLANG_NEED_ERLC<br /></strong>...<br />AC_CONFIG_FILES([<br />  <strong>Makefile</strong><br /><strong>  tests/Makefile</strong><br />  ...])<br /><strong>AC_CONFIG_TESTDIR([tests])</strong><br />AC_OUTPUT<br /></pre><p>If the Erlang interpreter and compiler are not detected in <code>configure.ac</code> using those macros, all the test groups that use <code>AT_CHECK_EUNIT</code> will be skipped.</p>
<p>To obtain all the configuration and build files generated by Autoconf and Automake, and also the <code>tests/testsuite</code> script generated by Autotest to execute all the tests, run those commands:</p>
<pre style="background-color: #efefef;">autoreconf -vi<br />./configure<br />(cd tests ; make testsuite)</pre><p>Since the <code>AC_CHECK_EUNIT</code> macro requires a version of Autoconf newer than 2.64, and no such version has yet been released, you will have first to download and build Autoconf from the latest version in the <a title="Autoconf source repository" href="http://www.gnu.org/software/autoconf/#downloading">Autoconf source repository</a>, and make sure that the built versions of those tools are in the PATH when running the commands above.</p>
<p>The generated files (<code>configure</code>, <code>Makefile.in</code>, <code>tests/testsuite</code>, etc.) must be distributed to users. This is done automatically when generating an archive using the generated <code>Makefile</code>s and <code>make dist</code>:</p>
<pre style="background-color: #efefef;">./configure ; make dist</pre><h2>How to run Autotest testsuites</h2>
<p>As an end-user, all that is needed to run the test suite is to configure and build the software, and then executing <code>make check</code>:</p>
<pre style="background-color: #efefef;">./configure ; make ; make check</pre><p>For instance, to build and test my example project <a title="autotest-eunit-demo-1.0.tgz" href="/files/autotest-eunit-demo-1.0.tgz"><code>autotest-eunit-demo-1.0.tgz</code></a>, you only need to execute:</p>
<pre style="background-color: #efefef;">tar xzf autotest-eunit-demo-1.0.tgz<br />cd autotest-eunit-demo-1.0<br />./configure ; make ; make check</pre><p>End-users don't need to install and use Autoconf, Automake, or Autotest. The generated distributed files don't depend on those tools.</p>
<p>In the <code>tests</code> subdirectory, all that <code>make check</code> does is to execute the generated <code>tests/testsuite</code> shell script. <code>testsuite</code> can be executed directly, and accepts some options. For instance, running <code>testsuite</code> in verbose mode displays detailed information on the standard output, and also runs EUnit in verbose mode:</p>
<pre style="background-color: #efefef;">(cd tests ; ./testsuite --verbose)</pre><p>One can also restrict the executed tests to only those associated to specific keywords (as specified using macro <code>AT_KEYWORDS</code>), for instance:</p>
<pre style="background-color: #efefef;">(cd tests ; ./testsuite -k autotest_eunit_demo)</pre><h2>Conclusion</h2>
<p>Setting up an Autotest testsuite may seem tedious. However, it is quite easy once you have a working Autoconf / Automake configuration. Then, adding tests using <code>AT_CHECK_EUNIT</code> macros to run EUnit tests is trivial. This integration of Autotest and EUnit will mostly benefit developers and users of projects that use several languages, e.g. Erlang, Java, and C, who have to integrate tests written in different languages and using different testing systems. Autotest will be extended to integrate more and more testing systems, to be usable as a general driver of all a project's tests.</p>
    ]]></content>
  </entry>
  <entry>
    <title>News on Erlang support in GNU Autoconf 2.64</title>
    <link rel="alternate" type="text/html" href="http://www.berabera.info/en/node/193" />
    <id>http://www.berabera.info/en/node/193</id>
    <published>2009-08-02T11:37:53+09:00</published>
    <updated>2009-08-02T11:37:53+09:00</updated>
    <author>
      <name>Romain Lenglet</name>
    </author>
    <category term="Erlang/OTP" />
    <category term="GNU Autoconf" />
    <category term="GNU Autotools" />
    <summary type="html"><![CDATA[<p>I have restarted actively maintaining the Erlang support in <a title="GNU Autoconf" href="http://www.gnu.org/software/autoconf/">GNU Autoconf</a>. <a title="Autoconf version 2.64 has been released on July 26th" href="http://lists.gnu.org/archive/html/autoconf/2009-07/msg00079.html">Autoconf version 2.64 has been released on July 26th</a>, which contains a few changes to its Erlang support.</p>
    ]]></summary>
    <content type="html"><![CDATA[<p>I have restarted actively maintaining the Erlang support in <a title="GNU Autoconf" href="http://www.gnu.org/software/autoconf/">GNU Autoconf</a>. <a title="Autoconf version 2.64 has been released on July 26th" href="http://lists.gnu.org/archive/html/autoconf/2009-07/msg00079.html">Autoconf version 2.64 has been released on July 26th</a>, which contains a few changes to its Erlang support.<br />
Autoconf had <a title="a bug breaking the AC_ERLANG_CHECK_LIB macro" href="http://lists.gnu.org/archive/html/autoconf/2008-09/msg00097.html">a bug breaking the AC_ERLANG_CHECK_LIB macro</a>, between versions 2.61a and 2.63, making those versions of Autoconf practically unusable for Erlang projects. This bug was fixed in version 2.63b. This bug had remained undetected by the Autoconf maintainers because there were no unit tests for the Erlang macros. So I added <a title="unit tests for the Erlang-related macros in version 2.64" href="http://lists.gnu.org/archive/html/autoconf-patches/2009-07/msg00037.html">unit tests for the Erlang-related macros in version 2.64</a>, which should prevent such regressions in the future. This was my first encounter with <a title="GNU Autotest" href="http://www.gnu.org/software/autoconf/manual/autoconf.html#Using-Autotest">GNU Autotest</a>, as all of Autoconf's tests are written using Autotest. Although Autotest is basic, I find it very useful as a central driver for performing all the tests of a project. I will write more on that in another blog article.<br />
I finally <a title="added the AC_ERLANG_SUBST_ERTS_VER macro" href="http://lists.gnu.org/archive/html/autoconf-patches/2009-05/msg00016.html">added the AC_ERLANG_SUBST_ERTS_VER macro</a>. This macro was suggested (a long, long time ago) by Ruslan Babayev to ease the <a title="automatic generation of Erlang release resource files (.erl)" href="/en/node/69">automatic generation of Erlang release resource files (.erl)</a>. This is the only user-visible change in Autoconf 2.64's Erlang support.<br />
I plan to add more features to Autoconf's Erlang support in the next versions, and I started working on adding support to the other GNU Autotools and build system tools: Autotest, Automake, etc.</p>
    ]]></content>
  </entry>
  <entry>
    <title>Automatically generating Erlang/OTP .app and .rel files using GNU Autoconf</title>
    <link rel="alternate" type="text/html" href="http://www.berabera.info/en/node/69" />
    <id>http://www.berabera.info/en/node/69</id>
    <published>2008-02-16T19:39:02+09:00</published>
    <updated>2008-02-17T09:58:46+09:00</updated>
    <author>
      <name>Romain Lenglet</name>
    </author>
    <category term="Erlang/OTP" />
    <category term="GNU Autoconf" />
    <category term="GNU Autotools" />
    <summary type="html"><![CDATA[<p>
A user of the GNU Autotools (Autoconf and Automake) with Erlang/OTP has recently asked me about a way to automatically generate the <a href="http://www.erlang.org/doc/design_principles/applications.html#7.3"><code>.app</code> (application resource) files</a> and <a href="http://www.erlang.org/doc/design_principles/release_structure.html#10.2"><code>.rel</code> (release resource) files</a> for an Erlang/OTP application. This is a perfect occasion to present the ideas that <a href="http://ruslan.babayev.com/">Ruslan Babayev</a> had on the subject, <a href="http://www.berabera.info/oldblog/lenglet/archives/2006/09/index.html#e2006-09-05T20_16_04.txt">which I promised to publish back in Sep. 2006</a>. I am "only" 18 months late! (^_^);
</p>
    ]]></summary>
    <content type="html"><![CDATA[<p>
A user of the GNU Autotools (Autoconf and Automake) with Erlang/OTP has recently asked me about a way to automatically generate the <a href="http://www.erlang.org/doc/design_principles/applications.html#7.3"><code>.app</code> (application resource) files</a> and <a href="http://www.erlang.org/doc/design_principles/release_structure.html#10.2"><code>.rel</code> (release resource) files</a> for an Erlang/OTP application. This is a perfect occasion to present the ideas that <a href="http://ruslan.babayev.com/">Ruslan Babayev</a> had on the subject, <a href="http://www.berabera.info/oldblog/lenglet/archives/2006/09/index.html#e2006-09-05T20_16_04.txt">which I promised to publish back in Sep. 2006</a>. I am "only" 18 months late! (^_^);
</p>
<p>
As for now, Automake provides no help to generate those files. This is Autoconf's job. What Autoconf can do is help you avoid the manual duplication of information between several files, and help you synchronize the content of the <code>.app</code> and <code>.rel</code> files with the state of your system and your Erlang/OTP installation. The trick is to let Autoconf generate parts of the <code>.app</code> and <code>.rel</code> files for you.
</p>
<p>
To let Autoconf substitute parts of those files, first add their names into the <code>AC_CONFIG_FILES</code> macro call, in your <code>configure.ac</code> file:
</p>
<p><code><br />
AC_CONFIG_FILES([... sample.app sample.rel])<br />
</code></p>
<p>
Then, name your files with extensions <code>.app.in</code> and <code>.rel.in</code>, e.g. <code>sample.app.in</code> and <code>sample.rel.in</code>, as the <code>configure</code> script will generate <code>sample.app</code> from <code>sample.app.in</code> and <code>sample.rel</code> from <code>sample.rel.in</code>. Your <code>.app.in</code> and <code>.rel.in</code> files should contain variables to substitute enclosed with <code>@...@</code>.
</p>
<p>
For instance, a <code>sample.app.in</code> file could be:
</p>
<p><code><br />
{application, @PACKAGE@,<br />
 [{description, "Sample Erlang server"},<br />
  {vsn, "@VERSION@"},<br />
  {modules, [sample_app, sample_sup, sample_server]},<br />
  {registered, [sample_sup, sample_server]},<br />
  {applications, [kernel, stdlib]},<br />
  {mod, {sample_app, []}}<br />
 ]}.<br />
</code></p>
<p>
The <code>@PACKAGE@</code> and <code>@VERSION@</code> parts are substituted with the information supplied in the <code>AC_INIT</code> macro call in <code>configure.ac</code>.
</p>
<p>
Likewise, a <code>sample.rel.in</code> file could be:
</p>
<p><code><br />
{release, {"@PACKAGE@", "@VERSION@"}, {erts, "5.6.1"},<br />
 [{kernel, "@ERLANG_LIB_VER_kernel@"},<br />
  {stdlib, "@ERLANG_LIB_VER_stdlib@"},<br />
  {@PACKAGE@, "@VERSION@"}]}.<br />
</code></p>
<p>
The variables <code>ERLANG_LIB_VER_kernel</code> and <code>ERLANG_LIB_VER_stdlib</code> contain the version numbers of the installed libraries named <code>kernel</code> and <code>stdlib</code>. To define such variables automatically with the versions of the libraries actually installed in your system, you just have to call the standard <code>AC_ERLANG_CHECK_LIB</code> Autoconf macro in your <code>configure.ac</code> file, e.g.:
</p>
<p><code><br />
AC_ERLANG_CHECK_LIB([kernel])<br />
AC_ERLANG_CHECK_LIB([stdlib])<br />
</code></p>
<p>
Note also that the ERTS version is not yet automatically determined by the standard Autoconf macros. However, Ruslan also proposed a new macro, <code>AC_ERLANG_SUBST_ERTS_VER</code>, to do that:
</p>
<p><code><br />
# AC_ERLANG_SUBST_ERTS_VER<br />
# -------------------------------------------------------------------<br />
AC_DEFUN([AC_ERLANG_SUBST_ERTS_VER],<br />
[AC_REQUIRE([AC_ERLANG_NEED_ERLC])[]dnl<br />
AC_REQUIRE([AC_ERLANG_NEED_ERL])[]dnl<br />
AC_CACHE_CHECK([for Erlang/OTP ERTS version],<br />
    [erlang_cv_erts_ver],<br />
    [AC_LANG_PUSH(Erlang)[]dnl<br />
     AC_RUN_IFELSE(<br />
        [AC_LANG_PROGRAM([], [dnl<br />
            Version = erlang:system_info(version),<br />
            file:write_file("conftest.out", Version),<br />
            halt(0)])],<br />
        [erlang_cv_erts_ver=`cat conftest.out`],<br />
        [AC_MSG_FAILURE([test Erlang program execution failed])])<br />
     AC_LANG_POP(Erlang)[]dnl<br />
    ])<br />
AC_SUBST([ERLANG_ERTS_VER], [$erlang_cv_erts_ver])<br />
])# AC_ERLANG_SUBST_ERTS_VER<br />
</code></p>
<p>
You only have to put the code of that macro for instance in a <code>acinclude.m4</code> file, placed in the same directory as the <code>configure.ac</code> file, and to call it in <code>configure.ac</code>:
</p>
<p><code><br />
AC_ERLANG_SUBST_ERTS_VER<br />
</code></p>
<p>
That way, the <code>ERLANG_ERTS_VER</code> variable gets automatically defined with the detected installed ERTS version, and can be substituted in your <code>.rel.in</code> file, e.g. in <code>sample.rel.in</code>:<br />
<code><br />
{release, {"@PACKAGE@", "@VERSION@"}, {erts, "<b>@ERLANG_ERTS_VER@</b>"},<br />
 [{kernel, "@ERLANG_LIB_VER_kernel@"},<br />
  {stdlib, "@ERLANG_LIB_VER_stdlib@"},<br />
  {@PACKAGE@, "@VERSION@"}]}.<br />
</code></p>
<p>
Note to myself: I should really really take a little time to submit the definition of Ruslan's <code>AC_ERLANG_SUBST_ERTS_VER</code> macro to the GNU Autotools team for inclusion into the standard Autoconf...
</p>
    ]]></content>
  </entry>
  <entry>
    <title>Dryverl version 0.1.3 is out</title>
    <link rel="alternate" type="text/html" href="http://www.berabera.info/en/node/53" />
    <id>http://www.berabera.info/en/node/53</id>
    <published>2008-01-29T12:31:41+09:00</published>
    <updated>2008-01-29T13:48:41+09:00</updated>
    <author>
      <name>Romain Lenglet</name>
    </author>
    <category term="C" />
    <category term="Dryverl" />
    <category term="Erlang/OTP" />
    <summary type="html"><![CDATA[<p>I have published a new version of <a href="http://dryverl.objectweb.org/">Dryverl</a>, <a href="http://forge.objectweb.org/forum/forum.php?forum_id=1325">version 0.1.3</a>. This is a minor release, that corrects two bugs.<br />
Dryverl supports <code>&lt;dev-c-local-variable/&gt;</code> elements to declare local variables in the generated C code. The first bug was that Dryverl allows such an element to be empty, which would mean that the generated C local variable declaration has no specific initial value. However, in that case, Dryverl generated invalid declarations, such as:<br />
int some_var = ();<br />
Dryverl 0.1.3 now correctly generates no initializer in such cases:</p>
    ]]></summary>
    <content type="html"><![CDATA[<p>I have published a new version of <a href="http://dryverl.objectweb.org/">Dryverl</a>, <a href="http://forge.objectweb.org/forum/forum.php?forum_id=1325">version 0.1.3</a>. This is a minor release, that corrects two bugs.</p>
<p>Dryverl supports <code>&lt;dev-c-local-variable/&gt;</code> elements to declare local variables in the generated C code. The first bug was that Dryverl allows such an element to be empty, which would mean that the generated C local variable declaration has no specific initial value. However, in that case, Dryverl generated invalid declarations, such as:</p>
<p>int some_var = ();</p>
<p>Dryverl 0.1.3 now correctly generates no initializer in such cases:</p>
<p>int some_var;</p>
<p>The second bug was in the generated code that encodes output data, i.e. data that is returned from the C side to the Erlang side of the driver. As several methods are available in the Erlang port driver API to communicate between C and Erlang, Dryverl-generated drivers try to automatically use the most efficient one. The most efficient of all is the <a href="http://www.erlang.org/doc/man/erlang.html#erlang:port_call/3">port call</a> method. In the case the C code is invoked by a port call from the Erlang side, <a href="http://www.erlang.org/doc/man/driver_entry.html#call">the emulator allocates itself a buffer that must be used for returning the output encoded data</a>. The code generated by Dryverl automatically uses that buffer whenever possible, to avoid unnecessary memory allocation. However, if the encoded data exceeds that buffer, the generated code automatically switches to another method, which is to encode the output data as a separately allocated Erlang binary term, and reference that term from the output data encoded in the port call buffer.</p>
<p>The bug was that when the port call output buffer is exceeded, the data already encoded into it was not copied into the newly allocated Erlang binary&#39;s buffer, i.e. the start of that binary was left as garbage. The code generated by Dryverl 0.1.3 now correctly handles that case by copying the port call output buffer data into the Erlang binary buffer, before using that Erlang binary to further encode output data.</p>
<p>That second bug is of a kind that is very hard to trigger, because of the very large number of possible combinations of methods that can be used to communicate between Erlang and C, and because the code generated by Dryverl, and the strategy chosen by that code, depend on several conditions: are binaries sent in input? in output? are asynchronous threads used? is the encoded output data too large to fit the port call output buffer?, etc. </p>
<p>Thanks a lot to Hunter Morris for spotting those two bugs and for submitting  a patch.</p>
    ]]></content>
  </entry>
</feed>
