Portable 64 bit integer type

Here I would like to write about coding a “portable” C programs in a sense that they produce same results when compiled and executed on 32 and 64 bits architectures.

The problem I faced with DES is that I need to make sure that my integers are exactly 32 bits on all platforms where I compile it. Also I must be able to create long integers which are exactly 64 bits wide.

What makes C code different on 64 bits platforms.

First thing which comes to mind is the size of integer data types on 32 bits and 64 bit platforms.
Using the common sense we expect 32 bits platforms to have (ILP32 model):

  • 4 bytes per pointer
  • 4 bytes per int
  • 4 bytes per long

and perhaps we expect that 64 bits platforms have:

  • 8 bytes per pointer
  • 4 bytes per int (or 8?)
  • 8 bytes per long

The problem is that you cannot be sure about these values.
The figures above for 64 bit platform are called LP64 (“Longs and Pointers are 64 bits: LP64″) data model.
In fact, there are different data models which denote various schemes.


size of integer data models LLP64 ILP64

Standard (C99) defines minimum size requirements (in number of bytes) for the int and long types.
It also says that long type should be at least as big as, int but they can have equal sizes.
However, when we port some code to a new platform, it is usually easy to find out the data model.

The following small test will show the sizes of main data types.

#include <stdio.h>

int main(int argc, char * argv[])
{

printf("sizeof(void*) = %d bytes\n", sizeof(void*));
printf("sizeof(char) = %d byte\n", sizeof(char));
printf("sizeof(short) = %d bytes\n", sizeof(short));
printf("sizeof(int) = %d bytes\n", sizeof(int));
printf("sizeof(long) = %d bytes\n", sizeof(long));
printf("sizeof(long int) = %d bytes\n", sizeof(long int));
printf("sizeof(float) = %d bytes\n", sizeof(float));
printf("sizeof(double) = %d bytes\n", sizeof(double));

//the next line can fail on some old compilers
printf("sizeof(long long) = %d bytes\n", sizeof(long long));

//this works only on C++ compilers
printf("sizeof(bool) = %d bytes\n", sizeof(bool));

return 0;
}

Some platforms (like HP/UX with LP64 and ILP32) support several models and you can change them by specifying proper compiler time flags and runtime libraries.

Is there a portable way to have 32 bits integers in C?

So we are interested in writing such code which will use exact 64 bit integer type on platforms where it is available. I would like to mention here that you can always do the proper stuff in *runtime* just by using sizeof() operator.

What is more tricky is the *compile time* solution.

If we speak about C++, the solution is also easy to implement with templates, see the section below.
Sometimes, compiler predefines a special macro to make life easier:


//this works on some systems:
#if __LP64__
typedef long I32;
#endif

However, you cannot be sure about all platforms.

A standard compliant way

The good news is that C99 standard for C language introduces some very handy headers:

#include <inttypes.h>
#include <stdint.h>

which allow you to use the following integer types with fixed size:

  • int8_t
  • int16_t
  • int32_t
  • int64_t

When “stdint.h” is available it is best solution since int32_t is guaranteed to be always 32 bits wide on all data models (platforms).

The problem is that some compilers do not support stdint.h header (after all, your compiler must be C99 compliant for this feature). For example, Microsoft Visual C compilers (I tried 2003 and 2008 versions) do not come with these headers. Unlike Micrisoft, most common versions of gcc support these headers.

In order to fix this Microsoft specific problem, one can use something like this:

#ifdef _MSC_VER
#include "ms_inttypes.h"
#include "ms_stdint.h"
#else
#include <inttypes.h>
#include <stdint.h>
#endif

Alexander Chemeris has made special versions of inttypes.h and stdint.h which work on Microsoft compilers.

One can download these headers, than rename it, adding prefix “ms_” to avoid conflicts on platforms with built-in support.

If your compiler does not support stdint.h you can also try to use pstdint.h header made by Paul Hsieh.

I would also suggest checking the stdint.h wikipedia page for some updates which may happen on this topic.

Lucky C++

More than ten years ago, Kevin S. Van Horn has published an article Compile-time Assertions in C++.

We can use this idea to assert the size of integer type:


template <bool t>
struct ctassert {
enum { N = 1 - 2 * int(!t) };
static char A[N];
};
template <bool t>
char ctassert<t>::A[N];

static ctassert<sizeof(int) == 4> TEST_INT_4_BYTES_LONG;
static ctassert<sizeof(long) == 8> TEST_LONG_8_BYTES_LONG;

The code above will fail to compile with weird messages about subscripts of A[N] array. Note, “static” is used here to make sure that assertions are not removed by optimizing compiler.

Boost your code

BOOST library does not provide you a special way to use integers with exact number of bytes.
However, there are other two very nice features if you consider using boost/cstdint.hpp header.

The following integer types are introduced in boost namespace:

  • Integer types with at least N bits requirement. This is a lower bound guarantee for the size of the integer value.
  • Fastest integer types with at least N bits requirement. The code tries to use platform specific features to make sure that N bits integer is represented in most optimal way.

So thats it.

It looks like until C99 standard is implemented in most compilers we do not have “portable” way to define integers with exact size. However, there are some tricks to make sure that code will either work or fail to compile.

The IP slash notation

How often you see something like this:

192.168.255.0/24

For those of us who do not frequently configure IP networks such notation may seem tricky and it has nothing to do with URI(s), ports and etc.

This is a “slash” notation which is formally called Classless Inter-Domain Routing (CIDR) notation.
It allows you to specify the network or the range of IP addresses in a more compact way.

Normaly for 192.168.255.0/24 the we would write:

Network: 192.168.255.0
Subnet mask: 255.255.255.0

which means there is network of 255 IP addresses starting at address 192.168.255.1.

The number after “/” sign denotes number of non-zero bits in the submask left to right.

Another example:

192.168.128.0/25 means:
Network: 192.168.128.0
Subnet mask: 255.255.128.0

which is a network with first address 192.168.128.1 and last address 192.168.255.255.

This notation is also called IP prefix notation.

It comes especially useful, when you want to use a trie in order to match the IP address with some predefined pool of partial values. Such task is relevant for IP routes and is known as Longest prefix match.

For example, let’s see how this all works together.
Let’s say in the routing table we have the following entries:

Route1 192.168.0.0/16
Route2 192.168.20.16/28
Route3 192.168.20.17/32

When these entries are put into Radix trie (Particia), we use binary representation so the prefix trie
would loook liks this (simplified view):


[ROOT_EMPTY_NODE]
|
11000000 10101000 [max index = 16, represents element 192.168.0.0]
|
00010100 0001 [max index 28, represents element 192.168.20.16]
|
0001 [max bit index = 32, represents element 192.168.20.17]

Radix trie has a nice feature which guarantees complexity of O(k) for all operations where k is the length of the input. In our special example the length of IP addresses is 32 bits so we will have no more than 32 comparisons to insert of delete new elements.

Why IP prefix notation is so useful.

See the “max index” values above?
It is the exact position where common parts of the addresses end. When we built Radix trie from raw data, we have to find this index every time we insert new entry.

When we have IP prefix notation, the index is already known to us so we can utilize this fact in
order to speed up the insertions!

If we look into algorithm of Patricia (Radix) trie which is used to store (IP routing) prefixes we would see
that trie will look different depending on the order of the insertions. One of the advantages of IP prefix notation is that it allows you to utilize the network mask bits in order to create a trie in a faster way, with less comparisons than in usual way.

Originally when new element is inserted in to the trie, we are comparing it with nodes bit by bit until
we find the common parts of the key. When branching index is already known (this is our /slash notation
number or subnet mask) we can keep all data sorted by this parameter before inserting into the trie.

Then inserting the entire routing table into the trie becomes faster since we only need to check the bit in the position of the branching index and store the pointer to the current node for the next insertion.

It turns out the slash notation is not only convenient for people but also for software.

For the implementation of the Patricia trie I would suggest the following article by Radu Gruian.

For the wonderful explanation and details about Radix trie and others please see Robert Sedgewick’s “Algorithms in C++“.

Sample SCTP code for Linux

In this post I will explain how to compile SCTP sample code on Linux with socket interface and with ACE.

First of all, make sure you have installed kernel module which adds support for SCTP:
I have been using

lksctp-tools-1.0.2

To install it, download it from online package repository and use rpm (rpm -i …).

Compiling the C test

You can use the following samples for SCTP client and SCTP server applications or read the wonderful article which includes similar code “Better networking with SCTP” from IBM.

Now, compile the sources with

g++ sctpclnt.cpp -L/usr/local/lib -lsctp -o sctp_client
g++ sctpserv.cpp -L/usr/local/lib -lsctp -o sctp_server

After running it you may get the error with errno 94 and description
Socket type not supported“.
This happens because sctp module is not enabled for linux kernel.
You can fix it by using the following shell command:

modprobe sctp

You do not need to recompile, just run the binaries and it should work fine.

Using ACE

ACE has SCTP performance test code which you can use as a starting point for your SCTP applications.

In order to run these tests you must make sure that ACE library is built with SCTP support.
One way to enable SCTP for ACE is described.

This way you can modify “platform_linux.GNU” which is under include/makeinclude/ of your ACE distribution directory. Unfortunately this never worked for me so I had to manually include
flags when buildilng ACE:

./configure CXXFLAGS="-DACE_HAS_LKSCTP" LDFLAGS="-L/usr/local/lib -lsctp"

This ensures that ACE_HAS_LKSCTP symbol is defined and linker knows where to get the SCTP library.

Now, when ACE library is built successfully, you get the “libACE.so”, place it into /usr/local/lib and run “ldconfig” to set run time bindings.

When building ACE SCTP performance test, you may notice it prints the following error message:

"SCTP was NOT installed when this binary was compiled."

The problem is that ACE test code is looking for ACE_HAS_SCTP symbol while ACE library is checking ACE_HAS_LKSCTP. For the ACE SCTP performance tests make sure ACE_HAS_SCTP is defined.

After rebuilding ACE code should work fine, given that sctp module is enabled as shown above.

Current state of SCTP

SCTP is a transport level protocol which works on top of IP.

For a formal introduction see RFC4960 and RFC3286.

While SCTP origin is from telecom, it turns out SCTP is pretty useful for desktop applications too.

A bit of history.

SCTP was designed primary as a reliable transport in SIGTRAN architecture. SIGTRAN provides means to send SS7 traffic over IP, allowing connecting public service telecom network with IP based applications.

Why not TCP?

TCP is not a good solution for telecom signaling switches because it has number of downsides, for example one packet loss makes stall for the entire byte sequence. This is not good when one connection controls several phone calls and all of them are waiting for some packet to be retransmitted.

A few other key points of using SCTP I have collected here:

  • inside connection (“association” in SCTP), several streams can be set with separate sequences
  • association can have up to 65K streams, loss of packet for one stream does not stall other streams
  • unordered delivery possible
  • multihoming: endpoint has set of {IP4, IP6} addresses, protocol engine chooses where to retransmit the packet
  • retransmit limit on the time, counts or buffer size
  • message made with one send operation is parsed by protocol engine. It works different from TCP byte oriented stream, where application is responsible for framing.
  • protection for SYN denial of service attacks by having stateless handshake procedure on the server side
  • heartbeat is implemented by default, while in TCP it must be turned on explicitly

One concert could be about the performance of SCTP versus TCP.

As it can be found in ACE SCTP performance test, the SCTP is not significantly slower:

SCTP performance test

You may be curious why SCTP is not used in other applictions?

It might work much better than TCP for real time or other QoS sensitive traffic.

The answer is that at this moment SCTP is relatively young protocol.

The immaturity of implementations results in possible flaws enabling DoS attacks.

See CVE-2006-1527:

The SCTP-netfilter code in Linux kernel before 2.6.16.13 allows
remote attackers to trigger a denial of service (infinite loop) via
unknown vectors that cause an invalid SCTP chunk size to be processed
by the for_each_sctp_chunk function.

However, the future of SCTP is bright. According to this study using SCTP in massive multiplayer online game environments shows that SCTP someday may beat TCP and UDP on this field.

How to use SCTP?

We have two major open source implementations for Unix based sysems and a few less known open source implementations for
Windows.

Linux:

Windows:

The list of commercial implementations could be found here.

Using SCTP

There is a wonderful SCTP introduction paper by Randall Steward, Michael Tüxen and Peter Lei which explains the features of SCTP and provides simple example of using sockets API on Linux for server and client applications.

If your Linux kernel is built with SCTP, everything is straightforward.

Good news for those of us who likes higher level of abstraction.

As it is explained here ACE library also supports SCTP protocol in both Linux based flavours (LKSCTP and OpenSS7).

ACE also provides clean and functional example in the SCTP performance test code.

Origins of DeadBeef

It was nice to find out in Jeremy Uejio’s Blog that 0xDEADBEEF is used on Linux (I assume it happens with debug versions) when you free memory.

According to this source, originally DEADBEEF was used in order to fill newly allocated memory.

On my Windows machine neither fresh nor deallocated memory is fancy.
Only 0xAB, 0xFE and nothing really meaningful.

For more historical stuff see The Jargon File.

Tricky question for C

This would be good as an interview question on C for those guys working with computations.

You have some code:


if (tmp != tmp)
{
printf("it happened!");
}

You are running this application with tmp to be any of built-in types with no overloaded members (plain C).

Is there possibility that you would get “it happened!” printed out
and what should be the type of tmp in order to get this behaviour?

Correct answer: tmp should be float since when float value is NaN the != comparison would be true.

This is by IEEE 754 standard which defines NaN values as a such that != comparisons involving them are always true.

Proof Carrying Code

In program analysis, there is an idea to use first order logic with theorem prover to “sign” a binary that it is safe with regard to some predefined safety policy.
Eventually it envolved into idea of “proof carrying code”.
A few slides introducing PCC are based on the original PCC paper.

Some fun with C

Do you know that the following is legal C code?

const char * x = "abcdefgh";
int pos = 4;
char tmp1 = (5*2-6)[x];
char tmp2 = pos[x];
assert(tmp1 == 'e');
assert(tmp2 == 'e');

The point here is that compiler evaluates x[y] as (char&)(x + y) or (char&)(y + x).
So regardless of what comes first, the result is the same.