12    Endianism Considerations

This chapter discusses the issues involved when migrating code and data between computer architectures that have different byte ordering.

12.1    Overview

A significant problem customers and ISVs will have to deal with when migrating their application and databases from Tru64 UNIX to HP-UX 11i is endianism. Endianism refers to the way in which data is stored, and defines how bytes are addressed in multibyte data types. This is important because if you try to read binary data on a machine that is of a different endianism than the machine that wrote the data, your results will be different. This is an especially significant problem when a database needs to be moved to a different endian architecture.

There are two types of endian machines: big-endian (forward byte ordering or Most Significant First, MSF) and little-endian (reverse byte ordering or Least Significant First, LSF). The mnemonics of "Big End In" and "Little End In" can be useful when discussing endianism. Figure 12-1 and Figure 12-2 compare the two byte ordering methodologies graphically.

Figure 12-1:  Big-Endian Byte Ordering

Figure 12-2:  Little-Endian Byte Ordering

On a little-endian operating system (Tru64 UNIX), the little end, or least-significant byte is stored at the lowest address. This means a hexadecimal like 0x1234 is stored in memory as 0x34 0x12. The same is true for a 4-byte value; for example, 0x12345678 is stored as 0x78 0x56 0x34 0x12. A big-endian operating system does this in the reverse fashion, so 0x1234 is stored as 0x12 0x34 in memory.

Consider the number 1025 (2 to the tenth power plus one) stored in a 4-byte integer. This is represented in memory as shown in Example 12-1.

Example 12-1:  Binary Representation of 1025

little endian: 00000000 00000000 00000100 00000001b
big endian: 00000001 00000100 00000000 00000000b

Like the Alpha processor, the Itanium® processor family supports both big- and little-endian memory addressing. The PA-RISC processor uses a big-endian addressing model. HP-UX 11i has always been a big-endian operating system, and will continue to be so. This means that applications already on HP-UX 11i will be able to read their existing binary data without modification. The Tru64 UNIX operating system is little-endian based, so applications running on HP-UX 11i may not be able to read binary data created on Tru64 UNIX without modification.

Converting data between these two types of systems is often referred to as the NUXI problem. Imagine the word UNIX stored in two 2-byte words. In big-endian systems, it is stored as UNIX. In little-endian systems, it is stored as NUXI.

12.2    Persistent Data

Byte order should be considered carefully when reading or writing data. Multibyte values should be preprocessed such that the endian type of the source and destination systems is unimportant. Consider the following code example, which assumes systems implementing the same endianism will be used for both writing and reading data:

Writer Code:

#include <unistd.h>
#include <inttypes.h>
 
int64_t val = 1;
ssize_t result = write( fileDes, &val, sizeof( val ) );

Reader Code:

#include <unistd.h>
#include <inttypes.h>
 
int64_t valRead;
ssize_t result = read( fileDes, &valRead, sizeof( valRead ) );

When both the reader and writer systems are of the same endian type, the contents of valRead will be 1. However, in situations where the reader and writer have different byte ordering, valRead will be 0x0100000000000000.

Applications that store persistent data in native endian format will need to be redesigned to avoid endian issues in shared or migrated data sources. Conversion of the old data is best handled by a separate process, which allows the primary application to remain focused on endian-neutral development. There are several means of handling data storage in endian-neutral formats:

  1. Store the data in a defined endian format.

  2. Add additional data to indicate format.

  3. Store all data as ASCII strings.

The first method is the preferred method as it requires the least overhead.

Storing the data in a defined endian format can be achieved by developing endian-neutral I/O functions. You can develop I/O functions by:

The following sections describe each of these methods.

12.2.1    Preprocessor Controlled Byte Order

While the preprocessor can be used to control functions that need to be implemented differently based on endianism, it is not standardized. None of the current standards require that the compilers provide a means of determining the endianism. However, this does not mean that it cannot be done. Developers who wish to do this must implement their own means of determining the endian type of a platform.

Compaq C and Compaq C++ provide no default definition, but the inclusion of machine/endian.h will define BYTE_ORDER. HP-UX 11i has no such header file, although the HP C and HP aC++ compilers provide the definition of _BIG_ENDIAN. Therefore, there is no common means of testing byte-order in the preprocessor. Example 12-2 provides one example of how you might develop code to handle this situation.

Example 12-2:  Supporting Multiple Byte Orderings Using The Preprocessor

#ifndef BIG_ENDIAN
#define BIG_ENDIAN 4321 
#endif
 
#ifndef LITTLE_ENDIAN
#define LITTLE_ENDIAN 1234
#endif
 
#ifndef BYTE_ORDER
#if defined(_BIG_ENDIAN) || defined(__hpux)
#define BYTE_ORDER BIG_ENDIAN
#else if defined(__osf__) || defined(__linux)
#define BYTE_ORDER LITTLE_ENDIAN
#endif
#endif /* BYTE_ORDER */

.
.
.
#if BYTE_ORDER == BIG_ENDIAN /* some code depending on big-endian byte ordering */
.
.
.
#else /* some code depending on little-endian byte ordering */
.
.
.
#endif

12.2.2    Run-Time Byte Order Control

Another means of developing endian-aware code is to dynamically test for the endian type at run time. By taking advantage of what is normally an endian bug in software, it is possible to detect the endian type of the running system. Example 12-3 shows how you can check to see if your code is running on a little- or big-endian system. Using the routine in Example 12-3 to check the endian order, implement your code so that at run time it dynamically acts on either little- or big-endian data.

Example 12-3:  Testing Byte Order

#include <inttypes.h>
 
bool TestBigEndian(void)
{
  int16_t one = 1;
  char *cp = (char*)&one;
 
  if ( *cp == 0 ) {
   return true;
  }
  return false;
}

12.2.3    Using Standard Byte Order APIs

Using standardized endian-related APIs will ensure that your code is portable. One such set of APIs is the host-to-network family. By storing your application data using these APIs, you ensure that the data is stored in big-endian (network byte-order) and is therefore more portable than an endian-native format.

The host-to-network/network-to-host byte order conversion functions ( htons(3), htonl(3), ntohs(3), and ntohl(3)) are highly optimized and should reduce to only a few instructions on little-endian systems, and may be optimized away on big-endian systems. Example 12-4 uses the htons() function to convert a 16-bit integer from host byte order to network byte order.

Example 12-4:  Using the htons Routine

#include <inttypes.h>
int16_t w = 0x1234; 
printf("Host Order w=%04x\n",w);
printf("Network Order w=%04x\n",htons(w));

Example 12-5 uses the htonl() function to convert an unsigned 32-bit integer from host byte order to network byte-order.

Example 12-5:  Using the htonl Routine

#include <inttypes.h>
int32_t w = 0x12345678;
printf("Host Order w=%08x\n",w);
printf("Network Order w=%08x\n",htonl(w));

One problem with the host-to-network APIs is that they are still unable to manipulate 64-bit data elements. Using information from Example 12-2 and Example 12-8 it is possible to develop your own 64-bit version of these APIs. Example 12-6 shows one possible implementation.

Example 12-6:  64-Bit Host To Network Implementation

#include <inttypes.h>
 
uint64_t htonq(uint64_t input)
{
#if BYTE_ORDER == LITTLE_ENDIAN
  return SWAP_8_MACRO(input);
#else
  return input;
#endif
}
 
uint64_t ntohq(uint64_t input)
{
#if BYTE_ORDER == LITTLE_ENDIAN
  return SWAP_8_MACRO(input);
#else
  return input;
#endif
}

12.3    Byte Swapping

If data with multibyte values needs to be transferred between little-endian and big-endian systems, you must provide code that swaps the byte order. For network or socket communications, use the host-to-network APIs ( htons(3), htonl(3), ntohs(3), and ntohl(3)). Unfortunately, this is a problem that cannot be solved without knowledge of the data structure layout. The reason is that the manner in which you swap depends on the format of your data. Character strings typically do not get swapped, 64-bit elements get swapped eight bytes end-for-end, and 32-bit elements get swapped four bytes end-for-end. Any program that needs to swap data around needs to know the data type, the source data endian order, and the host endian order.

The following types of I/O are transparent to endianism and do not need swapping:

Another way to swap bytes is to use the the swab(3) (swap bytes) API. The syntax is:

void swab(
        const void *src,
        void *dest,
        ssize_t nbytes );

You can also define a preprocessor macro. For 32-bit data, the code to convert little-endian to big-endian data might look as shown in Example 12-7.

Example 12-7:  32-Bit Endian Byte Swap Macro

#include <inttypes.h>
#define SWAP_4_MACRO(value)\ 
 ((( (value) & UINT32_C(0x000000FF)) << 24) | \
  (( (value) & UINT32_C(0x0000FF00)) << 8) | \
  (( (value) & UINT32_C(0x00FF0000)) >> 8) | \
  (( (value) & UINT32_C(0xFF000000)) >> 24))

The macro to swap the byte order of 64-bit data is shown in Example 12-8.

Example 12-8:  64-Bit Endian Byte Swap Macro

#include <inttypes.h>
#define EVENBYTESL UINT64_C(0x00FF00FF00FF00FF)
#define ODDBYTESL UINT64_C(0xFF00FF00FF00FF00)
#define EVENWORDSL UINT64_C(0x0000FFFF0000FFFF)
#define ODDWORDSL UINT64_C(0xFFFF0000FFFF0000)
 
#define SWAP_8_MACRO(value) \
 (((((((((value)>>8) & EVENBYTESL)     | \
 (((value)<<8) & ODDBYTESL))>>16) & EVENWORDSL) | \
 ((((((value)>>8) & EVENBYTESL)       | \
 (((value)<<8) & ODDBYTESL))<<16) & ODDWORDSL))>>32 ) | \
 ((((((((value)>>8) & EVENBYTESL)      | \
 (((value)<<8) & ODDBYTESL))>>16) & EVENWORDSL) | \
 ((((((value)>>8) & EVENBYTESL)       | \
 (((value)<<8) & ODDBYTESL))<<16) & ODDWORDSL))<<32 ))
 

12.4    Unused Bytes

Sometimes code that tries to make efficient use of memory takes advantage of the fact that often not all four bytes in an integer are used. For example, if a particular int field in a record will hold only values in the range 0 to 10,000,000, the most-significant byte will always be 0. Rather than adding another element to a structure, the free byte is often used to store an element only requiring 1 byte of storage.

If the most-significant byte is accessed by means of a character array or by casting and dereferencing a pointer, then the code will not be portable and slightly different versions will be needed on big-endian and little-endian machines. Example 12-9 shows a sample of this direct byte access.

Example 12-9:  Direct Byte Access

#include <inttypes.h>
typedef union freebyte {
int32_t intdata;
char    chardata[4];
} mystruct;
#define set_int(s, x) \
 (s).intdata = (((s).intdata&0xFF000000)|(x&0x00FFFFFF))
#define get_int(s) ((s).intdata & 0x00FFFFFF)
/* The char accessors are endian specific! */
#define set_char(s, x) (s).chardata[3] = x
#define get_char(s) (s).chardata[3]

However, implementing it as a named bit field will enable the compiler to generate the correct instructions regardless of the endianism. This problem can be corrected as shown in Example 12-10.

Example 12-10:  Named Bit Fields

#include <inttypes.h>
typedef struct freebyte {
int32_t _3byte:24;
int32_t _1byte:8;
} mystruct;
#define set_int(s, x) (s)._3byte = x
#define get_int(s) (s)._3byte 
#define set_char(s, x) (s)._1byte = x
#define get_char(s) (s)._1byte

Using bit fields in this manner grants the same memory footprint reduction and removes endianism problems. It also makes it possible to remove the accessor macros as each element is now a legal element.

12.5    Unions

Applications that utilize unions and make assumptions about the data layout within that union will have endian portability problems. Example 12-11 shows one such union.

Example 12-11:  Endianism and Unions

#include <inttypes.h>
union int_byte {
 int32_t int_val;
 char byte[4];
};
 
union int_byte my_var;
my_var.int_val = 1000000;
if(my_var.byte[3] == 0)
 printf("The number is divisible by 256\n");

On a big-endian machine, this code works correctly. byte[3] is 0 only when the number is 0 or a multiple of 256. However, on a little-endian machine, byte[3] is the most-significant byte. The easiest way to avoid this problem is not to try and outsmart the compiler. The same functionality can be achieved using endian-independent code. For example:

if((my_var.int_val & 0xFF) == 0)
 printf("The number is divisible by 256\n");

Or, better still:

if((my_var.int_val % 256) == 0)
 printf("The number is divisible by 256\n");

12.6    Initializing Multiword Entities in 32-Bit Chunks

Use care when porting code that initializes multiword entities with 32-bit entities. For example, on a little-endian system, an array of two 32-bit integer values is used to initialize a 64-bit double:

u.ul[0] = 0x7FFFFFFF;
u.ul[1] = 0xFFFFFFFF;

To produce the correct results on a big-endian system, such as on HP-UX 11i, the subscripts must be reversed to represent the correct byte-order. For example:

u.ul[1] = 0x7FFFFFFF;
u.ul[0] = 0xFFFFFFFF;

When possible, data elements should always be initialized using their natural type. The language standards include support for constant initializers large enough to initialize the largest supported data types. The limits.h and float.h header files contain information on the sizes as well as the macros for maximum and minimum values for numeric data types as defined by the ANSI C standard.

12.7    Hex Constants Used as Byte Arrays

An endian problem occurs when a 32-bit value is often treated as a 32-bit value (an integer) and sometimes as an array of 4 characters. For example, the following array is equivalent to the number 0x44332211 on little-endian machines and the number 0x11223344 on big-endian machines:

char a[4] = {0x11, 0x22, 0x33, 0x44};

Values that are masked using constants can also affect the result when a particular byte order is expected.

12.8    Other Considerations

There is a trick in common use in little-endian code that is forbidden in cross-platform work: casting a pointer to an int to a pointer to a char and assuming that the least-significant byte will be at the address pointed to. For example:

unsigned int value = 0x03020100
unsigned int *ptr = &value;
unsigned char charVal;
 
charVal = *(unsigned char*)ptr;

On a little-endian system charVal is assigned the value of 0. On a big-endian system it is assigned the value of 3. You do not want to use such code in your program but it is very common. In old code written for a little-endian platform it is one of the hardest things to find and root out.

To accomplish the same thing in a portable way, use a temporary variable:

unsigned int temp = *ptr;
charVal = (unsigned char)temp;

The second line will take its value from the least-significant byte on every architecture whether it is at the high or low end of the temporary variable; the compiler handles the details for you.

Also, you should do endian conversion on input and output and not in the middle of compute routines. This may be obvious but it is sometimes overlooked.

12.9    Conclusion

Unfortunately, byte-swapping may be necessary when moving data between different architectures. However, it does not have to greatly affect the performance of your code. The code required to swap a two-, four-, or eight-byte value is just a few instructions and is easily done entirely in the registers. If you have significant data to swap, such as large arrays, all of the code should fit in a small loop that fits well in the cache, and the data can be fetched sequentially from the data cache, which is very efficient. Just be sure to understand the format of your data before migrating your code and you will not have any problems with data integrity.