newbook.document.writeln(''); newbook.document.close(); newbook.frames[2].location.href = targetFile; } // -->

2    Comparing IRIX and Tru64 UNIX

This chapter compares major features of IRIX and Tru64 UNIX. It also describes features available only on Tru64 UNIX. It presents porting issues involving the Alpha architecture and the 64-bit environment. If you are already familiar with Tru64 UNIX, you can skip this material.

This chapter presents the following topics:

2.1    Overview of Tru64 UNIX

The Tru64 UNIX operating system is a 64-bit advanced kernel architecture based on Carnegie-Mellon University's Mach V2.5 kernel design, with components from Berkeley Software Distribution (BSD) 4.3 and 4.4, UNIX System V, and other sources. Tru64 UNIX implements the Open Software Foundation (OSF) OSF/1 R1.0, R1.1, and R1.2 technology, Common Desktop Environment (CDE), and the Motif graphical user interface and programming environment. Under the X/Open UNIX branding program, Compaq Computer Corporation has received the UNIX 95 brand for the Tru64 UNIX operating system.

Tru64 UNIX provides symmetric multiprocessing (SMP), realtime support, and numerous features to assist programmers in developing applications that use shared libraries, multithread support, and memory-mapped files. All features of the X Window System, Version 11, Release 6 (X11R6) from the X Consortium, Inc. are fully supported. Selected features of Release 6.1 (X11R6.1) are also supported. POSIX standard 1003.1c pthreads are supported.

The Tru64 UNIX operating system complies with numerous other standards and industry specifications, including the X/Open XPG4 and XTI, POSIX, FIPS, and System V Interface Definition (SVID). Tru64 UNIX is compatible with Berkeley 4.3 and System V programming interfaces and conforms with the OSF Application Environment Specification (AES), which specifies an interface for developing portable applications that will run on a variety of hardware platforms.

Familiar User Interfaces and Tools

Tru64 UNIX provides user interfaces familiar to UNIX developers:

2.2    Real Time Programming

The Tru64 UNIX operating system supports facilities to enhance the performance of realtime applications. These realtime facilities make it possible for the operating system to guarantee that the realtime application has access to resources whenever it needs them and for as long as it needs them. That is, the realtime applications running on the operating system can respond to external events regardless of the impact on other executing tasks or processes. Realtime applications written to run on the operating system make use of and rely on the following system capabilities:

All of these realtime facilities work togther to form the Tru64 UNIX realtime environment. In addition, realtime applications make full use of process synchronization techniques and facilities.

For more information on the real-time programming environment, see the Guide to Realtime Programming. For information on configuring the real-time kernel, see the System Administration guide.

2.3    Directories and Defines

Among the differences between IRIX and Tru64 UNIX are the layout of the directories and the values associated with some of the standard defines.

2.3.1    Directory Hierarchies

The hierarchies of system directories differ between IRIX and Tru64 UNIX. Figure 2-1 shows the Tru64 UNIX directory structure.

Figure 2-1:  Tru64 UNIX Directory Hierarchy

Table 2-1 describes the contents of the Tru64 UNIX directories.

Table 2-1:  Contents of the Tru64 UNIX Directories

Directory Description
/ The root directory of the file system.
/dev Block and character device files.
/etc System configuration files and databases; nonexecutable files.
/sbin Commands essential to boot the system; these commands do not depend on shared libraries or on the loader and can have other versions in /usr/bin or /usr/sbin.
/sbin/init.d System initialization files.
/sbin/rc0.d The run control files executed for system-state 0 (single-user state).
/sbin/rc2.d The run control files executed for system-state 2 (nonnetworked multiuser state).
/sbin/rc3.d The run control files executed for system-state 3 (networked multiuser state).
/sbin/subsys Loadable kernel modules required in single-user mode.
/lost+found Files recovered by fsck.
/usr Most user utilities and applications. Most commands in /usr/bin, /usr/sbin, and /usr/lbin have been built with the shared version of libc and will not work unless /usr is mounted.
/usr/.smdb. Subset installation control files used by setld.
/usr/bin Common utilities and applications.
/usr/ccs C compilation system; tools and libraries used to generate C programs.
/usr/examples Source code for example programs.
/usr/opt Optional application packages, such as layered products.
/usr/include Program header (include) files For more information, see Chapter 4, Application Development Environment.
/usr/lib Libraries, data files, and symbolic links to library files located elsewhere; included for compatibility.
/usr/lbin Back-end executable files.
/usr/sbin System administration utilities and system utilities.
/usr/share Architecture-independent ASCII text files. This directory includes word lists, various libraries, and online reference pages.
/usr/sys Directories that contain system configuration files.
/usr/shlib Binary loadable shared libraries; shared versions of libraries in /usr/ccs/lib.
/opt Optional application packages, such as layered products.
/var Multipurpose log, temporary, transient, varying, and spool files.
/var/adm Common administrative files and databases. This directory includes the crash area, files for the cron daemon, configuration and database files for sendmail, and files generated by syslog.
/var/spool Miscellaneous printer and mail system spooling directories.
/tmp System-generated temporary files that are usually not preserved across a system reboot.
/vmunix Pure kernel executable file (the operating system loaded into memory at boot time).
/genvmunix Generic kernel executable file built with most options and device support (useful if vmunix becomes corrupted).

2.3.2    Defines for Numeric Values

Some maximum and minimum values assigned to standard defines vary according to system architecture. Table 2-2 shows values for defines in /usr/include/alpha/machlimits.h (Tru64 UNIX) and /usr/include/limits.h (IRIX and Tru64 UNIX).

IRIX defines for LONG and ULONG values vary according to whether the code is compiled with _MIPS_SZLONG == 32 or with _MIPS_SZLONG == 64. Compiling with _MIPS_SZLONG == 64 results in defines with the same values as those in Tru64 UNIX. IRIX code that depends on LONG or ULONG values defined when _MIPS_SZLONG == 32 must be modified.

Table 2-2:  Defines from limits.h and machlimits.h

Define IRIX Tru64 UNIX
CHAR_MAX UCHAR_MAX (255U) 127
CHAR_MIN 0 -128
MB_LEN_MAX 5 4
LONG_MAX 2147483647 (_MIPS_SZLONG == 32) 9223372036854775807
LONG_MIN -2147483647-1 (_MIPS_SZLONG == 32) (-LONG_MAX-1) (-9223372036854775807-1)
ULONG_MAX 4294967295U (_MIPS_SZLONG == 32) 18446744073709551615U

Defines for floating-point values for both IRIX and Tru64 UNIX appear in the /usr/include/float.h file. By default, the IRIX defines for floating-point values for LDBL (long double) values are defined in terms of the corresponding DBL (double) values. By default, Tru64 UNIX uses different values for long doubles. IRIX code that assumes the IRIX default values for long doubles must be modified, or you can compile the Tru64 UNIX code with the -_X_FLOAT == 1 option.

2.4    Unique Directory Features

There are two unique directory features in Tru64 UNIX:

2.4.1    Device Naming

Device special files appear in a /devices directory under the root directory (/). This directory contains subdirectories each containing device special files for a class of devices. A class of device corresponds to related types of devices, such as disks or nonrewind tapes. For example, the directory /dev/disk contains files for all supported disks, and /dev/ntape contains device special files for nonrewind tape devices. Currently, only the subdirectories for certain classes have been created.

For more information, see the section Device Naming and Device Special Files in the System Administration manual.

2.4.2    Context-Dependent Symbolic Links

The cluster environment provided by the TruClusterTM Server software has a single, clusterwide namespace for files and directories. This namespace gives each cluster member the same view of all file systems. All cluster members use the same file name to access a given file regardless of where the file actually resides in the cluster.

Some configuration files and directories cannot be shared by all cluster members. For example, a member's /etc/sysconfigtab file contains information specific to that member's kernel component configuration, and only that member should use that configuration. Context-dependent symbolic links (CDSLs) provide a mechanism that lets each member read and write the file named /etc/sysconfigtab, while actually reading and writing its own member-specific sysconfigtab file.

A CDSL is a special kind of symbolic link that contains a variable whose value is determined only during pathname resolution. The {memb} variable is used to access member-specific files in a cluster. The following example shows the CDSL for /etc/sysconfigtab:

/etc/sysconfigtab -> ../cluster/members/{memb}/etc/sysconfigtab  
 

When resolving a CDSL pathname, the kernel replaces the {memb} variable with the string member n, where n is the member ID of the current member. Therefore, on a cluster member whose member ID is 2, the pathname /cluster/members/{memb}/etc/sysconfigtab resolves to /cluster/members/member2/etc/sysconfigtab.

CDSLs also appear in standalone Tru64 UNIX systems, but are not used in the standalone environment.

For more information, see the section Context-Dependent Symbolic Links and Clusters in the System Administration manual.

2.5    64-Bit Considerations

The Alpha architecture is based on a 64-bit microprocessor, and Tru64 UNIX is a 64-bit operating system. These facts introduce a number of extended capabilities beyond 32-bit architectures. For example, 64-bit addressing allows Tru64 UNIX to support file system sizes greater than 2 gigabytes.

When porting a 32-bit application to the 64-bit environment of Tru64 UNIX, you face the same issues you would in porting that application to 64-bit IRIX. This section describes these issues. See SGI's MIPSpro 64-Bit Porting and Transition Guide for a discussion of 64-bit porting issues from the IRIX perspective.

IRIX follows the LP64 model, the industry programming model for 64-bit UNIX. LP64 is based on the Tru64 UNIX implementation in use since 1993. If your application is written for a 64-bit environment and compiles in 64-bit mode on IRIX (-xarch=v9), then the only 64-bit considerations in porting the application to Tru64 UNIX might be issues of data and structure alignments, covered later in this section.

Most porting concerns for applications written for a 32-bit environment are caused by three facts about the 64-bit environment:

Chances are that most code you end up changing incorrectly assumes the following:

sizeof(int) == sizeof(pointer) == sizeof(long)

2.5.1    Language Data Types

In the Tru64 UNIX 64-bit environment, long and pointer data types are 64 bits. Table 2-3 presents the Tru64 UNIX data types.

Table 2-3:  C Language Data Types

Data Type Tru64 UNIX Bits
char 8
short 16
int 32
float 32
pointer 64
long 64
long long 64
double 64
long double 128

Table 2-4 shows the alignment of various Tru64 UNIX data types. Note that long and pointer data types are aligned on 8-byte boundaries.

Table 2-4:  Natural Data Alignment

Data Type Alignment (Byte Multiple)
char 4
short 4
int 4
float 4
long 8
pointer 8
long long 8
double 8

With high-level languages, such as C, the compiler automatically attempts to align data and variables to their natural boundaries. In some situations the compiler might lack the information needed to make the correct alignment. Alignment errors can result from misuse of long and pointer data types in structure definitions that are shared between 32-bit and 64-bit systems.

2.5.1.1    Data Access

Older Alpha processors (EV4 and EV5) only support memory access of longwords (32 bits) and quadwords (64 bits). Byte and word accesses are accomplished by multiple instructions that load a longword or quadword, mask, and shift to obtain the desired entity. The lack of a single, uninterruptible operation for byte and word access has implications both for the performance and for the correctness of an application.

Beginning with the 21164a Alpha chip (EV56), the Alpha processor supports direct access of byte and word data types, as well as direct access of longword and quadword data types.

In Tru64 UNIX Version 4.0 and higher, the operating system kernel includes an instruction emulator that allows any Alpha chip to execute and produce correct results from Alpha instructions, even if some of the instructions are not implemented on the chip. Applications that use emulated instructions will run correctly, but might incur significant emulation overhead at run time.

To learn the type of Alpha processor on your system, enter the following command:

/usr/sbin/psrinfo -v

2.5.1.2    Data Access Synchronization

Independently executing applications that access common data must synchronize access to that data. The Alpha architecture mandates that for naturally aligned quadwords, independent access to adjacent quadwords produces the same results regardless of the order of instruction execution. No such guarantee exists for char, byte, word, or longword data.

A multithreaded application or multiple processes that access adjacent char, byte, word, or longword data through a common address space must use either thread mutex locking functions or semaphore locks to ensure that access to the data is deterministic. Similarly, such processes using shared memory-mapped files are restricted to semaphore locks to avoid conflicts with access operations to adjacent data items of type char, byte, word, or longword.

A simpler alternative to locking functions or semaphores is to expand byte-, word-, or longword-length items to quadwords.

Quadword data items that are not naturally aligned (low-order 3 bits) incur access penalties similar to those for byte and word access. But the compiler takes care of correct alignment for quadword data.

Compiling an application with the -arch ev56 option reduces the chances of word tearing for byte-, word-, or longword-length items. Even so, there are still cases where the compiler will generate code that exposes word tearing.

The cc -granularity <size> option ensures that the compiler generates code that will not cause word tearing for the <size> specified. However, Tru64 UNIX system libraries and kernel interfaces frequently assume quadword granularity. The compiler alone cannot resolve all the issues if you require granularity of less than an 8-byte quadword.

Note

The default memory access size on a Tru64 UNIX system is 8 bytes (quadword). This means that when two or more threads of execution are concurrently modifying adjacent memory locations, those locations must be quadword-aligned to protect the individual modifications from being overwritten. Errors can occur, for example, if separate data items stored within a single quadword of a composite data structure are being concurrently modified.

For details on the problems that non-quadword alignment can cause and the various situations in which the problems can occur, see the Granularity Considerations section in the Guide to the POSIX Threads Library.

For more information on data alignment and threads, see Section 6.2.2.

2.5.2    Pointers

Pointers are 64 bits long. Treating pointers as though they are the same size as int values will likely cause unwanted results.

Code to be ported that casts a pointer or quadword (type long or u_long) to an int value results in the upper 32 bits being truncated. If that int value is then cast back to 64 bits, the resulting value is incorrect.

The following summarizes the behavior of 64-bit pointers:

2.5.3    Constants

Constants can have different values on 32-bit and 64-bit systems. Table 2-5 lists some constants and their values.

Table 2-5:  Values of 64-Bit Constants

C Constant Value 32-Bit Value 64-Bit Value
0xffffffff (232 -1) -1 4,294,967,295
4294967296 232 0 4,294,967,296
0x100000000 232 0 4,294,967,296
0xfffffffffffffff (264-1) -1 -1

In the following code fragment, the expression in the if statement is true in a 32-bit environment but false in Tru64 UNIX:

long long_val = 0xffffffff;
if(long_val < 0)

In Tru64 UNIX, long and unsigned long constants are 64 bit, quadword values. For example:

sizeof(543210) = 4 bytes
sizeof(543210L) = 8 bytes
sizeof(543210UL) = 8 bytes

2.5.3.1    Truncation of Longs

Because longs are 64-bit values, truncation can occur if a long is assigned to an int variable. For example:

int int_val;
.
.
.
int_val = 2147483660;

Because of truncation, the value of int_val is -2147483636.

Truncation can also occur if a long value is passed as an argument to a function expecting an int value. For example:

abs(2147483660) = 2147483636

2.5.3.2    Bit Shifts

A bit-shift operation on an integer constant always yields a 32-bit constant. For example, even though long_val is declared a long, the results of the following operations are 32-bit values:

long_val = 1 << 31 results in long_val = -2147483648 or 0xffffffff80000000
if((1 << 31)) > 0x7fffffff) is false

If you need a result of type long, you must use the L or UL suffix for long integer constants. The top 32 bits of value depend on the type of the value shifted. Signed values are sign extended; unsigned values are zero extended. If you want a 64-bit constant, be sure to use the L or the UL suffix. Only the left operand of a shift operator determines the result type. The type of shift count operand is irrelevant.

long_val = 1L << 31 results in 2147483648 or 0x80000000
if((1L << 31)) > 0x7fffffff) is true

You obtain similar results by casting to a long. For example, when shifting bytes into a long value, cast each byte to a long; otherwise, the result is only a 32-bit value. The following example results in a 64-bit value. (Assume long_val is a long data type and bp is a pointer to bytes.)

long_val = (((u_long)bp[0] << 56) | ((u_long)bp[1] << 48));
 

2.5.4    Variables

Variables declared as int are 32-bit entities on both 32-bit systems and in the 64-bit environment of Tru64 UNIX. Variables declared as long (and as pointers) are 64 bits in Tru64 UNIX.

If you have specific variables that need to be 32 bits in size on both Tru64 UNIX and 32-bit systems, define the type to be int. If the variable should be 32 bits on 32-bit systems but 64 bits on Tru64 UNIX systems, define the variable to be long.

2.5.5    Structures

The 64-bit environment can affect both the size and alignment of structures, as described in this section.

2.5.5.1    Size

Because pointers and longs are 64-bit values, structures and unions that include pointers or long data types are larger than the same structures and unions on 32-bit systems.

For example, the following structure, TextNode, doubles in size on a 64-bit system because the pointer types are doubled in size from 4 bytes to 8 bytes:

struct TextNode
{
char *text;
struct TextNode *left;
struct TextNode *right;
};

If you are sharing data defined in structures between 32-bit and 64-bit systems, avoid using longs and pointers as members in shared structures.

2.5.5.2    Member Alignment

The compiler ensures that members of structures and unions are aligned on their natural boundaries. Table 2-6 shows the alignments of various data types.

Table 2-6:  Structure Alignments

Data Type Alignment
char byte
short word
int longword
long quadword
pointer quadword

This means that the compiler sometimes inserts padding to provide member alignment in structures and unions. On 64-bit Alpha systems, the size of the following structure is 32 bytes: 8 bytes for each pointer and 4 bytes of padding after the int member size, so that the pointer left, which follows size, is aligned on a 64-bit boundary:

struct TextCountNode
{
char *text;
int size;
struct TextCountNode *left;
struct TextCountNode *right;
};

2.5.5.3    Structure Alignment

The compiler aligns structures according to the strictest aligned member. This aids in aligning structure members on their required boundaries. The compiler pads structures to ensure proper alignment. Padding can be added within the structure or at the end of the structure to terminate the structure on the same alignment boundary on which it started.

Because of padding, do not assume that the size of a structure is simply the accumulated size of all of the objects defined in it. The sizeof operator is a safer method for determining structure size.

In some cases, you can minimize the amount of padding needed in a structure by reordering the members.

The following structure is 40 bytes; the compiler adds 4 bytes of padding after each of the members size and count, to maintain alignment of the pointers on 64-bit boundaries:

struct TextCountNode
{
char *text;
int size;
struct TextCountNode *left;
int count;
struct TextCountNode *right;
};

Placing the two int members together eliminates the padding and reduces the size of the structure to 32 bytes:

struct TextCountNode
{
char *text;
int size;
int count;
struct TextCountNode *left;
struct TextCountNode *right;
};

2.5.5.4    Unions

Problems arise when the use of a union is based on assumptions such as the following:

sizeof(double) == 2*sizeof(long) or sizeof(long) == 4*sizeof(char)
 

The following code fragment assumes that an array of two longs overlays a double:

union double_union {
double d;
unsigned long ul[2];
};

Changing the long to an int fixes the problem:

union double_union {
double d;
unsigned int ul[2];
};

2.5.5.5    Bit Fields

Bit fields are allowed on any integral type on Alpha systems. (ANSI C requires only bit fields with int, signed int, and unsigned int types.)

In a C declaration, if one bit field immediately follows another in a structure declaration, the second bit field is packed into adjacent bits of the former unit. Because the long data type is 64 bits long on Alpha systems, consecutive declarations of bit fields of type long can contain multiple bit-field definitions, whereas this might not occur on 32-bit systems. This difference can cause unexpected results in operations on these bit fields.

To ensure the same behavior in operations on bit fields, change bit field definitions of type long to int.

The -Zp n option to the cc command and the #pragma pack directive let you specify the number of bytes used to align the members of a structure. Fore more information, see the Tru64 UNIX Programmer's Guide and cc(1).

2.5.6    Library Calls and Operators

The 64-bit data types also affect the following library calls and operator:

2.6    File System

The 64-bit Tru64 UNIX operating system allows you to build very large files and file systems. The off_t file offset is defined to be a long on Alpha systems (64 bits). Given this extended capability, you can build files and file systems that cannot be fully accessed by 32-bit systems. Consider this when working in a distributed environment in which file systems are shared between 32- and 64-bit systems.

2.7    Endian Issues

IRIX on the MIPS architecture is big endian: It has forward byte ordering. Bit 0 is the least-significant bit and byte 0 is the most-significant byte. Alpha and Intel x86 architectures are little endian: Byte 0 is the least-significant byte.

Figure 2-2 illustrates the byte ordering on the Alpha architecture.

Figure 2-2:  Byte and Bit Ordering on Alpha Systems

For well-constructed code, the endianism of a system is almost always transparent. Those few cases in which endianism is a concern typically are caused by coding practices that mix types in unions or casts.

2.7.1    Unions

Unions such as the following can result in an endian portability problem:

union int_byte {
int int_val;
char byte[4];
};
union int_byte my_var;
my_var.int_val = 1000000;
if(my_var.byte[3] == 0)
printf("The number is divisible by 256\n");

On a big-endian machine, this code works correctly. Byte[3] is zero only when the number is divisible by 256. However, on a little-endian machine, byte[3] is the most-significant byte. Either of the following methods fix this problem:

2.7.2    Initializing Multiword Entities in 32-Bit Chunks

Use care when porting code that initializes quadword and other multiword entities with 32-bit entities. For example, on a big-endian system, an array of two 32-bit integer values is used to initialize a 64-bit double:

u.ul[1] = 0x7fffffff;
u.ul[0] = 0xffffffff;

To produce the correct results on a little-endian system, such as an Alpha, the subscripts must be reversed:

u.ul[0] = 0x7fffffff;
u.ul[1] = 0xffffffff;

2.7.3    Unused Bytes

Sometimes code that is trying to make very efficient use of memory takes advantage of the fact that often not all 4 bytes in an integer are used. For example, if a particular int field in a record will hold only values in the range 0 to 10,000,000, the most-significant byte will always be zero. A 1-byte field could be stored in that byte to make the record 1 byte smaller.

If the most-significant byte is accessed by means of a character array or by casting and dereferencing a pointer, then the code will not be portable and slightly different versions will be needed on big-endian and little-endian machines. However, if bitwise operators are used to mask, merge, and shift bytes, then the code will be portable.

2.7.4    Hex Constants Used As Byte Arrays

An endian problem occurs when a 32-bit value is treated sometimes as a 32-bit value (an integer) and sometimes as an array of 4 characters. For example, the following array is equivalent to the number 0x11223344 on big-endian machines and the number 0x44332211 on little-endian machines:

char a[4] = {0x11, 0x22, 0x33, 0x44};
 

2.7.5    Data Transfer

If data with multibyte values is being transferred between big-endian and little-endian systems, then it is a simple matter to provide code that swaps the bytes. Appendix A presents suggestions for doing this.

2.8    Write-to-Memory Operations and Memory Barriers

The Alpha architecture guarantees coherency of a processor's view of memory (that is, cache is updated, or the contents marked invalid and good data fetched elsewhere). The architecture has a shared-memory model that specifies no implicit ordering between the reads and writes issued on one processor, as viewed by a different processor. This approach allows a wide variety of high-performance implementation techniques. For example, it makes possible such implementations as the use of multibank caches, bypassed write buffers, write merging, and pipelined writes with retry on error.

When required, specify strict ordering of reads and writes by using explicit memory barrier (MB) instructions. Low-level hardware operations, such as device drivers, often make use of memory barrier instructions to ensure the order in which data are written to memory.

The following code fragment illustrates the use of a memory barrier:

device_intr()
{
mb();
bcopy (DMA_buffer, data, nbytes);
/* If we need to update a device register, do: */
mb();
device->csr = DONE;
mb();
}

See Writing Device Driversin the Tru64 UNIX Device Driver Kit for more information.