TDI - A Thread Debug Interface for Pthreads Implementations

A Thread Debug Interface (TDI)

for Implementations of

the POSIX Threads (Pthreads) Standard

TDI-Introduction Pthread Libraries, a modified GDB)
Installation Details
Give us Feedback

TDI Introduction

Thread debugging with common debuggers is very uncomfortable if the debugger is not thread aware. There are a few solutions for that problem, most of them are based on the GDB (GNU Debugger). The common disadvantage of them is the specialization to a specific thread implementation. The POSIX standard for a C multithread programming interface offers the possibility to design a common debug interface for standard compliant implementations. The TDI is an approach to provide such an interface for the ,,whole" variety of POSIX threads implementations. This includes kernel thread implementations (Linuxthreads) as well as pure user space solutions (FSU Pthreads, MIT Pthreads).

The TDI is an abstraction layer between the SE tool debugger and the programming environment POSIX threads. Normaly neither the debugger is thread aware nor the Pthreads package provides debugging support. There are several approaches to get debugging support for Pthreads. One distinction is done by looking at the implementations behaviour while debugging it. The first alternative is a passive role while debugging. This requires a vast amount of decoding and symbol lookup on the debugger side and decreases the portability. All informations about threads and related informations have to be retreived by the debugger. The advantage is that there are no changes to the implementation beside the compilation with ,,-g". A much greater disadvantage is the coding effort necessary on the debugger side. Furthermore the debugger has to be updated for every Pthreads change, because the decoding stuff has to be updated.

The TDI uses the second approach to get debugging support. Using the Pthreads implementation as a server for thread related informations offers the possibility to provide a kind of abstract pthreads implementation interface by every specific implementation. In consequence the debugger only has to know how to debug an abstract pthread implementation. This makes the debugger independent to the Pthreads development. The TDI core embodies the debugger and pthreads independent part of thread debugging. It provides communication support for both the server side (pthreads) and the client (debugger). The TDI manages the query requested by the debugger, evalutes it by utilizing the abstract pthread implementation interface of the pthread implementation.

To get an experience of how the TDI is embedded into the debugging environment take a look at the following picture.

Mainly the TDI is an information server. We can consider every Pthreads Implementation as a dynamic database. Actually we have only three relations - Threads, Mutexes and CV's. The attributes are mentioned above. To access the data base easier a simple query system was developed. TDI Perspectives

Because the TDI is an interface it offers different possibilities to look at. The interesting views to the TDI are decsribed below.

User View

The level of visibility of the TDI for the user depends on the debugger implementation. The TDI provides a query language to get information about threads, mutexes and condition variables. If the debugger implements a command to access that TDI facility directly, the user is able to use it. In addition the debugger could transform symbol names into address values (used by the TDI) to get a user friendly query interface (names vs. addresses).

An Example:

User question: Which threads are blocked on a mutex with a priority of greater than 20 ?

TDI-Query: ,,<TED>thread: id,prio,state,function,function_arg: status==0x2 && prio>0x20"

Debugger(Gdb) View

The development of a portable debugger is a hard work which is shown impressive by the Gdb implementation. One aim of the TDI was to not increase the complexity of the Gdb. Instead the TDI is a bit more complex than the TDI access by the Gdb. All Gdb access to the TDI is done by sending ASCII-sequences (human readable) to the TDI and receiving results in a defined format. For example a response to the above shown query example could be ,,1 2f 2 805b000 805bc28#2 21 2 805c000 805cc28". According to the request this response should be processed by the debugger to get a readable result. The debugger could resolve the function and function_arg address to symbol names, because the TDI can not be aware of symbol names.

Pthreads View

The Pthread Implementation has to provide the functions necessary to embody the abstract pthread implementation interface. For every kind of entity (threads, mutexes, CV's) there have to be attribute access functions. There is not much effort necessary to implement them. This should be an incentive for pthread developers to add that extension. The Pthread implementation can not see anything of the TDI. The TDI-Server is linked to the Pthreads-Library (there is a library linked for pthread support either a library implementation or a kernel thread package is used) at runtime (shared library support is necessary).

What does the TDI ?

The TDI is an interface between the Debugger and the Pthreads implementation, that provides an application view for the debugger on a pthreads abstraction level. For short, the TDI makes all POSIX Objects and their relationshpis at runtime visible. This information can be used by the debugger to provide low and high level thread debugging.

POSIX Objects

If one imagines the POSIX Threads API as a world of objects, the following interesting objects could be noticed:

Threads: ~ are an abstraction for concurrent (or parallel on multiprocessors) control flow. They are associated with a start function they execute. Threads can have many attributes which should be accessible for the debugger. Because the standard only defines the name for the thread type (pthread_t), the implementation solely knows how to access the thread attributes.

interesting attributes: ID, address (of the pthread_t object), priority, start function address , function argument address, status (RUNNING,READY,BLOCKED[Mutex,Condition Variable,Other Reason (IO,timer)],EXITING)
methods (thread related c-functions): init,create,exit

Mutexes: ~ are a mean for thread synchronisation. The type name pthread_mutex_t is defined but not the type itself.

interesting attributes: ID,address (of the pthread_mutex_t object), owner
methods: init,lock,unlock,destroy

Condition Variables: ~ are used to signal blocked threads. CV's establish a per mutex signalling system.

interesting attributes: ID, address (of the pthread_cond_t object), associated mutex
methods: init,wait,timed_wait,signal,broadcast,destroy

Scheduler: ~ is not explicitly defined as a data type but implicitly accesible through Pthreads functions. One can specify a scheduling policy and thread priorities.

POSIX Threads Implementations

There is a variety of thread implementations conforming to the POSIX threads standard. One possibility to implement the standard is the kernel thread solution. Operating systems like Mach, SunOS 5.5 (and higher), Linux and many more provide kernel threads. On such systems the pthreads can be mapped to kernel threads. Scheduling is done by the kernel and in most cases there is an MT safe C-library provided. Many OS's does'nt provide kernel threads. On these systems pthreads are pure user space implementations. The disadvantage for debugging of user space implementations is that all threads are running in the same unique process invisible for the operationg system kernel. They share the process time and they have a user space scheduler. Some Pthread solutions come up with their own thread safe C-libraries (i.e. MIT Threads). To implement a TDI all possible implementations has to be supported.

We decided to give the TDI a multilayered architecture. There are two layers, the generic layer (TDI kernel) and the pthreads dependent layer (TED). All TDI functionality, that is common to all pthreads implementations (communication module, query processing, auxiliary functions and bookkeeping stuff) is located in the TDI kernel. All TDI functionality, thet depends on the pthread implementation (read and write attributes, object set iteration) is implemented in the TED layer. The TED layer is a set of many TED implementations (for every pthreads implementation one TED suport package). The TDI kernel is unique. This offers the possibility to develop portable thread debuggers on top of the TDI. There is no debugger modification necessary to support a new thread package.

Abstract Pthread Implementation (Thread Extension for Debugging - TED)

As mentioned above, we need a low level abstraction layer, that encapsulates the internal pthreads representation of the POSIX threads objects. Additionally we need iteration methods for the object sets. We assume that it is possible for every Pthreads developer to provide list iteration (getfirst, getnext) on every set of objects. Attribute access for all object types should also be possible in a standardized way. TED should be the standard for the Pthreads-developers to implement the access and iteration functions. TED defines the prototypes for the the iterators and read/write functions. The implementation of the read and write functions is the responsibility of the developers.

TDI Architecture

The TDI-kernel consists of three components:

communication module

query processing (parsing, evaluation)

bookkeeping and auxiliary functions

Communication Module

The communication is implemented using the SVR4 shared memory facility. The communication module is designed modular to add any other kind of IPC.

Query Transformation

set addressing (threads, mutexes or condition variables)

attribute selection (projection) and assignment

selection predicate

Syntactic and semantic checks are done by a precompiler. The precompiler generates a list of visible attributes, a list of assigned attributes.

Query Processing

Auxiliary Functions

object registration (object containers for POSIX objects not administrated by the Pthreads implementation)
persistent ID support (process persistent Thread-ID's)
registering of the TED functions

Because not all POSIX threads objects are administrated by the pthreads implementation, the TDI provides object containers for them. Mutexes and condition variables are handled by the user and not by the pthreads implementation. To keep track of the object sets, the pthread implementationhas to use the object container functions of the TDI to register an object (at pthread_create or pthread_mutex_init) at creation time and unregister it at destruction (pthread_exit or pthread_cond_destroy).

The data type of POSIX threads is pthread_t. If pthread_t is defined as a pointer to an internal thread structure, persistent thread ID's can not guaranteed. If the pthreads implementation uses the TDI object container functions, process persistent ID's can be realized by assigning an new ID to every new container element.

It was a design goal of the TDI to allow debugging support for pthread implementations with an incomplete TED implementation. This is necessary, because some functionality can not be supported by both kernel and user space implementations. To illustrate this design goal see the thread attribute state.If a multithreaded kernel thread based application is stopped while debuggung it, all kernel threads have the state (TRACED), even the running thread that caused the breakpoint. If it's impossible to get the state of the kernel threads they had just before stopping them, no read and write functions for the state attribute are registered at the TDI kernel. If the user or debugger makes a query request including the state attribute, the NULL-value is returned for state, because no read and write functions are registered.

Installation Details

Feedback

You're welcome to make any suggestions and tell us your ideas. Please direct all TDI questions and bug reports to

tdi-bugs@informatik.hu-berlin.de

Last Modified: 1999-Apr-25

Author: Daniel Schulz