User:Nick Johnson/WhereToPutThisStuff

From Citizendium
< User:Nick Johnson
Revision as of 11:26, 16 April 2007 by imported>Nick Johnson
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

On the extensive use of pointers in C

Since C is widely used, and has been for many years, the usage of pointers in C warrants special attention here. Although C's pointer system allows for potentially unsafe usage of pointers, C programmers can use pointers to add many features to C, such as information hiding and abstraction (computer science, an object-oriented programming style and inheritance, and reinterpretation typecasts, in addition to typical data structures.

Still, many programmers would consider these techniques as kludges, and would recommend higher-level languages that have full syntactic and semantic support for these concepts.


Object-Oriented Programming, Information hiding and abstraction

Main Article: object-oriented programming information hiding and abstraction

C is not an object-oriented programming (OOP) language; but, one can still write object-oriented programs in C. OOP, in fact, is not a language feature, but a design philosophy. OOP languages include additional syntax and type checking that makes OOP easier in those languages.

In C, a programmer can group related information into a single object by using what is known as a structure. For instance, shown below is an Employee object, which has-a name and salary.

/* in Employee.h */
struct Employee
{
  char *name;
  int   salary;
};

This definition will work, but can create problems when the software needs to be modified in the future. Because of its transparent nature, programmers will tend to directly access its members. Continuing with our example, we could assume that there will be many places throughout the whole program which attempt to read the employee's name field.

However, in the future, the programmers may decide that the Employee's name should be represented as three fields: the first, middle and last names. Such modifications might look like this,

/* in Employee.h */
struct Employee
{
  char *firstName,
       *middleName,
       *lastName;
  int   salary;
};

Again, this definition will work. However, because our representation has changed, many parts of the code will need to change as well. Software engineers have studied this problem and recommend information hiding and abstraction to reduce the about of rework necessary when a data structure changes. In short, one should hide the internal representation of a data structure from the remainder of the program, and instead define a public interface to manipulate the data structure in a few, well defined and safe ways.

To employ information hiding, a C programmer would split the object's definition, so that only its interface is available to the remainder of the code. For example,

/* in Employee.h */
typedef struct s_employee *Employee;

Employee employee_new(void);
void employee_delete(Employee this);
void employee_setName(Employee this, const char *newName);
char *employee_getName(Employee this);
/* in Employee.c */
struct s_employee
{
  char *firstName,
       *middleName,
       *lastName;
  int   salary;
};

/* definitions of each method follows */

This necessarily depends on the usage of pointers. Note that in the header file Employee.h, we declare a forward reference to the s_employee structure. The compiler allows this for syntactic convenience, and for this reason will allow the remainder of the program to manipulate pointers to the s_employee type, but not the structures directly. As a result, any attempt to access individual fields of the object will produce compile time errors; those fields have been made private, and the internal representation has been hidden.

Inheritance

Main Article: inheritance

We shall extend our earlier example of the employee object from the previous example. Suppose that, in addition to Employee objects, we also wish to have Manager objects. Managers will be the same as Employees, except managers additionally have Secretaries. We want to create the Manager class as a derivative of the Employee class, and we do not want to re-write all of the methods that have already been written for the Employee class.

This can be accomplished by embedding the parent class at the beginning of the child class. Because of how the C compiler packs structures, we can define our Manager object, but still manipulate it with some Employee methods. For instance,

/* in Employee.h */
typedef struct s_employee *Employee;
typedef struct s_manager  *Manager;

Employee employee_new(void);
void employee_delete(Employee this);
void employee_setName(Employee this, const char *newName);
char *employee_getName(Employee this);

Manager manager_new(void);
void manager_delete(Manager this);
void manager_setSecretary(Manager this, Employee sec);
Employee manager_getSecretary(Manager this);
/* in Employee.c */
#include "Employee.h"

struct s_employee
{
  char *firstName,
       *middleName,
       *lastName;
  int   salary;
};

struct s_manager
{
  struct s_employee Me;

  Employee mySecretary;
}

/* definitions for each method follows */

This works because the first few words of the s_manager structure have the same organization as the s_employee structure. This, a method that retrieves the salary for an Employee object will also work for a Manager object. The programmer must be quite careful with this technique.

Reinterpretation Typecasts

Main Article: reinterpretation typecast

Sometimes it is useful to manipulate data in ways that are not directly supported by a programming language. Take, for instance, the problem of generating pseudorandom floating point numbers. When performed frequently, as is the case with many simulations, the speed of a routine to generate these numbers is paramount.

Generating a pseudorandom integer is relatively fast, however generating a pseudorandom floating point number may require a division operation, which is (relatively) slow. One trick is to generate a pseudorandom integer and to reinterpret cast it as a floating point number, as in below:

double fastFrand()
{
  double result;
  int i, *parts = (int*) & result;

  do
    for(i=0; i< sizeof(double)/sizeof(int); i++)
      parts[i] = rand();
  while( ! goodFloatingPointNumber(result) ); 

  return result;
}

Where the predicate goodFloatingPointNumber() performs checks for a few boundary cases.