Encyclopedia > Subroutine

Article Content

Subprogram

Redirected from Subroutine

In computer programming, a subprogram is a set of instructions in a computer program which is separated from other code to reduce redundancy, and called by other subprograms or other parts of the program.

The possible motivations for the use of subprograms are numerous:

reducing redundancy,
promoting reuse of code,
decomposing complex problems into small pieces that are simpler,
improving readability of program code,
replicating mathematical functions,
emulating hardware functionalities,
introducing some layers within programs,
hiding or regulating part of the program.

Reduction of redundancy is one of the most common reasons: using subprograms, the program can make repeated use of one block of code that would otherwise be duplicated. Redundancy can be deliberately introduced to improve performance (see optimization).

Computer programs tend to be complex - it is hard to understand them all at one time. Therefore programs are now generally written as a number of subprograms. Subprograms are sometimes made to mimic mathematical functions, such as sine and cosine. This makes it possible for programmers to write mathematical expressions containing these functions.

Generally, subprograms are defined within computer programs. Often for consistency, built-in routines are called in the exactly the same way as user-defined subprograms. Such inconsistency is sometimes by mistake or deliberatly. Many languages such as C programming language require all code to belong to each subprogram and a specific one, the main function, is called at the beginning of execution.

Some programming languages, particularly languages following the structured programming methodology such as Pascal, support subprograms embedded in other subprograms.

Although uncommon, subprogram (meaning part of program) is a concise and precise term because other commonly used terms imply a philosophy about programs. Subroutines are the term most commonly used. It is called so because it has become so common soon after invention of assember languages[?] and Fortran to make a common code, that is, a routine take, as routine code. Programmers tend to use this term in any language. Procedure is the most common term when algorithms are described because in mathematics, an algorithm is seen as generalized steps of computations. Some people prefer to use functions because in some languages, subprograms appear just like a part of expression or something having a value. Generally, the precise meaning of each term varies between programming languages. Methods are special kind of subprogram used in object-oriented programming. See method for detail.

Typically, the caller waits for subprograms to finish and continues execution only after a subprogram returns. Subroutines are often given parameters to refine their behavior or to perform a certain computation with given (variable) values. Values passed to a subprogram are called actual parameters (or more commonly arguments), and the list of arguments a subprogram expects to be given is known as its list of formal parameters (or simply "parameters).

In most imperative programming languages, subprograms may have so-called side-effects, that is, they may cause changes that remain after the subprogram has returned. Usually, compilers cannot predict whether a subprogram has a side-effect or not, but can determine if a subprogram calls no other subprograms, or at least no other subprograms that have side-effects. In imperative programming, compilers usually assume every subprogram has a side-effect to avoid complex analysis of exection paths. Because of its side-effects, a subprogram may return different results each time it is called, even if it is called with the same arguments. A simple example is a subprogram that returns a random number each time it is called . Such behavior is invalid in a strict mathematical sense. An exception to this common behaviour is found in functional programming languages, where subprograms can have no side effects, and will always return the same result if repeatedly called with the same arguments. [Note that subprograms are referred to as functions in these languages].

Subprograms may or may not return a value to their caller. In many languages such as Pascal, a subprogram that returns a value is called a function in analogy with the mathematical sense of the term (see denotational semantics), while a subprogram that returns no value is called a procedure. In some languages, including C and Fortran, this distinction is not observed. In these langauges, some special datatype, such as C's void, denotes that the function returns no value. A specification of the datatypes accepted and returned corresponds to specifying a domain and codomain in the mathematical sense. See function.

Sometimes the declaration of a subprogram is distinguished its actual definition. A few languages, notably C and C++[?] support declarations called prototypes which give the compiler information about a function which has yet to be defined. Some languages require subprograms to be defined or declared before they are called, while some do not.

In Fortran, a subprogram's local variables are stored in global space. Calling a subprogram is a sequence of storing its return address, jumping to the code of the subprogram, and finally jumping back to its return address. Today, most programming languages use stack-based allocation[?] to store the return addresses of subprograms. Each time a subprogram is called, a record is kept on a stack (usually the system stack); this record is then deleted when the subprogram returns. This mechanism enables recursive calling of subprograms. See the next section for more details.

Table of contents

1 Calling sequence (stack-based)

2 Parameter passing

2.1 Call by reference, call by value and call by name

3 Polymorphism

3.1 in-line expansion

4 Kind of subprograms

5 See also

Calling sequence (stack-based) When a subprogram is called, the first step is to allocate memory to hold the return address, local variables, parameters and other bookkeeping stuff required. The space is allocated on the top of the stack and, is called a stack frame[?] or activation record[?]. The size of the memory allocated for a particular subprogram can be fixed, but may vary if some of the subprogram's local variables are dynamically sized arrays. In most implementations, the stack pointer (sp), is decremented, and the frame pointer is set to the starting address of the record. The address of local variables and other stuff can be calculated by adding an offset (determined at compile-time) to the frame pointer. If the size of the record is fixed, the frame pointer is not needed because the fixed offset from the stack pointer can be used instead. If the size or number of arguments is unknown, formal parameters are placed below fixed objects such as local variables and fixed size arguments. These operations are preferablly done by the subprogram itself, because if the caller has code to perform them, such code appears each calling point, resulting in a large body of redundant code. After or before this process, the program counter[?] (pc) is changed to the jump code of the subprogram.

The process of returning from a subprogram is simple. The relevant stack-frame must be deleted, and the program counter must be reset to point to the bit of code from which the subprogram was called.

Parameter passing The way in whcih arguments are passed to subprograms is a complex issue. Most of the time, subprograms accept not just one but a list of arguments as is the case in functions in mathematic. The order of the list of arguments corrsponds to that of parameters; that is, the first argument is the first parameter in a subprogram and so on. Formal parameters are often given datatype. The datatype of an argument corresponding to a formal parameter usually the same such as simple integer, but can differ as long as they are compatible.

The more complex tasks subprograms, the more a number of parameters tend to be. It makes difficult to remember for human programmers the number, order and type of arguments. Named parameters is a way to place a name tag to each parameter so that the order of arguments becomes irrelevant. Default parameters are ones that can be omited in a list of arguments and in that case, values or expressions defined before are used. C Plus Plus supports Default parameters and Ada programming language supports both.

In many APIs of operating systems, a composite datatype is used instead of a list of parameters. Because C, which is typically used in such system programming, does not support named parameters or default parameters above. Advocates of object-oriented programming recommend to decompose such a complex subprogram into classes or objects with small methods.

[TODO: Add information about out, in and out/in modifiers]

Call by reference, call by value and call by name

In computer science, call-by-something (formally known as parameter-passing modes) is a term describing the way in which parameters are passed to subprograms. There are three main methods: "call-by-reference", "call-by-value" and "call-by-name".

In the call-by-reference mechanism, the caller passes a pointer to each argument instead of passing a copy of it. When a subprogram which has been called-by-reference modifies one of its parameters, the object changes from the caller's perspective as well. Call-by-reference can be simulated in some call-by-value languages (for example, in C, one can pass a pointer to an object as an argument, instead of a copy of the object itself).

In the call-by-value mechanism, a copy of each argument is passed to the subprogram, which prevents arguments passed to a subprogram being changed by it. In Visual Basic, the ByVal modifier states parameters are passed by value.

[TODO: Add information about call-by-name]

Polymorphism See polymorphism[?] for what it is.

Subprograms more or less support polymorphic behaviors with overloading and overriding. Overloading is a way to provide several subprograms that differ in terms of datatype of arguments and that of return-value. What differentiates them is called signature. This kind of polymorphism is especially called parametric polymorphism. Overriding, in turn, is to replace subprograms with other ones, mostly method for a subclass. When combined with late binding[?] this is the basic means of achieving the polymorphism[?] of an object oriented design in an object-oriented programming language besides overloading. They both are usually considered a fundamental aspect of object-oriented programming and many sophisticated languages such as Ada programming language and functional languages supports this.

in-line expansion

The in-line expansion is a technique that compilers expand actual implementation of subprograms in the point of calling them. It is a better solution to problems with macros. The main drawback is that this results in a larger binary, which can actually hurt performance if it affects locality of reference.

Often, use of in-line expansion over calling subprograms reduces overhead of subroutine calling dramatically. Unlike macro, the effect is semantically guranteeded to be equivalent to calling subroutines, though some subtle bugs may be introduced in some circumstances such as compiler bug[?]. Besides, the definition of subroutines with the expansion is usually identifical to the defition of ones without the expansion with certain comment about expansion. In Ada programming language, significant comment[?], or pragma[?] states the subroutine should be expanded if possible.

C++ is well known for its support over C. Because implementing this expansion is difficult to deal with polynomic subroutines, not many programming languages support it. Some Java runtimes (notably the HotSpot compiler) support aggressive inlining based on actual runtime call patterns; this is a "best of both worlds" solution, since it only inlines parts that are frequently used.

Subprogram with this expansion can be written in C++ like:

inline int
max (int a, int b)
{
  return a > b ? a : b;
}

max (x, y); // this is basically equivalent to "x > y ? x : y";

Kind of subprograms

Wrapper functions (or commonly just wrapper) are subprograms that simply call another one. They are used to change the order or datatype of arguments or bridge a gap between new and old versions by old ones (referred to as deprecated) just calling new ones. Win16 APIs in 32-bit Windows operating systems that only call win32 apis are example. The functionality of them is sometimes similar to adapter pattern.

See also

reentrant (lit. entrace again) - a subprogram that can be called again
Putpixel - Common name for a subroutine that should a value in a computer's video RAM, that should later be displayed as a pixel on a computer's screen.
parameter - duplicate discussion about parameters
closure - a subprogram that can be handled as a variable

All Wikipedia text is available under the terms of the GNU Free Documentation License

Search Encyclopedia

Search over one million articles, find something about almost anything!