Thomas Weideman

9

An algorithm is a set of steps – instructions – for a program to follow. What runs the program is not really an issue. The algorithm could be implemented by a person – something that we do every time that we provide directions. Or it could be a machine, in which case the machine is now a computer. In either case when we speak of the computer we are speaking of hardware – the physical components of the computer.

But we need to direct the hardware on how to implement our algorithm. We do this by providing a program. The program is what we commonly think of as software; the instructions that make the hardware perform for us.

For a person we just tell them the steps that we would like them to follow. By doing this we have in effect programmed them. But what about a machine? How do we program a machine?

Low Level or High Level Languages

The process of programming the early mechanical computers required adjusting gears or levers. The electronic computers replaced the gears with wires or switches. In either case it was a physically intensive process. It was also a time consuming process in that it could take days or weeks to program the computer to implement an algorithm.

Software:

The instruction set that controls the computer’s hardware.

“What you blame when it doesn’t work.'”

These programs were also temporary. Once the programming was complete the algorithm could be run. When it was complete the programmers would reset the computer and start the process of programming the computer for the next algorithm needed.

This type of programming was often done using wires and switches. It could take weeks to program the computer for a single algorithm. Once programmed It might return a result in minutes, but then the programmers would start the process of rewiring the computer for the next program. If an engineer wanted to run another case or adjust a model they might have to wait weeks until once again it was their turn on the computer.

Portability:

The ability for a program to be run on multiple computers without changes.

This was a result of the computer’s design – it was all hardware. All of the tubes, wires, transistors, and circuits were a single unit. To program the computer the programming would determine how the circuitry would interact and then enter the program using switches and cables. The commands were written directly in what would be considered machine code.

The speed and complexity of programming the computer was a major problem. But it was not the only one. The software lacked portability.

One-Trick Pony:

Someone or something that is only good for one particular purpose, or at doing one particular thing.

When a programmer created the algorithm for their analysis it was written for the single computer upon which it would be run. They could not write it once and then run the program on different computers. It lacked portability. The program was, in effect, a one-trick pony.

To speed the process of programming, and and at the same time make the program portable, the software needed to be separated from the hardware. This would require a separate means of writing the program, loading it into the computer, and then having the computer run it. This was accomplished by the creation of a standardized form for writing the instructions. This list of commands could then be used to program the computer quickly – at leas much more quickly that what was currently being done. Along with this came repeatability and portability. By using this formalized set of instructions a programmer could enter their algorithm into the computer – or any computer that would accept the same commands – and run the program. And if necessary, it could run it over and over and over again.

This was the first use of a programming language.

Low-Level Languages

The first programs were implemented in a low-level language. The most common low-level programs are written directly in machine code or a series of binary, 0s and 1s, instructions.

Low-Level Languages:

A programming language that contain instructions that are directly readable by the computer.

A low-level language is directly readable by the computer which makes it difficult for a programmer to understand. The benefit is that the programs often run very efficiently; both fast and using much less memory.

While efficient, low-level languages are difficult in which to write programs. One type of low-level language, machine code, requires the programmer to provide the instructions to directly control the computer’s Central Processing Unit (the CPO.) This might involve the direct allocation controlling individual memory addresses or performing specific tasks such as loading a value or performing an arithmetic operation. Machine code might be written in the binary code that will be executed by the CPU.

An example of the machine code that would print Hello, world is shown in the figure Printing Hello, World in Machine Code. While this example is not binary, it instead operates by allocating, assigning, and accessing individual memory locations by their address.

b8	21	0a	00	00	#moving “!\n” into eax
a3	0c	10	00	06	#moving eax into first memory location
b8	0c	10	00	06	#moving “orld” into eax
a3	08	10	00	06	#moving eax into next memory location
b8	6f	2c	20	57	#moving “o, W” into eax
a3	04	10	00	06	#moving eax into next memory location
b8	48	65	6c	6c	#moving “Hell” into eax
a3	00	10	00	06	#moving eax into next memory location
b9	00	10	00	06	#moving pointer to start of memory location into ecx
ba	10	00	00	00	#moving string size into edx
bb	01	00	00	00	#moving “stdout” number to ebx
b8	04	00	00	00	#moving “print out” syscall number to eax
cd	80				#calling the linux kernel to execute our print to stdout
cd	80				#calling the linux kernel to execute our print to stdout
b8	01	00	00	00	#moving “sys\_exit” call number to eax
cd	80				#executing it via linux sys\_call
cd	80				#executing it via linux sys\_call

Printing Hello, World in Machine Code Thanks to cedriczirtacic on github for the code

Assembly Languages:

Assembly language is a low-level programming language in which the instructions in the language correspond to the architecture’s machine code instructions.

While a step above machine code, Assembly, which is still considered a low-level language, requires the use of an assembler; a platform dependent program that converts the program to machine code. Each statement would still be a direct action for the computer. This might include allocating memory, accessing variables, and performing calculations. The syntax of the commands are often cumbersome and confusing.

As an example, the code in the figure Printing Hello World in Assembly Language would be used to print Hello World on an x86\_64 processor running Linux.

.global main /* Make main a global function */
main: /* Start function main */
mov msg, %rdi /* RDI gets pointer to message */
call puts /* puts(msg) */
mov , %rax /* RAX gets zero */
ret /* return(0) */
msg: /* Declare a label for the string */
.asciz "Hello World" /* Define the string */

Printing Hello World in Assembly Language Thanks to Jack Brennen of Google

Programs written in a low-level language run directly on the computer’s processor. This, inherently, will make the program fast to run. But the difficulties may far outweigh the speed.

The challenge in writing a program using a low-level language is obvious from the two examples. Part of this complexity is a result of the need for the program to direct the computer to perform every action – nothing is already understandable by the computer. In effect, the program must assign each item to an appropriate place in memory and then perform the actions using the memory locations. Each time a new variable is created memory must be allocated for the variable. When the variable is deleted memory must be freed up, or deallocated.

But the issues of low-level languages are more than just memory management. For most programmers the problem is that there is no simple, easily understood, command that can be used to direct the computer.

Abstraction:

Abstraction is the process in which actions are derived from their context.

It is this aspect of the low-level language that is being described when it is said that low-level languages lack abstraction – no operations can be inferred by the command. They are not already built into the language.

To address the challenges of programming in machine code, and allow for portability of the software, programmers created high-level languages.

High-Level Languages

Most modern programs are written in one of the many different high-level languages. The original goal was to create a language for writing programs that was more similar to the syntax of a spoken language.

High-Level Languages:

A programming language with strong abstraction.

With the simplicity of a readable programming language, high level languages have strong abstraction. This means that many of the details of making the computer operate are already built into the language.

As an example recall the low-level assembly code in the figure Printing Hello World in MatLab that was needed to print the phrase Hello World. Now let’s write the same code but in MatLab.

fprintf('Hello World\n'); % Prints the string

Printing Hello World in MatLab

The code in the figure Printing Hello World in MatLab is still a bit cryptic. Why fprintf and not just print? Why the \n? These items will be addressed later but for now if you showed a non-programmer the line of MatLab code and asked them what it did they would probably say it prints the phrase Hello World.

The difference between the two is that in the Assembly code there was no abstraction – we needed to explicitly tell the computer how to read the letters of the phrase into memory, where to store them, and how they could be printed.

But in the high level language there is strong abstraction. All that we have to do is say print the phrase and the programming language takes care of all of the other details. The computer still has no direction on how to do any of these actions, it is just that the high level language has taken care of it for us.

types of programming languages

There are two characteristics that are used to identify modern programming languages; how they work, and how the programmer implements them. As for how the language functions, a language can be procedural or object oriented. This is known as the programming model.

The second characteristic addresses how the program is written or implemented. With respect to how the programmer writes the program, the programming language can be compiled or interpreted.

Programming Model

The programming model describes how the program operates. The two primary models are object-oriented programming, and procedural programming.

Object-oriented programming is based upon the creation of classes and objects. A class is a structure consisting of both variables and methods – smaller program instructions – whereas an object is an instance of a class. Thus there can be many different objects that are all of the same class. The objects then interact with both each other and with data streams outside of the program.

Object-oriented programs can be thought of as modular. Each aspect of the program is a separate module and the modules interact with each other.

While object-oriented programming has been around since the 1950s, it has only become a commonly used programming paradigm in the last twenty years when it has surpassed the use of procedural programming.

Procedural programming is the original programming language model. In this approach the programming creates separate procedures, or functions, or methods. These functions are often compared to mathematical functions in that the user provides a set of inputs to the function. The function then performs a set of operations on the input data. The function may then perform an output action such as printing or writing to a file, or more commonly it will return – or send – a result back to the function that initiated the original action to the current function.

A goal of procedural programming to implement top-down design. This is when the program is broken down into a set of tasks. Each of these individual tasks are then broken down into subtasks. This process is repeated until each task that needs to be done is trivial. At this point, the trivial tasks are written as separate functions. The program then consists of a driver function that calls other functions to complete tasks as they are needed.

The programming model describes how the language operates in solving a task, but there are also differences in how certain languages are written and run. Languages can be compiled or interpreted. This describes how the program is translated from a text into something the can be run – or executed – on a computer.

Program Translation

Translating a program is the process of converting the plain text that was written by the programmer into a form that can be run on the computer. There are two common types of program translation – compiling or interpreting.

Compiled Programming Languages

A program that is written in a compiled language will start with one or more source code files. This is a text file with the individual program commands. While every language is a bit different, the source code file must be written as a complete program. If the language requires a specific start command and an end command, the source code must include these.

\begin{tikzpicture}[auto]

%Place nodes

\node [startstop] (funcStart) {Start};

\node [functionProcess, below of=funcStart, node distance=5.5em] (readFile) {\parbox{0.30\textwidth}{\vspace{0.5em}\begin{center}Parse Source\\Code\end{center}\vspace{0.5em}}};

\node [functionDecision, below of=readFile, node distance=6.5em] (decision) {\parbox{1.0\textwidth}{\vspace{-0.0em}\centering Syntax\\Errors?}};

\node [functionProcess, below of=decision, node distance=7em] (processTrue) {\parbox{0.30\textwidth}{\begin{center}Identify Syntax\\ Errors\end{center}}};

\node [functionProcess, right of=processTrue, node distance=12em] (processFalse) {\parbox{0.30\textwidth}{\begin{center}Create Object\\ Code\end{center}}};

\node [functionProcess, below of=processFalse, node distance=6em] (Link) {\parbox{0.30\textwidth}{\begin{center}Link Object to\\ Executable\end{center}}};

\node [startstop, below of=processTrue, node distance=6em] (funcStop) {End};

%\node [right of=functionCall, node distance = 4em, inner sep=-2pt, minimum size=0pt] (node1) {};

\path [line] (funcStart) — (readFile);

\path [line] (readFile) — (decision);

\path [line] (decision) — node {Yes} (processTrue);

\path [line] (decision) -| node {No} (processFalse);

\path [line] (processFalse) — (Link);

\path [line] (processTrue) — (funcStop);

\path [line] (Link) — (funcStop);

%\node [draw, thick, dotted, fit=(funcStart) (recursiveFunctionCall) (funcStop), inner sep=2em](dottedBox) {};

\end{tikzpicture}

Compiling a Program

When the source code is in a complete form it can be run through a compiler. The compiler is a platform dependent program that is will convert the source code into an executable file. More specifically, the compiler is translator in that it translates the high level language of the source code into the machine code needed to run the program on the computer.

The compiler first strips all non-executable content from the file. This includes any comments or white space. It then parses the file. Parsing is the process of reading the file into memory one character at a time while checking it for correct syntax. If an error is found the compiling process stops and usually error messages are sent back to the programmer.

If no syntax errors are found, the program is converted to object code. Object code is a file where the program has been converted into binary machine code. The object code is then linked – that is the compiler uses the object code to create an executable file. The executable is the program file; it can be run by the computer. It is platform dependent meaning that it can only be run on the same type of computer on which it was compiled. A powerful aspect of the executable is that once it is created it can be run on multiple computers – as long as the computer is running the same operating system – without the need for the computers to have the compiler, or the source code.

The process of compiling can be time consuming. It requires the program to be in a complete state. This might be mean only that is has a start and end. The entire program is parsed and linked. If there is a single error in the program the process stops without a single line of code being run. Once the syntax error has been identified and corrected, the compiling process can be started again.

Compiled Programs:

The paradigm of compiled computer programs is that they are Slow to Code, but fast to run

While modern compilers are fast, compiling a long program can be time consuming. The upside side is the the executable program is does not need to be checked for any errors. Further, it is usually optimized by the compiler to run efficiently on the machine on which it is written. This results in a common saying about coding in a compiled programming language – Slow to code, but fast to run.

The use of compiled programming languages had become the standard because of the speed of execution when the program is run. But there is an alternative, and while the programs run more slowly than compiled languages, with today’s fast processors they have made a resurgence. These are interpreted programming languages.

Interpreted Programming Languages

The alternative writing a program in a compiled programming language is to use an interpreted programming language. While compiled programs and interpreted programs may appear the same to the user, to the programmer they are very different.

The first difference in an interpreted program is semantic. The compiled program was written as source code. But the interpreted programming languages use a script. The reason is how they function. The script can be compared to a script that might be used in a play or a movie.

Run Lines:

In the theatre, when an actor is practicing the play with only one other person it is said they run lines. Similarly, the computer will run lines when it executes the script.

In a play script each line is either a dramatic action – something said – or a physical action – something done. While the actor performs that action the other actors are theoretically idle – they wait. This is similar to how the script is run.

\begin{tikzpicture}[auto]

%Place nodes

\node [startstop] (funcStart) {Start};

\node [functionDecision, below of=funcStart, node distance=7.5em] (decision) {\parbox{1.0\textwidth}{\vspace{-0.0em}\centering Lines\\To Run?}};

\node [startstop, left of=decision, node distance=9em] (funcStop1) {End};

\node [functionProcess, right of=decision, node distance=10.5em] (parseLine) {\parbox{0.20\textwidth}{\vspace{0.5em}\begin{center}Parse Next\\Line\end{center}\vspace{0.5em}}};

\node [functionDecision, below of=parseLine, node distance=6.5em] (checkLine) {\parbox{1.0\textwidth}{\vspace{-0.0em}\centering Syntax\\Error?}};

\node [functionProcess, right of=checkLine, node distance=10em] (idSyntax) {\parbox{0.20\textwidth}{\begin{center}Identify\\Error\end{center}}};

\node [startstop, below of=idSyntax, node distance=6em] (earlyStop) {End};

\node [functionProcess, below of=checkLine, node distance=8em] (machineCode) {\parbox{0.20\textwidth}{\begin{center}Translate to\\Machine\\Code\end{center}}};

\node [functionProcess, below of=machineCode, node distance=7em] (runLine) {\parbox{0.20\textwidth}{\begin{center}Execute\\Line\end{center}}};

% \node [startstop, below of=runLine, node distance=6em] (funcStop) {End};

%\node [right of=functionCall, node distance = 4em, inner sep=-2pt, minimum size=0pt] (node1) {};

\path [line] (funcStart) — (decision);

%\path [line] (decision) — (readLine);

\path [line] (decision) — node {Yes} (parseLine);

\path [line] (decision) — node {No} (funcStop1);

\path [line] (parseLine) — (checkLine);

\path [line] (machineCode) — (runLine);

\path [line] (checkLine) — node {No} (machineCode);

%\path [line] (moreLines) — node {Yes} (parseLine);

%\path [line] (processFalse) — (Link);

\path [line] (checkLine) — node {Yes} (idSyntax);

\path [line] (idSyntax) — (earlyStop);

\path [line] (runLine) -| (decision);

%\node [draw, thick, dotted, fit=(funcStart) (recursiveFunctionCall) (funcStop), inner sep=2em](dottedBox) {};

\end{tikzpicture}

Interpreting a Program

Programs written in an interpreted programming language act on a single line at a time. To do this, the program requires an interpreter to translate the line of code to machine code to be run.

The interpreter is a program written for a specific operating system. In its simplest form, It opens the computer script and runs the program. More specifically, the interpreter reads the script into memory. It then parses the first line of the program. If it finds a syntax error, the interpreter stops the program and prints an error message. But if it does not find a syntax error the interpreter translates the single line of the script into machine language and runs the line. If there are more lines of code in the script, the interpreter goes to the next line and repeats the process. This continues until it either finds an error or there are no more lines of code in the script.

Interpreted Programs:

The paradigm of interpreted computer programs is that they are Fast to Code, but slow to run

One way to view interpreted programs is that the parsing and translating are performed on each line of the program every time the script is run. There is no executable created but instead the script is the program. Historically this meant that the program would run more slowly than would a compiled program. But there was an upside. The interpreter only needed a single line of code to run. The process of writing a script becomes incremental. You write a line and run the program. If it works, you write another line of code, building the program one line at time. Each time the interpreter is run, all the previous code has been checked, and while it is parsed a second time, you already know that it is correct and will not need to be updated. This makes the process of coding faster than with a compiled language. Thus fast to code, but slow to run.

Note:

The idea that the interpreted programs are slow to run has become less of an issue with the faster processors in modern computers. While they still run more slowly than do compiled programs, the differences are much less noticeable.

Another difference between compiled and interpreted programs is with respect to platform independence. Recall that a compiled program is optimized to run on a single platform. Thus you cannot run an executable on any computer. But the script for an interpreted program is plain text and can be run on any computer that has the proper interpreter installed. Thus interpreted programs are considered to be platform independent.

summary

An algorithm is the process we follow to complete a task. To implement the algorithm we write a program. But to get the program to run on a computer requires a specific syntax, and thus a specific programming language.

While mechanical computers would be programmed by physically manipulating gears or levers, programs for electronic computers consist of a set instructions that the computer can follow.

In its most basic form – the low level language – these instructions are directly readable by the computer, for example machine language, and control the computer’s central processing unit. In this type of language every step must be addressed explicitly. The difficulty in this type of programming becomes obvious for all but the most trivial programs.

High level languages have been designed to make the programming process more accessible. The goal is enable the programmer to enter the commands in a syntax similar that of a spoken language. If the program wants to add two values, the program tells them to be added. To print the result, the program would use a command that simply tells the computer to print.

While the idea of the many different high level languages is the same, the means of writing the program varies widely. With respect the language model, they can be object-oriented or procedural. As how the computer runs – or translates – the program, there are compiled programming languages and interpreted languages.

Object-oriented programming languages allow the programmer to create modules, sometimes described as small programs, that interact with each other or with devices outside of the computer. In a procedural programming language, the programming creates a separate routines, or functions or methods, to perform individual tasks.

Whether object-oriented or procedural, the program still needs to be translated into machine code to be run. Some languages do this by compiling a source code while other interpret a script.

Compiled programming languages will parse the entire source code, identifying any syntax errors that might exist. If there are none, the compiler then translates the plain text source code file into object code. The object code is then linked to an executable file. The executable is binary machine code that has been optimized that can be run directly on the computer without further processing. The executable is platform dependent meaning that it can only be run on the same type of computer on which it was compiled.

The process of compiling a program often makes the programming process time intensive. But once compiled the executable is efficient. Thus the paradigm of compiled languages as slow to code, but fast to run.

Interpreted programming languages require an interpreter that is run on the computer. The interpreter reads the plain text script – the program – into memory. It then parses a single line, and if there are no errors it converts the line of the script to machine code and runs it. It then goes on to the next line and the next until it either finds an error or finishes the program.

The process or parsing a single line and then running it makes the programming process a much faster endeavor. The programmer can write a single line of code, test it, and if it is correct add the next and the next. This can greatly speed up the programming process. The tradeoff is in that the each line must run individually. Thus interpreted languages are said to fast to code, but slow to run.

While the interpreter must be written for a specific type of computer, the script does not. Since interpreters are available for most of the common computer types, a script written on one computer can usually be run on any other computer regardless of whether it is the same type. Thus interpreted languages are considered the platform independent.

MatLab is an interpreted programming language, and while the programs can be written as an object-oriented program, the programs can also be written as a procedural language.

License

Icon for the Creative Commons Attribution 4.0 International License