Electrical Engineering is all about abstraction. We hide complex details and features and work on high-level objects to make our work easier. However, this also means that most often, we do not know how something works internally, even though we might know how to use it, like op-amps.
And one place, where we probably see the most abstraction is in the microcontroller/processor area.
I have been studying the hardware behind programmable chips recently to figure out how they work, particularly, how a program is converted into hardware signals. And it is a mess down here!
As you break down more and more layers of abstraction, it feels like you are opening Pandora’s Box and all sorts of weird and complex architectures and processing units come up.
But at the same time, it is beautiful. I mean, how does a program that we write in C, say, to add two numbers, cause changes in the chip in the hardware level that causes certain pins to go up and down and finally give us something as simple and complex as the sum!
While studying about that, I came across three terms that are although very neglected, are highly important when it comes to understanding and designing microprocessor architectures: Instruction Set, RISC, and CISC. But before we learn about that, we have to understand how a microprocessor really works.
Imagine you are training a dog. You teach it to understand certain specific instructions like ‘sit’ or ‘fetch’ and then associate those words with certain tasks. So that later on, when you say ‘fetch’, and throw a ball, the dog will fetch it.
A CPU (and I will use this term to mean both microcontrollers and processors in general) works in a similar way. There are certain instructions that the CPU knows and when we give them those instructions, different transistors inside it switch ON and OFF to perform those instructions.
The instructions that we input are in the form of 1’s and 0’s, or opcode. Since it is hard for us to remember combinations of 1’s and 0’s, we tend to use shorthand’s for those instructions, called assembly language, and a compiler converts it into opcode.
The number of instructions that a particular CPU can have is limited and the collection of all those instructions is called the Instruction Set.
The Instruction Set is very important. High-level programming languages are designed based on how the instruction set is and a proper design of hardware and instruction set can determine how fast the CPU is.
The performance of a CPU is the number of programs it can run in a given time. The more the number of programs it can run in that time, the faster the CPU is.
The performance is determined by the number of instructions that a program has: more instructions, more time to perform them. It also depends upon the number of cycles (clock cycles) per instructions (to know more, go here).
This means that there are only two ways to improve the performance: Either minimize the number of instructions per program or reduce the number of cycles per instruction.
We cannot do both as they are complementary; optimizing one will sacrifice the other. And the optimizations that we have to make are embedded deep in the instruction set and the hardware of the CPU.
It is because of this that the CPU industry is divided between two very big players backing one of either techniques. While many Intel CPU’s are CISC architecture based, all Apple CPUs and ARM devices have RISC architectures under the hood.
CISC is the shorthand for Complex Instruction Set Computer. The CISC architecture tries to reduce the number of Instructions that a program has, thus optimizing the Instructions per Program part of the above equation. This is done by combining many simple instructions into a single complex one.
In the dog analogy, “Fetch” can be thought of as a CISC instruction. When a dog “Fetches” a ball, it is actually doing a series of instructions that include: “Follow the ball” then “Pick it up” followed by “Go back to human” and finally, “Give human the ball”.
It is obvious that giving a dog a single “fetch” instruction is easier and faster than giving it four separate instructions. And this is why initial CPU manufacturers like Intel designed CISC processors.
To illustrate a CISC instruction, let’s take the MUL instruction.
This instruction takes two inputs: the memory location of the two numbers to multiply, it then performs the multiplication and stores the result in the first memory location.
MUL 1200, 1201
Where MUL takes the value from either two memory locations (say 1200 and 1201) or two registers, finds their product and stores the result in location 1200.
This reduces the amount of work that the compiler has to do as the instructions themselves are very high level. The instructions themselves take very little memory in the RAM and most of the work is done by the hardware while decoding instructions.
Since in a CISC style instruction, the CPU has to do more work in a single instruction, so clock speeds are slightly slower. Moreover, the number of general purpose registers are less as more transistors need to be used to decode the instructions.
On the other hand, Reduced Instruction Set Computer or RISC architectures have more instructions, but they reduce the number of cycles that an instruction takes to perform. Generally, a single instruction in a RISC machine will take only one CPU cycle. This might be a “sit” instruction that we give to a dog.
Multiplication in a RISC architecture cannot be done with a single MUL like instruction. Instead, we have to first load the data from the memory using the LOAD instruction, then multiply the numbers, and the store the result in the memory.
Load A, 1200
Load B, 1201
Mul A, B
Store 1200, A
Here the Load instruction stores the data from a memory location like 1200 into a register A or B. Mul multiplies values in the two registers stores it in A. Then finally we store the value of A in 1200 (or any other memory location). Note that in RISC architectures, we can only perform operations on Registers and not directly on the memory. These are called addressing modes.
This might seem like a lot of work, but in reality, since each of these instructions only takes up one clock cycle, the whole multiplication operation is completed in fewer clock cycles (I discuss more on this later).
However, the time advantage is not without its disadvantages. Since RISC has simpler instruction sets, complex High-Level Instructions needs to be broken down into many instructions by the compiler. While the instructions are simple and don’t need complex architectures to decode, it is the job of the compiler to break down complex high-level programs into many simple instructions.
This puts a lot of stress on the software and the software designers while reducing the work needed to be done by the hardware.
Since the decoding logic is simple, transistors required are lesser and a more number of general purpose registers can be fit into the CPU.
While CISC tries to complete an action in as few lines of assembly code as possible, RISC tries to reduce the time taken for each instruction to execute.
For example, the MUL operation on two 8-bit numbers in the register, in 8086 which is a CISC device can take as much as 77 clock-cycles (link), whereas the complete multiplication operation in a RISC device like a PIC takes only 38 cycles (link)! That makes it nearly twice as fast!
P.S.: Although I should be fair and say that it is not always possible to fairly compare two different kinds of microprocessors and microcontrollers and there might be other factors at play here.
Since CISC instructions take a more number of cycles to execute, parallelism and pipelining of instructions is much harder. In RISC however, since all instructions take one cycle, pipelining instructions is easier.
However, the compiler plays an important role in RISC systems, and its ability to perform this “code expansion” can hinder performance.
SO WHICH IS BETTER?
There is really no “better” architecture, each has its own advantages and disadvantages that make it useful in different applications. CISC is most often used in automation devices whereas RISC is used in video and image processing applications.
When microprocessors and microcontroller were first being introduced, they were mostly CISC. This was largely because of the lack of software support present for RISC development.
Later a few companies started delving into the RISC architecture, most notably, Apple, but most companies were unwilling to risk it (pun intended) with an emerging technology.
Fast forward a few decades and the CISC architecture was becoming extremely unwieldy and difficult to improve and develop. Intel, however, had a lot of resources at their hand and were able to plow through most of the major roadblocks. They were doing this mainly to make all their hardware and software back compatible with their initial 8086 processors.
Nowadays, the boundary between RISC and CISC architectures are very blurred and in most cases, it is not important. ARM devices, PICs, and almost all smartphone manufacturers use RISC devices as they are faster and less resource and power hungry. The only completely CISC device still in existence is probably the Intel x86 series.