How does a CPU work?

CPU, also known as the microprocessor is the heart and/or brain of a computer. Lets Deep dive into the core of the computer to help us write computer programs efficiently.

CPU, also known as the microprocessor is the heart and/or brain of a computer. Lets Deep dive into the core of the computer and understand how CPU work which will help us write computer programs efficiently.

A tool is usually more simple than a machine; it is generally used with the hand, whilst a machine is frequently moved by animal or steam power.

– Charles Babbage

computer is a machine powered mostly by electricity but its flexibility and programmability have helped achieve the simplicity of a tool.

CPU is the heart and/or the brain of a computer. It executes the instructions that are provided to them. Its main job is to perform arithmetic and logical operations and orchestrate the instructions together. Before diving into the main parts let’s start by looking at what are the main components of a CPU and what their roles are:

Two main components of a CPU (processor)

  • Control unit — CU
  • Arithmetic and logical unit — ALU

Control Unit — CU

Control unit CU is the part of CPU that helps orchestrate the execution of instructions. It tells what to do. According to the instruction, it helps activate the wires connecting the CPU to different other parts of the computer including the ALU. The control unit is the first component of the CPU to receive the instruction for processing.

There are two types of control unit:

  • hardwired control units.
  • micro programmable (microprogrammed) control units.

Hardwired control units are the hardware and need the change in hardware to add modify it’s working whereas the micro-programmable control unit can be programmed to change its behavior. Hardwired CU is faster in processing instruction whereas micro-programmable as more flexible.

Arithmetic and logical unit — ALU

Arithmetic and logical unit ALU as the name suggests does all the arithmetic and logical computations. ALU performs operations like addition, subtraction. ALU consists of logic circuitry or logic gates that perform these operations.

Most logic gates take in two input and produce one output

Below is an example of a half adder circuit that takes in two inputs and outputs the result. Here A and B are the input, S is the output and C is the carry.

Half adder
Half source: https://en.wikipedia.org/wiki/Adder_(electronics)#/media/File:Half_Adder.svg

Storage — Registers and Memory

The main job of the CPU is to execute the instructions provided to it. To process these instructions most of the time, it needs data. Some data are intermediate data, some of them are inputs and other is the output. These data along with the instructions are stored in the following storage:


A Register is a small set of places where the data can be stored. A register is a combination of latchesLatches also known as flip-flops are combinations of logic gates which stores 1 bit of information.

A latch has two input wire, write and input wire and one output wire. We can enable the write wire to make changes to the stored data. When the write wire is disabled the output always remains the same.

NOR gate
An SR latch, constructed from a pair of cross-coupled NOR gates

CPU has registers to store the data of output. Sending to the main memory(RAM) would be slow as it is the intermediate data. This data is sent to other registers that are connected by a BUS. A register can store instruction, output data, storage address or any kind of data.


Ram is a collection of register arranged and compact together in an optimized way so that it can store a higher number of data. RAM(Random Access Memory) is volatile and it’s data gets lost when we turn off the power. As RAM is a collection of registers to read/write data a RAM takes input of 8bit address, data input for the actual data to be stored and finally read and write enabler which works as it is for the latches.

What are Instructions

Instruction is the granular level computation a computer can perform. There are various types of instruction a CPU can process.

Instructions include:

  • Arithmetic such as add and subtract
  • Logic instructions such as andor, and not
  • Data instructions such as moveinputoutputload, and store
  • Control Flow instructions such as gotoif … gotocall and return
  • Notify CPU that the program has ended Halt

Instructions are provided to a computer using assembly language or are generated by a compiler or are interpreted in some high-level languages.

These instructions are hardwired inside the CPU. ALU contains arithmetic and logical whereas the control flow is managed by CU.

In one clock cycle computers can perform one instruction but modern computers can perform more than one.

A group of instructions a computer can perform is called an instruction set.

CPU clock

Clock cycle

The speed of a computer is determined by its clock cycle. It is the number of clock periods per second a computer works on. Single clock cycles are very small like around 250 * 10 *-12 sec. Higher the clock cycle faster the processor is.

A CPU clock cycle is measured in GHz(Gigahertz). 1gHz is equal to 10 ⁹ Hz(hertz). A hertz means a second. So 1Gigahertz means 10 ⁹ cycles per second.

The faster the clock cycle, the more instructions the CPU can execute. Clock cycle = 1/clock rate CPU Time = number of clock cycle / clock rate

This means to improve CPU time we can increase clock rate or decrease the number of clock cycles by optimizing the instruction we provide to CPU. Some processor provides the ability to increase the clock cycle but since it is physical changes there might be overheating and even smokes/fires.

How does an instruction gets executed

Instructions are stored on the RAM in sequential order. For a hypothetical CPU, Instruction consists of OP code(operational code) and memory or register address.

There are two registers inside a Control Unit Instruction register(IR) which loads the OP code of the instruction and instruction address register which loads the address of the currently executing instruction. There are other registers inside a CPU that stores the value stored in the address of the last 4 bits of an instruction.

Let’s take an example of a set of instructions that adds two numbers. The following are the instructions along with their description. The CPU works executing the following instructions:

STEP 1 — LOAD_A 8:

The instruction is saved in RAM initially as let’s say <1100 1000>. The first 4 bit is the op-code. This determines the instruction. This instruction is fetched into the IR of the control unit. The instruction is decoded to be load_A which means it needs to load the data in the address 1000 which is the last 4 bit of the instruction to register A.


Similar to above this loads the data in memory address 2 (0010) to CPU register B.


Now the next instruction is to add these two numbers. Here the CU tells ALU to perform the add operation and save the result back to register A.


This is a very simple set of instructions that helps add two numbers.

We have successfully added two numbers!


All the data between CPU, register, memory and IO devise are transferred via bus. To load the data to memory that it has just added, the CPU puts the memory address to address-bus and the result of the sum to the data-bus and enables the right signal in the control-bus. In this way, the data is loaded to memory with the help of the bus.

CPU bus
Photo src: https://en.wikipedia.org/wiki/Bus_(computing)#/media/File:Computer_system_bus.svg


CPU also has a mechanism to prefetch the instruction to its cached. As we know there are millions of instructions a processor can complete within a second. This means that there will be more time spent in fetching the instruction from RAM than executing them. So the CPU cache prefetches some of the instruction and also data so that the execution gets fast.

If the data in the cache and operating memory is different the data is marked as a dirty bit.

Instruction pipelining

Modern CPU uses Instruction pipelining for parallelization in instruction execution. Fetch, Decode, Execute. When one instruction is in the decode phase the CPU can process another instruction for the fetch phase.

CPU clock cycle
photo source: https://en.wikipedia.org/wiki/Instruction_pipelining#/media/File:Pipeline,_4_stage.svg

This has one problem when one instruction is dependent on another. So processors execute the instruction that is not dependent and in a different order.

Multicore computer

It is basically the different CPU but has some shared resources like the cache.


The Performance of CPU is determined by it’s execution time. Performance = 1/execution time

let’s say it takes 20ms for a program to execute. The performance of CPU is 1/20 = 0.05msRelative performance = execution time 1/ execution time 2

The factor that comes under consideration for a CPU performance is the instruction execution time and the CPU clock speed. So to increase the performance of a program we either need to increase the clock speed or decrease the number of instruction in a program. The processor speed is limited and modern computers with multi-core can support millions of instructions a second. But if the program we have written has a lot of instructions this will decrease the overall performance.

Big O notation determines with the given input on how the performance will be affected.

There is a lot of optimizations done in the CPU to make it faster and perform as much as it can. While writing any program we need to consider how reducing the number of instructions we provide to CPU will increase the performance of the computer program.

Interested in optimizing databases? Learn about it here: https://milapneupane.com.np/2019/07/06/how-to-work-optimally-with-relational-databases/