Registers
This is the sequence of steps that happens when the CPU (Central Processing Unit) fetches an instruction from the memory. It involves several registers inside the CPU - specifically, the Program Counter. Here are a summary of the registers needed:
As well as these buses, there is another set of wires called the control bus. These specify whether the memory is being read or written to and send signals from the CPU to the memory (or vice-versa) that the data is ready on the various buses.
One wire in the control bus is referred to as Read/Write or (R/W). This is set to binary "1" (the voltage representing "true" - typically 5 volts) to indicate that the memory is to be read, and to binary "0" (the voltage representing "false" - typically 0 volts) to indicate that the memory is to be written to. The line is sometimes written with a line over the W to show that it is the 0 value that means write, not the 1 value. Similarly, there is an address bus ready line (often called address bus enable), which is set to 1 by the processor whenever there is a valid address on the address bus (it is the signal that says to the memory "Address is ready - come and get it!"), and a similar signal, data bus ready (or data bus enable), which either the memory or the CPU can set.

This is the "Fetch" part of the cycle. The processor then executes the instruction, which, in this case, means two more steps:
After the instruction has been executed, the CPU increases its Program Counter by 1 so that it will point to the instruction after the INX instruction.
An example of a two-byte instruction - LDA #7
The instruction LDA #7 in the 6502 assembly language means "Load the accumulator with the value 7." It requires two bytes of memory - one to hold the instruction ("Load the accumulator with the byte that follows me in memory") and one to hold the byte itself:

The sequence of instructions is now as follows:
When the instruction has been carried out, the CPU will increase the value in the Program Counter so that it points to the instruction after the LDA instruction. If the CPU has a memory address register, then it will need to increase the Program Counter by 2, in order to bypass the LDA instruction itself and the number 7 in memory. If the CPU used the Program Counter itself as the memory address register, then it will already be pointing to the 7 in memory, and will only need to be increased by 1 in order to point to the next instruction.
A two-byte instruction involving a memory write - STA 100
The instruction STA 100 means "Store the value currently in the accumulator in memory address 100". I have included it as it involves two memory reads (getting the instruction, and getting the address in which to store the number) and a memory write (actually storing the number itself).
I am assuming for simplicity's sake that the number 100, the address itself, is a one byte number, so that the instruction will look like this:

The 6502 processor does have a version of the STA instruction which stores numbers in using one byte addresses, so this is a reasonable assumption. That version of the STA instruction can only store numbers in the first 256 bytes of memory (called the Zero Page), as one byte numbers can't specify addresses above 255. Another version of the instruction exists (with a slightly different op-code) which is followed by 2 bytes, specifying the low and high bytes of a 2-byte memory address:

The memory address in this case would be 14 + 73 x 256 = 18702. However, this adds complexity to the process, so I won't be explaining this.
The write-cycle goes as follows:
The last thing the CPU does is to increase the Program Counter, either by 1 if it has been using the Program Counter instead of a memory address register, or by 2 if it really does have a memory address register.
An instruction involving the ALU - ADC #40
The ALU is the Arithmetic Logic Unit, which is used to perform arithmetic (adding and subtracting) and logic (AND or OR). The instruction ADC #40 means "Add the value currently in the accumulator to the number 40, putting the answer back into the accumulator. Change the status flags accordingly".

You can see that the output from the ALU only goes back to the accumulator. No instruction can access the contents of the ALU directly. It has to look inside the CPU instead.

This diagram shows a three-byte number being added to another three-byte number to produce a three-byte answer. There is a carry bit from the first byte to the second (which happens to be 1 in this case) and another carry from the second to the third (which is 0 in this case).
Let's suppose we had to implement this in 6502 assembly code. The two numbers to be added are in addresses 70, 71 and 72 for the first number and 80, 81 and 82 for the second number, and the answer has to be put in addresses 90, 91 and 92. Addresses 70 and 80 hold the least-significant bytes of the numbers and address 90 is to hold the least significant byte of the answer. ("Least signficant" means the byte on the right side).
The code for this addition is:
|
CLC |
Clear the carry flag before the addition |
|
LDA 70 |
Load byte from address 70 into accumulator |
|
ADC 80 |
Add the byte in address 80, setting the carry flag if there is a carry to the next byte. |
|
STA 90 |
Store the answer in address 90. |
|
LDA 71 |
Repeat calculation for bytes 71 and 81 |
|
ADC 81 |
|
|
STA 91 |
|
|
LDA 72 |
Repeat calculation for bytes 72 and 82 |
|
ADC 82 |
|
|
STA 92 |
The first addition is carried out in lines 2 to 4. If there is a carry between this byte and the next, then the carry flag is set automatically by the ADC instruction. This is why there is no CLC instruction before the next byte addition (lines 5 to 7). Similarly, there may be a carry between the second byte and the third byte. The third byte addition happens in the last three lines. In this way, the carry flag is saved between each byte of the addition.
The example above shows the code for a three-byte addition. However, if we used similar code to add twenty-byte numbers, it would be a very long program. We can use the X register to keep a counter, so we can put the code in a loop. Here is a rewritten program which adds the twenty bytes in addresses 10 to 39 to the twenty bytes in addresses 50 to 69 and stores the answer in the twenty bytes from addresses 80 to 99.
|
CLC |
Clear the carry flag before the addition |
|
|
LDX #0 |
Load the counter with 0 |
|
|
.loop |
LDA 10,X |
Load the relevant byte of the first number into A |
|
ADC 50,X |
Add the relevant byte from the second number |
|
|
STA 80,X |
Store in relevant byte of answer |
|
|
INX |
Add 1 to the counter |
|
|
CPX 20 |
Has it reached 20? |
|
|
BNE loop |
Jump if it hasn't back to the loop. |
This program uses an unusual addressing mode. The instruction LDA 10,X means "Load the accumulator with the number in address (10 + the contents of the X register)". This means that when X holds the number 0, the instruction is equivalent to LDA 10, when X holds the number 1, the instruction is equivalent to LDA 11, when it holds 2, it is equivalent to LDA 12 etc. The instructions ADC 50,X and STA 80,X work in a similar manner.
The code works like a loop. The first time round the loop, the byte in 10 is added to the byte in 50 and the answer placed in the address 80. The second time round the loop, the byte in 11 is added to the byte in 51 and the answer placed in 81 etc. All the time the carry flag keeps track of any carries from one byte to the next. The loop only stops when the value of X reaches 20. As long as it hasn't, it jumps back. The instruction "BNE loop" means "Branch if it is Not Equal to the point marked 'loop'".