Branch (computer science)

738 VIEWS

Implementation

Examples

Performance problems with branch instructions

Improving performance by reducing stalls from branches

Branch prediction hints

Hardware branch predictors

Branch-free code

Delay slot

Branch (computer science)

A branch is an instruction in a computer program that can cause a computer to begin executing a different instruction sequence and thus deviate from its default behavior of executing instructions in order.^[1] Branch (or branching, branched) may also refer to the act of switching execution to a different instruction sequence as a result of executing a branch instruction. Branch instructions are used to implement control flow in program loops and conditionals (i.e., executing a particular sequence of instructions only if certain conditions are satisfied).

A branch instruction can be either an unconditional branch, which always results in branching, or a conditional branch, which may or may not cause branching depending on some condition. Also, depending on how it specifies the address of the new instruction sequence (the "target" address), a branch instruction is generally classified as direct, indirect or relative, meaning that the instruction contains the target address, or it specifies where the target address is to be found (e.g., a register or memory location), or it specifies the difference between the current and target addresses.^[2]

Implementation

Mechanically, a branch instruction can change the program counter (PC) of a CPU. The program counter stores the memory address of the next instruction to be executed. Therefore, a branch can cause the CPU to begin fetching its instructions from a different sequence of memory cells.

When a branch is taken, the CPU's program counter is set to the argument of the jump instruction. So, the next instruction becomes the instruction at that address in memory. Therefore, the flow of control changes.

When a branch is not taken, the CPU's program counter is unchanged. Therefore, the next instruction executed is the instruction after the branch instruction. Therefore, the flow of control is unchanged.

The term branch can be used when referring to programs in high level languages as well as the programs written in machine code or assembly language. In high-level programming languages, branches usually take the form of conditional statements of various forms that encapsulate the instruction sequence that will be executed if the conditions are satisfied. Unconditional branch instructions such as GOTO are used to unconditionally "jump" to (begin execution of) a different instruction sequence.

Machine level branch instructions are sometimes called jump instructions. Machine level jump instructions typically have unconditional and conditional forms where the latter may be taken or not taken depending on some condition. Usually there are distinct forms for one-way jumps, often called jump and subroutine invocations known as call which automatically save the originating address as a return address on the stack, allowing a single subroutine to be invoked from multiple locations in code.

In CPUs with flag registers, an earlier instruction sets a condition in the flag register. The earlier instruction may be arithmetic, or a logic instruction. It is often close to the branch, though not necessarily the instruction immediately before the branch. The stored condition is then used in a branch such as jump if overflow-flag set. This temporary information is often stored in a flag register but may also be located elsewhere. A flag register design is simple in slower, simple computers. In fast computers a flag register can place a bottleneck on speed, because instructions that could otherwise operate in parallel (in several execution units) need to set the flag bits in a particular sequence.

There are also machines (or particular instructions) where the condition may be checked by the jump instruction itself, such as branch if register X negative. In simple computer designs, comparison branches execute more arithmetic and can use more power than flag register branches. In fast computer designs comparison branches can run faster than flag register branches, because comparison branches can access the registers with more parallelism, using the same CPU mechanisms as a calculation.

Some early and simple CPU architectures, still found in microcontrollers, may not implement a conditional jump, but rather only a conditional "skip the next instruction" operation. A conditional jump or call is thus implemented as a conditional skip of an unconditional jump or call instruction.

Examples

Depending on the computer architecture, the assembly language mnemonic for a jump instruction is typically some shortened form of the word jump or the word branch, often along with other informative letters (or an extra parameter) representing the condition. Sometimes other details are included as well, such as the range of the jump (the offset size) or a special addressing mode that should be used to locate the actual effective offset.

This table lists the machine level branch or jump instructions found in several well-known architectures:


condition or result	x86	PDP-11, VAX	ARM (partly 6502)	equation
zero (implies equal for sub/cmp)	JZ; JNZ	BEQ; BNE	BEQ; BNE	zero; not zero
negative (N), sign (S), or minus (M)	JS; JNS	BMI; BPL	BMI; BPL	negative; not negative
arithmetic overflow (flag called O or V)	JO; JNO	BVS; BVC	BVS; BVC	overflow; not overflow
carry (from add, cmp, shift, etc.)	JC; JNC	BCS; BCC	BCS; BCC	carry; not carry
unsigned below (lower)	JB	BLO	BLO	borrow
unsigned below or equal (lower or same)	JBE	BLOS	BLS	borrow or zero
unsigned above or equal (higher or same)	JAE	BHIS	BHS	not borrow
unsigned above (higher)	JA	BHI	BHI	not borrow and not zero
signed less than	JL	BLT	BLT	sign≠overflow
signed less or equal	JLE	BLE	BLE	(sign≠overflow) or zero
signed greater or equal	JGE	BGE	BGE	sign=overflow
signed greater than	JG	BGT	BGT	(sign=overflow) and not zero

x86, the PDP-11, VAX, and some others, set the carry-flag to signal borrow and clear the carry-flag to signal no borrow. ARM, 6502, the PIC, and some others, do the opposite for subtractive operations. This inverted function of the carry flag for certain instructions is marked by (*), that is, borrow=not carry in some parts of the table, but if not otherwise noted, borrow≡carry. However, carry on additive operations are handled the same way by most architectures.

Performance problems with branch instructions

To achieve high performance, modern processors are pipelined. They consist of multiple parts that each partially process an instruction, feed their results to the next stage in the pipeline, and start working on the next instruction in the program. This design expects instructions to execute in a particular unchanging sequence. Conditional branch instructions make it impossible to know this sequence. So conditional branches can cause "stalls" in which the pipeline has to be restarted on a different part of the program.

Improving performance by reducing stalls from branches

Several techniques improve speed by reducing stalls from conditional branches.

Branch prediction hints

Historically, branch prediction took statistics, and used the result to optimize code. A programmer would compile a test version of a program, and run it with test data. The test code counted how the branches were actually taken. The statistics from the test code were then used by the compiler to optimize the branches of released code. The optimization would arrange that the fastest branch direction (taken or not) would always be the most frequently taken control flow path. To permit this, CPUs must be designed with (or at least have) predictable branch timing. Some CPUs have instruction sets (such as the Power ISA) that were designed with "branch hints" so that a compiler can tell a CPU how each branch is to be taken.

The problem with software branch prediction is that it requires a complex software development process.

Hardware branch predictors

To run any software, hardware Branch predictors moved the statistics into the electronics. Branch predictors are parts of a processor that guess the outcome of a conditional branch. Then the processor's logic gambles on the guess by beginning to execute the expected instruction flow. An example of a simple hardware branch prediction scheme is to assume that all backward branches (i.e. to a smaller program counter) are taken (because they are part of a loop), and all forward branches (to a larger program counter) are not taken (because they leave a loop). Better branch predictors are developed and validated statistically by running them in simulation on a variety of test programs. Good predictors usually count the outcomes of previous executions of a branch. Faster, more expensive computers can then run faster by investing in better branch prediction electronics. In a CPU with hardware branch prediction, branch hints let the compiler's presumably superior branch prediction override the hardware's more simplistic branch prediction.

Branch-free code

Some logic can be written without branches or with fewer branches. It is often possible to use bitwise operations, conditional moves or other branch predication instead of branches.^[3]

Delay slot

Another technique is a branch delay slot. In this approach, one instruction after a branch is always executed. Therefore, the computer can use this instruction to do useful work whether or not its pipeline stalls. This approach was historically popular in RISC computers. In a family of compatible CPUs, it complicates multicycle CPUs (with no pipeline), faster CPUs with longer-than-expected pipelines, and superscalar CPUs (which can execute instructions out of order.)

References

[1]

Citation Linkopenlibrary.orgAt least conceptually; see out-of-order execution.

Sep 20, 2019, 2:18 AM

[2]

Citation Linkwww.researchgate.net"A Survey of Techniques for Dynamic Branch Prediction", S. Mittal, CPE 2018

Sep 20, 2019, 2:18 AM

[3]

Citation Linkopenlibrary.orgKnuth, Donald (2008). The Art of Computer Programming. Volume 4, Pre-fascicle 1A (Revision 6 ed.). pp. 48–49.

Sep 20, 2019, 2:18 AM

[4]

Citation Linkwww.intel.comFree IA-32 and x86-64 documentation

Sep 20, 2019, 2:18 AM

[5]

Citation Linkweb.archive.orgThe PDP-11 FAQ

Sep 20, 2019, 2:18 AM

[6]

Citation Linkwww.simplemachines.itThe ARM instruction set

Sep 20, 2019, 2:18 AM

[7]

Citation Linkwww.researchgate.netA Survey of Techniques for Dynamic Branch Prediction

Sep 20, 2019, 2:18 AM

[8]

Citation Linkwww.intel.comFree IA-32 and x86-64 documentation

Sep 20, 2019, 2:18 AM

[9]

Citation Linkweb.archive.orgThe PDP-11 FAQ

Sep 20, 2019, 2:18 AM

[10]

Citation Linkwww.simplemachines.itThe ARM instruction set

Sep 20, 2019, 2:18 AM

[11]

Citation Linken.wikipedia.orgThe original version of this page is from Wikipedia, you can edit the page right here on Everipedia.Text is available under the Creative Commons Attribution-ShareAlike License.Additional terms may apply.See everipedia.org/everipedia-termsfor further details.Images/media credited individually (click the icon for details).

Sep 20, 2019, 2:18 AM

Branch (computer science)

Branch (computer science)

Implementation

Examples

Performance problems with branch instructions

Improving performance by reducing stalls from branches

Branch prediction hints

Hardware branch predictors

Branch-free code

Delay slot

See also

References