I'd like to get some varied perspectives (theoretical or otherwise) on why conditionals (i.e. the ability to write 'if -something- do -this- else do -that-') are important to classical computer architectures/languages. Clearly they're important - something to do with the fact that without them your machine isn't actually Turing-complete. Clearly there's something about them that differentiates them from other operations - see the limits imposed by branch operations in pipelining. Perhaps I just haven't found the correct description of why they are important yet - or perhaps I am just stupid (very likely) and am incapable of understanding.
Input welcome, references appreciated.
--
There are Turing-complete languages without conditionals, though - even without data, as such! An example is the lambda calculus. If you want to learn more about this, read up on "Church encoding" and "Scott encoding".
The basic idea is to use functions and the visitor pattern to encode a kind of conditional behaviour that varies with the input supplied to it.
True = (lambda (t f) (t))
False = (lambda (t f) (f))
If = (lambda (b tr fa) (b tr fa))
(lambda (x)
(If x
(lambda () (display "It was true!"))
(lambda () (display "It was false!"))))
Desk check the reductions of applying that function to True and to False. The "t" and "f" arguments to the boolean values act as a kind of "menu" (the visitor pattern), from which the datum selects in order to communicate what variant of data it is. The pattern extends readily to natural numbers, lists, and so on, giving a straightforward encoding of not only conditionals, but also (one-layer-deep) pattern matching.> ... giving a straightforward encoding of not only conditionals, but also (one-layer-deep) pattern matching.
So you wrote a conditional out of something that didn't have one? But for that to work, something in the "If" has to decide whether to evaluate the first one or the second one. That is, there is something that is equivalent to a conditional somewhere in the parts that you use to build the "If".
This may come down to what the definition of "conditional" is, though...
(the code will probably be more optimized / easier to optimize for the compiler but you get my point)
I'm having a hard time understanding the program you supplied though, the subsequent description did not really help (probs more my fault than your own). I certainly understand the visitor pattern (and have had a sneaky suspicion it is important) but the lambda calculus, not so much.
What type of syntax is the program you supplied in - is it more of a pseudocode or an actual formal language?
thanks for the search keywords - definitely helped - if you have a favorite textbook/arxiv reference/author on this stuff, those would be greatly appreciated as well.
True = (lambda (t f) (t))
-- Let True be the function that takes two parameters t and f and returns the first, t. --
False = (lambda (t f) (f))
-- Let False be the function that takes two parameters t and f and returns the second, f. --
If = (lambda (b tr fa) (b tr fa))
-- Let If be the function that takes three parameters, and applies the first parameter b to the second and third parameters. --
Notice that if the parameter b is the True value above, 'b tr fa' will return 'tr'. If 'b' is False, 'b tr fa' will return 'fa'
(lambda (x)
(If x
(lambda () (display "It was true!"))
(lambda () (display "It was false!"))))
So it follows that this will pass either "It was true!" or "It was false!" to 'display', waving aside all the stuff you have to define that makes 'display' and STRINGS work as you'd expect.This is where it becomes clear that you need a lazy system for this to make sense. In an eager system all three of the parameters to 'If' are evaluated and then their values are substituted into the body of 'If'. In a lazy system the expressions for the values are substituted for the parameters in the body, so only the parameter 'b' and one of the other parameters will be evaluated.
When I play around with functional languages I would just write
True t f = t
False t f = f
If b tr fa = b tr fa
ShowIt x = If x (display "It was true!") (display "It was false!")
You could also write this in (I think this is) Scheme: (define True (t f) (t))
(define False(t f) (f))
(define If (b tr fa) (b tr fa))
(define ShowIt (x) (If (x (display "It was true!") (display "It was false!"))))
Except that Scheme isn't usually lazy. ~$ racket
Welcome to Racket v7.2.0.6.
> (define True (lambda (t f) (t)))
> (define False (lambda (t f) (f)))
> (define If (lambda (b tr fa) (b tr fa)))
> (define P (lambda (x)
(If x
(lambda () (display "It was true!"))
(lambda () (display "It was false!")))))
> (P True)
It was true!
> (P False)
It was false!
>
Seen differently, it's also close to the mathematical notation of the lambda calculus.To learn about the lambda calculus, check out chapter 5 of "Types and Programming Languages", Benjamin C. Pierce, MIT Press 2002. You might also get something out of "Semantics Engineering with PLT Redex", Felleisen, Findler & Flatt, MIT Press 2009.
A common technique inside compilers is to separate out which code gets run, and which result gets used. If everything is side-effect-free, then there's no harm in running the code whose result won't get used. In highly parallel systems, you can just run both versions and select at the end. So you can rewrite:
if foo < 2:
a = foo
else
a = 2
to: a = phi(foo<2, 2, foo)
(the phi operator is taken from SSA notation. See https://llvm.org/docs/LangRef.html#phi-instruction if you're unfamiliar)So Turing completeness is basically two things. 1: the ability to compose an arbitrary number of conditional statements. 2: the ability to keep computing recursively, until some output is true.
A DFA or maybe a 2DFA (which are equally powerful) is basically what you get when all you have is conditionals.
Without that, what you have is a box that can read finite input and compute a value. Equivalent to a finite lookup table.
min(a,b) := if(a < b, a, b)
Obviously, the evaluation of this function looses information about either a or b. This is of course by intention. But this is incompatible to a closed (quantum) system where information must be preserved.Interestingly (but I think there is no deeper connection), vectorization uses masks to avoid branching. Programmatically, one implements then both branches at the same time, and on a per-data level different operations are performed. A simple pseudo-numpy example would be
B[where(A < B)] = A
which should actually read component by component, for
instance, having A and B being matrices,
B_{ij} = A_{ij} if A_{ij} < B_{ij}
B_{ij} else ("no-op")
One can write long programs following this paradigm which frequently excel at performance at GPUs and modern day CPUs (which are actually vector machines).From assembly(x86), most conditionals and loops break down to cmp/test and jmp,jx,jnx type instructions.
Computers need to make decisions based on comparisons of data or program state. You can abstract decision making however you want (e.g.: switch statements in place of if/else or usage of goto's) but the computer needs to test or compare and decide on what instruction to execute based on the result.
Let me invent a new condtional 'alot':
for i in range(0,100): alot i: exit(1)
instead of testing for boolean true/false. My conditional "alot" tests the result of the evaluation against the value range of the type,if the result is over half of the value range it executes the conditional statement (exit())
Of course I am being a bit silly but my point is my silly conditional much like if/else breaks down to comparing and jumping based on evaluation (i.e: instead of a jz/jnz it will break down to jge)
input >>> output
By definition the output is conditional on the input, at least if the program works correctly. That's why conditional statements are important - they express the relationship between input and output.Isn't this really basic stuff? Or am I missing something?
How is the output of the following not conditional upon the input? Are there any conditional statements in it?
F(x, y) = x + y
Yes, of course you can.
("Most" because you have programs like image classifiers that are trying to be similarity amplifiers. That turns out to be hard to do with the standard building blocks, which want to be difference amplifiers. Perhaps different building blocks would be more useful.)
The first computers were literally people, they computed. There wasn't really a lot of "conditional" work to what they did, they did math (principally numerical work) like generating tables of data and compiling the results. They still needed some conditionals, like if a value was over a certain threshold, use a particular formula or change some parameter.
The early mechanical computers were largely tabulators of the same sort, but machines and not people. But as you want to express more complicated computations, you need the ability to branch. Conditionals are just a means for executing such a branch. Even the simplest of computational machines (in the theory sense) like regular expressions and deterministic finite automata include a branch or conditional element: Given input C while in state S go to state S'. A simple way to visualize this decision matrix is with a table:
Input | S_0 | S_1
-----------------
0 | S_0 | S_0
1 | S_1 | S_1
Accepting states = {S_0}
Is a simple machine that will recognize any binary string ending in '0'. In each state, a decision is made on what the next state should be based on the input. This isn't even a Turing complete system, DFAs can only recognize regular languages. This cannot be done without some sort of conditional branching capability. You may be able to encode this in a hardware description language such that the conditional is obscured or obfuscated, but it's still there.Even trying to compute taxes or payroll depended on conditionals (income level determines which tax brackets you pay into, over a certain amount of income you've maxed out Social Security). And that was one of the earliest (commercial) uses for computational devices.
Now, the way we express conditionals may change from one language or system to another. In Erlang, you can remove a lot of `if` expressions using pattern matching, consider the simple computation of the Fibonacci numbers as an example:
fib(0) -> 1;
fib(1) -> 1;
fib(N) -> fib(N-1) + fib(N-2).
A conditional is right there: If N = 0, return 1; if N = 1, return 1; else return fib(N-1)+fib(N-2). Haskell and other functional languages use similar constructs, and you can include other guards: fib(0) -> 1;
fib(1) -> 1;
fib(N) when N > 1, is_integer(N) -> fib(N-1) + fib(N-2).
This removes the potential for infinite recursion by adding an additional guard, now this function is only valid for non-negative, integral inputs. It has been transformed now to: If N = 0, return 1; if N = 1, return 1; if N > 1 and N is an integer, return fib(N-1)+fib(N-2); else signal an error.