The Languages of English, Math, and Programming (opens in new tab)

(github.com)

161 pointsstereoabuse1y ago47 comments

47 comments

40 comments · 17 top-level

nick0garvey1y ago· 6 in thread

I took a Udacity class by Norvig [1] and my abilities as a programmer clearly were improved afterward.

His code here demonstrates why. It is both shorter and much easier to understand than anything the LLMs generated. It is not always as efficient as the LLMs (who often skip the third loop by calculating the last factor), but it is definitely the code I would prefer to work with in most situations.

[1] https://www.udacity.com/course/design-of-computer-programs--...

fifilura1y ago

Working extensively in SQL for a while also gives you another perspective of programming. There is just no way you can write this with a for loop in SQL since it does not (generally) have for loops.

  WITH all_numbers AS
  (
      SELECT generate_series as n
      FROM generate_series(1, 108) as n
  ),
  divisors AS
  (
      SELECT *
      FROM all_numbers
      WHERE 108 % n = 0
  ),
  permutations as
  (
      SELECT a.n as n1, b.n as n2, c.n as n3
      FROM divisors as a
        CROSS JOIN divisors as b
        CROSS JOIN divisors as c
  ) 
  SELECT *
  FROM permutations
  WHERE n1 * n2 * n3 =  108
      AND n1 < n2 And n2 < n3 
  ORDER BY  n1, n2, n3

https://codapi.org/embed/?sandbox=duckdb&code=data%3A%3Bbase...

upghost1y ago

Single handedly most important class in my career back in the day. It took me 3 months to really grok and generalize the Qpig algorithm, even my professors couldn't explain it.

It's amazing how he never used the words "AI" once in this course despite the fact that it is a straight up AI course.

I revisit the course notes at least once a year and I still walk away with something new every time.

abecedarius1y ago

I've recommended his book PAIP despite the AI in the title because you can take it as about the craft of programming (I mean things like style and efficiency, not the bare beginning of being able to code at all) -- using old-fashioned AI as the subject matter. To learn better coding you gotta code something.

https://github.com/norvig/paip-lisp

1 more reply

JohnKemeny1y ago

What's the Qpig algorithm?

2 more replies

l33t73322731y ago

> It is both shorter and much easier to understand than anything the LLMs generated.

I think this makes sense. LLMs are trained on average code on average. This means they produce average code, on average.

olliemath1y ago

Actually most of the LLMs algos are less efficient than the readable human one, even with only two nested loops. Only one of them precalculates the factors which makes the biggest difference (there are log2(N) factors, worst case, for large N and a triple loop over those is better than a double loop over 1..N).

ryandv1y ago· 6 in thread

It's worth noting that math and programming do not appear to be considered "languages" by much of the academic and/or neuroscientific literature; see [0] on the front page right now and my comments regarding the same [1].

[0] https://news.ycombinator.com/item?id=41868884

[1] https://news.ycombinator.com/item?id=41892701

riku_iki1y ago

> It's worth noting that math and programming do not appear to be considered "languages" by much of the academic

There is math term "formal language", and both math and programming perfectly fit into it: https://en.wikipedia.org/wiki/Formal_language

nurettin1y ago

Yes, calling an algorithm, a formal description of a generative function "a language" is clearly wrong. I don't think we need an academic dissertation on this.

ryandv1y ago

I take it you disagree with the conclusions from TFA then?

> Sometimes a natural language such as English is a good choice, sometimes you need the language of mathematical equations, or chemical equations, or musical notation, and sometimes a programming language is best. Written language is an amazing invention that has enabled human culture to build over the centuries (and also enabled LLMs to work). But human ingenuity has divised [sic] other notations that are more specialized but very effective in limited domains.

bbor1y ago

Huh, interesting. Programming languages were devised with Chomsky’s foundational theory of formal languages in mind, and they’re one of the few actual implementations of it. I read your comment and it seems your main thrust is that arithmetic activity lights up different brain regions than communicative activity, which I don’t personally see as a compelling basis for a definition of the word “language”.

Of course, this is what Chomsky calls a “terminological dispute”, so I mean no offense and you’re ofc free to stand your ground that the only language is what appears in human brains! But if mathematical notation and programming languages aren’t languages, what are they…? Protocols? Recursively generative patterns? Maybe just grammars?

The best move in any terminological dispute is “my terms are more useful”, so this seems like a good reason to keep language as it’s defined by the generative linguistics. Or, more broadly:

  Saussure approaches the essence of language from two sides. For the one, he borrows ideas from Steinthal and Durkheim, concluding that language is a 'social fact'. For the other, he creates a theory of language as a system in and for itself which arises from the association of concepts and words or expressions. Thus, language is a dual system of interactive sub-systems: a conceptual system and a system of linguistic forms. Neither of these can exist without the other because, in Saussure's notion, there are no (proper) expressions without meaning, but also no (organised) meaning without words or expressions.

https://en.wikipedia.org/wiki/Theory_of_language

ryandv1y ago

I don't actually take any strong positions on the matter so long as the sense in which a term is used is clearly defined and agreed upon from the outset. This is merely an observation regarding what the literature considers "language" and this narrow definition's basis in brain structure contrasted with other forms of mental activity.

But if I must, I suppose I am indeed assuming a stance by taking potshots at this narrower (albeit more precise) use of the word language by (obliquely) pointing to counterexamples that could be considered languages in their own right; the sweeping claim that "language is not essential for thought" seems far broader than the narrow sense in which the term is construed in the actual paper.

Tainnor1y ago

The confusion is in a sense intentional as the whole point of Chomsky's program (Universal Grammar) is that natural languages essentially work like formal languages.

I don't think that this is actually true at all (and modern neuroscience, sociolinguistics, etc. disagree pretty heavily with Chomsky), but it is impressive how far you can get with modelling natural language grammar with formal methods.

tmsh1y ago· 3 in thread

What about o1? I think the world is sleeping on o1. Recently I misread a leetcode/neetcode problem (so I was curious that my version of the problem with an extra constraint could be solved in a different way). And 4o hallucinated incorrectly and double downed when asked follow up questions - but o1 solved it the first time what seemed like easily. It really is a major step forward.

fifilura1y ago

So... Can you enlighten us how it went?

klibertp1y ago

After 34 seconds:

    import itertools

    def factors(n):
        result = set()
        for i in range(1, int(n**0.5) + 1):
            if n % i == 0:
                result.add(i)
                result.add(n // i)
        return sorted(result)

    def main():
        n = 108
        factors_list = factors(n)
        triplets = []
        for a_idx in range(len(factors_list) - 2):
            a = factors_list[a_idx]
            for b_idx in range(a_idx + 1, len(factors_list) - 1):
                b = factors_list[b_idx]
                for c_idx in range(b_idx + 1, len(factors_list)):
                    c = factors_list[c_idx]
                    if a * b * c == n and a < b < c:
                        triplets.append((a, b, c))
        print("All ways in which three distinct positive integers have a product of 108:")
        for triplet in triplets:
            print(triplet)

    if __name__ == "__main__":
        main()

Run:

    -▶ python a.py
    All ways in which three distinct positive integers have a product of 108:
    (1, 2, 54)
    (1, 3, 36)
    (1, 4, 27)
    (1, 6, 18)
    (1, 9, 12)
    (2, 3, 18)
    (2, 6, 9)
    (3, 4, 9)

Too smart for its own good, maybe? Still, apparently correct on the first try.

There's an import for `itertools` which isn't used, just as noted in the article for 4o.

Can someone who knows what it's about say how optimal this version is, compared to other answers?

1 more reply

tmsh1y ago

https://chatgpt.com/share/671604db-7ff0-8001-840e-960abfed0b...

Note the added section compared to the leetcode question (https://leetcode.com/problems/two-sum-ii-input-array-is-sort...) is "And note also that index1 + 1 == index2."

If you enter that into 4o it'll just keep repeating leetcode 167 answers and insist binary search has nothing to do with it.

tangent: o1-preview isn't going to invent things quite yet (i tried to get it to come up with something original in https://chatgpt.com/share/6715f3fe-cfb8-8001-8c7c-10e970c7d3... via curiosity after watching https://www.youtube.com/watch?v=1_gJp2uAjO0). but it can sort of reason about things that are more well-established.

1 more reply

owenpalmer1y ago· 2 in thread

A formal notation for reasoning could possibly solve some reasoning issues for LLMs. Perhaps something like Lojban or symbolic logic. We don't have a lot of data for it, but it might be possible to synthetically generate it.

On a dark note, I wonder if increasing AI reasoning capability could have dangerous results. Currently, LLMs are relatively empathetic, and seem to factor the complex human experience into it's responses. Would making LLMs more logical, cold, and calculating result in them stepping on things which humans care about?

seanhunter1y ago

> A formal notation for reasoning could possibly solve some reasoning issues for LLMs. Perhaps something like Lojban or symbolic logic. We don't have a lot of data for it, but it might be possible to synthetically generate it.

There's definitely anecdata supporting this. For some time chatgpt was better on a lot of arithmetic/logic type tasks if you asked it to generate a python program to do x than if you just asked it to do x for example. I haven't tested this specifically on the latest generation but my feeling is it has caught up a lot.

Jianghong941y ago

Due to the extreme data quantity requirement for pre-training, LLM effectively locks your reasoning language into the lowest common denominator, aka Python. Sure, people (maybe very smart) can come up with reasonable, effective, efficient notation; the problem is to train the model to use it properly.

downboots1y ago· 2 in thread

> The language that a problem-solver uses matters!

Because the "intelligence" is borrowed from language (lower entropy)

ainiriand1y ago

What are your thoughts on https://news.ycombinator.com/item?id=41868884

downboots1y ago

An analogy https://youtu.be/5H8aRCyEGnU

underdeserver1y ago· 1 in thread

I'm just here to point out that since Python 3.10, you don't need to import anything from typing anymore, you could use the built-in types for annotation:

  from math      import prod
  from itertools import combinations

  def find_products(k=3, N=108) -> set[tuple[int, ...]]:
    """A list of all ways in which `k` distinct positive integers have a product of `N`.""" 
    factors = {i for i in range(1, N + 1) if N % i == 0}
    return {ints for ints in combinations(factors, k) if prod(ints) == N}

  find_products()

superlopuh1y ago

The typing version is still useful when you want to communicate that the result conforms to a certain interface, which doesn't include mutability in the case of Set, but not the exact type.

Edit: I see that he imported the typing Set, which is deprecated, instead of collections.abc.Set, which is still useful, so your comment is correct.

segmondy1y ago· 1 in thread

Tried these with some local models and these are the ones that generated the program one shot, a few of them also generated the results correctly one shot.

llama3.1-70b, llama3.1-405b, deepseekcoder2.5, gemma-27b, mistral-large, qwen2.5-72b. https://gist.github.com/segmond/8992a8ec5976ff6533d797caafe1...

I like how the solution sort of varies across most, tho mistral and qwen look really similar.

assadk1y ago

What specs does your machine have to run these models locally?

YeGoblynQueenne1y ago· 1 in thread

>> Only 2 of the 9 LLMs solved the "list all ways" prompt, but 7 out of 9 solved the "write a program" prompt. The language that a problem-solver uses matters! Sometimes a natural language such as English is a good choice, sometimes you need the language of mathematical equations, or chemical equations, or musical notation, and sometimes a programming language is best. Written language is an amazing invention that has enabled human culture to build over the centuries (and also enabled LLMs to work). But human ingenuity has divised other notations that are more specialized but very effective in limited domains.

If I understand correctly, Peter Norvig's argument is about the relative expressivity and precision of Python and natural language with respect to a particular kind of problem. He's saying that Python is a more appropriate language to express factorisation problems, and their solutions, than natural language.

Respectfully -very respectfully- I disagree. The much simpler explanation is that there are many more examples, in the training set of most LLMs, of factorisation problems and their solutions in Python (and other programming languages), than in natural language. Examples in Python etc. are also likely to share more common structure, even down to function and variable names [1], so there are more statistical regularities for a language model to overfit-to, during training.

We know LLMs do this. We even know how they do it, to an extent. We've known since the time of BERT. For example:

Probing Neural Network Comprehension of Natural Language Arguments

https://aclanthology.org/P19-1459/

Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference

https://aclanthology.org/P19-1334/

Given these and other prior results Peter Norvig's single experiment is not enough, and not strong enough, evidence to support his alternative hypothesis. Ideally, we would be able to test an LLM by asking it to solve a factorisation problem in a language in which we can ensure there are very few examples of a solution, but that is unfortunately very hard to do.

______________

[1] Notice for instance how Llama 3.1 immediately identifies the problem as "find_factors", even though there's no such instruction in the two prompts. That's because it's seen that kind of code in the context of that kind of question during training. The other LLMs seem to take terms from the prompts instead.

zazaulola1y ago

No LLM suggested to first build a table of primes for faster factorization. 108 is too small a number for the test. How long will each program hang to give results for, say, the prime number 1599999983: [1,1599999983,1]

itronitron1y ago· 1 in thread

so, in conclusion, the training data containing 'math' that LLMs have access to is predominantly written as software code, and not as mathematical notation

jll291y ago

It would be quite exciting to train a LLM with (OCR scans of) all mathematical journals, pre-prints, time-series etc.

earslap1y ago

It is more obvious when taken to extreme: With the current feedforward transformer architectures, there is a fixed amount of compute per token. Imagine asking a very hard question with a yes/no answer to an LLM. There are infinite number of cases where the compute available to the calculation of the next token is not enough to definitively solve that problem, even given "perfect" training.

You can increase the compute for allowing more tokens for it to use as a "scratch pad" so the total compute available will be num_tokens * ops_per_token but there still are infinite amount of problems you can ask that will not be computable within that constraint.

But, you can offload computation by asking for the description of the computation, instead of asking for the LLM to compute it. I'm no mathematician but I would not be surprised to learn that the above limit applies here as well in some sense (maybe there are solutions to problems that can't be represented in a reasonable number of symbols given our constraints - Kolmogorov Complexity and all that), but still for most practical (and beyond) purposes this is a huge improvement and should be enough for most things we care about. Just letting the system describe the computation steps to solve a problem and executing that computation separately offline (then feeding it back if necessary) is a necessary component if we want to do more useful things.

bytebach1y ago

Related HN discussion - https://news.ycombinator.com/item?id=41831735 using Prolog as an intermediate target language for LLM output improves their 'reasoning' abilities.

albert_e1y ago

Gut feel: doing this in two steps (1. write an algorithm for and 2. write code for) or even chain-of-thought prompting might yield better results.

dandanua1y ago

It's not the language it's the context. LLMs could give very different outputs depending on the supplied context. In this case, words "python" and "program" put it in a far better context to solve the problem.

okibry1y ago

I have question, we still do not know what behind "neuron network" of machine. What if we gave them some syntax of a language (or syntax of a area) then ask them extract a general rule for that language ? Can them can do that ?

__0x011y ago

I thought this was going to be an essay on the impact of English, Math and Programming on humanity.

I would place English (or all spoken/written languages in general) first, Math (as discovered) second, and programming languages last.

PaulDavisThe1st1y ago

Within hours to weeks to months, many LLMs will be better at solving the problem as originally given in TFA ... because of the existence of TFA.

Not immediately clear what this means or if it is good or bad or something else.

akira25011y ago

When an author writes things like:

"But some of them forgot that 1 could be a factor of 108"

I struggle to take them seriously. The anthropomorphization of AI into something that can "know" or "forget" is ridiculous and shows a serious lack of caution and thinking when working with them.

Likewise it leads people into wasting time on "prompt engineering" exercises that produce overfit and worthless solutions to trivial problems because it makes them feel like they're revealing "hidden knowledge."

Genuinely disappointing use of human time and of commercial electricity.

j / k navigate · click thread line to collapse

47 comments

40 comments · 17 top-level

nick0garvey1y ago· 6 in thread

I took a Udacity class by Norvig [1] and my abilities as a programmer clearly were improved afterward.

[1] https://www.udacity.com/course/design-of-computer-programs--...

fifilura1y ago

Working extensively in SQL for a while also gives you another perspective of programming. There is just no way you can write this with a for loop in SQL since it does not (generally) have for loops.

  WITH all_numbers AS
  (
      SELECT generate_series as n
      FROM generate_series(1, 108) as n
  ),
  divisors AS
  (
      SELECT *
      FROM all_numbers
      WHERE 108 % n = 0
  ),
  permutations as
  (
      SELECT a.n as n1, b.n as n2, c.n as n3
      FROM divisors as a
        CROSS JOIN divisors as b
        CROSS JOIN divisors as c
  ) 
  SELECT *
  FROM permutations
  WHERE n1 * n2 * n3 =  108
      AND n1 < n2 And n2 < n3 
  ORDER BY  n1, n2, n3

https://codapi.org/embed/?sandbox=duckdb&code=data%3A%3Bbase...

upghost1y ago

Single handedly most important class in my career back in the day. It took me 3 months to really grok and generalize the Qpig algorithm, even my professors couldn't explain it.

It's amazing how he never used the words "AI" once in this course despite the fact that it is a straight up AI course.

I revisit the course notes at least once a year and I still walk away with something new every time.

abecedarius1y ago

https://github.com/norvig/paip-lisp

1 more reply

JohnKemeny1y ago

What's the Qpig algorithm?

2 more replies

l33t73322731y ago

> It is both shorter and much easier to understand than anything the LLMs generated.

I think this makes sense. LLMs are trained on average code on average. This means they produce average code, on average.

olliemath1y ago

ryandv1y ago· 6 in thread

[0] https://news.ycombinator.com/item?id=41868884

[1] https://news.ycombinator.com/item?id=41892701

riku_iki1y ago

> It's worth noting that math and programming do not appear to be considered "languages" by much of the academic

There is math term "formal language", and both math and programming perfectly fit into it: https://en.wikipedia.org/wiki/Formal_language

nurettin1y ago

Yes, calling an algorithm, a formal description of a generative function "a language" is clearly wrong. I don't think we need an academic dissertation on this.

ryandv1y ago

I take it you disagree with the conclusions from TFA then?

bbor1y ago

The best move in any terminological dispute is “my terms are more useful”, so this seems like a good reason to keep language as it’s defined by the generative linguistics. Or, more broadly:

  Saussure approaches the essence of language from two sides. For the one, he borrows ideas from Steinthal and Durkheim, concluding that language is a 'social fact'. For the other, he creates a theory of language as a system in and for itself which arises from the association of concepts and words or expressions. Thus, language is a dual system of interactive sub-systems: a conceptual system and a system of linguistic forms. Neither of these can exist without the other because, in Saussure's notion, there are no (proper) expressions without meaning, but also no (organised) meaning without words or expressions.

https://en.wikipedia.org/wiki/Theory_of_language

ryandv1y ago

Tainnor1y ago

The confusion is in a sense intentional as the whole point of Chomsky's program (Universal Grammar) is that natural languages essentially work like formal languages.

tmsh1y ago· 3 in thread

fifilura1y ago

So... Can you enlighten us how it went?

klibertp1y ago

After 34 seconds:

    import itertools

    def factors(n):
        result = set()
        for i in range(1, int(n**0.5) + 1):
            if n % i == 0:
                result.add(i)
                result.add(n // i)
        return sorted(result)

    def main():
        n = 108
        factors_list = factors(n)
        triplets = []
        for a_idx in range(len(factors_list) - 2):
            a = factors_list[a_idx]
            for b_idx in range(a_idx + 1, len(factors_list) - 1):
                b = factors_list[b_idx]
                for c_idx in range(b_idx + 1, len(factors_list)):
                    c = factors_list[c_idx]
                    if a * b * c == n and a < b < c:
                        triplets.append((a, b, c))
        print("All ways in which three distinct positive integers have a product of 108:")
        for triplet in triplets:
            print(triplet)

    if __name__ == "__main__":
        main()

Run:

    -▶ python a.py
    All ways in which three distinct positive integers have a product of 108:
    (1, 2, 54)
    (1, 3, 36)
    (1, 4, 27)
    (1, 6, 18)
    (1, 9, 12)
    (2, 3, 18)
    (2, 6, 9)
    (3, 4, 9)

Too smart for its own good, maybe? Still, apparently correct on the first try.

There's an import for `itertools` which isn't used, just as noted in the article for 4o.

Can someone who knows what it's about say how optimal this version is, compared to other answers?

1 more reply

tmsh1y ago

https://chatgpt.com/share/671604db-7ff0-8001-840e-960abfed0b...

Note the added section compared to the leetcode question (https://leetcode.com/problems/two-sum-ii-input-array-is-sort...) is "And note also that index1 + 1 == index2."

If you enter that into 4o it'll just keep repeating leetcode 167 answers and insist binary search has nothing to do with it.

1 more reply

owenpalmer1y ago· 2 in thread

seanhunter1y ago

Jianghong941y ago

downboots1y ago· 2 in thread

> The language that a problem-solver uses matters!

Because the "intelligence" is borrowed from language (lower entropy)

ainiriand1y ago

What are your thoughts on https://news.ycombinator.com/item?id=41868884

downboots1y ago

An analogy https://youtu.be/5H8aRCyEGnU

underdeserver1y ago· 1 in thread

I'm just here to point out that since Python 3.10, you don't need to import anything from typing anymore, you could use the built-in types for annotation:

  from math      import prod
  from itertools import combinations

  def find_products(k=3, N=108) -> set[tuple[int, ...]]:
    """A list of all ways in which `k` distinct positive integers have a product of `N`.""" 
    factors = {i for i in range(1, N + 1) if N % i == 0}
    return {ints for ints in combinations(factors, k) if prod(ints) == N}

  find_products()

superlopuh1y ago

The typing version is still useful when you want to communicate that the result conforms to a certain interface, which doesn't include mutability in the case of Set, but not the exact type.

Edit: I see that he imported the typing Set, which is deprecated, instead of collections.abc.Set, which is still useful, so your comment is correct.

segmondy1y ago· 1 in thread

Tried these with some local models and these are the ones that generated the program one shot, a few of them also generated the results correctly one shot.

llama3.1-70b, llama3.1-405b, deepseekcoder2.5, gemma-27b, mistral-large, qwen2.5-72b. https://gist.github.com/segmond/8992a8ec5976ff6533d797caafe1...

I like how the solution sort of varies across most, tho mistral and qwen look really similar.

assadk1y ago

What specs does your machine have to run these models locally?

YeGoblynQueenne1y ago· 1 in thread

We know LLMs do this. We even know how they do it, to an extent. We've known since the time of BERT. For example:

Probing Neural Network Comprehension of Natural Language Arguments

https://aclanthology.org/P19-1459/

Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference

https://aclanthology.org/P19-1334/

______________

zazaulola1y ago

itronitron1y ago· 1 in thread

so, in conclusion, the training data containing 'math' that LLMs have access to is predominantly written as software code, and not as mathematical notation

jll291y ago

It would be quite exciting to train a LLM with (OCR scans of) all mathematical journals, pre-prints, time-series etc.

earslap1y ago

bytebach1y ago

Related HN discussion - https://news.ycombinator.com/item?id=41831735 using Prolog as an intermediate target language for LLM output improves their 'reasoning' abilities.

albert_e1y ago

Gut feel: doing this in two steps (1. write an algorithm for and 2. write code for) or even chain-of-thought prompting might yield better results.

dandanua1y ago

okibry1y ago

__0x011y ago

I thought this was going to be an essay on the impact of English, Math and Programming on humanity.

I would place English (or all spoken/written languages in general) first, Math (as discovered) second, and programming languages last.

PaulDavisThe1st1y ago

Within hours to weeks to months, many LLMs will be better at solving the problem as originally given in TFA ... because of the existence of TFA.

Not immediately clear what this means or if it is good or bad or something else.

akira25011y ago

When an author writes things like:

"But some of them forgot that 1 could be a factor of 108"

I struggle to take them seriously. The anthropomorphization of AI into something that can "know" or "forget" is ridiculous and shows a serious lack of caution and thinking when working with them.

Genuinely disappointing use of human time and of commercial electricity.

j / k navigate · click thread line to collapse