- Java's been trying to add f/t-strings, but its designers appear to be perfectionists to a fault, unable to accept anything that doesn't solve every single problem possible to imagine: [1].
- Go developers seem to have taken no more than 5 minutes considering the problem, then thoughtlessly discarded it: [2]. A position born from pure ignorance as far as I'm concerned.
- Python, on the other hand, has consistently put forth a balanced approach of discussing each new way of formatting strings for some time, deciding on a good enough implementation and going with it.
In the end, I find it hard to disagree with Python's approach. Its devs have been able to get value from first the best variant of sprintf in .format() since 2008, f-strings since 2016, and now t-strings.
[1]: https://news.ycombinator.com/item?id=40737095
[2]: https://github.com/golang/go/issues/34174#issuecomment-14509...
There are a million things in go that could be described this way.
Are they wrong about this issue? I think they are. There is a big difference in ergonomics between String interpolation and something like fmt.Sprintf, and the performance cost of fmt.Sprintf is non-trivial as well. But I can't say they didn't put any thought into this.
As we've seen multiple times with Go generics and error handling before, their slow progress on correcting serious usability issues with the language stem from the same basic reasons we see with recent Java features: they are just being quite perfectionist about it. And unlike Java, the Go team would not even release an experimental feature unless they feel quite good about it.
A format function that arbitrarily executes code from within a format string sounds like a complete nightmare. Log4j as an example.
The rejection's example shows how that arbitrary code within the string could instead be fixed functions outside of a string. Safer, easier for compilers and programmers; unless an 'eval' for strings is what was desired. (Offhand I've only seen eval in /scripted/ languages; go makes binaries.)
An f/t string is syntax not runtime.
Instead of
"Hello " + subject + "!"
you write f"Hello {subject}!"
That subject is simple an normal code expression, but one that occurs after the opening quote of the literal and before the ending quote of the literal.And instead of
query(["SELECT * FROM account WHERE id = ", " AND active"], [id])
you write query(t"SELECT * FROM account WHERE id = {id} AND active")
It's a way of writing string literals that if anything makes injection less likely.When compiling, those can be lowered to simple string concatenation, just like any for loop can be lowered to and represented as a while.
So, a template? I certainly ain't gonna be using go for its mustache support.
But as is all too common in the go community, there seems to be a lot of confusion about what is proposed, and resistance to any change.
The issue you linked was opened in 2019 and closed with no new comments in 2023, with active discussion through 2022.
Does Ruby strings already allow lazy processing ?
I'm not talking about wrapping them in a block and passing the block (all languages can do that with a lambdas) but a having literally that eventually resolves to something when you use it.
fmt.Sprintf("This house is %s tall", measurements(2.5))
fmt.Sprint("This house is ", measurements(2.5), " tall")
And the Python f-string equivalent: f"This house is {measurements(2.5)} tall"
The Sprintf version sucks because for every formatting argument, like "%s", we need to stop reading the string and look for the corresponding argument to the function. Not so bad for one argument but gets linearly worse.Sprint is better in that regard, we can read from left to right without interruptions, but is a pain to write due to all the punctuation, nevermind refactor. For example, try adding a new variable between "This" and "house". With the f-string you just type {var} before "house" and you're done. With Sprint, you're now juggling quotation marks and commas. And that's just a simple addition of a new variable. Moving variables or substrings around is even worse.
Summing up, f-strings are substantially more ergonomic to use and since string formatting is so commonly done, this adds up quickly.
_log(f”My variable is {x + y}”)
Reads to me a lot more fluently to me than _log(“My variable is {}”.format(x+y))
or _log(“My variable is {z}”.format(z=x+y))
It’s nothing too profound.Even PEP 498 (fstrings) was a battle.
STR."Hello \{this.user.firstname()}, how are you?\nIt's \{tempC}°C today!"
compared to scala
s"Hello ${this.user.firstname()}, how are you?\nIt's ${tempC}°C today!"
STR."" ? really?
I am super excited this is finally accepted. I started working on PEP 501 4 years ago.
There are also loads of people basically defaulting to "no" on new features, because they understand that there is a cost of supporting things. I will often disagree about the evaluation of that cost, but it's hard to say there is no cost.
Nobody wants a system that is unusable, slow, hard to implement for, or hard to understand. People sometimes just have different weights on each of these properties. And some people are in a very awkward position of overestimating costs due to overestimating implementation effort. So you end up in discussions like "this is hard to understand!" "No it isn't!"
Hard to move beyond, but the existence of these kinds of conversations serve, in a way, as proof that people aren't jumping on every new feature. Python is still a language that is conservative in what it adds.
This should actually inspire more confidence in people that features added to Python are _useful_, because there are many people who are defaulting to not adding new features. Recent additions to Python speeding up is more an indicator of the process improving and identifying the good stuff rather than a lowering of the bar.
[0]: I often think that these discussions often get fairly intense. Understandability is definitely a core Python value, but I Think sometimes discussions confuse "understandability" with "amount of things in the system". You don't have to fully understand pervasive hashing to understand Python's pervasive value equality semantics! A complex system is needed to support a simple one!
There have been processes put into place in recent years to try to curb the difficulty of things. One of those is that all new PEPs have to include a "how can you teach this to beginers" section, as seen here on this pep: https://peps.python.org/pep-0750/#how-to-teach-this
As Nick mentioned, PEP 750 had a long and winding road to its final acceptance; as the process wore on, and the complexities of the earliest cuts of the PEPs were reconsidered, the two converged.
[0] The very first announcement: https://discuss.python.org/t/pep-750-tag-strings-for-writing...
[1] Much later in the PEP process: https://discuss.python.org/t/pep750-template-strings-new-upd...
https://lucumr.pocoo.org/2016/12/29/careful-with-str-format/
So, right now, you have two options to log:
1. `logger.debug(f'Processing {x}')` - looks great, but evaluates anyway, even if logging level > `logging.DEBUG`;
2. `logger.debug('Processing %s', x)` - won't evaluate till necessary.
What would be the approach with t-strings in this case? Would we get any benefits?
For a logger t-strings are mostly just a more pleasant and less bug-prone syntax for #2
Also: were prompt templates for LLM prompt chaining a use case that influenced the design in any way (examples being LangChain and dozens of other libraries with similar functionlity)?
The main reason for non having deferred evaluation was that it over-complicated the feature quite a bit and introduces a rune. Deferred evaluation also has the potential to dramatically increase complexity for beginners in the language, as it can be confusing to follow if you dont know what is going on. Which means "deferred by default" wasnt going to be accepted.
As for LLM's, it was not the main consideration, as the PEP process here started before LLM's were popular.
Maybe not directly, but the Python community is full of LLM users and so I think there's a general awareness of the issues.
>>> template = 'Hello, {name}'
>>> template.format(name='Bob')
'Hello, Bob'
Until this, there wasn't a way to use f-strings formatting without interpolating the results at that moment: >>> template = f'Hello, {name}'
Traceback (most recent call last):
File "<python-input-5>", line 1, in <module>
template = f'Hello, {name}'
^^^^
NameError: name 'name' is not defined
It was annoying being able to use f-strings almost everywhere, but str.format in enough odd corners that you have to put up with it.The point of evaluation of the expressions is the same.
>>> template = t'Hello, {name}'
is still an error if you haven't defined name.BUT the result of a t-string is not a string; it is a Template which has two attributes:
strings: ["Hello, ", ""]
interpolations: [name]
So you can then operate on the parts separately (HTML escape, pass to SQL driver, etc.).There is an observation that you can use `lambda` inside to delay evaluation of an interpolation, but I think this lambda captures any variables it uses from the context.
Actually lambda works fine here
>>> name = 'Sue'
>>> template = lambda name: f'Hello {name}'
>>> template('Bob')
'Hello Bob'Bummer. This could have been so useful:
statement_endpoint: Final = "/api/v2/accounts/{iban}/statement"
(Though str.format isn’t really that bad here either.)That's correct, they don't. Evaluation of t-string expressions is immediate, just like with f-strings.
Since we have the full generality of Python at our disposal, a typical solution is to simply wrap your t-string in a function or a lambda.
(An early version of the PEP had tools for deferred evaluation but these were dropped for being too complex, particularly for a first cut.)
def my_template(name: str) -> Template:
return t"Hello, {name}" >>> template = lambda name: f'Hello {name}'
>>> template('Bob')I guess it’s more concise, but differentiating between eager and delayed execution with a single character makes the language less readable for people who are not as familiar with Python (especially latest update syntax etc).
EDIT: to flesh out with an example:
class Sanitised(str): # init function that sanitises or just use as a tag type that has an external sanitisation function.
def sqltemplate(name: Sanitised) -> str: return f”select * from {name}”
# Usage sqltemplate(name=sanitise(“some injection”))
# Attempt to pass unsanitised sqltemplate(name=“some injection”) # type check error
> just use a tag type and a sanitisation function that takes a string and returns the type
Okay, so you have a `sqlstring(somestring)` function, and the dev has to call it. But... what if they pass in an f-string?
`sqlstring(f'select from mytable where col = {value}')`
You havent actually prevented/enforced anything. With template strings, its turtles all the way down. You can enforce they pass in a template and you can safely escape anything that is a variable because its impossible to have a variable type (possible injection) in the template literal.
This example still works, the entire f-string is sanitised (including whatever the value of name was). Assuming sqlstring is the sanitisation function.
The “template” would be a separate function that returns an f-string bound from function arguments.
But normalizing one pattern ensures the whole community build API around it. This creates a unified ecosystem.
And it's a very clean API that is a no brainer for the string user.
modules, classes, protocols, functions returning functions, all options in Python, each work well for reuse, no need to use more than 2 at once, yet the world swims upstream.
evil = "<script>alert('evil')</script>"
sanitized = Sanitized(evil)
whoops = f"<p>{evil}</p>"If you create a subclass of str which has an init function that sanitises, then you can’t create a Sanitised type by casting right?
And even if you could, there is also nothing stopping you from using a different function to “html” that just returns the string without sanitising. They are on the same relative level of safety.
https://peps.python.org/pep-0750/#arbitrary-string-literal-p...
I mean, they took "yield" and @decorator, we have a trade deficit.
This looks really great! It's almost exactly like JavaScript tagged template literals, just with a fixed tag function of:
(strings, ...values) => {strings, values};
It's pretty interesting how what would be the tag function in JavaScript, and the arguments to it, are separated by the Template class. At first it seems like this will add noise since it takes more characters to write, but it can make nested templates more compact.Take this type of nested template structure in JS:
html`<ul>${items.map((i) => html`<li>${i}</li>`}</ul>`
With PEP 750, I suppose this would be: html(t"<ul>{map(lambda i: t"<li>{i}</li>", items)}</ul>")
Python's unfortunate lambda syntax aside, not needing html() around nested template could be nice (assuming an html() function would interpret plain Templates as HTML).In JavaScript reliable syntax highlighting and type-checking are keyed off the fact that a template can only ever have a single tag, so a static analyzer can know what the nested language is. In Python you could separate the template creation from the processing possibly introduce some ambiguities, but hopefully that's rare in practice.
I'm personally would be interested to see if a special html() processing instruction could both emit server-rendered HTML and say, lit-html JavaScript templates that could be used to update the DOM client-side with new data. That could lead to some very transparent fine-grained single page updates, from what looks like traditional server-only code.
Agreed; it feels natural to accept plain templates (and simple sequences of plain templates) as HTML; this is hinted at in the PEP.
> html(t"<ul>{map(lambda i: t"<li>{i}</li>", items)}</ul>")
Perhaps more idiomatically: html(t"<ul>{(t"<li>{i}</li>" for i in items)}</ul>")
> syntax highlighting and type-checking are keyed off the fact that a template can only ever have a single tag
Yes, this is a key difference and something we agonized a bit over as the PEP came together. In the (very) long term, I'm hopeful that we see type annotations used to indicate the expected string content type. In the nearer term, I think a certain amount of "clever kludginess" will be necessary in tools like (say) black if they wish to provide specialized formatting for common types.
> a special html() processing instruction could both emit server-rendered HTML and say, lit-html JavaScript templates that could be used to update the DOM client-side with new data
I'd love to see this and it's exactly the sort of thing I'm hoping emerges from PEP 750 over time. Please do reach out if you'd like to talk it over!
html(['ul', {'class': 'foo'}, *(['li', item] for item in items)])
I guess template strings do make it more concise. Kind of like Racket's "#lang at-exp racket".The benefit of lisp-like representation is you have the entire structure of the data, not just a sequence of already-serialized and not-yet-serialized pieces.
html(t"<ul>{(t"<li>{i}</li>" for i in items)}</ul>")One possibility would be to define __and__ on html so that you can write e.g. html&t"<b>{x}</b>" (or whichever operator looks the best).
Edit: Sorry I was snarky, its late here.
I already didn't like f-strings and t-strings just add complexity to the language to fix a problem introduced by f-strings.
We really don't need more syntax for string interpolation, in my opinion string.format is the optimal. I could even live with % just because the syntax has been around for so long.
I'd rather the language team focus on more substantive stuff.
Why stop there? Go full Perl (:
I think Python needs more quoting operators, too. Maybe qq{} qq() q// ...
[I say this as someone who actually likes Perl and chuckles from afar at such Python developments. May you get there one day!]
My issue with them is that you have to write your syntax in the string complex expressions dictionary access and such become awkward.
But, this whole thing is bike-shedding in my opinion, and I don't really care about the color of the bike shed.
My understanding of template strings is they are like f-strings but don't do the interpolation bit. The name binding is there but the values are not formatted into the string yet. So effectively this provides a "hook" into the stringification of the interpolated values, right?
If so, this seems like a very narrow feature to bake into the language... Personally, I haven't had issues with introducing some abstraction like functions or custom types to do custom interpolation.
The best use case I know of for these kinds of things is as a way to prevent sql injection. SQL injection is a really annoying attack because the "obvious" way to insert dynamic data into your queries is exactly the wrong way. With a template string you can present a nice API for your sql library where you just pass it "a string" but it can decompose that string into query and arguments for proper parameterization itself without the caller having to think about it.
That's exactly what it is. It's just that they use the word "council" instead of "committee".
Whether or not this is technically a swift call is in the eye of the beholder.
[0]: https://docs.python.org/3/library/string.html#template-strin...
def f(template: Template) -> str:
parts = []
for item in template:
match item:
case str() as s:
parts.append(s)
case Interpolation(value, _, conversion, format_spec):
value = convert(value, conversion)
value = format(value, format_spec)
parts.append(value)
return "".join(parts)
Is this what idiomatic Python has become? 11 lines to express a loop, a conditional and a couple of function calls? I use Python because I want to write executable pseudocode, not excessive superfluousness.By contrast, here's the equivalent Ruby:
def f(template) = template.map { |item|
item.is_a?(Interpolation) ? item.value.convert(item.conversion).format(item.format_spec) : item
}.join def f(template: Template) -> str:
return "".join(
item if isinstance(item, str) else
format(convert(item.value, item.conversion), item.format_spec)
for item in template
)
Or, y'know, several other ways that might feel more idiomatic depending on where you're coming from. def f(template):
return (for item in template:
isinstance(item, str) then item else
format(convert(item.value, item.conversion), item.format_spec)
).join('')Python has always been my preference, and a couple of my coworkers have always preferred Ruby. Different strokes for different folks.
Nah, idiomatic Python always used to prefer comprehensions over explicit loops. This is just the `match` statement making code 3x longer than it needs to be.
def _f_part(item) -> str:
match item:
case str() as s:
return s
case Interpolation(value, _, conversion, format_spec):
return format(convert(value, conversion), format_spec)
def f(template: Template) -> str:
return ''.join(map(_f_part, template))
The `match` part could still be written using Python's if-expression syntax, too. But this way avoids having very long lines like in the Ruby example, and also destructures `item` to avoid repeatedly writing `item.`.I very frequently use this helper-function (or sometimes a generator) idiom in order to avoid building a temporary list to `.join` (or subject to other processing). It separates per-item processing from the overall algorithm, which suits my interpretation of the "functions should do one thing" maxim.
If I were tasked to modify the Python version to say, handle the case where `item` is an int, it would be immediately obvious to me that all I need to do is modify the `match` statement with `case int() as i:`, I don't even need to know Python to figure that out. On the other hand, modifying the Ruby version seems to require intimate knowledge of its syntax.
I don't particularly love the Ruby code either, though - I think the ideal implementation would be something like:
fn stringify(item) =>
item.is_a(Interpolation) then
item.value.convert(item.conversion).format(item.format_spec)
else item.to_string()
fn f(template) => template.map(stringify).join()
[0] https://discuss.python.org/t/gauging-sentiment-on-pattern-ma...What do you mean? Python has always been that way. "Explicit is better than implicit. [..] Readability counts." from the Zen of python.
> By contrast, here's the equivalent Ruby:
Which is awful to read. And of course you could write it similar short in python. But it is not the purpose of a documentation to write short, cryptic code.
Almost all Python programmers should be familiar with list comprehensions - this should be easy to understand:
parts = [... if isinstance(item, Interpolation) else ... for item in template]
Instead the example uses an explicit loop, coupled with the quirks of the `match` statement. This is much less readable IMO: parts = []
for item in template:
match item:
case str() as s:
parts.append(...)
case Interpolation(value, _, conversion, format_spec):
parts.append(...)
> [Ruby] is awful to readI think for someone with a basic knowledge of Ruby, it's more understandable than the Python. It's a combination of basic Ruby features, nothing advanced.
I don't particularly love Ruby's syntax either, though - I think the ideal implementation would be something like:
fn stringify(item) =>
item.is_a(Interpolation) then
item.value.convert(item.conversion).format(item.format_spec)
else item.to_string()
fn f(template) => template.map(stringify).join() fn stringify(item) =>
item.is_a(String) then item else
item.value.convert(item.conversion).format(item.format_spec)
fn f(template) => template.map(stringify).join()Can't think of a good reason now on why I would need this rather than just a simple f-string.
Any unsafe string input should normally be sanitized before being added in a template/concatenation, leaving the sanitization in the end doesn't seem like the best approach, but ok.
One of the PEP's developers, Lysandros, presented this in our local meetup, so I am passingly familiar with it, but still, I might be missing something.
I guess the crux of it is that I don't understand why it's `t"some string"` instead of `Template("some string")`. What do we gain by the shorthand?
Because it's new syntax, it allows for parsing the literal ahead of time and eagerly evaluating the substitutions. Code like
bar = 42
spam = t"foo {bar*bar} baz"
essentially gets translated into bar = 42
spam = Template("foo ", Interpolation(bar*bar), " baz")
That is: subsequent changes to `bar` won't affect the result of evaluating the template, but that evaluation can still apply custom rules.With templates:
mysql.execute(t"DELETE FROM table WHERE id={id} AND param1={param1}")
Without templates: mysql.execute("DELETE FROM table WHERE id=%s AND param1=%s", [id, param1])
So one less argument to pass if we use templates.But yeah it does seem a bit confusing, and maybe kinda not pythonic? Not sure.
[1] https://docs.python.org/3/library/string.html#template-strin...
edit: this was mentioned by milesrout in https://news.ycombinator.com/item?id=43649607
I recently asked him:
--
Hi David! I am a huge long time fan of SWIG and your numerous epic talks on Python.
I remember watching you give a kinda recent talk where you made the point that it’s a great idea to take advantage of the latest features in Python, instead of wasting your time trying to be backwards compatible.
I think you discussed how great f-strings were, which I was originally skeptical about, but you convinced me to change my mind.
I’ve googled around and can’t find that talk any more, so maybe I was confabulating, or it had a weird name, or maybe you’ve just given so many great talks I couldn’t find the needle in the haystack.
What made me want to re-watch and link my cow-orkers to your talk was the recent rolling out of PEP 701: Syntactic formalization of f-strings, which makes f-strings even better!
Oh by the way, do you have any SWIG SWAG? I’d totally proudly wear a SWIG t-shirt!
-Don
--
He replied:
Hi Don,
It was probably the "Fun of Reinvention".
https://www.youtube.com/watch?v=js_0wjzuMfc
If not, all other talks can be found at:
https://www.dabeaz.com/talks.html
As for swag, I got nothing. Sorry!
Cheers, Dave
--
Thank you!
This must be some corollary of rule 34:
https://www.swigwholesale.com/swig-swag
(Don’t worry, sfw!)
-Don
--
The f-strings section starts at 10:24 where he's live coding Python on a tombstone with a dead parrot. But the whole talk is well worth watching, like all his talks!
I’m having trouble understanding this - Can someone please help out with an example use case for this? It seems like before with an f string we had instant evaluation, now with a t string we control the evaluation, why would we further delay evaluation - Is it just to utilise running a function on a string first (i.e. save a foo = process(bar) line?)
You don't completely control the evaluation.
From the PEP:
> Template strings are evaluated eagerly from left to right, just like f-strings. This means that interpolations are evaluated immediately when the template string is processed, not deferred or wrapped in lambdas.
If one of the things you are interpolating is, as a silly example, an invocation of a slow recursive fibonacci function, the template string expression itself (resulting in a Template object) will take a long while to evaluate.
Are you saying that calling:
template = t”{fib_slow()}”
Will immediately run the function, as opposed to when the __str__ is called (or is it because of __repr__?) - Apparent I might just have to sit down with the code and grok it that way, but thanks for helping me understand!This is probably the best overview of why it was withdrawn:
https://mail.openjdk.org/pipermail/amber-spec-experts/2024-A...
That is, you must process a Template in some way to get a useful string out the other side. This is why Template.__str__() is spec'd to be the same as Template.__repr__().
If you want to render a Template like an f-string for some reason, the pep750 examples repo contains an implementation of an `f(template: Template) -> str` method: https://github.com/davepeck/pep750-examples/blob/main/pep/fs...
This could be revisited, for instance to add `Template.format()` in the future.
That said, I think this is a great bit of work and I look forward to getting to use it! Thank you!
sql"SELECT FROM ..."
or re"\d\d[abc]"
that the development environment could highlight properly, that would ... I don't know. In the end t and f string don't do anything that a t() and f() function couldn't have done, except they are nice. So it would be nice to have more.I think this gives you slightly more control before interpolating.
If you want control flow inside a template, jinja and friends are probably still useful.
It is now be a generic expression evaluator and a template rendered!
> https://peps.python.org/pep-0750/#approaches-to-lazy-evaluat...
Hmm, I have a feeling there's a pitfall.
Excited to see what libraries and tooling comes out of this.
This is one place where s-expressions of Lisp make embedding these DSLs syntactically easier.
To borrow the PEP's HTML examples:
#lang racket/base
(require html-template)
(define evil "<script>alert('evil')</script>")
(html-template (p (% evil)))
; <p><script>alert('evil')</script></p>
(define attributes '((src "shrubbery.jpg") (alt "looks nice")))
(html-template (img (@ (%sxml attributes))))
; <img src="shrubbery.jpg" alt="looks nice">
You can see how the parentheses syntax will practically scale better, to a larger and more complex mix of HTML and host language expressions. (A multi-line example using the normal text editor autoindent is on "https://docs.racket-lang.org/html-template/".)PEP 750 t-strings literals work with python's tripe-quote syntax (and its lesser-used implicit string concat syntax):
lots_of_html = t"""
<div>
<main>
<h1>Hello</h1>
</main>
</div>
"""
My hope is that we'll quickly see the tooling ecosystem catch up and -- just like in JavaScript-land -- support syntax coloring and formatting specific types of content in t-strings, like HTML.When concatenating strings is the harder approach, it is really beautiful.
>>> hello_world = {"hello":"HELL" ,"world":"O'WORLD"}
>>> json_template='{"hello":"%(hello)s","world":"%(world)s"}'
>>> print(json_template % hello_world)
{"hello":"HELL","world":"O'WORLD"}
I mostly use Python in scientific contexts, and hitting end-of-life after five years means that for a lot project, code needs to transition language versions in the middle of a project. Not to mention the damage to reproducibility. Once something is marked "end of life" it means that future OS versions are going to have a really good reason to say "this code shouldn't even be able to run on our new OS."
Template strings seem OK, but I would give up all new language features in a heartbeat to get a bit of long term support.
And your scientific context is a distinct minority for python now. Most new development for python is for data/AI. Considering LLMs get updated every quarter, and depreciated every year, there is no appetite for code that doesn't get updated for 5 years.
The code will be updated over five years, but there's no need to be on continual version churn on the underlying language. And frankly I'm surprised that it's tolerated so widely in the community. Trying to run a Node project from 5 years ago is often an exercise in futility, and it will be a big shame when/if that happens to Python.
Your Python interpreter will not spontaneously combust due to being end-of-life. It just eventually won't be able to run new versions of tools; but your existing tool versions should also work fine for a very long time. All you're missing out on is bugfixes, which third parties (such as a Linux distro) are often willing to provide.
When a language like Python doesn't innovate at this rate, eventually people will get annoyed about how primitive it ends up feeling compared to languages that have been around for less than half as long. The devs' time and resources are limited and they've clearly advertised their time preference and committed to a reliable schedule - this is an explicit attempt to accommodate users like you better, compared to the historical attitude of releasing the next minor version "when it's done". It also means that they're locked in to supporting five versions at a time while developing a sixth. There's only so much that can reasonably be expected here.
Seriously, what you're getting here is well above the curve for open-source development.
But it's not a long time in the OP's field of science. Unfortunately despite a strong preference for Python in the scientific community, the language's design team seem to ignore that community's needs entirely, in favour of the needs of large technology companies.
I was hopeful that in the transition from a BDFL-based governance system to a Steering Council, we would see a larger variety of experience and opinions designing the language. Instead, I don't think there has ever been a single scientist, finance worker etc on the Steering Council - it's always software developers, almost always employees of large software companies.
> There should be one-- and preferably only one --obvious way to do it.
Use f-strings if you can, otherwise use t-strings.
I'm really loving this lovecraftian space the "batteries included" and "one obvious way to do it" design philosophy brought us!
The stated use case is to avoid injection attacks. However the primary reason why injection attacks work is that the easiest way to write the code makes it vulnerable to injection attacks. This remains true, and so injection attacks will continue to happen.
Templates offer to improve this by adding interpolations, which are able to do things like escaping. However the code for said interpolations is now located at some distance from the template. You therefore get code that locally looks good, even if it has security mistakes. Instead of one source of error - the developer interpolated - you now have three. The developer forgot to interpolate, the developer chose the wrong interpolation, or the interpolation itself got it wrong. We now have more sources of error, and more action at a distance. Which makes it harder to audit the code for sources of potential error.
This is something I've observed over my life. Developers don't notice the cognitive overhead of all of the abstractions that they have internalized. Therefore over time they add more. This results in code that works "by magic". And serious problems if the magic doesn't quite work in the way that developers are relying on.
Templates are yet another step towards "more magic". With predictable consequences down the road.
Template.__str__() is equivalent to Template.__repr__(), which is to say that these aren't f-strings in an important sense: you can't get a useful string out of them until you process them in some way.
The expectation is that developers will typically make use of well-established libraries that build on top of t-strings. For instance, developers might grab a package that provides an html() function that accepts Template instances and returns some Element type, which can then be safely converted into a string.
Stepping back, t-strings are a pythonic parallel to JavaScript's tagged template strings. They have many of the same advantages and drawbacks.
In PHP, people used to just call mysql_query on a string and all the escaping was done with mysql_escape_string. According to you that nice locality of query construction and sanitization that should've improved security, but my god did it ever not do that.
It was exactly layers of abstractions, moving things far away from the programmer, with prepared statements to ORMs, that meaningfully reduced the number of SQL injection vulnerabilities.
Another example is JavaScript, how many XSS vulnerabilities never happened because of all the shadow dom frameworks? Layers of abstractions like these (JSX,etc) are a major reason we don't see many XSS vulnerabilities nowadays.
The idea is for the interpolation to be provided by the library - just as the library is expected to provide a quoting/escaping/sanitization function today. But now the interpolation function can demand to receive an instance of the new Template type, and raise a `TypeError` if given a pre-formatted string. And that work can perhaps also be rolled into the same interface as the actual querying command. And manually creating a Template instance from a pre-formatted string is difficult and sticks out like a sore thumb (and it would be easy for linters to detect the pattern).
> This is something I've observed over my life. Developers don't notice the cognitive overhead of all of the abstractions that they have internalized. Therefore over time they add more. This results in code that works "by magic". And serious problems if the magic doesn't quite work in the way that developers are relying on.
By this logic, we couldn't have languages like Python at all.
Could you give examples of this?
> The developer forgot to interpolate
What would this look like? The only way to get dynamic/user input into a template is either through interpolation or concatenation.
Before:
f"..html_str..." + user_provided_str # oops! should have: html_str + sanitize(user_provided_str)
After:
t"...html_template..." + user_provided_str # oops! should have: t"...html_template...{user_provided_str}"
Does this really leave us worse off?
Unless you're referring to something like this:
Before:
html = "love > war" # oops! should have been: html = "love > war"
After:
html = "love > war" # oops! should have been: html = t"love > war"
But then the two scenarios are nearly identical.
> the developer chose the wrong interpolation
What kind of interpolation would be the "wrong interpolation"?
> or the interpolation itself got it wrong.
Isn't that analogous to sanitize(user_provided_str) having a bug?
> the developer chose the wrong interpolation Not possible if the library converts from template to interpolation itself
> or the interpolation itself got it wrong Sure, but that would be library code.
Background: TXR already Lisp has quasi-string-literals, which are template strings that do implicit interpolation when evaluated. They do not produce an object where you can inspect the values and fixed strings and do things with these before the merge.
1> (let ((user "Bob") (greeting "how are you?"))
`Hello @user, @greeting`)
"Hello Bob, how are you?"
The underlying syntax behind the `...` notation is the sys:quasi expression. We can quote the quasistring and look at the car (head symbol) and cdr (rest of the list): 2> (car '`Hello @user, @greeting`)
sys:quasi
3> (cdr '`Hello @user, @greeting`)
("Hello " @user ", " @greeting)
So that is a bit like f-strings.OK, now with those pieces, I just right now made a macro te that gives us a template object.
4> (load "template")
nil
You invoke it with one argument as (te <quasistring>) 5> (let ((user "Bob") (greeting "how are you?"))
(te `Hello @user, @greeting`))
#S(template merge #<interpreted fun: lambda (#:self-0073)> strings #("Hello " ", ")
vals #("Bob" "how are you?"))
6> *5.vals
#("Bob" "how are you?")
7> *5.strings
#("Hello " ", ")
8> *5.(merge)
"Hello Bob, how are you?"
9> (set [*5.vals 0] "Alice")
"Alice"
10> *5.(merge)
"Hello Alice, how are you?"
You can see the object captured the values from the lexical variables, and we can rewrite them, like changing Bob to Alice. When we call the merge method on the object, it combines the template and the values.(We cannot alter the strings in this implementation; they are for "informational purposes only").
Here is how the macro expands:
11> (macroexpand-1 '(te `Hello @user, @greeting`))
(new template
merge (lambda (#:self-0073)
(let* ((#:vals-0074
#:self-0073.vals)
(#:var-0075
[#:vals-0074
0])
(#:var-0076
[#:vals-0074
1]))
`Hello @{#:var-0075}, @{#:var-0076}`))
strings '#("Hello " ", ")
vals (vec user greeting))
It produces a constructor invocation (new template ...) which specifies values for the slots merge, strings and vals.The initialization of strings is trivial: just a vector of the strings pulled from the quasistring.
The vals slot is initialized by a `(vec ...)` call whose arguments are the expressions from the quasistring. This gets evaluated in the right lexical scope where the macro is expanded. This is how we capture those values.
The most complicated part is the lambda expression that initializes merge. This takes a single argument, which is the self-object, anonymized by a gensym variable for hygiene. It binds the .vals slot of the object to another gensym lexical. Then a genyms local variable is bound for each value, referencing into consecutive elements of the value vector. E.g. #:var-0075 is bound to [#:vals-0074 0], the first value.
The body of the let is a transformed version of the original template, in which the interpolated expressions are replaced by gensyms, which reference the bindings that index into the vector.
The complete implementation in template.tl (referenced by (load "template") in command line 4) is:
(defstruct template ()
merge
strings
vals)
(defun compile-template (quasi)
(match (@(eq 'sys:quasi) . @args) quasi
(let ((gensyms (build-list))
(exprs (build-list))
(strings (build-list))
(xquasi (build-list '(sys:quasi)))
(self (gensym "self-"))
(vals (gensym "vals-")))
(while-true-match-case (pop args)
((@(eq 'sys:var) @(bindable @sym))
exprs.(add sym)
(let ((g (gensym "var-")))
gensyms.(add g)
xquasi.(add g)))
((@(eq 'sys:expr) @expr)
exprs.(add expr)
(let ((g (gensym "expr-")))
gensyms.(add g)
xquasi.(add g)))
(@(stringp @str)
strings.(add str)
xquasi.(add str))
(@else (compile-error quasi
"invalid expression in template: ~s" else)))
^(new template
merge (lambda (,self)
(let* ((,vals (qref ,self vals))
,*[map (ret ^(,@1 [,vals ,@2])) gensyms.(get) 0])
,xquasi.(get)))
strings ',(vec-list strings.(get))
vals (vec ,*exprs.(get))))))
(defmacro te (quasi)
(compile-template quasi))
We can see an expansion:That Lisp Curse document, though off the mark in general, was right the observation that social problems in languages like Python are just technical problems in Lisp (and often minor ones).
In Python you have to wait for some new PEP to be approved in order to get something that is like f-strings but gives you an object which intercepts the interpolation. Several proposals are tendered and then one is picked, etc. People waste their time producing rejected proposals, and time on all the bureucracy in general.
In Lisp land, oh we have basic template strings already, let's make template objects in 15 minutes. Nobody else has to approve it or like it. It will backport into older versions of the language easily.
P.S.
I was going to have the template object carry a hash of those values that are produced by variables; while coding this, I forgot. If we know that an interpolation is @greeting, we'd like to be access something using the greeting symbol as a key.
(I don't see any of this is as useful, so I don't plan on doing anything more to it. It has no place in Lisp, because for instance, we would not take anything resembling this approach for HTML generation, or anything else.)