Also, many statically-typed languages do not take advantage of the flexibility a good type-system can provide. For example, C, C++, Java.
Type systems like Haskell add a lot of value, but you have to really know the language to make good use of it. I mean, just IO handling requires planning ahead. Small dynamic languages allow one to keep simple things simple. There's value in that too.
The distressing example that nags at me here is how the headline feature of Apache Spark 2.0 is that they Greenspunned a dynamic typing mechanism into it. And, in doing so, realized improvements in ergonomics, performance, and memory consumption. Understanding why and how is an instructive lesson.
The most promising typing system for arbitrary data wrangling that I've seen so far is the one that's built into a Python package called Dagster. It seems reminiscent, as best I can tell, of dependent typing. (Although I haven't actually spent any time with a dependently typed language, so that could be a misunderstanding on my part.) The actual checks are done at run time, of course, but I don't see any particular reason, aside perhaps from the complexity involved, why a similar mechanism couldn't be implemented as a static type check.
I’m speculating here, but as a software engineer who’s dealt with many different languages, I (personally) find types to increase my productivity, because it reduces errors at runtime. This is a huge benefit, and because I’ve spent a lot of time learning the target language’s type system, it doesn’t come at the cost of productivity, and like I said, I find it to improve my downstream productivity.
I’ve read that scientists have had significant mistakes in their findings in papers due to bugs, which I know is anecdotal, but it seems like adopting typed languages would reduce those errors and increase confidence in studies.
I wonder if they were able to recognize the downstream benefits of types, if they would shrug them off as irrelevant to their work.
However, it's possible to have a more lenient static type checker that only looks for cases where you're guaranteed to get an illegal operation at runtime. They answer the question "is this definitely wrong?"
For instance, if we have:
def f(x):
g(x)
h(x)
def g(x):
x.a()
def h(x):
x.b()
If we're doing whole program analysis and there's no type that has both a() and b() methods, then f is guaranteed to call a non-existing method at one place or the other.This sort of "this is clearly wrong" type checking is much less intrusive than the more common "I'm not positive this is correct" type checking.
If you're not doing whole-program analysis, then you may restrict the type search to the types imported into the transitive closure of the module and its dependencies. This makes type checking slightly more intrusive by sometimes forcing more module imports, but it's still much less intrusive than common type checkers.
Which in at least some cases means ignoring bugs. Pretty much every system I've worked on that used a language that didn't to type checking (or was too forgiving about it) has had runtime bugs because of type incompatibility or type coercion issues.
With type checkers for dynamic languages, a lot of the arguments in favor of static languages disappear.
Python’s type annotations aren’t checked at runtime, though - they’re also entirely static. So you either have to:
- Write code that’s as pedantically correct as in a traditional statically-typed language (e.g. by using `mypy —-strict`), in which case you’d be better off using a language that supports AOT-compilation, or
- Be prepared for cases where the runtime type doesn’t match the annotation. This is unacceptable - I hate the idea of code that tells lies.
Nonsense. It's very easy to use a runtime cast if and when you want to, even in Haskell. You're in control of exactly where typechecking doesn't happen.
In contrast, in a dynamic language with optional typechecking you have basically no guarantees. Even if 99% of your codebase is typechecked, you have no idea what errors might be lurking in the other 1% or how far they might propagate, since a type error can show up essentially arbitrarily far away from its actual cause. It's the worst of both worlds - you pay all the costs of a static type system but get hardly any of the benefits.
I'd say that the pool of python developers has been swamped by a tsunami of former java/c# developers who don't understand the first thing about what made python good and are demanding they get everything they were used to in their old languages in python.
It's a rather sad state of affairs, but it's been great for cashing out on a language I learned in my off time at highschool in the 00s, especially since I have my name as a contributor before I turned 18.
It can be nice to use an optional type/spec system to help with that but not required
This type of quick'n'dirty throw-away coding would be a lot less convenient if Python would force a strong static type system on me.
If Python is used for "real projects" with more than a few thousand lines of code, worked on by a team, static typing totally makes sense. But there are a lot of everyday tasks where this just gets in the way.
But the optional Python annotations? I was skeptical, but I began to see the value in it, especially in interfaces/APIs for external consumption, or more complicated areas.
But yeah don't ask me to type hint a "private" function that's only a couple of lines long and used in specific cases
As a trainer, experience teaches me that it's also much easier for beginners.
Now, types are nice to have for bigger code bases, but that doesn't mean there is no place for a language with optional typing.
It cuts both ways. Some days type advocates recognize why their type language is not their host language i.e., there is a such a thing as too much types.
If you believe types can express anything, and more over they are the best at expressing it then why do you need the host language, why not to program using just types.
There is no silver bullet. Types won't make your code bug free.
What programming language do you know that uses types to express most tests?
Example: Everyone in your system has a role. There are 3 kind of roles: User, Moderator, Admin
They have different attributes. All of them have a name, but only moderators and admins have dedicates areas of responsibility. And only moderators have a subset of at least one allowed permission (e.g. "edit posts" and/or "delete posts").
You can't really denote this in a way that the Java compiler understands it at the use site. E.g. an action is executed and you want to see if it is allowed. So you want to check the kind of role and if the attributes/permissions match the action or not. The compiler doesn't support this well at ala.
But (for new projects anyway) what's the point of using Python with types instead of using a real typed language like Java or C#?
Aside from that, an advantage is that the type system is more expressive than that of Java or C#. Most common Python idioms work just the same way with or without type annotations, so you can continue to write functions that take either int or str arguments (for better or worse) and the typechecker will understand your use of isinstance and make sure everything checks out. (But there are other fully statically typed languages that are also more expressive than Java or C#.)
You can also continue to do weird metaprogramming and monkeypatching, and though the typechecker is not always able to make sense of it you can often wrap it in a safe interface so you can still get assurances for the rest of the project.
I don't see the hype here, just use a statically typed language if it's an important consideration for the project.
I find typed python more pleasant and expressive than similar Java, for a variety of reasons, among them, I find python's type system to be superior to Java's (and getting even better faster!)
Other than that, I actually don't know... a lot of other languages now have interpreted-dev envs...
(btw, I'd rather use Kotlin than java --- it's like Java++)
On can use any bindings to those libraries, including using modern versions of Fortran and C++ directly.
All of the other replies are subjective.
Interpreted languages are quick to develop/quick to learn, and with the support of typing that same "quick prototype" can easily mature into a full application without needing to be re-written in a "real" language.
python task.py
vs javac Task.java
java -cp . Task
And if you're in and IDE (in fairness, they're not popular for quick prototypes), you're just clicking the play button either way.There are also toolchain options for running the code interpreted, JIT compiled, AOT compiled, or a mix of the previous options.
Languages are orthogonal to the toolchains.
I haven’t worked on large code bases, only run small ML analyses on tabular data. Can someone be concrete with the advantages and use cases?
Realizing when I'm writing that the expression I'm about to type is wrong because the types do not match still depends on my doing type checking in my head.
It seems to me that the case where typed wins is when my in-head type checker fails and I write something that is invalid. The typed language will catch that failure at compile time. The untyped won't catch it until runtime (and maybe not even then).
For the typed language, an IDE might be able to catch a mistake immediately so I don't have to wait until compilation to find it, and that immediate feedback might prevent me from writing dozens more lines with the same mistake, but even then I was using my in-head type checker to write that first mistake for the IDE to catch.
Yeah you can go in circles for an eternity about which is better and which is worse. I try to stay away from such arguments because they aren't productive.
I do now argue though that as an engineering team and code base gets bigger, you will need static typing to be able to continue to grow at a sustainable pace. Static typing provides a very strong safety net letting people make good, safe changes in code they may not be intimately familiar with.
If you have a small code base and/or small team, then static typing becomes much more of a personal preference.
I can leverage the compiler to do 90% of the refactoring work.
For example, say you want to remove a field from an object in C++. The compiler will tell you every place where the field is accessed and raise a compiler error. In python this won't manifest until the code is run in that particular spot. I can only imagine that in this type of language a well intentioned refactor could easily create 10 bugs that don't get triggered for a very long time.
While I endorse the other answers, I do want to highlight how the issues only appear at scale. I have no problem using a dynamically typed language for up to low-hundreds of numbers of lines. But that's about where I start to get nervous I've picked the wrong language. (Or, putting it a different way, about day 3 of working on the same code base continuously.)
When the whole program fits on the screen, essentially, it's no big deal to not have types.
But as the program grows, the problems emerge.
I think I have a minority view on what the problem that emerge is, though. I think the first problem a dynamically-typed code base usually encounters is that there is some function that accepts an object of some kind, and you discover that it actually needs to accept a list of that object sometimes instead. In a statically-typed language, you change the type from "MyObject" to "MyObject[]" and immediately change all the call sites. Possibly you even discover the change reverberates back up the program design, and you push it up higher, with the compiler helping all the way.
With dynamic languages, you tend instead to do something like:
def myOldFunction(obj):
if typeof(obj) != typeof([]):
obj = [obj]
# function continues
Well, that's if you're lucky. It seems to be more popular to instead decorate the entire function with if statements every time "obj" is used, but let's take this instead.Now you've taken your first step down a dark road, where you now have a function that accepts an object, or maybe an array of those objects. Then you decide to treat None/nil/whatever as an empty list. Then you realize that sometimes the return value also needs to be a list of whatever it used to return, but you have to add a parameter to the call now to specify you want a list in the return value so you don't break all the old callers.
You inevitably head down a road where the function is filled to the brim with entangled concerns from a lot of other code. Then you start getting lots of these functions in a codebase together, and the codebase can never again be refactored because it'll break everything. (Or, rather, it can be, but only in very constrained ways.)
By contrast, I prefer even to prototype in static languages now, because when I make a mistake in the signature a function should have, in just a minute or two, I can fix it, and it's gone like the mistake was never there because the compiler made me fix it (and, fortunately, helped). It is much easier to maintain a discipline where every function isn't deeply entangled with another when you're not carrying along the entire history of the function's input and output parameters forever.
As you scale up, other problems emerge too, like the difficulty of using static analysis tools and the way documentation and unit tests have to carry a lot more water because the code is so much harder to get a grasp on... but whenever I'm starting a new code base, or even just a new module in an existing code base, it is always the above problem that is the first one I hit in a dynamic language and the first benefit of a static language I notice. A lot of the other scaling issues take months or even years to develop, but this is the one I notice in days, or in the worst cases even mere hours. I sometimes wonder how much of the "prototype one to throw away" comes from people using dynamic languages; usually with not much extra effort, my statically-typed "prototype" comes out production quality by the time I'm done working with it this way. The end result may not much resemble what I first sketched out, but I got there with a clear set of easy steps, and usually I don't even take that large a step backwards since I pair everything with unit tests and between the static types and tests I'm usually always moving forwards even as I make rather substantial changes in the codebase that I wouldn't dream of doing in dynamic languages because I know from experience that it's much harder to avoid breaking things and taking huge steps backwards even as I move some part incrementally forward.
- Library support for static types is not very good. This can be fixed, of course, but it's also very hard to fix in a concerted way. It'll just depend on the community getting on board.
- The syntax is limited. There isn't proper support for declaring generics, you have to declare a separate TypeVar, as a Python variable, somewhere else in scope and it just... gets used to approximate a generic. It mostly works, but sometimes it doesn't, and it's very unintuitive and awkward. And then concepts like Callable, Union, TypedDict, and Optional don't have dedicated syntax for readability; they're generic types that you have to import and parameterize. Etc.
- Support isn't great for highly "dynamic" data. TypeScript gives you powerful features for reasoning about dynamic property-sets of objects (dicts), combining and separating them, duck-typing, doing really complex inference, etc. These features in Python are usually some combination of unreliable, third-party, syntactically awkward, and so on.
- Inconsistency between different type-checkers. You'd think the fact that Python has standardized type syntax would help with consistency, but what it actually means is that everyone gets to define their own semantics for the same syntax. Different checkers mostly orbit around the same semantics, but there are always gaps. So for example, MyPy does a pretty good job of being strict and smart, but it's really slow. So you'll end up using an IDE-optimized checker for development, like Pytorch, but Pytorch will allow some things that MyPy doesn't and not allow some things that MyPy does. So you can use your IDE to get most of the way there, but you always have to remember to run a "real" type-check before you commit, or you may break the build in CI.
I should point out the one big advantage that Python has here: unlike TypeScript you don't need a build step, because Python interpreters can parse (and throw away) the type annotations. That's pretty nice, especially for gradual adoption/casual typing of scripts.
All of the problems (except maybe the syntax) are solvable, and I genuinely hope they get solved. For now, if you stick to primitives and core or class-based data structures you'll have a great experience with Python types. If you do anything more complex, the results will be mixed. This is of course much better than nothing, but it could be a lot better still. If you're picking between typed Python and TypeScript for a new project, it's worth factoring in.
Union is getting dedicated syntax in 3.10: https://www.python.org/dev/peps/pep-0604/
Optional could follow: https://www.python.org/dev/peps/pep-0645/
I'll add a couple more frustrating limitations to Python's typing:
1. You can define a function type somewhat clumsily (`Callable[[Arg1T, Arg2T], ReturnT]`), but if your callback uses keyword arguments (pervasive among Python programs), you're out of luck.
2. You can't define recursive types like JSON. E.g., `JSON = TypeVar("JSON", Union[str, int, None, bool, List[JSON], Dict[str, JSON]])`.
3. Getting mypy to accept third party definitions sometimes works perfectly and other times it doesn't work at all. You get a link to some troubleshooting tips that have never actually worked for me.
Beyond that, it's just the general usability issues that ultimately derive from Python's election to shoehorn a lot of typing functionality into minimal syntax changes (as opposed to TypeScript which can make whichever syntax changes it likes because it isn't trying to be valid JavaScript).
I think the idea was that they didn't want to introduce build system complexity by way of a compiler, which is an easy choice to criticize in hindsight but I might've made the same call. I haven't used TypeScript in vain so I can't say for certain, but the TypeScript grass certainly looks pretty green from the Python side. Moreover, on the Python side we aren't even absolved from build-time problems since we still have to fight tooth and nail to get Mypy to accept third party type annotations.
Mypy is a valiant effort, but like everything in the Python ecosystem, it's a problem made difficult by an accumulated legacy of unfortunate decisions and there just isn't enough investment to move it forward. I've since moved onto Go for everything I used to use Python for, and I haven't honestly looked back--Go solves all of my biggest Python pain points: performance, tooling [especially package management], and dealing with the low-quality code that you tend to get when your colleagues don't have a type system keeping you on the rails and Go doesn't impose many pain points of its own (generics, but at a certain point in your career you realize that "good code" is not "maximally abstract code" nor "maximally DRY code", and the actual valid use cases for generics are fewer and farther between). TypeScript piques my interest, but it seems a bit too complex and configurable and in particular I'm not looking forward to figuring out how to wire together a JavaScript build system. Maybe I'll try Deno if I hear enough positive feedback about it, but for now it's hard to beat `go build`. Rust seems cool but the borrow checker tax is too steep for my blood.
I've personally never missed the ability to add keyword arguments to a Callable. If you want to anyway, I understand that at least mypy has syntax for this:
* https://mypy.readthedocs.io/en/stable/protocols.html#callbac...
* https://mypy.readthedocs.io/en/stable/additional_features.ht...
> Beyond that, it's just the general usability issues that ultimately derive from Python's election to shoehorn a lot of typing functionality into minimal syntax changes
Many syntax niceities will be introduced in upcoming Python releases. From the article:
* Union types are shortened to X | Y (PEP 604)
* (?) Optional types shortened to X? (PEP 645)
* Type Hinting Generics In Standard Collections (PEP 585) - Can use list[T], dict[K, V], etc in place of List[T] and Dict[K, V].
With those changes, you won't generally need to import the `typing` module at all anymore.
> there just isn't enough investment to move it forward.
Dropbox has dedicated engineers maintaining mypy. And of course a few volunteers such as myself. And Guido. :)
Deno is currently trying to bridge that gap - putting TS directly in your interpreter - but right now it comes with some caveats around being a separate runtime with incompatible system APIs, unfortunately. We'll see if it takes off enough that that becomes less of an issue.
Aside:
> it's a problem made difficult by an accumulated legacy of unfortunate decisions and there just isn't enough investment to move it forward
This is funny to me because JS has an even bigger accumulated legacy of unfortunate decisions, it's just gotten an obscene amount of investment to move it forward despite all odds ;)
1. a -> b :: Function(a)[b]
(a, k1=t1, k2=t2) -> b :: Function(a, k1=t1, k2=t2)[b]
2. JSON = RecTypeVar('JSON', lambda Self: Union[None, bool, int, float, str, List[Self], Dict[str, Self]])huh!
Granted. I do think it's improving over time. For example the django-stubs project is a really nice addition to the regular Django distribution: https://pypi.org/project/django-stubs/
Know that the signatures vary with version, so select the correct one.
To check these annotations, you'll need a third-party type checker somewhere in your build process. I use pylance with VS Code, as it can detect errors as I type: https://marketplace.visualstudio.com/items?itemName=ms-pytho...
Then just open a small script you know well, let's say less than 200 lines, and check what "missing" type annotations mypy complains about.
For most cases, the IDE will be able to tell you what it recommends, and from then you can start reading the docs for specific types for a deeper dive.
pylance is much stricter, but it seems to have a lot of false-(+)s
from what I've used, the implementation varies wildly...
I would say get good enough with haskell to write a 2 human player tic-tac-toe game by following some online tutorials. Then you'll be able to pick up types in python like it was nothing.
Otherwise if you have used a type checked language... Python types are really straight forward. I never needed a tutorial. Whenever I needed a feature (such as type variables) I would look it up via google.
Anyway, I don't actually know if you've ever played with a type checked language but I'm assuming you haven't because python type checking is pretty easy to pick up without a tutorial based resource if you have had prior experience in other languages.
Type checking mainly starts pulling its weight in large codebases, or otherwise in long-lived and large programs maintained by rotating groups of people over long periods.
1. no multiline lambda function? - breaks flow of writing fp code
2. default-arguments are 'shared' by default? - have to 'break' this sharing by assigning 'None'... - very unintuitive / I don't know any language that does this...
Anyway, not everything in python is 'pragmatic', and adding types is one of the first steps in going in the right direction.
sort(items, key=lambda x: x.size)
but really anything much bigger should have a name stamped on it dammit. I find JS largely unintelligible due to lambda overuse and nesting. (lamada x:
(x+x)
/x)(n)
Will work, though in Python I usually just create an outside function and pass that in.But yes, not everything in Python is nice for working with fp code. Using tuples, coming from Clojure, is painful. Things like returning exceptions for `StopIteration`is annoying. Mutability of lists outside of a lexical scope without using deepcopy. Though functools and itertools makes it more bearable.
But I also don't think types is the answer. I think something like Racket's contracts or Clojure's spec is closer to what I'd prefer. Gradual typing is nice though.
Because you’ve got an existing code base.
Because you want the benefits of a dynamic language with some of the benefits of type checking.
Because the editor experience is way better when you selectively sprinkle in some types.
Because you can choose how deep you go on types depending on the impact.
We have avoided countless production issues just by annotating when something is Optional or not.
Because stating the type of something important can expose design issues (e.g. "wait a minute, it isn't always a X! in case Y it can be a Z! Forbid or support?") before they become a serious problem.
You don't have to use it if you don't want to.
As a side note, for a related project linked to typinglike the TypedDict, I found that Pydantic is pretty amazing.
For me this is true, I've come back to Python because of it for personal projects.
I'm still trying to cargo cult myself into thinking that Python async makes any sense however. Why not just go with a genserver abstraction and expose the primitives over this mess we have?
Not that I know of. I used "renaissance" to connote an increased positive focus, invigoration, and activity.
maybe something very opinionated about how you're supposed to use the features, sort of a "Hitchhiker's Guide to Type Checking".
if I Google around I find either documentation or fairly low quality blog posts.
Docs and the relevant PEP are your best bet for now.
However a unit test may simply inform you that you have something wrong with your program, whereas a type checker will (more often that not) show you exactly where the error is, and usually as you were typing it.
Reality is that unit tests and type checking both have their place even if there is overlap between the two. Type checkers can however remove the need for trivial and repetitive unit tests.
For all the time a developer spends not writing out types, it then has to be made up in manually written unit tests.
Write a bunch of new code.
Repeat 4-6 times, in rapid succession:
Run program to manually test.
Find basic error (like a missing import).
Fix basic error.
Debug/fix deeper errors in the new code
With this, just about anything will help, including static typing.The problem is the following line
Run program to manually test.
You should avoid manual testing at (almost) all cost. Write an automated test instead. It is effort your are expending anyway, why throw it away?? And while you're there, write the test beforehand, to clarify the specification of the code you're trying to write.With such a TDD cycle, those problems wouldn't have happened.
Repeat
Write a test
Make it pass
[Commit]
Refactor
Commit
Mind you, static typing can still help in this scenario, but not in getting the code to work and helping it remain functional. It helps as verified documentation that eases understanding when reading the code."type" systems help with representation errors, but those tend to be easy to find without static typing.
One problem with static typing is that I'm forced to address all of the representation issues before I can do any testing. That "premature optimization" is hell on incremental development.
Another problem is that static typing tends to encourage languages which are designed around making type checking easy. That impacts program design.
I really hope to start seeing some immutable architecture and structuring in Python more than type checking, as my first impression.