PyAnnotate – Auto-generate type annotations for mypy (opens in new tab)

(mypy-lang.blogspot.com)

129 pointspsychotik8y ago33 comments

33 comments

18 comments · 3 top-level

hasenj8y ago· 9 in thread

Somewhat off topic but I think that more and more people are learning (the hard way, unfortunately) how important static typing is, and how dynamic typing makes it very difficult to develop and maintain large projects.

I think the next generation of successful languages will all be statically typed (whether they will run natively or in a virtual machine is a different (even if related) question).

jernfrost8y ago

No I think what we are seeing is a lot more hybrid systems. Go is like a statically typed language with lots of dynamic features. Julia is a dynamically typed language with lots static typing features.

Paradigms are getting mixed too. Rust, Kotlin and Swift are all imperative languages with heavy functional inspiration.

Traditional statically typed OOP languages such as Java is what people want to get away from.

hasenj8y ago

> No I think what we are seeing is a lot more hybrid systems.

I'm not sure why you started that sentence with "No". I actually agree with it.

There are many good ideas that have emerged separately in different languages, and are being combined in some of the new languages.

All I'm saying is, successful upcoming languages will probably be more statically typed than dynamically typed.

In other words, the dynamic typing paradigm is failing the real world test.

> Go is like a statically typed language with lots of dynamic features.

Go is statically typed.

The part that lacks static typing (no generics) is the worst part of the language that gets the most flak.

> Traditional statically typed OOP languages such as Java is what people want to get away from.

Java's problem is that it's just way too verbose.

    User user = new User(....); // This is not even that verbose

    // Maybe more like this:

    User user = new User(new UserProfile(....), UserManagerFactory.getDefaultUserManager());

People have misdiagnosed the problem and thought it was the static typing.

It turns out to be the lack of support for functions as objects. So for many things you end up having to create dummy classes and objects just to wrap functions.

Even C supported passing function pointers around.

So in this regard, Java is less expressive than C.

SolarNet8y ago

I somewhat agree but I think they will be optionally typed (and for that matter, optionally borrow checked), more along the lines of Julia. Where type stable code gets the performance benefits of static typing, even if it doesn't use any static typing. And where types can be added at any point to improve type checking, performance, and polymorphism all at once.

This allows for fast prototyping, and when done correctly, easy to add type safety. For example, you can prototype the code, make sure it works, add more tests, then add type checking while cleaning it up and documenting it. That would be my ideal workflow.

oweiler8y ago

I disagree, static typing shouldn't be optional.

2 more replies

dkersten8y ago

Citation needed?

Anecdotally, I've developed large projects in C++ and Java (I know, they're pretty lame static type systems -- but certainly the most popular static type systems) and also in Python and Clojure and I really haven't seen much benefit in static typing in regards to software defect rate or quality. Static typing make auto complete and refactoring tools easier, for sure, but it also slows down ease of experimentation (and writing generic code can be painful, although other static languages especially type inferred ones fare better here). I buy into Rich Hickeys view on this topic[1] and that's one reason why I like Clojure: it gets out of the way, but it provides me with the tools I need to verify or validate my data (eg on the module or application boundaries).

I've played around with languages that have fancier type systems (Haskell, various ML's, briefly ATS) and am very interested in Rust (but have yet to use it), but they haven't really provided enough benefits for the effort of describing the types.

Note that I used to be very heavily in the static typing camp and I still very much like the idea of static typing, I just don't think we have found a static type system yet that has the right balance of convenience and safety and actually catches the right kinds of errors (as described in the below talk).

I guess my point is that its not quite clear that the next generation of successful languages will all be statically typed. In fact, current trends would suggest otherwise (most of the popular languages are dynamically typed) although perhaps that depends on your definition of "successful".

[1] https://www.youtube.com/watch?v=2V1FtfBDsLU

tome8y ago

> I just don't think we have found a static type system yet that has the right balance of convenience and safety and actually catches the right kinds of errors (as described in the below talk).

"Please don't be an uninformed Rich Hickey talk"

"Oh, it's an uninformed Rich Hickey talk"

1 more reply

hasenj8y ago

> I really haven't seen much benefit in static typing in regards to software defect rate or quality

Hold it right there. I've never seen anyone argue that static type systems prevent bugs.

I mean they do prevent silly bugs that occur from mistyping variable/property names but I've never seen anyone claim that they eliminate other classes of bugs.

The biggest benefit of static type checking is you know what all the variables are.

    def checkout_cart(customer, payment_methods, cart, session):
        # body

What the hell is customer? What is payment methods? What fields are available on these objects? What methods can you call on them? no freaking idea.

Of course, this kind of code is confusing in Java as well, but for a different reason: Java conventions encourage a kind of obtuse programming style where everything is hidden behind layers of abstractions of factories and managers, so that even when everything is typed, you're not sure what anything is doing because all the data that matters is private and so are all the methods that actually do anything useful. All you're left with is an abstract interface that can sometimes be just as bad as an untyped variable. But this is mostly a cultural problem. (I've digressed).

> Static typing make auto complete and refactoring tools easier, for sure, but it also slows down ease of experimentation

Java slows down ease of experimentation because it requires tons of boilerplate code for even the simplest tasks.

It's not the static type checking.

If anything, static type checking helps experimentation because you can change your mind quickly and the compiler will help you catch all the stupid mistakes that can occur from mismatching types or mistyping variable names. This removes a huge cognitive tax and makes programming more enjoyable. Although I will concede this is subjective.

3 more replies

andybak8y ago

The pendulum has swung the other way and back even in my brief memory as a programmer. Is it a trend or an oscillation?

mdip8y ago

Have to agree with you, but with a little bit of additional context.

I get why people started rejecting statically typed languages. I work with many different languages, but started with statically typed languages with various forms of static typing (Pascal was my first, but then C and C++). Dynamic languages offer flexibility that statically typed languages lack. Most of the time, the flexibility rejected by static typing ends up being a good thing as it encourages sane code that can be analyzed by the compiler and prevent runtime bugs, so I've always accepted this sacrifice of flexibility as a feature rather than a hindrance.

However, I've found myself recently writing many solutions using TypeScript[0] and I'm finding its static type system, which allows one to configure the strictness, to be incredibly powerful. It's helped, first, by fantastic support for union types, type inference and duck typing. I find that the compiler, which is little more than a transpiler with a static analyser bolted on top of it, catches errors that I make and greatly reduces runtime WTFs while still allowing me to write terse, simple code that isn't littered with unnecessary type annotations. I add typings where either the implicit type detection can't predict the type for me or where it decreases cognitive overhead while reading code and leave them out when the opposite is true. This moves the bar back a little bit toward the flexibility side -- I can do things in code that are completely illegal in C# that result in less code, yet not increase my runtime bugs or decrease readability in the process. And if there's something truly harry that has to be done, I can tell the compiler to simply ignore the violation and give me the JavaScript output that I want regardless of what you think it's going to do.

...and I find myself longing for that type system everywhere else. So while I am still a huge proponent of statically typed languages and I don't see that changing any time, precisely how the static type system works and what features it supports is becoming very important to me.

[0] Which, considering how much I hate JavaScript, was a huge surprise considering it's basically ES6 JavaScript with optional type annotations.

ipsum28y ago· 6 in thread

For any dropboxers (or others), how does this compare with pytype? https://github.com/google/pytype.

tonygrue8y ago

(I worked on an early version of PyAnnotate.)

The main difference is that pytype is a static analyzer (i.e. it inspects the code and tries to figure out what types various things are), whereas PyAnnotate is a profiler hook, so you have to run your code and it observes types as your code runs.

Both have their pros and cons. While static analysis (in my personal opinion) would be ideal because you don't have to run your code, and in theory it can be much more complete, it's also much harder (often impossible) in Python. The runtime analysis of PyAnnotate has a lot of downsides (it doesn't give you types for code that it didn't observe run and it can't know if it has seen all the particular types for a parameter or return). The upside is that it was quick to implement something useful and it gets you quickly to pretty descent type annotations for your main code paths. Which is nice, because in a large untyped codebase it effectively lays down a rough draft of type annotations, making it a lot easier to fix up and fill in edge cases by hand.

carapace8y ago

There was a thing called PySonar by Yin Wang. It's sort of gone now but you can still find copies around the net.

yegle8y ago

(work for Google and uses pytype daily)

Pytype is similar to mypy that it can do type checking with proper annotations. In addition to use annotations, pytype can also do inference based on static analysis.

I don't have much experience with mypy but the last time I used it, it cannot infer from `return x == y` that the function returns a bool. Pytype can correctly infer many simple forms of function argument types and return type, and even some more complex form.

From reading the project, PyAnnotate completely rely on runtime profiling info to _help_ you get to the first round of annotations. We also have similar project that gathers types from runtime and help people to annotate the code. The type information gathered this way has its limitations (PyAnnotate project called this out as well, that you should only use it on legacy code but not on newly written code).

To give an example: if PyAnnotate observe a function below to accept a list of ints and returns an int, it may conclude that the type of this function is `Callable[[List[int]], int]`

``` def foo(xs): ret = 0 for x in xs: ret += x return ret ```

But it can actually work on any iterable (because of the for-in loop), and the item in `xs` is number (because the `__iadd__` call on integer 0). With static analysis, the correct inferred type might be `Callable[[Iterable[Union[int, float]]], Union[int, float]]`

patrec8y ago

It's not possible to infer that the result `return x == y` is a bool, because python has rich comparisons (e.g. if x an y are numpy arrays you'll get an array of the same shape (after broadcasting) back). So either pytype uses additional information, or it's sometimes just wrong.

1 more reply

yegle8y ago

I think another good example of "observed types are not necessarily the intended types" is a Text parameter (unicode in python2, str in python3). The actual intent might be Iterable[Text].

graton8y ago

Well at first glance, much better documentation :)

I just tried pytype and it basically did nothing but spit out some errors about imports not found. I didn't have time to try to investigate further and the documentation seems to be almost non-existent. Surprising with 1780 commits to the project.

alcari8y ago

Here's the easy 50% of it: https://github.com/alcarithemad/typetrace

j / k navigate · click thread line to collapse

33 comments

18 comments · 3 top-level

hasenj8y ago· 9 in thread

I think the next generation of successful languages will all be statically typed (whether they will run natively or in a virtual machine is a different (even if related) question).

jernfrost8y ago

Paradigms are getting mixed too. Rust, Kotlin and Swift are all imperative languages with heavy functional inspiration.

Traditional statically typed OOP languages such as Java is what people want to get away from.

hasenj8y ago

> No I think what we are seeing is a lot more hybrid systems.

I'm not sure why you started that sentence with "No". I actually agree with it.

There are many good ideas that have emerged separately in different languages, and are being combined in some of the new languages.

All I'm saying is, successful upcoming languages will probably be more statically typed than dynamically typed.

In other words, the dynamic typing paradigm is failing the real world test.

> Go is like a statically typed language with lots of dynamic features.

Go is statically typed.

The part that lacks static typing (no generics) is the worst part of the language that gets the most flak.

> Traditional statically typed OOP languages such as Java is what people want to get away from.

Java's problem is that it's just way too verbose.

    User user = new User(....); // This is not even that verbose

    // Maybe more like this:

    User user = new User(new UserProfile(....), UserManagerFactory.getDefaultUserManager());

People have misdiagnosed the problem and thought it was the static typing.

It turns out to be the lack of support for functions as objects. So for many things you end up having to create dummy classes and objects just to wrap functions.

Even C supported passing function pointers around.

So in this regard, Java is less expressive than C.

SolarNet8y ago

oweiler8y ago

I disagree, static typing shouldn't be optional.

2 more replies

dkersten8y ago

Citation needed?

[1] https://www.youtube.com/watch?v=2V1FtfBDsLU

tome8y ago

> I just don't think we have found a static type system yet that has the right balance of convenience and safety and actually catches the right kinds of errors (as described in the below talk).

"Please don't be an uninformed Rich Hickey talk"

"Oh, it's an uninformed Rich Hickey talk"

1 more reply

hasenj8y ago

> I really haven't seen much benefit in static typing in regards to software defect rate or quality

Hold it right there. I've never seen anyone argue that static type systems prevent bugs.

I mean they do prevent silly bugs that occur from mistyping variable/property names but I've never seen anyone claim that they eliminate other classes of bugs.

The biggest benefit of static type checking is you know what all the variables are.

    def checkout_cart(customer, payment_methods, cart, session):
        # body

What the hell is customer? What is payment methods? What fields are available on these objects? What methods can you call on them? no freaking idea.

> Static typing make auto complete and refactoring tools easier, for sure, but it also slows down ease of experimentation

Java slows down ease of experimentation because it requires tons of boilerplate code for even the simplest tasks.

It's not the static type checking.

3 more replies

andybak8y ago

The pendulum has swung the other way and back even in my brief memory as a programmer. Is it a trend or an oscillation?

mdip8y ago

Have to agree with you, but with a little bit of additional context.

[0] Which, considering how much I hate JavaScript, was a huge surprise considering it's basically ES6 JavaScript with optional type annotations.

ipsum28y ago· 6 in thread

For any dropboxers (or others), how does this compare with pytype? https://github.com/google/pytype.

tonygrue8y ago

(I worked on an early version of PyAnnotate.)

carapace8y ago

There was a thing called PySonar by Yin Wang. It's sort of gone now but you can still find copies around the net.

yegle8y ago

(work for Google and uses pytype daily)

Pytype is similar to mypy that it can do type checking with proper annotations. In addition to use annotations, pytype can also do inference based on static analysis.

To give an example: if PyAnnotate observe a function below to accept a list of ints and returns an int, it may conclude that the type of this function is `Callable[[List[int]], int]`

``` def foo(xs): ret = 0 for x in xs: ret += x return ret ```

patrec8y ago

1 more reply

yegle8y ago

I think another good example of "observed types are not necessarily the intended types" is a Text parameter (unicode in python2, str in python3). The actual intent might be Iterable[Text].

graton8y ago

Well at first glance, much better documentation :)

alcari8y ago

Here's the easy 50% of it: https://github.com/alcarithemad/typetrace

j / k navigate · click thread line to collapse