Using @dataclass the example from OP would look like:
from dataclasses import dataclass
@dataclass
class Point3D:
x: float
y: float
z: float
[1]: https://docs.python.org/3/library/dataclasses.htmlThis article was correct and addressed a very real need in Python programming—for year 2016. By now it is obsolete and today's standard library module `dataclasses` does all of that and more.
I don't know much about attrs, only professionally coming to python since 3.7, but I'm not going to bring it in if there's something sufficient in the language
Lots of love to Attrs, which is a great library and is a component of a lot of great software. It was my go-to library for years before Pydantic matured, but I think a lot of people have rightly started to move on to Pydantic, particularly with the popularity of FastAPI
I've done some truly amazing things with attrs because of that composability. If I'd wanted the same things with Pydantic, it would have had to be a feature request.
Yes, you can do validation in attrs, but it's not meant to be used the same way as pydantic. For serialization, you need cattrs, which is a completely different package.
If you’re worried about the performance hit of extra crap happening at runtime… dear lord use another programming language.
Dataclasses is just… meh. Pydantic and Attrs just have so many great features, I would never use dataclasses unless someone had a gun to my head to only use the standard library. I don’t know of a single Python project that uses dataclasses where Pydantic or Attrs would do (I’m sure they exist, but I’ve never run across it).
Dataclasses honestly seems very reactionary by the Python devs, since Attrs was getting so popular and used everywhere that it got a little embarrassing for Python that something so obviously needed in the language just wasn’t there. Those that weren’t using Attrs runtime validators often did something similar to Attrs by abusing NamedTuple with type hints. There were tons of “why isnt Attrs in the stdlib” comments, which is an annoying type of comment to make, but it happens. So they added dataclasses, but having all the many features that Attrs has isn’t a very standard-library-like approach, so we got… dataclasses. Like “look, it’s what you wanted, right!?”. Well no not really, thanks we’ll just keep using Attrs and then Pydantic
As I think I made clear in PEP 557, and every time I discuss this with anyone, dataclasses owes a lot to attrs. I think attrs made some great design decisions, in particular to metaclasses or base classes.
Python can be used quite successfully in high-performance environments if you are judicious about how you use it; set performance budgets, measure continuously, make sure to have vectorized interfaces, and have a tool on hand, like PyO3, Cython, or mypyc (you should probably NOT be using C these days, even if "rewrite in C" is the way this advice was phrased historically) ready to push very hot loops into something with higher performance when necessary. But if you redundantly validate everything's type on every invocation at runtime, it does eventually become untenable for anything but slow batch jobs if you have any significant volume of data.
Attrs just has the features I need for now. It certainly feel a touch verbose but I’m happy to pay the price.
For any larger program, pervasive type annotations and "compile" time checking with mypy is a really good idea though, which somewhat lessens the need for runtime checking.
I don’t expect any type-related thing to be remotely safe in Python without applying at least mypy and pylint, potentially pyright as well, plus, as always with an interpreted language, unit tests for typing issues that would be caught by a compiler in another language
Overall, I really don't see the appeal. It makes the already simple cases simpler (was that Point3D implementation really that bad?) and does nothing for the more complicated cases which make up the majority of object relationships.
These are useful even if only due to the "I can take the three related pieces of information I have and stick them next to each other". That is, if I have some object I'm modelling and it has more than a single attribute (a user with a name and age, or an event with a timestamp and message and optional error code), I have a nice way to model them.
Then, the important thing is that these are still classes, so you can start with
@dataclass
class User:
name: str
age: int
and have that evolve over time to @dataclass
class User:
name: str
age: int
...
permissions: PermissionSet
@property
def location():
# send off an rpc, or query the database for some complex thing.
and since it's still just a class, it'll still work. It absolutely makes modelling the more complex cases easier too. class Person:
def __init__(self, name, age):
self.name = name
self.age = age
is any worse than this: @dataclass
class Person:
name: str
age: int
I'm not writing an eq method or a repr method in most cases, so it just doesn't add much for the cost.The minimal trivial case doesn’t look much different, but if you stacked up 10 data classes with read-only fields vs. bare class implementations with private members plus properties to implement read-only, and you would start to see a bigger lift from attrs, as there would be a bunch of boring duplicated logic.
(Or not - if your usecases are all trivial then of course don’t use the library for more complex usecases. But hopefully you can see why this gets complex in some codebases, and why some would reach for a framework.)
It’s a pretty good abstraction that doesn’t feel half as magic as it is.
> I'm not writing an eq method or a repr method in most cases, so it just doesn't add much for the cost.
That's part of the appeal. With vanilla classes, `__repr__`, `__eq__`, `__hash__` et. al. are each an independent, complex choice that you have to intentionally make every time. It's a lot of cognitive overhead. If you ignore it, the class might be fit for purpose for your immediate needs, but later when debugging, inspecting logs, etc, you will frequently have to incrementally add these features to your data structures, often in a haphazard way. Quick, what are the invariants you have to verify to ensure that your `__eq__`, `__ne__`, `__gt__`, `__le__`, `__lt__`, `__ge__` and `__hash__` methods are compatible with each other? How do you verify that an object is correctly usable as a hash key? The testing burden for all of this stuff is massive if you want to do it correctly, so most libraries that try to eventually add all these methods after the fact for easier debugging and REPL usage usually end up screwing it up in a few places and having a nasty backwards compatibility mess to clean up.
With `attrs`, not only do you get this stuff "for free" in a convenient way, you also get it implemented in a way which is very consistent, which is correct by default, and which also provides an API that allows you to do things like enumerate fields on your value types, serialize them in ways that are much more reliable and predictable than e.g. Pickle, emit schemas for interoperation with other programming languages, automatically provide documentation, provide type hints for IDEs, etc.
Fundamentally attrs is far less code for far more correct and useful behavior.
Until you need them for debugging.
And dataclasses make them free, at lesst syntactically.
Granted, I'm not reviewing my third-party dependencies line by line when I upgrade them. But also I'm more afraid of the security risks of large amounts of in-house code that aren't exposed to public scrutiny, and so a policy that dissuaded the use of even high-quality and well-regarded third-party dependencies seems like it would do more harm than good.
Besides that, it helps that I happen to have met the maintainer of attrs at PyCon (and attrs has only one uploader in PyPI), and therefore I'm less concerned about supply-chain attacks against it, whether of the malicious-maintainer variety or the maintaner-got-scammed-or-hacked variety, than, again, most of my other dependencies whose maintainers I've never heard of. I'm not sure this scales particularly well, but I do feel like there's still something in the open source community being a community.
Sticking with the "rusty, leaking batteries included!" in the STL is a bad call and I don't believe it is safe, either; most of the STL is abandonware that is just being shipped for backward compatibility sake. Don't make future product decisions, design decisions etc. based on Python teams' deprecation requirements!
I've been writing Python a long time and have grown quite frustrated by some of its warts. But every time I look at seriously investing in another language attrs is one of the few things I wouldn't want to give up. It's not perfect but I'll take very, very good when I can get it, yeah?
Their differences are highlighted in the dataclasses PEP: https://www.python.org/dev/peps/pep-0557/#why-not-just-use-n...
Comparatively named tuples are an older language feature which essentially allow you to define named accessors for tuple elements. IIRC, these days you can also define type annotations for them.
Their use case essentially overlap. Personally I much prefer data classes.
You can even type dictionaries this way: https://docs.python.org/3/library/typing.html#typing.TypedDi...
Benefits: Saving at best 10-15 lines of boilerplate per data class. Much less if namedtuple works for you.
If you want to save lines in __init__ you can write "for k, v in locals().items(): setattr(self, k, v)". But you shouldn't.
Edit: Forgot to add to the most important cost: Magic. You don't need to know a lot of Python to understand how the standard self.x = x initialization works. However, you do need to understand a lot of Python internals to grok x = attr.ib().
attrs is not “relatively unknown” as Python libraries go.
> Using arcane class decorators and unusual syntactic constructs: @attr.s and x = attr.ib() (a pun?).
There have been conventional, SFW aliases for the punny ones for...a long time.
Incidentally, I'd recommend against Named Tuples for non-trivial software. Because they can be indexed by integer and unpacked like tuples, additions of new fields are backwards-incompatible with existing code.
No more than with namedtuples (in fact, both use essentially the same magic: code generation and `eval`).
https://github.com/python/cpython/blob/3.10/Lib/dataclasses....
Kind of blew my mind
Attrs – The python library everyone needs (2016) - https://news.ycombinator.com/item?id=17160262 - May 2018 (2 comments)
Using attrs for everything in Python - https://news.ycombinator.com/item?id=12359522 - Aug 2016 (101 comments)
The One Python Library Everyone Needs - https://news.ycombinator.com/item?id=12285342 - Aug 2016 (1 comment)
[0] https://www.python.org/dev/peps/pep-0557/#why-not-just-use-a...
If you take 10 seconds to read attrs website they do go over the differences and maybe discussing those would be more valuable than some cheap snark.
You can decompose classes that become too big for their own good. You can design your software, layer abstractions intelligently etc. so that having to do such refactoring isn't a big issue.
Python is a language that demands an above average level of discipline compared to many other programming languages I have used, but only because it IMO leans strongly towards empowering the developer instead of restricting them.
The article mentions quaternions. If you make a quaternion type (class), you can define addition, multiplication, comparison, etc. for it (methods). If you represent a quaternion any other way, you can't say a * b. Or maybe you can, but I don't know how.
But all I can really feel is gratitude for all of us not having to do namedtuple/slot contortions anymore. Good riddance.
Voila!