Show HN: Array – A Better Python List (opens in new tab)

(github.com)

87 pointslauriat5y ago66 comments

66 comments

54 comments · 14 top-level

pedrovhb5y ago· 15 in thread

I think this is neat but I'm not sure it's the best way to go about things.

> all(map(func3, filter(func2, map(func1, zip(a, b)))))

> a.zip(b).map(func1).filter(func2).forall(func3)

The original is indeed terrible and the second version is a bit better. A lot better than either one, though, is splitting your logic into multiple lines and assigning a descriptive identifier to each step. Maybe even throw in some inline comments if you're particularly respectful of others' time.

As tempting as it is to do something super clever and cram a ton of functionality into a small number of lines or characters (it does feel good), it's just better to be a bit more verbose and write simple, obvious code. I feel like code should be read like a book, not a puzzle.

brundolf5y ago

What I like about "cramming a ton of functionality into [a single expression]" is that it doesn't leak any intermediates to the rest of the block, and it doesn't allow for mutation. There's a single output exposed; you can't accidentally use the wrong value downstream. You could wrap it all in an inner function, I guess, but that seems like overkill unless you plan to reuse it.

Though to be fair, having explicit intermediate variables is idiomatic in Python, from what I've seen. It's one of my biggest pet-peeves about the language, but it's not without precedent.

techdragon5y ago

This is exactly the main situation where I'll happily "get clever" with my code.

It's not being reused and one of the following is true... I don't want to leave behind intermediary objects for whatever reason is relevant, or I feel its worth it to compress the logic to make it possible to use a language feature that requires an expression, like lambdas or list/dict comprehensions.

bko5y ago

> a.zip(b).map(func1).filter(func2).forall(func3)

Lets make this a somewhat concrete example.

---

heights = [1,2,3]

widths = [4,5,6]

# printing area greater than 10

# functional

heights.zip(widths).map(to_area).filter(lambda area: area > 10).forall(lambda a: print("Area " + a)

#Verbose way

hw_zipped = zip(a,b)

areas = hw_zipped.map(to_inches)

big_areas = areas.filter(a: a > 10)

for a in big_areas: print("Area " + a)

---

Which do you prefer? I would argue the right level of abstraction is the functional way in this example, and its often the case in my experience, especially in python where you don't often use a namespace to store these intermediary variables and you have can't rely on typing

claytonjy5y ago

As another point of comparison, as of python 3.8 you can do this in one list comp without nesting or double-computing areas with the walrus:

    result = [area for x,y in zip(heights,widths) if (area := to_area(x,y)) > 10]

I don't think that's very easy to read; I'd opt for two list comps like

    areas = [to_area(x,y) for x,y in zip(heights,widths)]
    result = [area for area in areas if area > 10]

But I agree with OP that map+filter is easier to read.

2 more replies

syrrim5y ago

  for x, y in zip(a,b):
      area = to_area(x, y)
      if area > 10:
          print(f"Area {area}")

>in python where you don't often use a namespace to store these intermediary variables

Hm? Most python code is within a function, in my experience.

1 more reply

lauriatOP5y ago

I agree, and yes, the line may be a bit excessive. The idea of Arrays is not just to cram a heap of functions to a single line. The readability (at least to me) is improved even with e.g. a single map

  arr.map(func)

vs.

  list(map(func, arr))

snicker75y ago

> assigning a descriptive identifier to each step

Working with data scientists, in practice, these identifiers are usually "arr1", "arr2", &c. I'd rather have method chaining. Often the intermediates are not meaningful.

disgruntledphd25y ago

I agree with you in general, people (especially data scientists) are bad at naming things.

It's probably the core skill of good programmers though, so it should be taught more. I don't think anyone sets out to use misleading names, but it's easy for name and code to diverge, and it's crippling to readability.

However, often when refactoring/updating such data scientist code (or even understanding), I need to break apart the long method chains, and this is much, much more annoying than dealing with crummy names.

At least I can print the values associated with the names, which is not easily possible in the really long method chain.

derwiki5y ago

Code is read more often than it’s written; optimize for reading.

dragonwriter5y ago

> As tempting as it is to do something super clever and cram a ton of functionality into a small number of lines or characters (it does feel good), it's just better to be a bit more verbose and write simple, obvious code.

I find fluent style often clearer as well as more terse than with superfluous intermediate variables. Verbosity isn't the same thing as clarity.

(But in Python, comprehensions/genexps are often clearer than either.)

ElevenPhonons5y ago

Are these really the same?

The idiomatic Python 3 version uses generators to compose the computation and to avoid unnecessary memory allocations. Does funct.Array also do this?

- https://docs.python.org/3/library/functions.html#map - https://docs.python.org/3/library/functions.html#filter

6gvONxR4sf7o5y ago

You can split the a.b.c.d onto different lines and comment each, which is a decent middle ground sometimes (a\n.b\n.c\n.d). A problem, still, is exceptions and debugging. You get paged and see that something went wrong in that expression that does so many different things, and it’s much more frustrating to track down the bug. It makes step debugging trickier too. I’d love better error message/debugger support for that kind of programming.

rowanG0775y ago

I disagree with this. Splitting this simple pipeline into more variables makes stuff a lot less readable. Splitting it into variables would very clearly indicate to me the intermediate computations are used elsewhere. Which wouldn't be the case here.

Phemist5y ago

This feels luke a strawman example. I feel like list comprehension results in a much more readable example here. I think, at least.

> all(func3(a) for h,w in zip(a,b) for a in func1(h,w) if func2(a))

lauriatOP5y ago

Fair enough. Readability is subjective but I understand the sentiment. Constructing list comprehensions of such long chained expressions can be rather tedious and error prone, though (as your example shows).

jamespwilliams5y ago· 5 in thread

Looks cool.

    bool (__bool__) Returns whether all elements evaluate to True.

I’d be worried that this will trip people up who use the

    if l:
        print l[0] # or whatever

pattern

fantod5y ago

To be fair, using "if something" in Python is pretty much always a good way to trip yourself up.

nemetroid5y ago

I've yet to see a (popular) style guide recommend against "if something:".

1 more reply

pansa25y ago

PEP8 recommends using `if seq:` instead of more verbose alternatives like `if len(seq):`.

2 more replies

lauriatOP5y ago

Thanks!

Good point. However setting

  def __bool__(self): return self.nonEmpty

would mess up certain methods e.g. .index for nested Arrays as __eq__ is computed elementwise and bool(Array(False, False)) would evaluate to True.

Maybe a warning would be appropriate? (as is the case with ndarrays)

pansa25y ago

> bool(Array(False, False)) would evaluate to True

Isn't that consistent with the built-in `list`, though, because `bool([False, False])` is True?

1 more reply

Immortal3335y ago· 4 in thread

Chaining has its own benefits. But I think this doesn't fit the definition of "Pythonic". Again, "Pythonic" is highly debatable. But, You can always break down big chain of operations, into smaller chain using good variable naming in-between.

Many operations are implemented as iterator in python on list, like filter, groupby. Looking at your code, its looks like you're not doing lazy computation. (Correct me if I wrong). This could be huge performance impact, depending upon use case of list.

lauriatOP5y ago

I understand the unpythonic nature of Arrays may startle some hardcore pythonistas, but ability to chain functions was one of the main reasons why I wrote the package as I find nested function calls ugly and sometimes rather hard to decipher.

Regarding the perfomance, Arrays aren't meant to be super high performing but rather a simple way to manipulate sequences. For the best performance you should go with generic python, toolz or other.

nerdponx5y ago

I am with you on this. Personally, I would rather continue using Toolz (https://github.com/pytoolz/toolz), and contribute additional helper/utility methods to that library.

The whole point of some things being functions versus methods is that they are generic rather than specialized. The generic iterator protocol is probably the best feature about the Python language, and it's both a damn shame and bad design to not use it.

If you really wanted to make an improvement over built in lists, the thing to do would be to implement some kind of fully lazy "query planning" engine, like what Apache Spark has. Every method call registers a new method to be applied with the query planner, but does not execute it. Execution only occurs when you explicitly request it. That way you can effectively compile in efficient but readable code that takes multiple passes over the data into efficient operations internally that only make one pass, or at least fewer passes. This also naturally lends itself to parallelization/concurrency.

jmuhlich5y ago

Dask does the lazy evaluation and query planning thing on numpy arrays and pandas dataframes, and can execute in parallel. It mimics most of their native interfaces which makes it a pretty easy drop-in.

https://docs.dask.org/en/latest/

feanaro5y ago

> But, You can always break down big chain of operations, into smaller chain using good variable naming in-between.

I don't think so. Very frequently the intermediate values represent nothing in particular and naming them simply results in visual noise.

I think this is comparable to SQL or LINQ statements. Consider what those would look like if you had to name every intermediate values instead of being able to filter and group on-the-fly.

Of course you can make a mess out of those too, by building huge unreadable expressions, but that's also an extreme, similar to naming every intermediate step.

RocketSyntax5y ago· 3 in thread

Thank you for improving things and sharing.

I use numpy & pandas, lists & dicts every day. I read your docs/github page, but can you help me see the value?

However, I do think there are lots of common tasks that need to be done with lists that should be methods rather than fancy footwork =)

For example: https://stackoverflow.com/questions/3462143/get-difference-b...

As you allude to w your zip loop: https://stackoverflow.com/questions/1919044/is-there-a-bette...

lauriatOP5y ago

Thank you for taking the time to check it out!

Naturally if you're dealing with big arrays/tensors, numpy is the best choice for operating on sequences.

However, ndarrays have downsides for certain use cases - as ndarrays are fixed size, adding elements is very slow, also they don't support functional methods (or rather you have to create a new array every time you apply e.g. a map), and ndarrays of any other type than numbers doesn't really make sense.

Many of the methods are wrappers for built-ins, but I find the syntax of Arrays cleaner than the weirdness of the builtins.

For example, while applying an async "starmap" to an Array is just a method call, with built-in lists you would have go through the whole hassle of importing both ThreadPoolExecutor and starmap, creating an executor, scheduling the function, and finally converting the result back to a list.

lunixbochs5y ago

asyncmap using a thread pool with more than one worker by default is a little silly. Unless you map to a C function, you're just spawning a bunch of threads to contend for the GIL anyway.

RocketSyntax5y ago

ndarrays "create a new array every time you apply"

That resonates with me now that you explain that I can't do it.

I do like chaining things in pandas like `df.select_types("float").head(100).plot.hist()`

1 more reply

topper-1235y ago· 3 in thread

I'd like to have a chaining operator in Python, like R is getting. Then the example could be:

> a |> zip(b) |> map(func1) |> filter(func2) |> forall(func3)

The advantages would be that this would work with all lists/iterables, so no need to make a special types.

brundolf5y ago

The hard part with this is it sort of requires currying once you have >1 arguments, or something equivalent. I suppose Python could carve out an implicit behavior where the first or last argument is what gets fed into, but that feels potentially confusing as the calling syntax is now "lying" to you. In JavaScript doing a proper currying style isn't too hard because of arrow-syntax, but using python's function definition syntax to make a curried function would be hideous (not to mention, the standard library isn't done that way). Maybe you could have a "curryify" higher-order function. Or, the final option would be to have an explicit "insert previous value here" syntax as a part of the pipeline syntax, which is something the JS proposal has played with. Makes things more verbose (|> double(#) instead of |> double), but is maximally flexible and minimally confusing.

In short: it's a lot more complicated than it seems, but I agree that this style makes this type of thing 1000x more readable.

housecarpenter5y ago

Python does have a "partial" function which does currying:

  from functools import partial

  a |> partial(zip, b) |> partial(map, func1) |> partial(filter, func2) |> partial(forall, func3)

Obviously it's a bit more verbose than if the currying was done implicitly, but it's not too bad, I think. You could also import partial under a shorter name if you want.

partial does have an advantage over implicit currying in that you can use keyword arguments to neatly curry on a parameter other than the first, although this isn't properly utilized by Python because most of the built-in functions have place-based rather than keyword arguments. In languages with implicit currying you have to use anonymous function expressions or functions like flip (flip(f, x, y) = f(y, x)) to deal with this.

It might also be worth noting that |> doesn't essentially need to be an operator, it would just be syntactic sugar:

  def chain(x, *fs):
      y = x
      for f in fs:
          y = f(x)
      return y

  chain(a, partial(zip, b), partial(map, func1), partial(filter, func2), partial(forall, func3))

Obviously having it as an infix operator is nicer, and produces less parentheses.

topper-1235y ago

Just letting it implicitly be the first parameter would be good enough IMO, and a nice symmetry to self` in methods. That'd be very simple, which would be a plus in my book.

Pandas allows the first param in a pipe to be a tuple[callable, str], where the second argument would signify the parameter location, e.g. `val |> (func, "param_name")` which gives some flexibility.

But yeah, if you open up to piping, there are a lot of possible choices to be made and easy to go overboard also IMO.

brian_herman5y ago· 3 in thread

Can you do the same thing with dicts and make it so d['non_existant_key'] does not create an exception?

basdftrewq5y ago

    from collections import defaultdict

    d = defaultdict(int)

    d['non_existant_key']

st0le5y ago

Not quite the same what OP asked for. This will create the key and assign value 0 to it.

lauriatOP5y ago

You can already do that with

  d.get("non_existant_key", default)

njharman5y ago· 2 in thread

> all(map(func3, filter(func2, map(func1, zip(a, b))))

That is super readable to me. Working left to right or inside out. There is one, clear, balanced, familiar, consistently used punctuation to guide you, parens, if you need it but adds little noise if you dont.

The “bunch of functions taking and returning an iterator” is a great paradigm. So clean and flexiable, and powerfull. ESP combined with Python’s “many things are iterable” and is trivial to write your own iterator

faitswulff5y ago

I come from Ruby, but it’s pretty unreadable to me. Not that I couldn’t, I just don’t want to. So I doubt that it’s objectively super easy to read.

Any sort of reading inside out, right to left is a barrier to easy reading. This is why people like pipes in functional languages, right? You just read it in one direction.

lifthrasiir5y ago

I have used Python for decades (not so much nowadays, but still) and it is very unreadable for me. It's clear that it is a data pipeline but the input and filters are all in a wrong order, thus backtracking is required for reading. I have the same complaint about str.join.

nerdponx5y ago· 1 in thread

I feel obligated to point out the existence of the "array" package in the Python standard library: https://docs.python.org/3/library/array.html

I'm sure the author is aware of it, but readers might not be.

lauriatOP5y ago

That's why the A is capitalised ;)

orf5y ago· 1 in thread

> all(map(func3, filter(func2, map(func1, zip(a, b)))))

You definitely wouldn’t do this in “traditional Python”. You’d use a comprehension of some kind, or even the walrus operator, which is quite possibly faster and more readable than several chained lambdas.

lauriatOP5y ago

Fair enough, the example is a bit exaggerated. You could implement it with comprehensions

  all(func3(y) for y in (func1(x) for x in zip(a, b)) if func2(y))

It most likely is a bit faster, but I wouldn't say it's more readable.

goodside5y ago· 1 in thread

I know this is an early/experimental project, but the README could use more motivation before diving into basic usage. Asking someone to change their general-purpose containers is a big ask.

It looks like Array mostly consolidates functional features already available in standard libraries, and the main innovation is a redesigned swiss-army-knife API.

Good APIs are important, but my instinct is they aren’t this important. Using enhanced versions of built-in container types sounds nice, but do you really want to be keeping track of whether something is a normal list or an Array? Do you want to force people who read your code to learn this library to work with something as fundamental as lists? It’s not an impossible bar to clear (e.g. NumPy, Pandas, Dask, xarray) but it’s a high one.

lauriatOP5y ago

Thanks for the feedback! Redesigned swiss-army-knife is well put.

I’m sure Array’s not for everyone, but for some, including me it’s a nifty tool. I don’t expect people to memorise all the features of the library - the aim was to name and document each feature clearly such that finding the right method would be easy with the help of an IDE.

asimjalis5y ago· 1 in thread

I find this really useful and plan to use it. Thanks for writing and sharing.

One use case for the chaining/FP style that I find particularly powerful is building out logic on the REPL. The chaining style allows me to incrementally grow my chain like a unix pipeline, see the results, use that to tweak the chain, until I finally have what I want.

This type of instantaneous feedback loop is both highly productive and also extremely fun.

lauriatOP5y ago

cheers!

notretarded5y ago· 1 in thread

Why would I use this over numpy?

lauriatOP5y ago

If you're doing matrix multiplication or other math operations on fixed size sequences, you shouldn't.

If, however, you need the dynamic nature of the built-in list or functional methods with a touch of numpyness, you should give Array a spin.

aldanor5y ago

Why the JS-like naming, weird method naming convetions with strange underscores, and capitalized module name? I can't remember a single commonly used Python library with naming this strange.

E.g.

    def removeByIndex(self, b):
        """ Removes the value at specified index or indices. """
        ...

    def removeByIndex_(self, b):
        """ Removes the value at specified index or indices in-place. """
        ...

If you were to follow typical naming conventions, these would be either

    def remove_by_index(self, b): ...
    def remove_by_index_inplace(self, b): ...

Or pandas-like:

    def remove_by_index(self, b, inplace=False): ...

Or, one more step, use explicit typing as well (which also makes it more clear that the method returns self), and give a better name to the method argument rather than 'b':

    def remove_by_index(
        self, 
        index: Union[int, Iterable[int]], 
        inplace: bool = False,
    ) -> 'Array': ...

Explicit type signatures in libraries like this make many things self-explanatory, like the one above.

lunixbochs5y ago

As this doesn't use `__slots__`, every empty Array() will be 176 bytes vs the 56 bytes of [], and incur a dict allocation per array.

This is due to classes without `__slots__` gaining a `__dict__` attribute for dynamic attribute assignment.

Currently:

    >>> sys.getsizeof([])
    56

    >>> a = Array()
    >>> sys.getsizeof(a)
    72
    >>> sys.getsizeof(a.__dict__)
    104

with `__slots__ = []` in the Array class definition:

    >>> a = Array()
    >>> sys.getsizeof(a)
    56
    >>> sys.getsizeof(a.__dict__)
    AttributeError: 'Array' object has no attribute '__dict__'

j / k navigate · click thread line to collapse

66 comments

54 comments · 14 top-level

pedrovhb5y ago· 15 in thread

I think this is neat but I'm not sure it's the best way to go about things.

> all(map(func3, filter(func2, map(func1, zip(a, b)))))

> a.zip(b).map(func1).filter(func2).forall(func3)

brundolf5y ago

Though to be fair, having explicit intermediate variables is idiomatic in Python, from what I've seen. It's one of my biggest pet-peeves about the language, but it's not without precedent.

techdragon5y ago

This is exactly the main situation where I'll happily "get clever" with my code.

bko5y ago

> a.zip(b).map(func1).filter(func2).forall(func3)

Lets make this a somewhat concrete example.

---

heights = [1,2,3]

widths = [4,5,6]

# printing area greater than 10

# functional

heights.zip(widths).map(to_area).filter(lambda area: area > 10).forall(lambda a: print("Area " + a)

#Verbose way

hw_zipped = zip(a,b)

areas = hw_zipped.map(to_inches)

big_areas = areas.filter(a: a > 10)

for a in big_areas: print("Area " + a)

---

claytonjy5y ago

As another point of comparison, as of python 3.8 you can do this in one list comp without nesting or double-computing areas with the walrus:

    result = [area for x,y in zip(heights,widths) if (area := to_area(x,y)) > 10]

I don't think that's very easy to read; I'd opt for two list comps like

    areas = [to_area(x,y) for x,y in zip(heights,widths)]
    result = [area for area in areas if area > 10]

But I agree with OP that map+filter is easier to read.

2 more replies

syrrim5y ago

  for x, y in zip(a,b):
      area = to_area(x, y)
      if area > 10:
          print(f"Area {area}")

>in python where you don't often use a namespace to store these intermediary variables

Hm? Most python code is within a function, in my experience.

1 more reply

lauriatOP5y ago

  arr.map(func)

vs.

  list(map(func, arr))

snicker75y ago

> assigning a descriptive identifier to each step

Working with data scientists, in practice, these identifiers are usually "arr1", "arr2", &c. I'd rather have method chaining. Often the intermediates are not meaningful.

disgruntledphd25y ago

I agree with you in general, people (especially data scientists) are bad at naming things.

At least I can print the values associated with the names, which is not easily possible in the really long method chain.

derwiki5y ago

Code is read more often than it’s written; optimize for reading.

dragonwriter5y ago

I find fluent style often clearer as well as more terse than with superfluous intermediate variables. Verbosity isn't the same thing as clarity.

(But in Python, comprehensions/genexps are often clearer than either.)

ElevenPhonons5y ago

Are these really the same?

The idiomatic Python 3 version uses generators to compose the computation and to avoid unnecessary memory allocations. Does funct.Array also do this?

- https://docs.python.org/3/library/functions.html#map - https://docs.python.org/3/library/functions.html#filter

6gvONxR4sf7o5y ago

rowanG0775y ago

Phemist5y ago

This feels luke a strawman example. I feel like list comprehension results in a much more readable example here. I think, at least.

> all(func3(a) for h,w in zip(a,b) for a in func1(h,w) if func2(a))

lauriatOP5y ago

jamespwilliams5y ago· 5 in thread

Looks cool.

    bool (__bool__) Returns whether all elements evaluate to True.

I’d be worried that this will trip people up who use the

    if l:
        print l[0] # or whatever

pattern

fantod5y ago

To be fair, using "if something" in Python is pretty much always a good way to trip yourself up.

nemetroid5y ago

I've yet to see a (popular) style guide recommend against "if something:".

1 more reply

pansa25y ago

PEP8 recommends using `if seq:` instead of more verbose alternatives like `if len(seq):`.

2 more replies

lauriatOP5y ago

Thanks!

Good point. However setting

  def __bool__(self): return self.nonEmpty

would mess up certain methods e.g. .index for nested Arrays as __eq__ is computed elementwise and bool(Array(False, False)) would evaluate to True.

Maybe a warning would be appropriate? (as is the case with ndarrays)

pansa25y ago

> bool(Array(False, False)) would evaluate to True

Isn't that consistent with the built-in `list`, though, because `bool([False, False])` is True?

1 more reply

Immortal3335y ago· 4 in thread

lauriatOP5y ago

Regarding the perfomance, Arrays aren't meant to be super high performing but rather a simple way to manipulate sequences. For the best performance you should go with generic python, toolz or other.

nerdponx5y ago

I am with you on this. Personally, I would rather continue using Toolz (https://github.com/pytoolz/toolz), and contribute additional helper/utility methods to that library.

jmuhlich5y ago

https://docs.dask.org/en/latest/

feanaro5y ago

> But, You can always break down big chain of operations, into smaller chain using good variable naming in-between.

I don't think so. Very frequently the intermediate values represent nothing in particular and naming them simply results in visual noise.

I think this is comparable to SQL or LINQ statements. Consider what those would look like if you had to name every intermediate values instead of being able to filter and group on-the-fly.

Of course you can make a mess out of those too, by building huge unreadable expressions, but that's also an extreme, similar to naming every intermediate step.

RocketSyntax5y ago· 3 in thread

Thank you for improving things and sharing.

I use numpy & pandas, lists & dicts every day. I read your docs/github page, but can you help me see the value?

However, I do think there are lots of common tasks that need to be done with lists that should be methods rather than fancy footwork =)

For example: https://stackoverflow.com/questions/3462143/get-difference-b...

As you allude to w your zip loop: https://stackoverflow.com/questions/1919044/is-there-a-bette...

lauriatOP5y ago

Thank you for taking the time to check it out!

Naturally if you're dealing with big arrays/tensors, numpy is the best choice for operating on sequences.

Many of the methods are wrappers for built-ins, but I find the syntax of Arrays cleaner than the weirdness of the builtins.

lunixbochs5y ago

asyncmap using a thread pool with more than one worker by default is a little silly. Unless you map to a C function, you're just spawning a bunch of threads to contend for the GIL anyway.

RocketSyntax5y ago

ndarrays "create a new array every time you apply"

That resonates with me now that you explain that I can't do it.

I do like chaining things in pandas like `df.select_types("float").head(100).plot.hist()`

1 more reply

topper-1235y ago· 3 in thread

I'd like to have a chaining operator in Python, like R is getting. Then the example could be:

> a |> zip(b) |> map(func1) |> filter(func2) |> forall(func3)

The advantages would be that this would work with all lists/iterables, so no need to make a special types.

brundolf5y ago

In short: it's a lot more complicated than it seems, but I agree that this style makes this type of thing 1000x more readable.

housecarpenter5y ago

Python does have a "partial" function which does currying:

  from functools import partial

  a |> partial(zip, b) |> partial(map, func1) |> partial(filter, func2) |> partial(forall, func3)

Obviously it's a bit more verbose than if the currying was done implicitly, but it's not too bad, I think. You could also import partial under a shorter name if you want.

It might also be worth noting that |> doesn't essentially need to be an operator, it would just be syntactic sugar:

  def chain(x, *fs):
      y = x
      for f in fs:
          y = f(x)
      return y

  chain(a, partial(zip, b), partial(map, func1), partial(filter, func2), partial(forall, func3))

Obviously having it as an infix operator is nicer, and produces less parentheses.

topper-1235y ago

Just letting it implicitly be the first parameter would be good enough IMO, and a nice symmetry to self` in methods. That'd be very simple, which would be a plus in my book.

Pandas allows the first param in a pipe to be a tuple[callable, str], where the second argument would signify the parameter location, e.g. `val |> (func, "param_name")` which gives some flexibility.

But yeah, if you open up to piping, there are a lot of possible choices to be made and easy to go overboard also IMO.

brian_herman5y ago· 3 in thread

Can you do the same thing with dicts and make it so d['non_existant_key'] does not create an exception?

basdftrewq5y ago

    from collections import defaultdict

    d = defaultdict(int)

    d['non_existant_key']

st0le5y ago

Not quite the same what OP asked for. This will create the key and assign value 0 to it.

lauriatOP5y ago

You can already do that with

  d.get("non_existant_key", default)

njharman5y ago· 2 in thread

> all(map(func3, filter(func2, map(func1, zip(a, b))))

faitswulff5y ago

I come from Ruby, but it’s pretty unreadable to me. Not that I couldn’t, I just don’t want to. So I doubt that it’s objectively super easy to read.

Any sort of reading inside out, right to left is a barrier to easy reading. This is why people like pipes in functional languages, right? You just read it in one direction.

lifthrasiir5y ago

nerdponx5y ago· 1 in thread

I feel obligated to point out the existence of the "array" package in the Python standard library: https://docs.python.org/3/library/array.html

I'm sure the author is aware of it, but readers might not be.

lauriatOP5y ago

That's why the A is capitalised ;)

orf5y ago· 1 in thread

> all(map(func3, filter(func2, map(func1, zip(a, b)))))

lauriatOP5y ago

Fair enough, the example is a bit exaggerated. You could implement it with comprehensions

  all(func3(y) for y in (func1(x) for x in zip(a, b)) if func2(y))

It most likely is a bit faster, but I wouldn't say it's more readable.

goodside5y ago· 1 in thread

I know this is an early/experimental project, but the README could use more motivation before diving into basic usage. Asking someone to change their general-purpose containers is a big ask.

It looks like Array mostly consolidates functional features already available in standard libraries, and the main innovation is a redesigned swiss-army-knife API.

lauriatOP5y ago

Thanks for the feedback! Redesigned swiss-army-knife is well put.

asimjalis5y ago· 1 in thread

I find this really useful and plan to use it. Thanks for writing and sharing.

This type of instantaneous feedback loop is both highly productive and also extremely fun.

lauriatOP5y ago

cheers!

notretarded5y ago· 1 in thread

Why would I use this over numpy?

lauriatOP5y ago

If you're doing matrix multiplication or other math operations on fixed size sequences, you shouldn't.

If, however, you need the dynamic nature of the built-in list or functional methods with a touch of numpyness, you should give Array a spin.

aldanor5y ago

Why the JS-like naming, weird method naming convetions with strange underscores, and capitalized module name? I can't remember a single commonly used Python library with naming this strange.

E.g.

    def removeByIndex(self, b):
        """ Removes the value at specified index or indices. """
        ...

    def removeByIndex_(self, b):
        """ Removes the value at specified index or indices in-place. """
        ...

If you were to follow typical naming conventions, these would be either

    def remove_by_index(self, b): ...
    def remove_by_index_inplace(self, b): ...

Or pandas-like:

    def remove_by_index(self, b, inplace=False): ...

Or, one more step, use explicit typing as well (which also makes it more clear that the method returns self), and give a better name to the method argument rather than 'b':

    def remove_by_index(
        self, 
        index: Union[int, Iterable[int]], 
        inplace: bool = False,
    ) -> 'Array': ...

Explicit type signatures in libraries like this make many things self-explanatory, like the one above.

lunixbochs5y ago

As this doesn't use `__slots__`, every empty Array() will be 176 bytes vs the 56 bytes of [], and incur a dict allocation per array.

This is due to classes without `__slots__` gaining a `__dict__` attribute for dynamic attribute assignment.

Currently:

    >>> sys.getsizeof([])
    56

    >>> a = Array()
    >>> sys.getsizeof(a)
    72
    >>> sys.getsizeof(a.__dict__)
    104

with `__slots__ = []` in the Array class definition:

    >>> a = Array()
    >>> sys.getsizeof(a)
    56
    >>> sys.getsizeof(a.__dict__)
    AttributeError: 'Array' object has no attribute '__dict__'

j / k navigate · click thread line to collapse