Very interesting. This PEP is still in draft state, but I am interested to see how the community will react. For me, I have a few thoughts:
1) This is really close to Erlang/Elixir pattern matching and will make fail-early code much easier to write and easier to reason about.
2) match/case means double indentation, which I see they reasoned about later in the "Rejected ideas". Might have a negative impact on readability.
3) Match is an already used word (as acknowledged by the authors), but I think this could have been a good case for actually using hard syntax. For me, perhaps because I'm used to it, Elixir's "{a, b, c} = {:hello, "world", 42}" just makes sense.
4) I hope there won't be a big flame-war debacle like with :=
5) And then finally there is the question of: "It's cool, but do we really need it? And will it increase the surprise factor?" And here I'm not sure. And again, this was the concern with the new assignment expression. The assignment expression is legitimately useful in some use cases (no more silly while True), but it might reduce the learnability of Python. Python is often used as an introductory programming language, so the impact would be that curricula need to be adjust or beginner programmers will encounter some surprising code along the road.
I can't say this is a good or bad proposal, I want to see what other opinions are out there, and what kind of projects out there in the world would really benefit from syntax like this.
> While matching against each case clause, a name may be bound at most once, having two name patterns with coinciding names is an error.
match data:
case [x, x]: # Error!
...
Which is a bit of a shame. This comes in handy in Elixir to say "the same value must appear at these places in the collection". I.e. for a Python tuple pattern `(x, y, x)`, `(3, 4, 5)` would not match but `(3, 4, 3)` would.
Overall, though, I think this will be a great addition to Python. Pattern matching is generally a huge boost for expressiveness and readability, in my opinion.
More vice versa. Guards look more general (being able to express not-equal, and less-than etc), so special semantics for `x,x` only seem warranted if it's very common.
Which equality will you use if it's not numbers, but something more complicated.
What if `(x, y, x)` is matched against some `(a, b, c)` where `a == c`, but not `a is c`? If yes, and you mutate `x` in the body, does it mutate `a` or does it mutate `c`?
I think at least making it an error now can leave it open to defining it later.
There's certainly no reason to use reference equality. In fact, I think equality is the wrong question: it should duplicate the match. `case (x, y, x)` should `__match__((a, b, c))` IFF `a.__match__(c)` (Not sure if that's exactly the right syntax for the Python calls, but hopefully the idea is clear.)
(Elixir has a similar operation to "pin" in a match: if you use an existing variable as the pattern, but pin it, the value must match whatever the variable is already bound to.)
> I think at least making it an error now can leave it open to defining it later.
What's interesting here is this is gated on the new parser from PEP-617 [1]. Based on mailing list discussions [2], and the PEP, the use of `match` will be context-sensitive, so it shouldn't be as disruptive of an introduction as `async`.
I have a visceral dislike of pattern matching. Lisp shows just how much people will abuse it in real-world production codebases. It becomes impossible to understand even simple logic without comments. I’d link to some examples, but I’m on mobile; suffice to say, pull up the emacs codebase and read through some of the more advanced modules like edebug.el. I’m not certain that one uses pattern matching, but it’s a perfect example of “this codebase cannot be understood without extensive study of language features.”
You may argue that I am simply not versed enough in pattern matching. “You should study harder.” I would argue that simplicity is worth striving for.
I hope this PEP never moves beyond draft.
It’s also shocking that most people here seem to be tacitly supporting this, or happy about it. Yes, it’s cool. Yes, it might simplify a few cases. But it will also give birth to codebases that you can’t read in about, say, 5 years. And then you’ll have a bright line between people in the camp of “This is perfectly readable; it does so and so” and the rest of us regular humans that just want to build reliable systems.
And oh yes, it becomes impossible to backport to older python versions. Lovely.
Even if a fail-safe version of destructuring-bind is used to validate and get the basic shape, it's still tedious:
(destructuring-case expr
...
((a (b c))
(if (and (eq a 'and)
(eq b 'not))
... now check that c is a list of nothing but (not x) forms
)))
I don't have a pattern matcher in TXR Lisp. That is such a problem that it's holding up compiler work! Because having to write grotty code just to recognize patterns and pull out pieces is demotivating. It's not just demotivating as in "I don't feel like doing the gruntwork", but demotivating as in, "I don't want to saddle my project with the technical debt caused by cranking out that kind of code", which will have to be rewritten into pattern matching later.
People use pattern matching all the time in ML or rust or haskell or Scala or Elm. It's totally uncontroversial there, and it's helpful to readability, not harmful. Erlang and Elixir also show it works in untyped languages pretty well.
The defining feature of lisp is that it can support whatever you want, because the AST is available at compile time and is completely regular. If your point is about dicts specifically, then you might be technically correct, but I assure you that the majority of lisp codebases do support exactly the sort of pattern matching in this PEP. And the abuses are frankly egregious. Racket is the worst offender of them all, with syntax matching.
I’ve never had this issue with lisp code bases, and I’ve read through quite a bit of lisp by now. For me, lisps—whether CL, Clojure or Emacs Lisp—are some of the easiest languages for me to identify and correct the source of a bug in.
2. Everything in between () is "has an attribute with value".
3. List means "the attribute should be treated as a tuple of"..
etc..
Very confusing, this definitely needs another syntax, because both newcomers and experienced devs will be prone to read it as plain `==`, since that's how enums and primitives will be working.
This syntax goes against Zen:
It’s implicit -- when using match case expressions don't mean what they regularly mean.
It’s complicated -- basically it’s another language (like regex) which is injected into Python.
I’m a big believer in this feature, it just needs some other syntax.
Using {} instead of () makes it a lot better. Now no way to confuse it with simple equality.
match node:
case Node{children=[{Leaf{value="(", Node{}, ...}}
It's worth noting that there's a truly massive amount of precedent in other languages for Python implementing it using the syntax as proposed. Languages that have or are planning to include pattern-matching where the pattern syntax exactly mirrors the expression syntax like this include Rust, Swift, OCaml, Haskell, C++, Ruby, Erlang, and many, many more.
I understand the worry that newcomers might struggle, but I don't think it's going to be the case: newcomers regularly learn the languages listed above without stumbling across that problem. And if Python did choose a syntax like the one you're proposing, it'd also be the odd one out among dozens of mainstream languages including this feature, which I think would be even more confusing!
Yes I think thats what the original commenter was missing. Pattern matching is not new ( but it is awesome ), there are already expectations of how it would look in Python.
> Very confusing, this definitely needs another syntax
The entire point of structural pattern matching is that structuring and destructuring look the same.
> This syntax goes against Zen: It’s implicit -- when using match case expressions don't mean what they regularly mean.
There's nothing implicit to it. The match/case tells you that you're in a pattern-matching context.
> I’m a big believer in this feature, it just needs some other syntax. Using {} instead of () makes it a lot better. Now no way to confuse it with simple equality.
Makes it even better by… looking like set literals and losing the clear relationship between construction and deconstruction?
At the current rates, it seems like it's only going to be another 5 years or so before Python is straight-up a more complicated language than Perl 5. What it lacks in frankly bizarre corner cases it's going to make up for in subtly bizarre corner cases.
I used to feel like I could define __getattr__ or __setattr__ and understand the implications, but that's getting increasingly terrifying.
Uhhh... yeah but that's true of almost every language feature, right? I love pattern matching but I think the point is that Python risks turning into a "kitchen sink" language with too many syntactical forms to keep in ones head if this continues.
Sorry, are you trying to argue for or against the proposition that Python is on track to be straight-up more complicated than Perl 5 at this rate? Your tone suggests you think you're arguing against it, but all the evidence you're bringing is for the proposition.
Pattern matching traditionally resembles instantiation except potentially with wildcards, because the “pattern” in “pattern matching” is a object representation template in the same representation used elsewhere in the language.
Using a different syntax for pattern matching loses the main point of pattern matching.
And the counter-point is: you learn one time that it's not instantiation and remember it forever, and you get to resume reusing/overloading muscle-memory syntax to get the job done.
glad to see this! though it's a shame that the proposed `match/case` is a statement, not an expression:
> "We propose the match syntax to be a statement, not an expression. Although in many languages it is an expression, being a statement better suits the general logic of Python syntax."
no matching in lambdas unless those get overhauled too :(
instead, let's get excited for a whole bunch of this:
match x
case A:
result = 'foo'
case B:
result = 'bar'
> glad to see this! though it's a shame that the proposed `match/case` is a statement, not an expression:
> no matching in lambdas unless those get overhauled too :(
Hate it or like it, but it's congruent with the rest of Python design.
Guido has been hostile with introducing too much FP in Python.
While I find it frustrating from time to time, espacially when I come back from another language to Python, on the long run, I must say I understand.
I don't like to work with FP languages, because unless your team is really good, the code ends up hard to read and debug. It's possible to make very beautiful code in FP, but it makes it so easy to create huge chains of abstract things.
And devs are not reasonable creatures. Give them a gun with 6 bullets, they'll shoot them all, and throw some stones once it's empty.
To me, the average dev is not responsible enought with code quality, and should not be trusted. I'm all for tooling enforcing as much as you can. Linters and formatters help a lot. But sometimes, language design limiting them is a god send.
I've seen the result with Python: you can put a math teacher, a biologist and frontend dev on their first backend project, and they will make something I can stand reading.
I wonder if this is intentional. FP constructs are increasingly discouraged in Python, taking reduce [1] as an example. Not that this is necessarily a bad thing, per sé -- I think it does simplify the language. But it's an opinionated decision that will discourage FP enthusiasm from finding a home in Python.
> I wonder if this is intentional. FP constructs are increasingly discouraged in Python, taking reduce [1] as an example.
That single comment from Guido 15 years ago, which was largely backtracked on—map and filter remain in the core library and reduce moved to a library in stdlib, lambda remains—is not really an example of something accurately described as “increasingly” now. In fact, given the backtracking, it's arguably decreasingly true from the high point of that post.
> You would not get through code review with that in any shop that has discipline.
... and i'd never think to try! that's why i mentioned using it in a REPL and called it a hack :) it's just fun. i played around with bytecode-rewriting too, doesn't mean i'd use it in production.
PS. for a less extreme version, the 1-elem-tuple trick is useful if you need an intermediate variable in a listcomp:
[
y+y
for x in my_list
for y in (foo(x),)
]
but it's (obviously) ugly & only useful if you're in a REPL and don't want to rewrite the whole thing.
> "[...] making it an expression would be inconsistent with other syntactic choices in Python. All decision making logic is expressed almost exclusively in statements, so we decided to not deviate from this."
so it's just keeping in line with Python's general imperativess.
> FP constructs are increasingly discouraged in Python [...] Not that this is necessarily a bad thing, per sé
i, on the other hand, do think it's a bad thing :) but what can you do...
Not sure about "increasingly" - note the comment you link to is 15 years old! It seems more like a recognition that map/filter/reduce never was idiomatic Python and there are better ways to do the same. Python have never tried to be a functional language.
Just as named functions replace multiline lambdas, I expect that named matcher functions (functions that consist entirely of a match where each arm is a return) will replace match expressions in python.
Abstractly, I'd rather have match expressions, too, but this does fit the rest of Python better.
it's a shame :( and honestly, i think the whole "if it needs statements, it should be a named function anyway" argument is misguided (and maybe kind of a post-hoc justification of the limitations of Python lambdas). introducing a named function for something that's only going to be used in one place messes up the order (you have to keep the function in your head until it gets used) thus often making the code harder to read.
Lately, Elixir has dethroned Python as the language that I get the most joy from using. Pattern matching in one of the big reasons. Great to see that Python core contributors (including Guido!) wants to see this feature in Python as well! Hopefully it will be well integrated and not feel like a tacked on feature.
If you're curious about why this is so useful, and reading the (quite dry) PEP isn't your thing, I would heartily recommend playing with Elixir for a few hours. Pattern matching is a core feature of the language, you won't be able to avoid using it. The language is more Ruby-like than Python-like, but Python programmers should still have an easy time grokking it. When I was getting started I used Exercism [1] to have some simple tasks to solve.
"The match and case keywords are proposed to be soft keywords, so that they are recognized as keywords at the beginning of a match statement or case block respectively, but are allowed to be used in other places as variable or argument names."
That's interesting. Python 3.6 had "async" and "await" as soft keywords, before they became reserved keywords in 3.7 [1]. However, soft keywords have just been added to Python more generally [2], so aren't such a special case anymore.
Interesting proposal, but I'm cringing at yet another overload for the * symbol.
a * b == "a times b"
a * * b == "a to the power of b"
f(* a) == "call f by flattening the sequence a into args of f"
f(* * a) == "call f by flattening the map a into key value args for f"
[* a] == "match a sequence with 0 or more elements, call them a"
Am I missing something? I know these all occur in different contexts, still the general rule seems to be "* either means something multiplication-y, or means 'having something to do with a sequence' -- depends on the context". It's getting to be a bit much, no?
Note: HN is making me put spaces between * to avoid interpretation as italics.
>>> [x, *other] = range(10)
>>> x
0
>>> other
[1, 2, 3, 4, 5, 6, 7, 8, 9]
So this instance of the syntax is not that novel. If it's a mistake, it's too late to fix it.
A unary asterisk before a name means the name represents a sequence of comma-separated items. If it's a name you're assigning to (LHS) that means packing the sequence into the name, if it's a name you're reading from (RHS) that means unpacking a sequence out of the name.
Yes, the form used in function declaration (the inverse of the form in function calls) which is pretty much exactly the same as the new use, “collect a sequence of things specified individually into a list with the given name”.
Unpacking was huge 10 years ago, but nowaday even JS has object destructuring, so python was lagging behind.
It feels like they really spent a lot of time in designing this: Python has clearly not be made for that, and they have to balance legacy design with the new feature.
I think the matching is a success in that regard, and the __match__ method is a great idea. The guards will be handy, while the '_' convention is finally something official. And thanks god for not doing the whole async/await debacle again. Breaking people's code is bad.
On the other hand, I understand the need for @sealed, but this is the kind of thing that shows that Python was not designed with type hints from the begining. Haskell devs must have a laught right now.
We can thank Guido for the PEG parser in 3.9 whichs allows him to co-author this as well.
I expect some ajustments to be made, because we will discover edge cases and performance issues, for sure. Maybe they'll change their mind on generalized unpacking: I do wish to be able to use that for dicts without having to create a whole block.
But all in all, I believe it will be the killer feature of 3.10, and while I didn't see the need to move from 3.7, walrus or not, 3.10 will be my next target for upgrade.
There is also switchlang¹ which isn't quite the same thing, but provides some of the functionality of PEP-622 and pampy. I believe it is notable for including a nice descriptive README, and also having a small/simple implementation.
I personally prefer the pampy internals, but quite like the context manager usage from switchlang. I don't even know which bikeshed I want to paint, let alone the colour.
This is the first PEP I’m really excited about. I hope the design is given more careful consideration, though, because destructuring in this manner more broadly across the language would also be killer.
My biggest concern is the class matching syntax. I feel like that would be much better deferred to a lambda style function or similar. The syntax matches instantiating a new class instance exactly, which seems like it could cause a lot of problems for tools that read and manipulate syntax.
Can someone explain to me the history behind Python's aversion to switch statements? I get Python is opinionated and I'm not trying to start a language war, it was just never clear to me why the `if ... elif` pattern was the preferred idiom.
I came to Python after ~5 years of programming in C-style languages (about 15 years ago) and the lack of a switch statement may be the one thing about Python that demonstrably made me a better coder.
When I discovered there wasn't one, I was really annoyed and went digging for an explanation. The one I found was a suggestion if you are reaching for a switch statement, you're (probably) doing something wrong. This is not to impugn anyone else's approach or style and I will freely admit there are times when all you need is a switch and if/ else if/ else gets ugly, but most times I find the replacement for switch is not that but a dictionary holding a callable or similar. I recently showed this approach to a peer in PR and watching it click for her was awesome. She ripped out most of what she'd done and replaced some of our more ponderous permission checking with a dictionary of functions to apply.
I'd say the other thing I do when I wish I had a switch statement is realize I am writing code that is halfway to doing things The Right Way and refactor the block into smaller functions.
Can you give an example? I've never heard of this recommendation and I have a bit of trouble imagining it in a way that is simpler than a bunch of if-else
I came up with the following, and that definitely doesn't convince me.
Taking this question from the opposite angle, what benefits do switch statements give us over if...elif?
Not actually a heavy Python user, but even though most languages I use regularly are more switch/case-heavy, I've never quite grasped why there's two largely interchangeable ways to do the one thing.
In lower level languages, when you use switch, you make it easier for the compiler to generate a jump table.
In higher level langues with strong, static types, the switch statement can indicate to the compiler you want to match on the type of the variable; it can do analysis then to make sure your pattern match is exhaustive.
These are good examples. Especially for a lower level languages this makes a lot of sense.
For higher level languages, I would imagine typeguards would provide some of the desired functionality here, while being much more lightweight than a full alternative conditional syntax.
Specifically to Ruby (therefore, this is not a generic answer), switch/case has syntax for "short form" expressions, that make the conditionals very coincise, by requiring only the variable part of what would otherwise be, repeating whole expressions.
Making an example is simpler than formulating a defition :-):
case instance
when MyClass...
when MyOtherClass...
...
end
which would otherwise be:
if instance.is_a?(MyClass)
...
elif instance.is_a?(MyOtherClass)
...
...
end
if you consider that this has support for many other expressions (regular expressions, ranges...), you'll see a very coincise construct for the purpose.
Switch might "just" be a special case (har har) of if/elif, but it's a significant one: where you're branching only on the value of one variable, and branching on equality. I'd say it's significant enough to merit some special syntax.
To put it another way: when you're branching on the value of a single variable by comparing it to an enumerated set of values, the obvious way to do so is with switch/case.
> what benefits do switch statements give us over if...elif?
Clarity of intent, much as comprehensions provide over imperative loops (even though switch is still imperative in most langauges). Though whether it's enough benefit to be warranted is another question; I think basic switch is not clearly compelling, though adding even basic smarter matching (like Ruby has long had, for instance) makes it moreso.
Having both switch and if/elif allows writing code with less redundancy in it (unnecessary keywords and variables can be elided by choosing one selection construct over the other).
I'm not averse to switch statements in general, but, in Python specifically, I just fail to see the point. In a higher-syntax language like C, a switch statement can save a lot of clutter, and also has some different semantics. In a language like ML, match statements come with a whole lot of extra static checking.
But Python's if/elif/else syntax is already clean enough that there's just not much clutter to remove. And C-style semantics on switch statements wouldn't be acceptable. And Python is a very dynamic language. So, in the end, you would end up with something that's generally the same line count and the same semantics as an equivalent if-statement, meaning it's would be more like semantic Splenda than semantic sugar.
I think culturally, because “explicit is better than implicit” (from the Zen of Python). Switch statements have a lot of implicit-ness to them (implicit invocation of equality comparison, to start) and it never seemed quite necessary, given how spare Python syntax is anyway.
I'd say the opposite: switch is explicitly about comparing a single variable against an enumerated set of possibilities, the equivalent if/elif construct has those same semantics only implicitly. The surface area of what switch/case means is very small and compact.
You choose to interpret a switch statement through an if statement, but that's arbitrary: the other way around is also possible (interpreting/rewriting an if-else construction as switch statements).
The article links to PEP 3103, which is a proposal to add a switch statement to the language. Interestingly, it was written by Guido himself, which you'd usually expect to give a pretty good chance of being accepted, especially since he was the sole decider of what would be accepted at the time! The rejection notice says simply:
> A quick poll during my keynote presentation at PyCon 2007 shows this proposal has no popular support. I therefore reject it.
It's a remnant from the time when Python strived for simplicity and resisted adding syntax. Just like the decision against adding a ternary operator, etc. The alternative ways (dict of functions or if chain) were considered good enough weighed against the sin of adding relatively rarely used syntax.
Suppose you are dispatching on token types. You could write something like this:
handlers = {'start' : handle_start, 'jump' : handle_jump, 'end' : handle_end, ...}
for tok, arg in tokens:
if tok not in handlers:
raise ValueError('Wrong token type, must be any of %s.' % ', '.join(handlers))
handlers[tok](arg)
I see, I do follow that design pattern for larger switch statements or where the dictionary had to be dynamically populated (even in non-Python languages) but I hadn't considered using it for smaller switch statements as well.
Rust was my first experience with pattern matching, and I really learned to love it there. Seeing it come to Python will be great as well. I'm glad this is on the docket, and I can't wait to see how this draft evolves.
I'm excited, but it seems like setting names by default is very odd. Quite a bit of the PEP is dedicated to the odd situations created by "case x" actually setting x rather than reading x ("case .x" would read x). Wouldn't this be a natural place for := ? So you would do:
case x := _
to match and assign to x. "_" would always be the matcher. You always have access to the original item that the match was made on, so pulling out the matched items is often not needed, AFAICT. This would be explicit, and not too surprising. Then the whole dotted names part can be dropped - it works like normal Python at that point.
The PEP already suggests this for capturing parts of the match, why not just use it for all saved matches? It's more verbose, but consistent, with fewer caveats, and not always needed.
Disclaimer: My languages don't happen to include one with good pattern matching, so I'm not strongly familiar with it.
With pattern matching you normally want bindings, at least local to the match construct (that the PEP proposes bindings with normal Python function scope, rather than local to the construct has plusses and minuses), so creating extra verbosity for bindings is complicating the normal case.
One subtle thing which I noticed is the distinction between class patterns and name patterns (bindings). In particular, it is possibly confusing that the code `case Point:` matches anything and binds it to the value Point, whereas `case Point():` checks if the thing is an instance of Point and doesn’t bind anything.
Yeah, that seems like it's going to cause trouble, because you can make a mistake without noticing. If you write `case Point:` when you mean `case Point():` you won't get an exception or a missing name, it'll just look like it thinks all objects are Points.
Linters could help. You're shadowing `Point`, and because `case Point:` matches any value, if there's another case after that then something is wrong. But you can't always rely on linters.
It's not quite syntactic sugar, but you're right that (probably) no new object would be created. Based on the "The __match__() Protocol" section [0] of the Pep, matching will call into `Node`'s `__match__` method with `node` as the arg. If Node doesn't have any custom logic here, it'll use the default __match__ implementation [1], which checks `isinstance(node, Node)` like you said, then returns `node` for the python interpreter to check that `node.children` a) is a sequence, b) has two elements, c) has its first element matching `LParen`, using `LParen`'s `__match__` method, and d) has its second element matching `RParen`, using `RParen`'s `__match__` method. If none of these `__match__` methods are overriden, then it does basically function as the parent poster said (though I think it would work even if node.children were some other sequence type (e.g. tuple) containing `LParen()`, `RParen()`).
This is great. I'm currently trying to rewrite a heavily object-oriented library into a more functional one (the rewrite is necessary because of licensing issues, and functional because the original code is a clusterf*ck of mutation) and despite the whole company working on top of Python, I was seriously considering implementing it in SML[1] specifically due to pattern matching making the underlying algorithm of the main data structure incredibly easier to reason about and implement.
[1] Yes I know coconut-lang is a thing, but I didn't want to introduce something that looks a lot like Python but isn't in our codebase
>Note that because equality (__eq__) is used, and the equivalency between Booleans and the integers 0 and 1, there is no practical difference between the following two:
>case True: ... case 1: ...
From practical perspective this is great, but I can imagine many cases where one could want to differentiate between those.
On one hand the number usually can be nested inside a structure and matching on == is more flexible. On the other hand matching on 'is' is still letting users relax this behaviour and allows matching on type of primitives as well.
1) This is really close to Erlang/Elixir pattern matching and will make fail-early code much easier to write and easier to reason about.
2) match/case means double indentation, which I see they reasoned about later in the "Rejected ideas". Might have a negative impact on readability.
3) Match is an already used word (as acknowledged by the authors), but I think this could have been a good case for actually using hard syntax. For me, perhaps because I'm used to it, Elixir's "{a, b, c} = {:hello, "world", 42}" just makes sense.
4) I hope there won't be a big flame-war debacle like with :=
5) And then finally there is the question of: "It's cool, but do we really need it? And will it increase the surprise factor?" And here I'm not sure. And again, this was the concern with the new assignment expression. The assignment expression is legitimately useful in some use cases (no more silly while True), but it might reduce the learnability of Python. Python is often used as an introductory programming language, so the impact would be that curricula need to be adjust or beginner programmers will encounter some surprising code along the road.
I can't say this is a good or bad proposal, I want to see what other opinions are out there, and what kind of projects out there in the world would really benefit from syntax like this.