Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Adding type hints to Python has increased my productivity by at least a factor of 10. They allow you to reason about code in a local function without having to track back up dozens of call sites to ensure you're getting what you think you're getting. That alone is worth the price of admission. Both when editing code or reviewing someone else's. It's fantastic, particularly in a very large code base.

The editor experience only increases that factor as you can navigate a lot more freely. Even for small code bases type hints are a revelation. Knowing if something is nullable or not, alone, has caught 100's of bugs in waiting.

Most of the complaints here feel like attacking a strawman, particularly the ones arguing what's the point if you're not running mypy. Run mypy! It's like saying what's the point of writing unit tests if you don't run the test suite. For any sufficiently large code base you absolutely should be running mypy with any plugins for your framework of choice added.

For heterogeneous dictionaries use `dict[str, object]` and then you're forced to use the existing duck-typing mechanisms you're used to. Or, you refactor into better types and/or structures.

Is there friction? Sometimes. You don't have to annotate everything. If the cost is too prohibitive for a particular structure or function, ignore it. Sometimes there are mypy bugs that make writing correct code impossible. # type: ignore[reason] that sucker. Don't throw the baby out with the bath water.

We use a flake8 linter to enforce that all new or modified functions at least annotate their return type which makes interacting with functions quite a bit nicer, and encourages all devs to at least do the minimum. Usually you find that most args are appropriately typed at the same time.



Why not just use a statically type language? I really don't understand why python is so popular outside of a few specialized areas.


Because ecosystems are what makes people productive and languages are just a means to access them. Python's ecosystem is absolutely massive and high-quality.


Most popular languages have massive ecosystems. Don't see Python's being a differentiator except for a few specialized areas. In fact, the project I'm on runs on spark wrapped by python libraries, so really its the scala libraries that provide all the heavy lifting.


We use Spark extensively. We also use @pandas_udf to run some part of our Spark pipeline through some Python-only ML libraries.


Ah, but what if I'm a masochist and I want all the verbosity of a statically typed language with none of the benefits?


Try F#: Strongly typed but 100% inference of types => best of both worlds.


It's basically a lisp with better syntax.


I'm baffled by people loving type hints in python when strongly typed languages have been widely available for so long. This is why people liked java and c#.


There are way more differences between Python and Java than just "having explicit types". If that was the only difference, your comment would make more sense.

I would even go so far as to say that Java's type system is the very one that left such a bad taste in people's mouth that many people swore off explicitly typed languages for a decade or two. It really was that bad, especially before the last few years. It added a lot of extra boilerplate for minimal practical benefit. People used Java because it was fast, cross-platform, and less likely to go horribly wrong than unmanaged languages like C and C++. I doubt very many people used it because they just loved typing tons of boilerplate or because they thought the type system was good enough to catch tons of errors.

The type systems in Rust, TypeScript, and other more recent languages are far more expressive (and capable of catching real-world bugs) than what Java offered. If you think Java has a good type system... things have come a long way since then. Java gets updated continually, and some people will surely jump at my comment and claim things are great these days, but, no. Java is not up to par, even today. C# has done a better job of keeping up, in my opinion, but even it still lacks extremely useful features like proper Sum Types.

I don't have much experience with Python's type hinting, so I can't comment as directly there.


And many of us prefer simplicity in languages.

When a language like typescript supports too many obscure features, or multiple ways to do the same thing, the curve becomes too steep to be proficient.


> C# has done a better job of keeping up, in my opinion, but even it still lacks extremely useful features like proper Sum Types

Does type pattern matching comes close? https://dotnetfiddle.net/Oz2Qyd

Yeah, it doesn't prevent passing unexpected type as object to Print func and getting runtime exceptions. And returning "object" type isn't helpful if we want to prevent runtime errors. But then again, it can be written to spit out compile time errors: https://dotnetfiddle.net/XfKVaa


F# is a great example of a strongly typed language where you almost never have to write type’s explicitly. The best of both worlds IMHO.


You misunderstand the ecosystem. Many folk program in Python because it’s largely forced upon them. It’s often the easiest language to work with for data science, ML Eng, or data engineering, despite many frameworks actually running in the JVM. It’s simply more accessible. The appeal of Python has not been its provenance or design for over 15 years, but rather the ecosystem.


Most engineers I work with are very happy to work with python. They would not say it is forced upon them. You are free to provide a much better alternative (that does not exist as far as I know).


Same thing happened to the JS world with typescript (though typescript seems more powerful than python type annotations).

Its fine though - I'm glad the culture is trending towards understanding that strong typing saves you time rather than causes you extra time. This is true even for very small and simple programs, and even if you're the only developer.


The typing debate has been ongoing for decades, and it's switched back and forth a few times. JS, Python and Ruby all came out in the 90s, after there were plenty of statically typed languages being used at the time. Alan Kay has argued that types are too restrictive and usually don't describe the kind of data the program is really about. Maybe that's not as true for an advanced typing language like Haskell. And maybe Kay had C and Pascal in mind, when he thought of objects as being the basis for a proper dynamic type system.


C,Pascal, even C# and C++ need type annotations mostly to actually make the program work. The type systems in Typescript and F# are more about helping the developer than helping the compiler.

Python type hints are slightly different because there are type hints as a syntax feature and then there is type checking, which can be static like mypy or during runtime like Beartype. Every type checker comes with its own type system, though they are hopefully somewhat similar...


TypeScript has a more powerful type system, but arguably it's less powerful because the hints are invisible to runtime code. Python doesn't do any runtime type checking by default, but the hints are accessible in code, which unlocks a fair amount of power. For example, Strawberry is a GraphQL server library that allows you to define your GraphQL schema using the same syntax as dataclasses.


Python is, and has always been, a strongly typed language.

Do you mean "statically typed"?


I really like C# and have done a bunch of work with it in the past! But I also love Python for reasons other than just static vs dynamic typing.

To be honest, I think a lot of Python [or arbitrary dynamic language] developers can be quite sloppy, throwing dictionaries around everywhere because it's easy and leading to some really hard to read code. Type hints guide people away from that sloppiness while still allowing access to most of the features that make dynamic languages so useful and expressive.

It's a best-of-both-worlds situation and I'm here for it.


Many languages also demonstrated Python-like succinctness with strong static types and type inference. E.g. OCaml dates back to the 90's and ML dates back to the 70's.


If the type system is not powerful enough it will be extra works just to get around the inadequacy.. If c# have some support of sum type and structural subtyping like Python I will be using that... I mean even Haskell does not have proper records.. Python is not definitely the best, but quite nice among the current options.


There's a difference between putting on your own seatbelt and the car forcefully tying you into the chair every time you sit.


Nice. How do you force types just in the new code?


My team started with an untyped codebase a year ago. I added a rule to CI that every pr that touches files with type errors must decrease the type errors in touched files by at least 5. When we started we had like 30k type errors. A year later and it’s about 5k left. This was a small wrapper script over mypy/pyright that just called them twice once on master and once on your branch to compare error counts.

I did occasionally get pushback on it but stricter checks will be inconvenience at first and I argued it’d help over time. Now that we’ve had it a year it’s pretty well accepted in same way many would accept run formatter.


The problem with these sorts of "rules" is they create needless changes outside of the PR's primary focus. The PR's purpose is to do X: add a feature, fix a bug, whatever. Now because I've done X, you want me to do Y, a set of totally unrelated changes that ultimately are a distraction.


Why wouldn’t you submit two different branches to review? One where you fix errors for types and another for the feature with the fixes:

    (git main) > git checkout -b abc-1234/type-fixes
    <do work on types and commit it>
    (git abc-1234/type-fixes) > git checkout -b abc-1234/feature-branch
    <do work on feature and commit it>
Then you can submit the feature branch as a PR on top of the type fixing branch and a PR for that branch to main. You only see the type fixing changes on that PR and only see the feature changes on its PR.


Seems fine to me, but in my experience these “rules” are often tied to a single PR and various automated CI checks.


Mypy can be configured to ignore modules based on their names.

Though honestly I think it’s worth it to just bite the bullet and spend a full day or two to go through and fix every single error. After this you will have a much easier time navigating code base. I mean, your “new code” is probably modifications to your old code or calling your old code right? So you want to have at least the boundary between new and old typed. It’s quite satisfying anyway, as you will likely learn and discover tons of bugs on the way.


With any large legacy Python codebase, that's usually a month-long undertaking for one engineer, if not longer!

Python by design encourages duck typing, which means that you'll have plenty of places that simply take multiple different types, and just slapping a union is usually non-trival and not very beneficial either (eg. you'd be randomly patching things with fakes and mocks in tests).

If you think it's a single day job, I am sure my company (and plenty others) would pay you gladly your single daily rate to get our codebase migrated in a day :-D


> day or two

Your code must be quite small.


For any large code base, it'll go on for months, since it gets done between higher priority work. You'll discover some problems, sure. But if they were really important, you'd have hit them earlier. My experience is these sorts of efforts create a lot of busy work and not much more.


You can pipe the diff into specific flake8 checkers. Disable the check by default so it's not run over the entire codebase, and have a separate line for running specific checks on just the diff. Eg:

git diff -U0 --relative origin/master... | flake8 --diff --select <custom-codes>

I'll leave the flake8 plugin as an exercise to the reader as ours is intertwined with code I can't share right now.


The same (and only) way you’d reasonably be able to do it in any other language that uses a series of text files for an input.

File by file.


I generally force it on a per-file basis by adding

     # mypy: disallow-untyped-defs 
on top of new python files.


You can use flakehell for that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: