> I might be good to remind the readers that the GIL removal has very little chance to break a Python-only codebase
Is this actually true? I was under the impression that some multi-threaded Python code relies on some operations being implicitly thread safe due to the GIL. For example, adding an item to the same list from two concurrent threads is never going to corrupt the list, simply because the threads never do that operation in parallel (the GIL prevents the threads from running in parallel). If you remove the GIL, suddenly you'll punish this kind of code, just like C++ quickly punishes concurrent mutation of an std::vector.
I'm not 100% sure of this, but I find this sentence a bit suspect.
Yes this is a thing was trivially protected by the GIL. There is the same thing with mutating the same map concurrently in Go that will panic for example.
PEP 703 goes over this in the "Container Thread-Safety" (I think container here refers to the fact that the object has references to other objects, this is the things that already are special-cased in CPython to be managed specifically by the garbage collector):
> This PEP proposes using per-object locks to provide many of the same protections that the GIL provides. For example, every list, dictionary, and set will have an associated lightweight lock. All operations that modify the object must hold the object’s lock. Most operations that read from the object should acquire the object’s lock as well; the few read operations that can proceed without holding a lock are described below.
Note that concurrency of containers not specific to Python any way.
For example, Java implements different versions of containers for single thread and multithread usage, because multithreaded containers have obvious performance penalty
> For example, Java implements different versions of containers for single thread and multithread usage, because multithreaded containers have obvious performance penalty.
Very few codebases in Java are single-threaded; specialty frameworks like Netty are the exception not the rule.
Likewise, there are not different containers for single threaded and multithreaded usage; there are containers that have different strategies for dealing with multi-threaded usage.
Hashtable is the oldest and is notable for it still being present and still being fundamentally flawed. It will lock reads and writes, but does not describe a way to lock for a transaction - e.g. changes based on read data. As such, it fundamentally has race conditions you can't protect against.
Hashmap and the rest of the Java 1.2 collections API set a slightly better pattern - they don't internally try to maintain safety, but provide mechanisms like synchronizedMap() to let you hold the monitor for the length of your transaction.
However, this could only be so good, because the monitors in Java are pretty fundamentally broken as well. A monitor is both part of its public API ( e.g. "synchronized(foo) {...}" ) and part of its implementation (e.g. public synchronized void bar() { ... })". This means that external code can affect your internal operation if you leverage the monitor that you get by default through your "this" instance.
As such, synchronized set involves three monitors:
1. The monitor on the interface-implementing collection type itself, e.g. on the HashMap. This likely is never used.
2. The monitor on the object returned by the 'synchronizedXXX' wrapping method. This is used to protect transactional access, such as iterating through while removing items.
3. The monitor used as a mutex inside the object returned by the 'syncronizedXXX' wrapping method, protect the integrity of the collection data type if used by multithreaded code which does not hold monitor #2. The code may have a race condition, but it won't put the collection itself into an inconsistent structural state.
The 'synchronizedXXX'-returned wrapper objects are pretty expensive, and if you can you should just internalize those collections into business object that does any needed syncronization itself.
ConcurrentHashMap and the like are lockless, and are built with the idea that you can perform the changes needed through atomic operations rather than transactions. This isn't always true, but often is.
For a collection which is always held by a single thread, the atomic operation overhead may still cause a performance impact - after all, the atomic operations are still processor state synchronization points. It is also possible to beat ConcurrentHashMap with regular HashMap on certain usage in multithreaded environments, when you are properly protecting access to the HashMap yourself.
It might be challenging to find scenarios where ConcurrentHashMap doesn't beat the 'synchronizedMap()' wrapper, just because the implementation itself is really expensive.
Thank you! That makes sense, and it also explains why removing the GIL has a negative performance impact as discussed in other comments. Taking a lock every time a container is accessed is significant overhead, which is why languages like C++ don't make basic containers thread-safe.
> Effectively the GIL is incurring that overhead on every data structure whether you need it or not.
Not really. The GIL is taken and released quite infrequently (only when the Python interpreter decides it's time to do a context switch), whilst the new locks for each data structure are taken/released every time you do a basic operation on those data structures.
Holding a lock that is rarely taken/released incurs very little overhead.
not sure there's much they can do about this, other than protecting all the built-in data structure operations with mutexes, like java's original data structures (Hashtable, Vector, etc)
(but then how do you get a non-synchronized [] if you want one?)
My impression was that that was exactly what they were going to do: replace the GIL with fine-grained locking on the objects themselves. I can't imagine they'd let multiple python threads manipulate python data-structures concurrently, the interpreter would segfault immediately.
> (but then how do you get a non-synchronized [] if you want one?)
You don't. This is one of the reasons why using the GIL is higher performance for single-threaded use-cases: stuff like lists and dicts can be non-synchronized
> not sure there's much they can do about this, other than protecting all the built-in data structure operations with mutexes, like java's original data structures (Hashtable, Vector, etc)
However - this is fundamentally the incorrect approach, because Vector and Hashtable aren't protected from read-then-write race conditions.
Such internal locking guarantees that the collection stays structurally sound, but not that code accessing it is dealing with a single consistent state until it finishes.
If there's a refcount of 1 you can mutate the value safely because no other thread could be trying to read/ write to it. And the only thread that can give it to others would be the one that's doing that mutation, so it can't suddenly change.
I'd assume that importing any sort of module level variable would imply an increment of the counter, but unsure.
Yeah, for Python I feel like the difference between fenced vs unfenced doesn't matter. The primary cost is around your L3 cache getting slammed with contentious atomics but your L3 is already absolutely fucked if you're using Python.
Nope. Cause users would just ignore or silence the warning, continue on, get subtly incorrect, difficult to reproduce behavior, submit tickets/issues, and just cause a lot of dead weight overall.
It's not remotely a trivial problem, and I assure you just about every naive solution has been considered and rejected.
> I might be good to remind the readers that the GIL removal has very little chance to break a Python-only codebase
Is this actually true? I was under the impression that some multi-threaded Python code relies on some operations being implicitly thread safe due to the GIL. For example, adding an item to the same list from two concurrent threads is never going to corrupt the list, simply because the threads never do that operation in parallel (the GIL prevents the threads from running in parallel). If you remove the GIL, suddenly you'll punish this kind of code, just like C++ quickly punishes concurrent mutation of an std::vector.
I'm not 100% sure of this, but I find this sentence a bit suspect.
[1] https://discuss.python.org/t/pep-703-making-the-global-inter...