I'm a Senior Machine Learning Engineer with 10+ years of R&D. My core expertise is Deep Learning Model Development and Pipeline Engineering, taking specialized models from concept to reliable output.
My recent work spans Computer Vision (traffic scenario analysis, SLAM, skin deformation analysis) and Generative Audio (speech synthesis focused on naturalness, novel voice generation, and controllability/editability).
I understand the full ML lifecycle, from novel research to scalable, cloud-ready API deployments. Seeking hands-on Senior-level roles and Lead positions to drive innovative model development. As Head of ML R&D (3 years), work included model development, AWS deployment, and rapid prototyping of LLM/GenAI applications for demos, all very hands-on along with a team of 10.
Background: PhD & Master's in Music Technology and Audio-Haptic Robotics (McGill).
Just search for "chess LLM leaderboard" there are already several. Also check https://www.reddit.com/r/llmchess/ although admittedly it doesn't get a lot of traffic.
I agree and honestly it may as well be considered a form of ABI incompatibility. They should make this explicit such that existing C extensions need to be updated to use some new API call for initialization to flag that they are GILless-ready, so that older extensions cannot even successfully be loaded when GIL is disabled.
This has already been done. There is a 't' suffix in the ABI tag.
You have to explicitly compile the extension against a free threaded interpreter in order to get that ABI tag in your extension and even be able to load the extension. The extension then has to opt-in to free threading in its initialization.
If it does not opt-in then a message appears saying the GIL has been enabled, and the interpreter continues to run with the GIL.
This may seem a little strange but is helpful. It means the person running Python doesn't have to keep regular and free threaded Python around, and duplicate sets of extensions etc. They can just have the free threaded one, anything loaded that requires the GIL gives you the normal Python behaviour.
What is a little more problematic is that some of the standard library is marked as supporting free threading, even though they still have the audit and update work outstanding.
Also the last time I checked, the compiler thread sanitizers can't work with free threaded Python.
the problem with that is it effects the entire application and makes the whole thing free-threading incompatible.
it's quite possible to make a python app that requires libraries A and B to be able to be loaded into a free-threaded application, but which doesn't actually do any unsafe operations with them. we need to be able to let people load these libraries, but say: this thing may not be safe, add your own mutexes or whatever
I'm curious, what is the use case for open-ended labeling like this? I can think of clustering ie finding similar tweets but that can also just be done via vector similarity. Otherwise maybe the labels contain interesting semantics but 6000 sounds like too many to analyze by hand. Maybe you are using LLMs to do further clustering and working on a graph or hierarchical "ontology" of tweets?
Hey OP here. The use-case is to give an Agent the ability to post on my behalf. It can use these class labels to figure out "what are my common niches" and then come up with keyword search terms to find what's happening in those spaces and then draft up some responses that I can curate, edit and post.
This is the kind of work you typically hire cheap social managers overseas to do through Fiverr. However, the variance in quality is very high and the burden of managing people on the other side of the world can be a lot of solo Entrepreneurs.
I am not sure this is true. Complexity is a function of architecture. Scalability can be achieved by abstraction, it doesn't necessarily imply highly coupled architecture, in fact scalability benefits from decoupling as much as possible, which effectively reduces complexity.
If you have a simple job to do that fits in an AWS Lambda, why not deploy it that way, scalability is essentially free. But the real advantage is that by writing it as a Lambda you are forced to think of it in stateless terms. On the other hand if suddenly it needs to coordinate with 50 other Lambdas or services, then you have complexity -- usually scalability will suffer in this case, as things become more and more synchronous and interdependent.
> The monolith is composed of separate modules (modules which all run together in the same process).
It's of course great to have a modular architecture, but whether or not they run in the same process should be an implementation detail. Barriers should be explicit. By writing it all depending on local, synchronous, same-process logic, you are likely building in all sorts of implicit barriers that will become hidden dangers when suddenly you do need to scale. And by the way that's one of the reasons we think about scaling in advance, is that when the need comes, it comes quickly.
It's not that you should scale early. But if you're designing a system architecture, I think it's better to think about scaling, not because you need it, but because doing so forces you to modularize, decouple, and make synchronization barriers explicit. If done correctly, this will lead to a better, more robust system even when it's small.
Just like premature optimization -- it's better not to get caught up doing it too early, but you still want to design your system so that you'll be able to do it later when needed, because that time will come, and the opportunity to start over is not going to come as easily as you might imagine.
> If you have a simple job to do that fits in an AWS Lambda, why not deploy it that way, scalability is essentially free. But the real advantage is that by writing it as a Lambda you are forced to think of it in stateless terms.
What you are describing is already the example of premature optimization. The moment you are thinking of a job in terms of "fits in an AWS Lambda" you are automatically stuck with "Use S3 to store the results" and "use a queue to manage the jobs" decisions.
You don't even know if that job is the bottleneck that needs to scale. For all you know, writing a simple monolithic script to deploy onto a VM/server would be a lot simpler deployment. Just use the ram/filesystem as the cache. Write the results to the filesystem/database. When the time comes to scale you know exactly which parts of your monolith are the bottleneck that need to be split. For all you know - you can simply replicate your monolith, shard the inputs and the scaling is already done. Or just use the DB's replication functionality.
To put things into perspective, even a cheap raspberry pi/entry level cloud VM gives you thousands of postgres queries per second. Most startups I worked at NEVER hit that number. Yet their deployment stories started off with "let's use lambdas, s3, etc..". That's just added complexity. And a lot of bills - if it weren't for the "free cloud credits".
> The moment you are thinking of a job in terms of "fits in an AWS Lambda" you are automatically stuck with "Use S3 to store the results" and "use a queue to manage the jobs" decisions.
I think the most important one you get is that inputs/outputs must always be < 6mb in size. It makes sense as a limitation for Lambda's scalability, but you will definitely dread it the moment a 6.1mb use case makes sense for your application.
The counterargument to this point is also incredibly weak: It forces you to have clean interfaces to your functions, and to think about where the application state lives, and how it's passed around inside your application.
That's equivalent to paying attention in software engineering 101. If you can't get those things right on one machine, you're going to be in world of hurt dealing with something like lambda.
I'd say the real advantage is that if you need to change it you don't have to deploy your monolith. Of course, the relative benefit of that is situationally dependent, but I was recently burned by a team that built a new replication handler we needed into their monolith, and every time it had a bug, and the monolith only got deployed once a week. I begged them to put it into a lambda but every week was "we'll get it right next week", for months. So it does happen.
> It's of course great to have a modular architecture, but whether or not they run in the same process should be an implementation detail
It should be, but I think "microservices" somehow screwed up that. Many developers think "modular architecture == separate services communicating via HTTP/network that can be swapped", failing to realize you can do exactly what you're talking about. It doesn't really matter what the barrier is, as long as it's clear, and more often than not, network seems to be the default barrier when it doesn't have to be.
The complexity that makes money is all the essential complexity of the problem domain. The "complexity in the architecture" can only add to that (and often does).
This is the part that is about math as a language for patterns as well as research for finding counter-examples. It’s not an engineering problem yet.
Once you have product market fit, then it becomes and engineering problem.
Remote: Yes (Remote or Hybrid, EU Timezone)
Willing to relocate: No
Technologies: [Deep Learning] PyTorch, TensorFlow, MLFlow; [Languages] Python, C, C++; [Infrastructure] AWS, Docker, PostgreSQL, DynamoDB.
I'm a Senior Machine Learning Engineer with 10+ years of R&D. My core expertise is Deep Learning Model Development and Pipeline Engineering, taking specialized models from concept to reliable output.
My recent work spans Computer Vision (traffic scenario analysis, SLAM, skin deformation analysis) and Generative Audio (speech synthesis focused on naturalness, novel voice generation, and controllability/editability).
I understand the full ML lifecycle, from novel research to scalable, cloud-ready API deployments. Seeking hands-on Senior-level roles and Lead positions to drive innovative model development. As Head of ML R&D (3 years), work included model development, AWS deployment, and rapid prototyping of LLM/GenAI applications for demos, all very hands-on along with a team of 10.
Background: PhD & Master's in Music Technology and Audio-Haptic Robotics (McGill).
CV and more details: https://sinclairs.gitlab.io/cv/sinclair_cv2025.pdf
Email: stephen.sinclair [..at ..] nonnegativ.com