Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The skills gap for Fortran looms large in HPC (nextplatform.com)
84 points by rbanffy on May 3, 2023 | hide | past | favorite | 120 comments


10 years ago or so I really wanted to NOT use Fortran for my PhD.

In my effort to try to use Python instead in my field, I fell down countless rabbit holes along the way (like being for a year or so one of the core contributors to Cython, in an effort to make "a better Fortran").

In the end though -- I decided it was time to just Get It Done, and just did what my supervisor had patiently and subtly hinted for years. Enough rabbit holes, just get it done in Fortran, and hand in the thesis...

I think what I learned most about (having an interest in compilers and programming languages) is the ignorance the rest of the programming language community, and computer science, has about the kind of things you are actually interested in when doing HPC. What kind of language features you care about and what idioms make sense.

Learning Fortran is super simple.. that is perhaps part of the problem that also people who are barely programmers can write (bad) programs in it. So what this must mean is simply a lack of programmers, full stop, being attracted to the field.

And...C++ is truly the most wrong language for this space there could be.


>C++ is truly the most wrong language for this space there could be

What? Why???

I switched from Python to C++ because Cython, Numba, etc. just weren't cutting it for my CPU-intensive research needs (program synthesis), and I've never looked back.


It's bloated beyond recognition and full of subtle and not so subtle footguns that you simply don't have to worry about in Fortran.


Gitgud


C++ and Python. Ypu picked the two worst programming languages to learn.


I've both worked in HPC and written Fortran. C++ is almost always preferable because Fortran is too limited as a language outside of numerical code. Most HPC stopped being limited by the performance of numerical code 20 years ago. Instead, HPC codes were and are commonly limited by memory bandwidth and locality, two things C++ is almost uniquely suited for.

The kind of C++ written for HPC tends to be much simpler than the kind of C++ used to write e.g. database engines. The complexity of C++ is not that onerous in context and Fortran isn't exactly an exemplar of ease of use.


I can't remember the exact quote right now, but what you say echoes that quote that goes something like: If you make a tool that has such a low barrier to entry that it makes it easy enough for unskilled people to use it and inevitably write low-quality code with it, then that is the definition of a good tool.


I'm in a similar position as you were right now: a PhD student in HPC domain who really wants to/ needs to use C++ for everything ( I love Python but has now come to terms with its limitations). Your last sentence is scary for me to hear- can you please elaborate a bit?


Of all the commonly used languages, C++ is the most complex, baroque, and filled with subtle and not-so-subtle footguns. Of course with all that downside comes the upside that C and C++ are the lingua franca for low level programming where performance matters, not only HPC (yes, I hope Rust will be there one day, but I digress..). Fortran is pretty much limited to numerical computing, you're not going to find much Fortran libraries outside that domain.

So if you envision doing programming like that outside HPC after your PhD, C++ might be worth the investment.

But of course a lot comes down to what language other people in your group and your particular subfield, and what your supervisors are using. Doing a PhD is hard enough without being the odd guy out who's using a language nobody else in your community is using.


Ever heard of Perl, PHP, JavaScript, sh or Fortran itself for that matter? Plenty of languages are far more "baroque" than C++.


Ignoring all the cliche comments about C++ complexity I'm gonna say something that is a controversial opinion of sorts: if you're a PhD student in CS and you're doing systems research and you can't write/read passable C/C++ then you shouldn't be allowed to graduate.


CS is NOT computational science or HPC (generally).


>you're doing systems research


Not sure what the parent meant but I always think of https://travisdowns.github.io/blog/2019/08/26/vector-inc.htm...


My god, though, why Python 10 years ago? There were better languages available then, like Ruby or Perl. Was NumPy just too big a thing even then?


The scientific Python stack (numpy, scipy, matplotlib + a plethora of more specialized things) were definitely a very big thing 10 years ago. Deep learning was just on the cusp of taking off in a major way, but in traditional numerical computing python was very big before that.


Ruby and Perl in HPC? I mean this in the nicest way possible if you’re being sincere, but I can’t tell if you are trolling or not


perl had some history in bioinformatics but ruby and php never made any inroads into science or HPC to the level Python has. Python has almost entirely replaced perl in this context.

I chose Python over perl about 25 years ago and never would have considered ruby or php (both seemed squarely aimed at web developers not scientists). I believe Numeric Python was released around that time, and it was a revelation- especially to matlab users, who recognized the syntax and behavior.


Yeah I did some bioinformatics work years ago and there were still a few Perl things, but not many.


I worked in an HPC lab at a medium-sized university around 10 years ago. Python was definitely the most popular interpreted language among the students. I remember it was years(?) earlier that MIT had switched much of its curriculum to Python.

Some older folks used Perl in places where others would've just written bash; mostly scripting invocations of actual number-crunching programs. Ruby would've raised some eyebrows for sure!


Yeah the switch was like '08 that MIT went to Python.


I think it is far from a foregone conclusion that Perl and Ruby are better than Python, or were ten years ago.


Python was definitely around and common 10 years ago. I did my Master’s 8 years ago and we had a python bootcamp for people not familiar with it and numpy was part of it.


You're not going to get traction using Ruby or Perl for a Ph.D. in astrophysics :)


The only ones judgy-er about using the "wrong" language than computer scientists are engineers and other scientists.


I reckon that's very true but it's important in one's closed work environment. If one bypasses accepted practice then one must be very good or have very good reason or both.


I love perl but it is awful at signal handling. The threading and shared memory interface are not as good as other languages either.

Python has a low barrier for entry and can be useful immediately to newer programmers. It is also fairly well rounded.

That being said, I am not partial to python as OP was with fortran, and steer around it regularly.


how are ruby and perl better solutions to how appalled you must be given your first sentence? if anything, they're just _different_ solutions that are based on personal preference and nothing more


> The good news for some HPC simulations and models, both inside of the National Nuclear Security Administration program at the DOE and in the HPC community at large, is that many large-scale physics codes have been rewritten or coded from scratch in C++

When C++ one day goes out of fashion, like Fortran already has, this will turn out to be bad news. Fortran is not a big language, and any programmer can learn Fortran in a matter of weeks. But if you'd need to teach people C++ just so that they'd be able to maintain a legacy code base, that will be considerably more difficult than with Fortran.


Probably in a thousand years nobody will be writing C++. In one year, there will still be a helluva lot of C++, including new projects.

Point being, the premise of your argument isn't granted. Is C++ going away soon? Or is it holding steady and even growing? How can someone objectively know either way?


It’s hard to predict much about 10 years out, let alone 1000.

But we keep expecting languages like COBOL and MUMPS to die, and they keep living on.

I think there’s a almost 100% chance that 1000 years from now, when all that’s left of human civilization is a bunch of bots trading hustle culture spam and culture war arguments on a zombie internet, that the whole thing will still run on some mission critical code written in C. Buggy, unsafe, but mission critical and hard to replace.

And of course there will be heated bot arguments about how it should all be scrapped and rewritten in Rust++.


If AI maintains a decent rate of advancement, it will eventually become possible to just ask an AI to... rewrite it in a different language.

You can already give GPT a decent-sized chunk of C code and ask it to rewrite it in, say, Rust, and it works with some rate of success.

For the last 50 years, this was impossible to do in a generic way. In the last 6 months, we managed to move the needle from impossible to possible. Give it another 12000 months, and what are the chances this would not move towards "trivial"?


Sure. But that's not what keeps systems written in COBOL or MUMPS. It's not what's kept the US air traffic control system running on 1970s-era hardware and software until very recently. Humans are perfectly capable of rewriting these systems. AI will be, as well.

The problem with changing a mission-critical system is risk, and cost. Maybe omniscient AIs in 3023 won't have this issue, but my ghost will be entirely unsurprised if there are still mission critical systems written in C laying around because whatever benefits are rewrite would give are dwarfed by the risk, perceived or real, of what happens if the rewrite isn't 100% perfect or doesn't 100% match the current system.


> Is C++ going away soon? Or is it holding steady and even growing? How can someone objectively know either way?

No. It's not going away anytime soon, but you can look at high-performance languages that are gaining in popularity, and that may point to change in the future. TIOBE isn't perfect, but it's a good indicator of interest, and Rust and Go seem to continue to gain popularity.

Interestingly, and related to this article, Fortran has moved up from #31 to #20 on TIOBE's index, which really does speak to the importance of a math-optimized, high performance, compiled language. Another interesting change is MATLAB moving up from #20 to #14.


TIOBE index is rivaled by the annual Stack Overflow Developer Survey in some regards https://survey.stackoverflow.co/2022/#technology-most-popula... :

> Most popular technologies This year, we're comparing the popular technologies across three different groups: All respondents, Professional Developers, and those that are learning to code.

Most popular technologies > Programming, scripting, and markup languages

The Top 500 Green500 (and the TechEmpower Web Framework benchmarks) are also great resources for estimating what people did this past year; what are the "BigE" of our models in terms of water, kwh of [directly or PPA-offset] sourced clean energy.


for contrast, tiobe also believes visual basic classic (a language which hasn't been supported since 2008) is also in the top 20


Having seen banking stacks, I'm inclined to believe TIOBE.


Most programming is not the kind programming you see talked about on HN. VB is still huge in legacy stacks.


right, but how much of that code is using a version of the language that hasn't been supported for 15 years? according to tiobe the answer is more than the amount of bash,powershell, and perl put together (and the amount is dramatically increasing)


Rust and Go still have a lot to catch up in HPC though.


IMHO The biggest roadblock to use AI to rewrite code and blindly trust results is having good tests to validate results. This is mood for simulations and models.

Also, neural ODEs are already taking traction. It won't be hard to replace lots of obscure spaghetti code that deal with edge cases in models with maybe inscrutable neural networks that deliver the same results. Better tools are already developing to deal with interpretability of NN, as higher-level blocks become standard abstractions.


Do you mean to say simulations and models have good tests? In my (very limited) experience with scientific computing that's absolutely not the case. There's a decent chance the software is impossible to test because it can only be run all together and is nondeterministic.


Modern C++ is just Rust minus the borrow checker. Future high performance programmers will have no trouble with it.


I would expect that most HPC programmers can learn Fortran or Cobol in a week or two. They are reasonably small languages, and Fortran would be sufficiently familiar to people who have both C/C++ with Matlab/Julia experience (typical for HPC). Would it be more proper to say it is not a skill gap, rather a "desire to work with old codebases" gap.

Which the article does admit:

> The skills issue with Fortran is apparently not just about learning Fortran, but more about being associated with Fortran and all of the legacy baggage that has given its vintage and its low marketability going forward.


I learned Fortran back in 2016 for a HPC course. It was indeed quick to learn but the resources online at the time for understanding compiler and runtime errors were scant compared to more popular languages. I mostly referenced pdfs of old books, and course notes scattered across various academic pages of ~vintage for learning the language and had sparse SO answers for help with gfc errors.

This quote from the article rings true to me:

>First, the lack of a standard library, a common resource in modern programming languages, makes mundane general-purpose programming tasks difficult. Second, building and distributing Fortran software has been relatively difficult, especially for newcomers to the language. Third, Fortran does not have a community maintained compiler like Python, Rust or Julia has, that can be used for prototyping new features and is used by the community as a basis for writing tools related to the language. Finally, Fortran has not had a prominent dedicated website – an essential element for new users to discover Fortran, learn about it, and get help from other Fortran programmers.


The folks behind https://fortran-lang.org have been working on many of these issues.


I wonder what the driver was? Anyway, first release of Fortran Package Manager was in November 2020: https://github.com/fortran-lang/fpm/releases/tag/v0.1.0 - more recently than I expected.


It was, and AFAIU still is, a grassroots effort started by people who liked Fortran as a language, but decided that the state of package management and availability of "standard" libraries was behind other languages, and decided to do something about other than whining in discussion forums. And that lack of an attractive website introducing the language to newcomers and pointing to the new tools like package managers and package repositories was a barrier to entry, hence the creation of the https://fortran-lang.org/en/ website.


> low marketability going forward.

I'm not so sure that will continue to be the case. Fortran has moved up substantially in TIOBE (to #20), so there's clearly growing interest.


True, but for FORTRAN maybe a month.

FORTRAN IV/77 has some statements that IIRC correctly have or will be removed from the newer fortran and that I think no other language has. Things like computational gotos, and a few others I have forgotten. Those could cause a bit of confusion with people looking at this old code on HPC.

Unlike COBOL, at least moving from FORTRAN you will not need to deal with rounding issues on floats.


"FORTRAN IV/77 has some statements that IIRC correctly have or will be removed from the newer fortran"

Right, importantly GOTO. The question I've puzzled over for some while is how much legacy (unmodified) FORTRAN IV/77 applications/libraries still exist and are in regular use—and whether porting this old code to later Fortran or other languages remains a significant problem.

My interest stems from the fact my first language was FORTRAN IV.


Fortran was old in the early 1980s, when I first taught it to myself for use on a research project. I'd taught myself BASIC, x86 assembler, and APL (for another physics course) by then. All the serious physicists were using Fortran though, for all their simulations.

My research project required use of the university mainframe for the big calculations, the 1000x1000 Markov matrices I was working on. I taught myself how to work on out-of-core code then as well, as well as exploiting symmetries inherent in my models.

My Ph.D. code was all written in Fortran, with a little Perl thrown in for good luck. 30+ years later, all of it works without a problem on my laptop/desktop machines.

I learned C in 1996 and C++ in 1997. The C code I wrote still works, though the C++ code needs special compilation options to be able to work.

I've not used Fortran in maybe 15 years now. I'm not up on the modern bits, I stopped using it professionally as F90 was becoming a thing.

I taught graduate HPC programming courses at my alma mater in the CS department, and gave the choice of Fortran or C to the kids. They blanched at the thought of using a nearly (at the time) 60 year old language. They correctly reasoned that learning the language would not help their employability.

Today, I use Julia for heavy computation. Python as a glue language, and C++ when required. I read the report, and I disagreed with the conclusions for a number of reasons, but what it comes down to is that many research groups in Physics/Chem still use Fortran, and aren't about to do the port to C/C++ due to funding. You can't get funding for this conversion. So codes will go on, students will learn what they need, and profs will keep publishing papers.

Moreover, this isn't the first, second, etc. time that people have predicted the death of Fortran. I heard this in 1990s as I was finishing up my Ph.D in theoretical/computational physics.

In 20 years or so, after I have (hopefully) retired, I'd bet that Fortran is still in widespread use, with Python following Java down the path to legacy. And the newer HPC language(s) will be quite happy to talk to Fortran.

I could be wrong, and I'm ok with it. But I doubt it.


"Physics/Chem still use Fortran, and aren't about to do the port to C/C++ due to funding."

If funding's a problem across many fields then wouldn't it perhaps make better sense to just accept Fortran as the most appropriate language for certain well defined application types/fields and concentrate on improving the language, that is fixing its actual or perceived limitations in current environments?

It would seem more efficient to centralize effort to improving just the language and its tools than to convert or rewrite a whole world of disparate applications that have been developed over many decades. After all, take English for instance, at any point in history, say 1600, its grammar and vocabulary were appropriate for the time. Moving on 400+ years till now we didn't chuck English away and replace it with a seemingly better language such Esperanto but progressively updated it to modern requirements.

I have to admit to some bias here in that Fortran was my first language so it acted as the template for others. That said, I am not convinced that the enormous plethora of different languages that have flooded programming in recent decades has benefited computing and CS to the extent that perhaps it ought to have as there has been a great deal of unnecessary duplication and overlap that's led to wasted human effort—the need to learn many different languages, lack of uniformity etc.

The large number of languages, lack of agreed consensus/standards—the most appropriate language for a given class of applications, etc.—has also led to language ghettoization where programmers swear by one language they've become familiar with and continue to use for jobs where another would be more appropriate. And one can't blame them for not wanting to learn a new language seemingly every other year.

It seems to me that rationalizing and simplifying the language problem ought to be a high priority for CS. Given its long and mature history, its entrenched position in certain fields and its proven suitability for math-intensive work, that process could begin with Fortran as it would likely be the least disruptive.


> If funding's a problem across many fields then wouldn't it perhaps make better sense to just accept Fortran as the most appropriate language for certain well defined application types/fields and concentrate on improving the language, that is fixing its actual or perceived limitations in current environments?

Yes it would. It would make far more sense than writing whitepapers extolling the risks of staying with the language, for example.

> It would seem more efficient to centralize effort to improving just the language and its tools than to convert or rewrite a whole world of disparate applications that have been developed over many decades.

There are ISO committees dedicated to improving Fortran[1].

...

> It seems to me that rationalizing and simplifying the language problem ought to be a high priority for CS.

It is not. This would relegate CS to a different role at a university, more of a tool building and improvement (e.g. engineering) than a "science". Moreover, there is no real money to be made, or reputation to be created by improving a tool. Especially one that has been in use so long.

CS loves to follow/lead with the new shiny thing. This is how the profs get grants. Show their value to the community. Get their students hired and starting companies. Or taking a leave from university, and going to work as chief scientist of AI at large global companies. (cough cough)

Most of these researchers would prefer to show their value and the value of their thoughts/work, by creating new and shiny things in languages or new languages. Yes, this is cynical. I've seen it first hand. I've watched fads/trends wax and wane in CS for a while now. Often times being unaware that much work was being repeated.

[1] https://j3-fortran.org/


This shouldn't really be a problem. Everyone always says that a good programmer can learn a new language - the concepts are universal, it's just syntax.

The real problem is that no company wants to train and they all expect experts on day one. Even if one company was willing to train, the employee would be SoL when searching for a new job since no other company wants to train.

I'm starting to think that replacing a bachelor requirement for most dev jobs with an associate degree followed by an apprenticeship would be better and cheaper. Perhaps an industry-wide union setting career development paths would be (marginally) better than what we have today, at least for people just starting out or switching jobs.


Maybe I've worked in big tech and finance too long but most good companies (excluding companies looking specifically for C++ expertise) assume you're able to pick up whatever languages you need.

Coding is the easy part of software engineering.

With that said, there's obviously a gap between folks that are mainly just programming vs. folks that are engineering systems. And both personas have been lumped in together. The fundamentals that a good CS, math, and stats education teach you are invaluable when you actually have to build things that scale or work on difficult problems.


"The fundamentals that a good CS, math, and stats education teach you are invaluable"

I agree. What I'm saying is a 2 year program would satisfy that. Basically 2 years of college is for learning non-domain related junk (health class from HS, music history, English proficiency from HS, etc).

I mean the engineer and programmers get lumped together because there is no real separation today. Put "programmers" on a project designed by "engineers" is a disaster. Documentation and communication always leaves something to be desired. Not to mention everyone calls themselves an engineer. Also I work in finance IT.


> I'm starting to think that replacing a bachelor requirement for most dev jobs with an associate degree followed by an apprenticeship would be better and cheaper.

I was looking into applying to a job at a national lab. Not anything crazy, no HPC or scientific programming, just back office enterprise crap, but they seemed so intent on wanting a degree. I’m not sure places like that in particular that are primarily seen a “scientific” organizations will be willing to ignore credentialism.


A good programmer can learn every language I suppose, but the concepts in each language don't completely overlap. Sometimes there's very little in common.

Some languages on a CV imply extensive knowledge of complex libraries that aren't technically part of the language but considered necessary. Think JavaScript.


Agreed


None of this is particularly surprising. I think there are a multitude of factors:

* Universities are teaching Python more than ever. When I started my undergrad in Physics everyone took C++, by the time I graduated it was all Python.

* Academia as a career track is not stable so people want to write code in something that they might have a hope of getting a job in afterwards. C/C++ are much more applicable in the wider world so anyone with one eye on the exit is likely to prefer one of those…

* Fortran itself is fine and even improving with fpm and the ecosystem changes happening, but the big Fortran scientific code bases are not much fun to work on.

* There’s no performance benefit to sticking with Fortran. You can make code fast in any language. The main argument that Fortran is faster than C/C++ is to do with aliasing which anyone writing performance critical code will know how to deal with anyway. Performance portability codes like Kokkos that let you run on OpenMP/MPI/CUDA but write code once are pretty much all C++ too.

* A few years ago, CUDA could only be used in Fortran with the PGI compiler which was commercial. This made it less attractive vs C/C++. This has improved now but it was a major factor for a few years. Even now I think they’re not at feature parity although it’s been a long time since I looked.


> * There’s no performance benefit to sticking with Fortran. You can make code fast in any language. The main argument that Fortran is faster than C/C++ is to do with aliasing which anyone writing performance critical code will know how to deal with anyway. Performance portability codes like Kokkos that let you run on OpenMP/MPI/CUDA but write code once are pretty much all C++ too.

Nonsense, any physics PhD student could write meaningful Fortran code over a weekend that's significantly faster then all but the most optimized C code. And even then there might still be a small gap.

It's a huge difference in skill-required effort and hours-spent effort, maybe as high as 100 to 1.


> any physics PhD student

Ignoring that Fortran should be easier for a compiler to optimize than C, this is more of a measure of how much performance "fast" code leaves on the table. You can get very good speedups by just paying attention to the problem, but short of reinventing the universe there's only so much you can do.

Ab initio just being a physics PhD doesn't magically give you skills of optimization and (say) knowing how caches work. I know these skills do often come together but in my experience for example every single academic "programmer" I have ever met has been basically lay - fumbling around in the dark, no taste, no idea how the machine works, no idea how to write code that isn't wrong, etc.


The huge difference is that Fortran defaults to what is in C restrict pointers -- non-overlapping memory regions.

This is something most programmers have not even heard about...

This means simple Fortran programs have more room for parallelizing optimizing compilers than C programs (without restrict annotations).

https://en.m.wikipedia.org/wiki/Restrict


Utterly bizarre that I've been reading about the pointer aliasing/restrict problems with C since 1993 and the problem hasn't been solved in 20 years.


The reason is that it isn't a really a problem for C programs and is infact very desireable for some applications (OSes for example :) )


This comment isn't really an argument, it veers close to a complaint about people you've interacted with. To be honest, the vast majority of developers do not know how a cache works.


This was certainly the case while I was a grad student.


> anyone writing performance critical code will know how to deal with anyway

There is a point about making it as easy and idiomatic to write performant code. If a naïve implementation performs as well (or almost as well) as a more difficult to understand explicitly optimized one, the former one tends to win. Easily usable libraries turn that on its head (as we see with Python and its numerics tools on desktops - not sure how adoption in HPC is).


> There is a point about making it as easy and idiomatic to write performant code.

There is certainly truth in that and certainly Fortran has an advantage here vs. C++ due to the aliasing rules and generally lack of subtle performance footguns in the usage of the language. But TBH, I think a lot of the reasons why Fortran is perceived to be fast is that out of the box it doesn't come with anything like a data structures and algorithms library, and domain experts who don't have a strong CS background are either not aware of other data structures or can't be bothered to implement them, so they code like the array is the only data structure known to man. And CPU's love it.


Fortran has a huge performance benefit if you are using an HPC architecture where the cpu manufacturer provides highly optimized processor specific math libraries written in Fortran. Basically, you are working in a purpose built HPC development environment. This is the case for the HPC systems I use.


All the processor specific libraries I’ve used could be called from either language family. This was certainly case for Cray LibSci libraries (which for e.g. supported Fortran BLAS and LAPACK interfaces and the equivalent C wrapper CBLAS) and MKL. Because they’re the same underlying library there is no performance difference.

What sort of architecture and libraries are you making use of where this is the case?


Honestly, I'm not an expert in this bus I do use BLAS/LAPACK w/ Fortran code on HPC systems for some of my research. If these libraries are written in Fortran, wouldn't it be faster to use them directly in Fortran? Wouldn't there be overhead translating C calls to fortran code, e.g. due to 0 vs 1 indexing, and row-major order vs column-major order for arrays?


It’s hard to say without specifics but most BLAS/LAPACK implementations today aren’t written in Fortran, they just provide a Fortran interface - it’s worth noting that BLAS/LAPACK original implementations were written in the 1970s in Fortran but they are not generally used today because other libraries with the same interface have been written that provide better performance. The open source implementations OpenBLAS and Atlas are written in C and commercial cuBLAS, rocBLAS and MKL are C++. This is also true for many other OSS scientific libraries I’ve used - Sundials, FFTW, etc.

In terms of column vs row major order - the underlying implementation will have its own default way of doing it (depends on the implementation) but you can if necessary recast matrix operations in such a way as to avoid actually doing transposes in the interface layer with the other language. so there’s not really a performance penalty as no additional work is done. CuBLAS and MKL’s BLAS implementation even let you pass parameters in about your input data’s memory layout.


That makes sense. However, it does sort of remind me of the old saying "a good C programmer can write C code in any language." It seems like if you use very simplified easy to optimize code that more-or-less looks like Fortran style coding, it can compile from any language to basically the same thing the Fortran HPC code would compile to, with the same performance.


>The main argument that Fortran is faster than C/C++ is to do with aliasing which anyone writing performance critical code will know how to deal with anyway.

Literally incorrect, to this day.


RPG (1959) is almost as old as Fortran and is still in use!

https://en.m.wikipedia.org/wiki/IBM_RPG


So is Lisp (released 1960; development began in 1958): https://en.wikipedia.org/wiki/Lisp_(programming_language)

Interestingly, Lisp and Fortran were both initially implemented on the same vacuum-tube based computer, the IBM 704: https://en.wikipedia.org/wiki/IBM_704#Landmarks


In retrospect there is no particular reason why the "two-language" approach of Python to numerically intensive computations should not have been primarily in Fortran rather than C/C++.

A Python/Fortran combination would have made the development of high-performing HPC type applications more accessible to scientists, more fun, and would have given Fortran a more secure future.

The decisions or non-decisions, the culture and approach to community etc of key stakeholders of language ecosystems do have implications.


> A Python/Fortran combination would have made the development of high-performing HPC type applications more accessible to scientists, more fun, and would have given Fortran a more secure future.

SciPy makes heavy use of Fortran. Numpy uses Fortran (but it's optional). f2py compiles fortran into python modules with little fuss. Python's ctypes module can directly call fortran libraries.

But to say Python should have been implemented in Fortran rather than C to enable Fortran interop by default -- that doesn't make sense. Python was designed as a scripting language for C-based operating systems, so it needed to be able to use pre-build unix libraries written in C, and it also needed to be embeddable in C programs. At the time neither of these things was easy via Fortran (cpython predates the Fortran 90 standard).


Cython got started in 2007. F90 was very mature by then. In any case its not a binary choice C vs Fortran, and definitely not about a Fortran implementation of Python itself. It is about the ease of integrating performant bits written in something other than C/C++, not by HPC specialists but domain scientists. The examples you mention show that this is possible but its not what people have been doing in the broader data science area. There might be other benefits for adopting C++ instead of F90 as the computing layer but overall it might have been a missed opportunity.


Cython is not cpython. cpython started in 1990.

Edit: In case you're confused, cpython is the de-facto standard implementation of Python. Cython is something entirely different (it compiles Python to C). Here's a good summary of Python's history: https://www.geeksforgeeks.org/history-of-python/.


But I never suggested that cpython should have been implemented in Fortran. The discussion is about how Python could have tapped F90's HPC capabilities instead of relying heavily on C++ for this. Think Cython, pybind11, Boost.python etc.


Except Python doesn't use C++ at all -- using C++ from Python is not very easy. The reason is that Python's native FFI interface is in C, because the interpreter is written in C. The reason Jython, PyPy, IronPython, etc. never got traction is because of all the Python core libraries written in C.

From a Python implementor standpoint, you have to implement Python's interpreter in the language you want to do the tight loops in. So for example, Jython was a reimplementation of Python on the JVM to make it possible to easily use Java for the tight loops. IronPython was the same, but for .NET.

You say it's cultural, but it's all about ergonomics and backwards compatibility. Making it easy to write tight loops in Fortran means changing the FFI API to be fortran friendly, which it is not.

Just try writing your MPI code with a combination of Fortran and Python. I did it circa 2005 and it was a technical nightmare.


Python has been around since early 90s and it’s main implementation was written in C though. It’s always had first class support for writing C extensions so it’s not illogical that wrapping functions written in other languages interface with it by targeting that C API.


> In retrospect there is no particular reason why the "two-language" approach of Python to numerically intensive computations should not have been primarily in Fortran rather than C/C++.

Python’s focus on C-extensibility predates its explosion in science and is a “particular reason” for this.

Its not an abstract, perfect-world, ivory tower reason, but most real reasons aren’t.


Its seems like a historical accident. Not every combination of things or a new direction that (in principle) might make some sense gets to receive timely attention.


Previous thread about the same report, with 71 comments: https://news.ycombinator.com/item?id=35738763


Fun to think that the two least sexy programming languages, COBOL and FORTRAN, run on the sexiest computers ever built.

Code has an almost surreal longevity.


If you work on supercomputers they feel a lot less sexy. As the old quote goes "supercomputers are devices to turn compute bound problems into IO bound problems". So you spend a lot of time fighting IO. (And thinking about memory movements.)


I called it "sysadmin by committee" because the decisions were made for the cluster were usually made by someone else who didn't know what they were doing.


I miss the Seymour Cray designs as well.


I would argue Mainframes (modern ones) may actually be the sexiest but your point still stands!


I think that too. They are a lot more interesting than the brute force of a modern supercomputer.

But, then, imagine an HPC parallel sysplex of a thousand five-drawer z16's filled to the gills with Nvidia GPUs for the number crunching.


According to Tiobe, Fortran is the 20th most popular language. Clojure is in the "next 50 languages" section – that is, somewhere between 51 and 100 in popularity.

As a Clojure lover, this hurts.

https://www.tiobe.com/tiobe-index/


No way this rating reflects reality. Kotlin and Type Script should be way higher, certainly higher than 35/38


I learned Fortran IV as an introductory language. And read here that it is so elementary, and these other languages that are more complex and faster and better for software engineering, and software engineers are better now... I say why haven't these new and improved SEs devised a translation program to compile Fortran into C++? Back in the 1980s we used software to reverse-build flow diagrams of Fortran programs. They gave us fullpath test coverage, unique variables, reused variables, loop termination conditions etc. Before that on mainframes we'd use JCL (job control language) to output assembler code from submitted Fortran, COBOL, or PL/1.

How can we not feed Fortran source in and receive Language XYZ as output? Perhaps the new wave of ChatGTP will write this and save society.


Not sure - I'm sure I've seen a COBOL to Java translator (paid).

I guess there's not enough commercial demand, and any hobbyist who's good enough at Fortran probably isn't interested in doing it.

edit: apparently f2py (Fortran to Python) is a thing.


f2py is a tool for calling Fortran code from Python, it doesn't translate Fortran into Python.


This could be good in the long run. I think one of the problems in HPC is that a lot of libraries and tools are very ossified into a particular design that doesn’t work as well for new use cases (like genetics) and that’s partly because a lot of people are building off very old software. When that software is migrated or abandoned, what replaces it will be fresh in people’s minds and able to be modified to generalize past the openmp+mpi paradigm.

It’s like in professional software, ignoring that we’re generally too willing to decide on a rewrite, one of the main benefits of a rewrite is your current team builds expertise on all the logic or behavior that predates them (and usually uncovers hidden knowledge like implicit behavior, subtle bugs, the code being ugly because it needed to be to solve a bug).


This is just a symptom of the fact developers are like cogs, where you drop them in and expect them to work with "legacy code" where as they guy who just moved on/was laid off had all that expertise on the logic and behavior that predates you. If you people decided to actually build with an eye towards longevity, (in this case, I mean keeping people around so they can train the newbies) then this wouldn't be a problem.


RPG is almost as old as Fortran and is still in use!


Fortran (1957) is older than RPG (1959)


that is what the parent said.


>The nerds all learned to program Fortran, which was two years younger than COBOL

A typo ? I thought FORTRAN was a bit older than COBOL :)

Anyway, i very nice article.


Good catch! Fortran had, somewhat famously, the first optimizing compiler and the first commercially available compiler. My rule of thumb is that if I’ve heard of it it probably came after Fortran.


I have to work with Fortran HPC code a lot. The code and conventions are often extremely difficult to read.

What I've started doing recently is just pasting into ChatGPT and having it explain the code to me. It actually works great.


Could LLMs help here to translate between programming languages? Same with COBOL. Is a coding assistant the way to get legacy systems updated -- sounds like a consulting niche.


For Fortran, it is really a niche. This is a part of what I am doing for a living. For the first time, I proposed my service on the traditional "who is hiring/looking for work/freelancer" to see if some people reacted to it because of a previous discussion about this report[0].

For the moment nothing, I am not really surprised. It is quite a lot of you know someone who... kind of niche. To port or upgrade some scientific code from Fortran to a new language, you need to have some minimal, some times extensive, domain knowledge, Fortran knowledge and target language knowledge.

It is hard to find the right persons, with not that many persons, you do not want to start a big migration and lose the head developer on the way, etc. This is why if you read in the report and the forums, what people want is to "reboot" the community/ecosystem and put new life in it. But this is also hard if the ecosystem starts to fossilize.

I love and really enjoy working in Fortran, but as a paradox, because of the mentioned situation, it is not that easy to find Fortran work.

[0]: https://news.ycombinator.com/item?id=35738763


I doubt it. Non-idiomatic translation is already covered by compiler-based technologies that already exist and are off-the-shelf. Idiomatic translation would require an LLM that could ingest large chunks of the code base; it is not sufficient to translate one function at a time (which is, again, a solved problem), and LLMs have scaling problems still as you go up. And, this is code, not human language. It doesn't take very many small mistakes where the code did this one obscure thing, but the LLM thought it was a more popular thing, before the whole translation is worthless. In my playing with ChatGPT if you deliberately bait it with code that looks like some really popular task but it's actually doing another, it is very strongly attracted to the popular thing.

An AI that uses LLMs but is more than just an LLM, maybe.


There have been automatic translators from Fortran to C for many years, but they don’t product idiomatic C code.


Could but who’s going to verify that they are not just feeding them garbage?


This is the right answer. Almost always in manual translation processes, there are questions about intent, either ambiguities in the original source (is this a big or do we keep it?) or clarifications in project planning (this code is unused... should we leave it out?).

How does an AI know to surface those concerns?


We'll sure find out, because there's no way to keep people from it.


Is the use of "codes" in the article, versus "code" a sign that the author/editor is not "code" literate?

Asking for a friend.


It's an interesting idiom you see in hpc and scientific computing that can be jarring when first encountered. "Codes" can be read as program or library.


No, that is more national lab slang.


hey, can you make some viewgraphs about the new dynamics codes?


yeah sure. I just need a cost center and program code to charge my time to...


It's just a bit antiquated. Rest assured people were writing codes for decades before any consumers had computers




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: