For some context, WRF (https://github.com/wrf-model/WRF), the current state-of-the-art weather modeling system (also developed by NCAR) can (to my limited knowledge) only run on CPU. There have been efforts to run it on GPU (https://wrfg.net)... though it doesn't look like it's been kept up to date.
It seems like FastEddy mostly replaces WRF-LES, which is used for high-resolution localized modeling.
There have been at least 2 cuda implementations of WRF ARW afaik, and it seems they are not widely recommended because of numerical differences from the reference implementation. However having run ARW a lot, I would definitely not do it again on CPU.
I’ve been out of the field for a decade now, but I’m not surprised WRF is still the dominant model. So how different is ARW from the reference implementation? Model physics are not quite real physics anyway, and your initial conditions have some garbage and low sig figs anyway. Is the error propagation measurable — does it make long term forecasts significantly worse?
I'm not a domain expert but I would expect weather patterns to be chaotic and thus even small perturbations (errors) can lead to significant divergence.
I'm a level above an amatuer in this, having studied data assimilation in college (I was a Math major). Numerical errors are a given, errors in general too, actually, especially in the measurements used to train the hidden parameters (it is very similar to supervised ML). Data assimilation is a collection of techniques used to tackle that issue: 3dvar, 4dvar, kalman filter, extended kalman filter, ensemble kalman filter, particle filters, and so many others are used to find the most likely (minimum energy
etc) hidden state (the mean and covariance) from a given set of measurements+associated covariances and then that resulting hidden state is used to run the model "forward" (generally in time). AIUI, DA was specifically developed for weather modeling. It is definitely true that weather modeling is a heavy employer in anycase. This is on top of the methods used to solve PDEs ie so that they are forward/backward stable etc.
The errors in the weather forecast are not the result of chaos: they are the result of the errors in the measurements (recorded in the observation covariance) and the sparsity of the measurements themselves vs the size of Earth, for example, and limitations in model resolution (consider a FEM grid over the entire surface of Earth). The effect of chaos just compounds these errors near bifurcations around fixed points.
Perturbations are not used in the way you think; think of Taylor series approximations around specific points of interest.
Anyway, I work on compilers/auto-vectorization now (lol), so I'll defer to The Expert, if such person wants to chime in.
I was an atmospheric scientist, but I never got in too deep in the modeling — if I had, I’d have actually received the PhD lol. But yes, data assimilation very important in weather models. You know what’s going on out in the oceans where instruments are sparse, because land measurements are advected (blown) over the ocean. You still need to assimilate to correct to IRL physics.
The data itself is noisy. Bad readings aren’t uncommon. Common hygrometers (humidity sensors) have hysteresis, wind is turbulent, radiosondes stop transmitting in midair. Some data is really weird, like GPS occultation data, which gives temperature mixed with humidity along a 200km long cylinder. Suffice to say that while higher order approximations in modeling have helped, DA is super important because measurements are both sparse and flaky.
But that’s why I asked: numerical errors are typically dwarfed by measurement errors. So it shouldn’t be worse than a member of an ensemble model, right?
I've mostly been out of it for 7 years or so as well and only been looking at it from a far as these things have changed. Waiting for something like the model in the original article to come into existance.
> The Navier–Stokes equations are nonlinear partial differential equations in the general case and so remain in almost every real situation. In some cases, such as one-dimensional flow and Stokes flow (or creeping flow), the equations can be simplified to linear equations. The nonlinearity makes most problems difficult or impossible to solve and is the main contributor to the turbulence that the equations model.
“CPUs excel at performing multiple tasks, including control, logic, and device-management operations, but their ability to perform fast arithmetic calculations is limited. GPUs are the opposite. Originally designed to render 3D video games, GPUs are capable of fewer tasks than CPUs, but they are specially designed to perform mathematical calculations very rapidly.”
I thought the advantage of the GPU is not speed but parallelism or are modern power saving processors also slow compared to GPUs on non-parallel tasks?
Also, is it common to use Registered Trademarks (FastEddy) in a paper these days. I know a lot of universities try to commercialize research, is this the reason for the trademark?
It's useful to think of GPUs as CPUs with low core frequency but incredibly wide SIMD instructions. They are very good at doing the exact same instructions in parallel (apply this formula to 4 million pixels), they are fairly bad at anything that includes branches or loops (doesn't fit the SIMD model), or random memory access (only so much memory bandwidth to go around between all the cores).
That makes GPUs generally good at arithmetic tasks, but horrible at control or logic tasks since those usually involve lots of branching.
> They are very good at doing the exact same instructions in parallel (apply this formula to 4 million pixels)
In numerical fluid mechanics, you rather want to apply the same formula to the nearest 27 (3x3x3) or 125 (5x5x5) pixels in the 3-dimensional array. And then store the result to another 3-dimensional array.
Or maybe for calculating a value in a 3d array A, you need to apply a formula that looks at the nearest 27 values in both array A itself and also in another 3d array B. Maybe also C.
This is true for pixels too; think about a median or blur operation, or a 2d fluid sim, you might gather and combine the nearest NxN pixels. 2d and 3d are nearly identical, the main difference is just indexing. Parent comment was talking about the single formula that produces the output value written to the cell, which as you say might well involve reading inputs from many nearby cells. Fluid mechanics can also be thought of as a single formula per output cell. (And normally it’s most natural to organize GPU threads so that they are 1:1 with output cells.)
The architecture of GPUs is less exotic than people would have you believe. A given graphics card will have a couple dozen relatively slow cores (arranged in a NUMA hierarchy) each with 10 logical threads and 32 or 64-wide SIMD. The 10 logical threads/core enables many concurrent memory operations to be in flight at the same time, while the wide SIMD enables massive parallelism.
There's some cleverness in the programming model however: the code the programmer writes is executed on a single SIMD lane so 32 or 64 copies of it can be run in lockstep. In total, to keep every lane of every logical thread of every core busy requires thousands of concurrent threads.
(There is also some special purpose hardware for graphics related tasks, but that is less relevant to GPGPU workloads)
> I thought the advantage of the GPU is not speed but parallelism or are modern power saving processors also slow compared to GPUs on non-parallel tasks?
Fast and parallel are two ways to say the same thing, or in other words, the wide parallelism of the GPU's many math cores is what makes it faster than CPU's relatively small capability to do math in parallel. (The x-factor is so large, the power saving features don't really make a meaningful difference.) The tradeoff is you need a highly parallel workload to be efficient and that much faster on the GPU.
I think they're actually asking, if you have a single "thread" of a GPU, is that one thread quicker at math than a single thread of a CPU? The most fair comparison imo might be a single "warp" in nvidia terms against a single CPU thread, which is roughly similar to what a CPU might be doing for AVX.
It’s much slower. GPU single thread performance is very slow. Unless you are doing some specialized operation that has a special instruction on GPU, it will be slower.
GPU clock speeds are lower in general and the CPU has a much wider super scalar pipeline. Can be 4-8 wide vs single or double wide for most GPUs. It also has many latency-hiding techniques for a single thread. The GPU has none, it uses multiple threads to hide latency.
Well this is the coolest thing I've seen in weather tech in a while. I wonder if they will pair this with additional city-level data collection - anemometers/mini weather stations on the sides of buildings, etc. I'm the barometer-data-from-phones guy running All Clear on Play Store US: https://play.google.com/store/apps/details?id=com.allclearwe...
with an interest in hyperlocal weather forecasting.
I didn't see much about forecast accuracy in this article, but, still, extremely cool. I do wonder how much accuracy is possible - you cannot necessarily know how trucks and other human-caused short-term atmosphere changes will affect the weather.
But why? What do I do with a forecast about wind speeds in urban areas? Let's be generous and say the forecast will be able to tell me one hour in advance about wind speed and direction on every street in my neighbourhood. Now I order a drone-delivered pizza. What does the drone do with this knowledge? It still needs to get the Pizza from A to B, ASAP.
Drones/sUAVs tend to be pretty sensitive to wind and the weather in general. Being able to know, as you put it, the wind speed and direction on every street means being able to potentially avoid significant slow downs and/or running out of battery from having to use more power to compensate for the wind.
Of course that's if you can predict the weather with good enough temporal and spatial resolution to be useful. But to me it seems potentially useful.
IFR manned flight requires enough fuel to fly to your alternate destination airport + 45 minutes.
Considering the idea of flying a drone with something valuable on it is a theft / vandalism / collision nuisance already that will probably never be used outside of a few test markets, I find it highly unlikely that "running out of power" will be a concern. Assuming hypothetically that they were allowed to operate on a larger scale than they are today which, again, is a dubious assumption to begin with, they will have similar reserve requirements that are easy to calculate based on observed wind conditions at a nearby airport.
This seems a bit much to enable good pizza delivery. The funders seem more interested in urban airflows for bioweapons attacks or other slightly more critical problems.
This might enable simulation but forecasting is a chaotic problem isn't it? It's essentially like trying to predict a stream of random numbers - even if they follow a certain attractor, it's still impossible to know exactly what form they'll take.
Partial differential equations over multiple dimensions is innately chaotic, with exponential error bounds (the longer any simulation goes, the exponentially bigger the errors get).
But getting more-and-more accurate predictions is the goal of any weather modeler. If the exponential error bounds gives you currently 1-day worth of predictions before the models go to crap... maybe an improved algorithm (or 10x more compute power) can get you 2-days worth of predictions instead.
Even if you don't get a major change (maybe going from 1 day worth of predictions to 1.1 days of predictions), you might be able to convert that into 1/10th the compute power needed (ex: lower the accuracy down to 1-day prediction but cut back dramatically on the compute-power needed to perform the simulation).
Forecasting is done, and it's done by creating ensembles of weather simulations.
Weather is chaotic, which is why forecast accuracy rapidly drops off the further out in time you make predictions. But still, they can be accurate enough far enough out into the future to be extremely useful.
It does seem to me like a lot of that accuracy comes from real-time metric collection from all of the stations around the world (And there are many more stations, too).
For example, in London iPhone gives you almost real-time "current weather" state. Stuff like "it's raining now and it's going to stop raining in the next 5 minutes". You can only do that via collecting the radar data in real-time and correlating it with GPS. Not so much prediction.
side note: I think it'll be cool/crazy to see some kind of city-level planning that models like cars moving on the ground in real time, and things (drones) flying in the air, 3D buildings/space they take up... that would be a cool system to setup. I thought Airbus was doing something like that for their Vahana since it isn't piloted.
That looks cool, OSM/(can be) browser based looks like. Wonder if the paths for the roads are drawn automatically. Anyway that's something 3D is just an extruded layer/box I suppose.
It seems like FastEddy mostly replaces WRF-LES, which is used for high-resolution localized modeling.