Whenever you use [DllImport], the analyzer will nudge you to auto-fix it to [LibraryImport] which source-generates a marshalling stub (if any is needed) that then calls an inner [DllImport] that does not require runtime marshalling. This is very cheap since function address gets cached into a readonly static which then gets baked into the machine code once the JIT produces Tier-1 compilation for your method.
On NativeAOT, you can instead use "DirectPInvoke" which links against specified binary and relies on system loader just like C/C++ code would. Then, you can also statically link and embed the dependency into your binary (if .lib/.a is available) instead which will turn pinvokes into direct calls (marshalling if applicable and GC frame transition remain, on that read below).
Lastly, it is beneficial to annotate short-lived PInvoke calls with [SuppressGCTransition] which avoids some deoptimizations and GC frame transition calls around interop and makes the calls as cheap as direct calls in C + GC poll (a single usually not-taken branch). With this the cost of interop effectively evaporates which is one of the features that makes .NET as a relatively high-level runtime so good at systems programming.
Unmanaged function pointers have similar overhead, and identical if you apply [SuppressGCTransition] to them in the same way.
* LibraryImport is not needed if pinvoke signature only has primitives, structs that satisfy 'unmanaged' constraint or raw pointers since no marshalling is required for these.
On NativeAOT, you can instead use "DirectPInvoke" which links against specified binary and relies on system loader just like C/C++ code would. Then, you can also statically link and embed the dependency into your binary (if .lib/.a is available) instead which will turn pinvokes into direct calls (marshalling if applicable and GC frame transition remain, on that read below).
Lastly, it is beneficial to annotate short-lived PInvoke calls with [SuppressGCTransition] which avoids some deoptimizations and GC frame transition calls around interop and makes the calls as cheap as direct calls in C + GC poll (a single usually not-taken branch). With this the cost of interop effectively evaporates which is one of the features that makes .NET as a relatively high-level runtime so good at systems programming.
Unmanaged function pointers have similar overhead, and identical if you apply [SuppressGCTransition] to them in the same way.
* LibraryImport is not needed if pinvoke signature only has primitives, structs that satisfy 'unmanaged' constraint or raw pointers since no marshalling is required for these.