Writing C # C ++ Critical Coding

I am currently working on critical performance code, and I have a special situation where I would like to write an entire application in C #, but due to performance reasons, C ++ ends up faster than FAR.

I conducted several tests on two different implementations of some code (one in C #, the other in C ++), and the timings showed that the C ++ version was 8 times faster, both versions in release mode and with all optimizations turned on. (Actually, C # had the advantage of compiling as 64-bit. I forgot to include this in C ++ time)

So, I suppose, I can write most of the code base in C # (which C # is very easy to write), and then write my own versions of things where performance is critical. The particular piece of code that I tested in C # and C ++ was one of the critical areas where more than 95% of the processing time was spent.

What is the recommended wisdom when writing native code here? I never wrote a C # application that calls native C ++, so I have no idea what to do. I want to do this in such a way as to minimize the cost of having to make my own calls as much as possible.

Thanks!

Edit: The following is most of the code I'm actually trying to work on. This is for n-body modeling. 95-99% of the processor time will be spent on Body.Pairwise ().

class Body { public double Mass; public Vector Position; public Vector Velocity; public Vector Acceleration; // snip public void Pairwise(Body b) { Vector dr = b.Position - this.Position; double r2 = dr.LengthSq(); double r3i = 1 / (r2 * Math.Sqrt(r2)); Vector da = r3i * dr; this.Acceleration += (b.Mass * da); b.Acceleration -= (this.Mass * da); } public void Predict(double dt) { Velocity += (0.5 * dt) * Acceleration; Position += dt * Velocity; } public void Correct(double dt) { Velocity += (0.5 * dt) * Acceleration; Acceleration.Clear(); } } 

I also have a class that simply controls the simulation in the following ways:

  public static void Pairwise(Body[] b, int n) { for (int i = 0; i < n; i++) for (int j = i + 1; j < n; j++) b[i].Pairwise(b[j]); } public static void Predict(Body[] b, int n, double dt) { for (int i = 0; i < n; i++) b[i].Predict(dt); } public static void Correct(Body[] b, int n, double dt) { for (int i = 0; i < n; i++) b[i].Correct(dt); } 

The main loop looks like this:

 for (int s = 0; s < steps; s++) { Predict(bodies, n, dt); Pairwise(bodies, n); Correct(bodies, n, dt); } 

Above is just the minimal larger application I'm working on. Something else happens there, but the most critical things happen in these three functions. I know that the paired function is slow (It n ^ 2), and I have other methods that are faster (Barnes-hutt for one, that n log n), but that goes beyond what I ask in this question.

C ++ code is almost identical:

 struct Body { public: double Mass; Vector Position; Vector Velocity; Vector Acceleration; void Pairwise(Body &b) { Vector dr = b.Position - this->Position; double r2 = dr.LengthSq(); double r3i = 1 / (r2 * sqrt(r2)); Vector da = r3i * dr; this->Acceleration += (b.Mass * da); b.Acceleration -= (this->Mass * da); } void Predict(double dt) { Velocity += (0.5 * dt) * Acceleration; Position += dt * Velocity; } void Correct(double dt) { Velocity += (0.5 * dt) * Acceleration; Acceleration.Clear(); } }; void Pairwise(Body *b, int n) { for (int i = 0; i < n; i++) for (int j = i + 1; j < n; j++) b[i].Pairwise(b[j]); } void Predict(Body *b, int n, double dt) { for (int i = 0; i < n; i++) b[i].Predict(dt); } void Correct(Body *b, int n, double dt) { for (int i = 0; i < n; i++) b[i].Correct(dt); } 

The main loop:

 for (int s = 0; s < steps; s++) { Predict(bodies, n, dt); Pairwise(bodies, n); Correct(bodies, n, dt); } 

There is also a vector class that works just like a regular math vector, which I don't include for brevity.

+4
source share
7 answers

You will need to interact with your own code. You can put it in a dll and pinvoke. It’s good when you don’t switch very often, and the interface is thin. The most flexible and fastest solution is to create a ref class shell in C ++ / CLI. See this magazine article for guidance.

Last but not least, you really need to profile C # code. A coefficient of 8 is quite excessive. Do not start with this until you have at least half an idea why it is so slow. You do not want to reproduce the reason in the C ++ code, which can ruin a week of work.

And beware of the wrong instincts. 64-bit code is actually not faster, it is usually a bit slower than x86 code. He received a bunch of additional registers, which is very nice. But all pointers are double size, and you do not get a dual processor cache. .

+8
source

You have two options: P / Invoking and C ++ / CLI.

P / Call

Using P / Invoke or Platform Invoke, .NET (and therefore C #) can call unmanaged code (your C ++ code). This may be a bit overwhelming, but it is certainly possible for your C # code to invoke critical C ++ code.

Some MSDN links to get you started:

Basically, you will create a C ++ DLL that defines all the unmanaged functions that you want to call from C #. Then in C # you will use DllImportAttribute to import this function into C #.

For example, you have a C ++ project that creates Monkey.dll with the following function:

 extern "C" __declspec(dllexport) void FastMonkey(); 

Then you will get the definition in C # as follows:

 class NativeMethods { [DllImport("Monkey.dll", CallingConvention=CallingConvention.CDecl)] public static extern void FastMonkey(); } 

You can then call the C ++ function in C # by calling NativeMethods.FastMonkey .

A few common mistakes and notes:

  • Spend time learning Interop Marshaling. Understanding this in many ways will help to create the correct P / Invoking definitions.
  • The default calling convention is StdCall, but C ++ will be the default CDecl.
  • The default character set is ANSI, so if you want to arrange Unicode strings, you will need to update the DllImport definition (see MSDN - DllImport. CharSet ).
  • http://www.pinvoke.net/ is a useful resource for learning how to call a call to the P / Invoke Standard Calls function. You can also use this to figure out how to marshal something if you know a similar Windows function call.

C ++ / CLI

C ++ / CLI is a series of extensions for C ++ created by Microsoft to create .NET collections with C ++. C ++ / CLI also allows you to mix unmanaged and managed code together into a "mixed" build. You can create a C ++ / CLI assembly that contains both the code critical for your performance and any .NET wrapper you want.

For more information in C ++ / CLI, I recommended starting with MSDN - language features for targeting the CLR and MSDN - Intelligent Compatibility and .NET .

I recommend you start with the P / Invoking route. I found that a clear separation between unmanaged and managed code helps simplify things.

+2
source

In C #, is a vector a class or a structure? I suspect this is a class, and Arthur Stankevich hit a nail on the head with his observation that you can highlight many of them. Try creating a vector structure or reusing the same vector objects.

+1
source

The easiest way to do this is to create a C ++ ActiveX dll.

Then you can reference them in a C # project, Visual Studio will create gaps that will carry the ActiveX COM object.

You can use an interaction code, such as a C # code, without an additional wrapper code.

More about AciveX / C #:

Create and use an ActiveX C ++ component in a .NET environment

0
source

"I did some tests on two different implementations of some code (One in C #, the other in C ++) and timings showed that the C ++ version was 8 times faster"

I did some numerical calculations in C #, C ++, Java and a bit of F #, and the biggest difference between C # and C ++ was 3.5.

Profile your C # version and find the bottleneck (maybe there are some problems related to IO, unnecessary allocation)

0
source

P / Invoke is definitely simpler than COM Interop for a simple case. However, if you are making large chunks of a class model in C ++, you can really consider C ++ / CLI or COM Interop.

ATL forces you to quickly break up the class, and once the object is created, the overhead is basically as small as with P / Invoke (unless you use dynamic dispatch, IDispatch, but that should be obvious).

Of course, C ++ / CLI is the best option, but it will not work everywhere. P / Invoke can be made to work everywhere. COM interoperability is supported in mono to the extent

0
source

It looks like you are making many implicit assignments of the Vector class in your code:

 Vector dr = b.Position - this.Position; ... Vector da = r3i * dr; this.Acceleration += (b.Mass * da); b.Acceleration -= (this.Mass * da); 

Try reusing the already allocated memory.

0
source

Source: https://habr.com/ru/post/1347386/


All Articles