Wednesday, November 11, 2015

MKL and C# for dummies (Part 2)



As we saw, it is quite easy to use this library. Obviously you have to search the right function you need marshalling in C# and of course to know something of algebra. Although the esoteric marshalling can create some problem, when you start to face some huge system, big arrays, these functions boost your programs performance.


Memory Limitation (System.OutOfMemoryException)

It is easy to exceed the 2GB RAM .NET memory limit , that involve both x86 and x64 platforms. This was a original choice of Microsoft .NET developer team, the array max size into the heap was 2GB.

By the way, from .NET 4.5 and platform 64-bit it is possible to overcome this limit. But how?
It is enough to add this element gcAllowVeryLargeObjects into the config file App.config.

MSDN Link.

...
  <runtime>
    <gcallowverylargeobjects enabled="true">
  </gcallowverylargeobjects>
  </runtime>

There is always an index limitation, in fact the array index is a represented by Int32 number, so you can have maximum 23997907 items. But it is very difficult to have such big array, while and array like Complex[12000, 12000] is quite normal, enough to overcome 2GB.

Strange Exceptions 
(System.ArgumentException: Array size exceedsadressing limitation)

Every thing works fine now, my class use MKL functions without problems until you reach again 2GB, but why? The Microsoft condition was respected. This was the exception: System.ArgumentException: Array size exceeds dressing limitation!

I spent a lot of time to understand where was the issue, but it seemed nobody knew the solution. The same example I did in C# worked properly if wrote in C++.

At the end I found that the default marshaling doesn't like such large arrays. It was necessary to pin the array and pass a pointer to its first element to the native function. But how?

This was the hint:

using System.Numerics;
using System.Runtime.InteropServices;

public unsafe class MyWrapper
{
    static void Main(string[] args)
    {
        Complex[,] A = new Complex[12000, 12000];

        A[0, 0] = new Complex(0.1, -0.1);
        A[0, 1] = new Complex(11.0, 0.1);
        A[0, 5] = new Complex(55, -0.5);
        A[0, 9] = new Complex(99.0, 0.9);

        fixed (Complex* p = &A[0, 0])
            C_foo(p);
    }

    [DllImport("DLL1.dll", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true, SetLastError = false)]
    internal static extern int C_foo(Complex* A);
}

So my class becames:

[SuppressUnmanagedCodeSecurity]
    public unsafe class MKLWrapper
    {

…

 [DllImport("mkl_rt.dll", ExactSpelling = true, SetLastError = false, CallingConvention = CallingConvention.Cdecl)]
        internal static extern int LAPACKE_zgesv(
            int matrix_layout,
            int n,
            int nrhs,
            [In, Out] Complex* A,
            int lda,
            [In, Out] int[] ipiv,
            [In, Out] Complex* B,
            int ldb
        );

public static Complex[,] MKLResolve(Complex[,] MSL, Complex[,] VTN)
        {
            int n = MSL.GetLength(0);
            int nrhs = 1;
            int lda = n;
            int ldb = 1;
            int info = 1;

            int[] ipiv = new int[n];

            fixed (Complex* p1 = &MSL[0,0])
            {
                fixed (Complex* p2 = &VTN[0,0])
                {
                    info = MKLWrapper.LAPACKE_zgesv(LAPACK_ROW_MAJOR, n, nrhs, p1, lda, ipiv, p2, ldb);
                }
            }
            
            if (info == 0)
            {
                //Console.WriteLine("Calculation completed!");               
                return VTN;
            }
            else
            {
                //Console.WriteLine("Calculation Error!");
                return null;
            }
        }

In order to use this sintax the class MKLWrapper need to be declared unsafe:
public unsafe class MKLWrapper


Then the array parameter became:
[In, Out] Complex* A,


And the function needs to be called like this:
fixed (Complex* p1 = &MSL[0,0])
  {
    fixed (Complex* p2 = &VTN[0,0])
    {
      info = MKLWrapper.LAPACKE_zgesv(LAPACK_ROW_MAJOR, n, nrhs, p1, lda, ipiv, p2, ldb);
    }
  }


The fixed statement prevents the garbage collector from relocating a movable variable. The fixed statement is only permitted in an unsafe context. Fixed can also be used to create fixed size buffers. 

Here Microsoft document.

lapack_int 

Another thing to take in account, is that the lapack_int type could be int or long in C# but if declared int, you are sure this declaration is valid between the 32 or 64 bit platforms.


Now it works well!!!!!

Part 1 | Part 2 | Part 3 | Part 4


Wednesday, October 14, 2015

MKL and C# for dummies (Part 1)


Introduction

It is time to boost my application! A computational program that resolve linear systems of sparse matrices and other algebra calculations.

Until now, I implemented all the algorithms myself in c#, searching on internet or in math books, I parallelize also some of these, by the way the performance in some cases remains very slow.

After a research on internet, I decide to adopt the Intel Math Kernel Library (MKL). It comes with Intel “Parallel Studio XE” for commercial solutions. Intel  MKL  is  built  using  the  Intel®  C++  and  Fortran  Compilers  and  threaded  using  OpenMP*.  Its  algorithms  are constructed  to  balance  data  and  tasks  for  efficient  use  of  multiple  cores  and processors.

MKL is written in C++, and Fortran too, so you can have the sources, but it provides also dlls, already compiled x86 and x64, that can be used in other languages like C#. The main difficulties is to declare the functions on C# because: or you find examples, or you have to become an expert of marshalling with c#.

What is Marshaling?

Marshaling is the process of creating a bridge between managed code and unmanaged code; it is the homer that carries messages from the managed to the unmanaged environment and reverse. It is one of the core services offered by the CLR (Common Language Runtime.)

Because much of the types in unmanaged environment do not have counterparts in managed environment, you need to create conversion routines that convert the managed types into unmanaged and vice versa; and that is the marshaling process.
As a refresher, we call .NET code “managed” because it is controlled (managed) by the CLR. Other code that is not controlled by the CLR is called unmanaged.

Why Marshaling?
You already know that there is no such compatibility between managed and unmanaged environments. In other words, .NET does not contain such types HRESULT, DWORD, and HANDLE that exist in the realm of unmanaged code. Therefore, you need to find a .NET substitute or create your own if needed. That is what called marshaling.

An example is the unmanaged DWORD; it is an unsigned 32-bit integer, so we can marshal it in .NET as System.UInt32. Therefore, System.UInt32 is a substitute for the unmanaged DWORD. On the other hand, unmanaged compound types (structures, unions, etc.) do not have counterparts or substitutes in the managed environment. Thus, you’ll need to create your own managed types (structures/classes) that will serve as the substitutes for the unmanaged types you use.

(http://stackoverflow.com/questions/2240804/marshalling-what-is-it-and-why-do-we-need-it).


First Steps
Once installed “Parallel Studio XE”, I had to understand where to find the libraries and codes. I found these directories:


DLLs C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2016.0.110\windows\redist\
CPP Headers C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2016.0.110\windows\mkl\include
DOCUMENTATION C:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2016/windows/documentation/en/mkl/common/mklman_c/index.htm






In the DLL Folder, there were plenty files (192MB of files in x86 platform), so I had to find which of these was useful for my purpose.

As I mentioned above, I needed to compute a system resolution of sparse matrices, with Complex array data type. Reading the documentation I found that I had to use ?gesv() function.
? is a char that depends of data type you want to manage, in this case zgesv() was my right function.

This function was in LAPAKE namespace, so I found useful for my purpose these files:

  • mkl_rt.dll
  • mkl_core.dll
  • mkl_avx2.dll
  • mkl_intel_thread.dll
  • libiomp5md.dll

44MB of 192MB

I didn’t spend time to understand what is the specificity of each dlls in this list, but with these files my program works fine.

The code below it is a class, than incapsulates some MKL function used on my program.


 

[SuppressUnmanagedCodeSecurity]
public sealed class MKLWrapper
{

 public static int LAPACK_ROW_MAJOR = 101;
 public static int LAPACK_COL_MAJOR = 102;

 //TRANSPOSE
 public static int CBLAS_NO_TRANS = 111;
 public static int CBLAS_TRANS = 112;
 public static int CBLAS_CONJ_TRANS = 113;

 public const char UPLO_U = 'U'; 
 public const char UPLO_L = 'L'; 

 private MKLWrapper()
 {
 }

 #region MKL API

 
 [DllImport("mkl_rt.dll", ExactSpelling = true, SetLastError = false, CallingConvention = CallingConvention.Cdecl)]
 internal static extern int LAPACKE_zgesv(
  int matrix_layout,
  int n,
  int nrhs,
  [In, Out] Complex[,] input_matrix,
  int lda,
  [In, Out] int[] ipiv,
  [In, Out] Complex[,] b,
  int ldb
 );

 /** CBLAS native cblas_cgemm declaration */
 [DllImport("mkl_rt.dll", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true, SetLastError = false)]
 internal static extern void cblas_zgemm(
  int matrix_layout,
  int TransA,
  int TransB,
  int M,
  int N,
  int K,
  Complex alpha,
  [In] Complex[,] A,
  int lda,
  [In] Complex[,] B,
  int ldb,
  Complex beta,
  [In, Out] Complex[,] C,
  int ldc
 );

 //lapack_int LAPACKE_zsytrf(int matrix_layout, char uplo, lapack_int n, lapack_complex_double* a, lapack_int lda, lapack_int* ipiv);
 [DllImport("mkl_rt.dll", ExactSpelling = true, SetLastError = false, CallingConvention = CallingConvention.Cdecl)]
 internal static extern int LAPACKE_zsytrf(
  int matrix_layout,
  char uplo,
  int n,
  [In, Out] Complex[,] input_matrix,
  int lda,
  [In, Out] int[] ipiv
 );


 //LAPACKE_zsytri(int matrix_layout, char uplo, lapack_int n, lapack_complex_double* a, lapack_int lda, const lapack_int* ipiv );
 [DllImport("mkl_rt.dll", ExactSpelling = true, SetLastError = false, CallingConvention = CallingConvention.Cdecl)]
 internal static extern int LAPACKE_zsytri(
  int matrix_layout,
  char uplo,
  int n,
  [In, Out] Complex[,] input_matrix,
  int lda,
  [In, Out] int[] ipiv
 );

 #endregion

 
 // Multiplication between matrices
 public static Complex[,] MKLMul(Complex[,] A, Complex[,] B)
 {
  int m = A.GetLength(0);
  int n = B.GetLength(1);
  int k = A.GetLength(1);

  Complex[,] c = MatrixGeneric.Create<complex>(m, n);

  Complex alpha = 1;
  Complex beta = 0;

  int lda = k;
  int ldb = n;
  int ldc = n;

  MKLWrapper.cblas_zgemm(LAPACK_ROW_MAJOR, CBLAS_NO_TRANS, CBLAS_NO_TRANS, m, n, k, alpha, A, lda, B, ldb, beta, c, ldc);

  return c;
 }

 
 // Lynear System Resolution (Sparse matrices)
 public static Complex[,] MKLResolve(Complex[,] MSL, Complex[,] VTN)
 {
  int n = VTN.GetLength(0);
  int nrhs = 1;
  int lda = n;
  int ldb = 1;
  int info = 1;

  int[] ipiv = new int[n];

  info = MKLWrapper.LAPACKE_zgesv(LAPACK_ROW_MAJOR, n, nrhs, MSL, lda, ipiv, VTN, ldb);
  if (info == 0)
  {
   //Console.WriteLine("Calculation completed!");
   Complex[,] res = MatrixGeneric.Create<complex>(n, true);
   for (int i = 1; i <= VTN.Length; i++)
   {
    res[i, 1] = VTN[i, 1];
   }
   return res;
  }
  else
  {
   //Console.WriteLine("Calculation Error!");
   return null;
  }

 }

 
 // Symmetrica Matrix Inversion
 public static Complex[,] MKLInvert(Complex[,] A)
 {

  Complex[,] c = (Complex[,])A.Clone();

  int n = A.GetLength(0);
  int lda = n;
  int info = 1;

  int[] ipiv = new int[n];

  info = MKLWrapper.LAPACKE_zsytrf(LAPACK_ROW_MAJOR, UPLO_U, n, c, lda, ipiv);

  info = MKLWrapper.LAPACKE_zsytri(LAPACK_ROW_MAJOR, UPLO_U, n, c, lda, ipiv);


  for (int i = 1; i <= A.GetLength(0); i++)
  {
   for (int j = 1; j <= A.GetLength(1); j++)
   {
    if (i > j)
     c[i, j] = c[j, i];
   }
  }


  if (info == 0)
   return c;
  else
   return null;
 }
}



Part 1 | Part 2 | Part 3 | Part 4