Wednesday, October 14, 2015

MKL and C# for dummies (Part 1)


Introduction

It is time to boost my application! A computational program that resolve linear systems of sparse matrices and other algebra calculations.

Until now, I implemented all the algorithms myself in c#, searching on internet or in math books, I parallelize also some of these, by the way the performance in some cases remains very slow.

After a research on internet, I decide to adopt the Intel Math Kernel Library (MKL). It comes with Intel “Parallel Studio XE” for commercial solutions. Intel  MKL  is  built  using  the  Intel®  C++  and  Fortran  Compilers  and  threaded  using  OpenMP*.  Its  algorithms  are constructed  to  balance  data  and  tasks  for  efficient  use  of  multiple  cores  and processors.

MKL is written in C++, and Fortran too, so you can have the sources, but it provides also dlls, already compiled x86 and x64, that can be used in other languages like C#. The main difficulties is to declare the functions on C# because: or you find examples, or you have to become an expert of marshalling with c#.

What is Marshaling?

Marshaling is the process of creating a bridge between managed code and unmanaged code; it is the homer that carries messages from the managed to the unmanaged environment and reverse. It is one of the core services offered by the CLR (Common Language Runtime.)

Because much of the types in unmanaged environment do not have counterparts in managed environment, you need to create conversion routines that convert the managed types into unmanaged and vice versa; and that is the marshaling process.
As a refresher, we call .NET code “managed” because it is controlled (managed) by the CLR. Other code that is not controlled by the CLR is called unmanaged.

Why Marshaling?
You already know that there is no such compatibility between managed and unmanaged environments. In other words, .NET does not contain such types HRESULT, DWORD, and HANDLE that exist in the realm of unmanaged code. Therefore, you need to find a .NET substitute or create your own if needed. That is what called marshaling.

An example is the unmanaged DWORD; it is an unsigned 32-bit integer, so we can marshal it in .NET as System.UInt32. Therefore, System.UInt32 is a substitute for the unmanaged DWORD. On the other hand, unmanaged compound types (structures, unions, etc.) do not have counterparts or substitutes in the managed environment. Thus, you’ll need to create your own managed types (structures/classes) that will serve as the substitutes for the unmanaged types you use.

(http://stackoverflow.com/questions/2240804/marshalling-what-is-it-and-why-do-we-need-it).


First Steps
Once installed “Parallel Studio XE”, I had to understand where to find the libraries and codes. I found these directories:


DLLs C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2016.0.110\windows\redist\
CPP Headers C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2016.0.110\windows\mkl\include
DOCUMENTATION C:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2016/windows/documentation/en/mkl/common/mklman_c/index.htm






In the DLL Folder, there were plenty files (192MB of files in x86 platform), so I had to find which of these was useful for my purpose.

As I mentioned above, I needed to compute a system resolution of sparse matrices, with Complex array data type. Reading the documentation I found that I had to use ?gesv() function.
? is a char that depends of data type you want to manage, in this case zgesv() was my right function.

This function was in LAPAKE namespace, so I found useful for my purpose these files:

  • mkl_rt.dll
  • mkl_core.dll
  • mkl_avx2.dll
  • mkl_intel_thread.dll
  • libiomp5md.dll

44MB of 192MB

I didn’t spend time to understand what is the specificity of each dlls in this list, but with these files my program works fine.

The code below it is a class, than incapsulates some MKL function used on my program.


 

[SuppressUnmanagedCodeSecurity]
public sealed class MKLWrapper
{

 public static int LAPACK_ROW_MAJOR = 101;
 public static int LAPACK_COL_MAJOR = 102;

 //TRANSPOSE
 public static int CBLAS_NO_TRANS = 111;
 public static int CBLAS_TRANS = 112;
 public static int CBLAS_CONJ_TRANS = 113;

 public const char UPLO_U = 'U'; 
 public const char UPLO_L = 'L'; 

 private MKLWrapper()
 {
 }

 #region MKL API

 
 [DllImport("mkl_rt.dll", ExactSpelling = true, SetLastError = false, CallingConvention = CallingConvention.Cdecl)]
 internal static extern int LAPACKE_zgesv(
  int matrix_layout,
  int n,
  int nrhs,
  [In, Out] Complex[,] input_matrix,
  int lda,
  [In, Out] int[] ipiv,
  [In, Out] Complex[,] b,
  int ldb
 );

 /** CBLAS native cblas_cgemm declaration */
 [DllImport("mkl_rt.dll", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true, SetLastError = false)]
 internal static extern void cblas_zgemm(
  int matrix_layout,
  int TransA,
  int TransB,
  int M,
  int N,
  int K,
  Complex alpha,
  [In] Complex[,] A,
  int lda,
  [In] Complex[,] B,
  int ldb,
  Complex beta,
  [In, Out] Complex[,] C,
  int ldc
 );

 //lapack_int LAPACKE_zsytrf(int matrix_layout, char uplo, lapack_int n, lapack_complex_double* a, lapack_int lda, lapack_int* ipiv);
 [DllImport("mkl_rt.dll", ExactSpelling = true, SetLastError = false, CallingConvention = CallingConvention.Cdecl)]
 internal static extern int LAPACKE_zsytrf(
  int matrix_layout,
  char uplo,
  int n,
  [In, Out] Complex[,] input_matrix,
  int lda,
  [In, Out] int[] ipiv
 );


 //LAPACKE_zsytri(int matrix_layout, char uplo, lapack_int n, lapack_complex_double* a, lapack_int lda, const lapack_int* ipiv );
 [DllImport("mkl_rt.dll", ExactSpelling = true, SetLastError = false, CallingConvention = CallingConvention.Cdecl)]
 internal static extern int LAPACKE_zsytri(
  int matrix_layout,
  char uplo,
  int n,
  [In, Out] Complex[,] input_matrix,
  int lda,
  [In, Out] int[] ipiv
 );

 #endregion

 
 // Multiplication between matrices
 public static Complex[,] MKLMul(Complex[,] A, Complex[,] B)
 {
  int m = A.GetLength(0);
  int n = B.GetLength(1);
  int k = A.GetLength(1);

  Complex[,] c = MatrixGeneric.Create<complex>(m, n);

  Complex alpha = 1;
  Complex beta = 0;

  int lda = k;
  int ldb = n;
  int ldc = n;

  MKLWrapper.cblas_zgemm(LAPACK_ROW_MAJOR, CBLAS_NO_TRANS, CBLAS_NO_TRANS, m, n, k, alpha, A, lda, B, ldb, beta, c, ldc);

  return c;
 }

 
 // Lynear System Resolution (Sparse matrices)
 public static Complex[,] MKLResolve(Complex[,] MSL, Complex[,] VTN)
 {
  int n = VTN.GetLength(0);
  int nrhs = 1;
  int lda = n;
  int ldb = 1;
  int info = 1;

  int[] ipiv = new int[n];

  info = MKLWrapper.LAPACKE_zgesv(LAPACK_ROW_MAJOR, n, nrhs, MSL, lda, ipiv, VTN, ldb);
  if (info == 0)
  {
   //Console.WriteLine("Calculation completed!");
   Complex[,] res = MatrixGeneric.Create<complex>(n, true);
   for (int i = 1; i <= VTN.Length; i++)
   {
    res[i, 1] = VTN[i, 1];
   }
   return res;
  }
  else
  {
   //Console.WriteLine("Calculation Error!");
   return null;
  }

 }

 
 // Symmetrica Matrix Inversion
 public static Complex[,] MKLInvert(Complex[,] A)
 {

  Complex[,] c = (Complex[,])A.Clone();

  int n = A.GetLength(0);
  int lda = n;
  int info = 1;

  int[] ipiv = new int[n];

  info = MKLWrapper.LAPACKE_zsytrf(LAPACK_ROW_MAJOR, UPLO_U, n, c, lda, ipiv);

  info = MKLWrapper.LAPACKE_zsytri(LAPACK_ROW_MAJOR, UPLO_U, n, c, lda, ipiv);


  for (int i = 1; i <= A.GetLength(0); i++)
  {
   for (int j = 1; j <= A.GetLength(1); j++)
   {
    if (i > j)
     c[i, j] = c[j, i];
   }
  }


  if (info == 0)
   return c;
  else
   return null;
 }
}



Part 1 | Part 2 | Part 3 | Part 4