Scientific Computing on JRuby github.com/prasunanand
Objective ● A Scientific library is memory intensive and speed counts.How to use JRuby effectively to create a great tool/gem. ● A General Purpose GPU library for Ruby that can be used by industry in production and academia for research.
● Ruby Science Foundation ● SciRuby has been trying to push Ruby for scientific computing. ● Popular Rubygems: 1. NMatrix 2. Daru 3. Mixed_models
NMatrix NMatrix is SciRuby’s numerical matrix core, implementing dense matrices as well as two types of sparse (linked-list-based and Yale/CSR). It currently relies on ATLAS/CBLAS/CLAPACK and standard LAPACK for several of its linear algebra operations.
Daru
Mixed_models
Nyaplot
Why nya?
Contributors wanted ● IRC #sciruby ● Slack-channel #sciruby ● Google-group #sciruby
Known for performance JRuby is 10 times faster than CRuby. With truffle it’s around 40 times faster than CRuby.
Say hello
NMatrix for JRuby ● Not a unified interface for Sciruby gems: MDArray. ● MDArray is a great gem for Linear Algebra. ● However, every gem that used NMatrix as dependency needed to be reimplemented with MDArray. ● Hence, putting in effort for optimization.
NMatrix for JRuby ● Parallelism=> No Global Interpreter Lock as in case of MRI ● Easy Deployment(Warbler gem)
How NMatrix works ● N-Dimensional ● 2-Dimensional NMatrix
N-dimensional NMatrix N-dimensional matrices are stored as a one-dimensional Array.
Elementwise Operation ● Iterate through the elements ● Access the array; do the operation, return it ● [:add, :subtract, :sin, :gamma]
Determinants and Factoriztion ● Two dimensional matrix operations ● In NMatrix-MRI, BLAS-III and LAPACK routines are implemented using their respective libraries ● NMatrix-JRuby depends on Java functions.
Mixed models ● After NMAtrix for doubles was ready, I tested it with mixed_models.
Challenges ● Autoboxing and Multiple data type ● Minimise copying of data ● Handling large array
Autoboxing ● :float64 => double only ● Strict dtypes => creating data type in Java: not guessing ● Errors => that can’t be reproduced :P [ 0. 11, 0.05, 0.34, 0.14 ] + [ 0. 21,0.05, 0.14, 0.14 ] = [ 0, 0, 0, 0] ([ 0. 11, 0.05, 0.34, 0.14 ] + 5) + ([ 0. 21, 0.05, 0.14, 0.14 ] + 5) - 10 = [ 0.32, 0.1, 0.48, 0.28]
Minimise copying of data ● Make sure you make copies of data
Handling large arrays ● Array Size ● Accessing elements ● Chaining to java method ● Speed and Memory Required
Ruby Code index =0 puts Benchmark.measure{ (0...15000).each do |i| (0...15000).each do |j| c[i][j] = b[i][j] index+=1 end end } #67.790000 0.070000 67.860000 ( 65.126546) #RAM consumed => 5.4GB b = Java::double[15_000,15_000].new c = Java::double[15_000,15_000].new index=0 puts Benchmark.measure{ (0...15000).each do |i| (0...15000).each do |j| b[i][j] = index index+=1 end end } #43.260000 3.250000 46.510000 ( 39.606356)
Java Code public class MatrixGenerator{ public static void test2(){ for (int index=0, i=0; i < row ; i++){ for (int j=0; j < col; j++){ c[i][j]= b[i][j]; index++; } } } puts Benchmark.measure{MatrixGenerator.test2} #0.034000 0.001000 00.034000 ( 00.03300) #RAM consumed => 300MB public class MatrixGenerator{ public static void test1(){ double[][] b = new double[15000][15000]; double[][] c = new double[15000][15000]; for (int index=0, i=0; i < row ; i++){ for (int j=0; j < col; j++){ b[i][j]= index; index++; } } } puts Benchmark.measure{MatrixGenerator.test1} #0.032000 0.001000 00.032000 ( 00.03100)
Results Improves: ● 1000 times the speed ● 10times the memory
Benchmarking NMatrix functionalities
System Specifications ● CPU: AMD FX8350 0ctacore 4.2GHz ● RAM: 16GB
Addition
Subtraction
Gamma
Matrix Multiplication
Determinant
Factorization
Benchmark conclusion ● NMatrix-JRuby is incredibly faster for N-dimensional matrices when elementwise operations are concerned. ● NMatrix-MRI is faster for 2-dimensional matrix when calculating matrix multiplication, determinant calculation and factorization.
Improvements ● Make NMatrix-JRuby faster than NMatrix-MRI using BLAS level-3 and LAPACK routines. ● How? ● Why not JBlas?
Future Work ● Add support for complex dtype. ● Convert NMatrix-JRuby Enumerators to Java code. ● Add sparse support.
Am I done?
Nope!
Enter GPU
A General-Purpose GPU library ● Combine the beauty of Ruby with transparent GPU processing ● This will work both on client computers and on servers that make use of TESLA's and Intel Xeon Phi solutions. ● Developer activity and support for the current projects is mixed at best, and they are tough to use as they involve writing kernels and require a lot of effort to be put in buffer/RAM optimisation.
ArrayFire-rb ● Wraps ArrayFire library
Using ArrayFire
MRI ● C extension ● Architecture is inspired by NMatrix and NArray ● The C++ function is placed in a namespace (e.g., namespace af { }) or is declared static if possible. The C function receives the prefix af_, e.g., af_multiply() (this function also happens to be static). ● C macros are capitalized and generally have the prefix AF_, as with AF_DTYPE(). ● C functions (and macros, for consistency) are placed within extern "C" { } blocks to turn off C++ mangling.
JRuby ● The approach is same as NMatrix JRuby. ● Java Native Interface( JNI ) ● Work on ArrayFire-Java
Benchmarking ArrayFire
System Specification CPU: AMD FX Octacore 4.2GHz RAM: 16GB GPU: Nvidia GTX 750Ti GPU RAM : 4GB DDR5
Matrix Addition
Matrix Multiplication
Matrix Determinant
Factorization
Transparency ● Integrate with Narray ● Integrate with NMatrix ● Integrate with Rails
Applications ● Endless possibilities ;) ● Bioinformatics ● Integrate Tensorflow ● Image Processing ● Computational Fluid Dynamics
Conclusion
Useful Links ● https://github.com/sciruby/nmatrix ● https://github.com/arrayfire/arrayfire-rb ● https://github.com/prasunanand/arrayfire-rb/tree/temp
Acknowlegements 1. Pjotr Prins 2. Charles Nutter 3. John Woods 4. Alexej Gossmann 5. Sameer Deshmukh 6. Pradeep Garigipati
Thank You Github: prasunanand Twitter: @prasun_anand Blog: prasunanand.com

Scientific computing on jruby