EKON 29

Statistic Packages for Delphi and Python

A truly cross-platform library for numerical analysis and statistical functions isn’t easy to find in Delphi. Especially since quality and licensing are also important. In this session, we’ll delve into the most useful packages and libraries suitable for performing statistical calculations, all the way up to advanced modeling and data visualization. We’ll also compile a statistics on the top and flops of Delphi projects over the past 30 years.

https://entwickler-konferenz.de/delphi-innovations-fundamentals/statistic-packages-fuer-delphi-oder-python

Hands-On

  • Setting up the mrMath and mrMatrix mrAI toolchain
  • Open AI for Delphi Demo
  • Calculating DMath correlations, pattern recognition, and trends
  • Introducing the 7 well-known statistical methods
  • Demonstrating the 5 most important chart types (Bar Chart, Scatter Plot,
    Histogram, Box Plot and Correlationmatrix)
  • Setting up an energy storage time series AGSI project statistics
  • Descriptive statistics with the reference dataset for morale statistics (Guerry, “HistData”)
  • Data Science Tutorial AGSI or Guerry

Contents

  • Overview of the statistical packages with Delphi and Python (P4D).
  • Configuration and spec features of DMath, SKLearn, Statsmodels, & DataLab library
  • Troubleshooting: Typical miscalculations and their solutions with cleaned data
  • First steps in implementing regression, cluster analysis, and correlation matrix

We download the Guerry dataset, a collection of historical data used in support of 
Andre-Michel Guerry’s 1833 Essay on the Moral Statistics of France. The data set is hosted 
online in comma-separated values format (CSV) by the Rdatasets repository. We could download the file 
locally and then load it using read_csv, but pandas takes care of all of this automatically for us:

https://github.com/friendly/Guerry
 OLS Regression Results 
==============================================================================
Dep. Variable: Lottery R-squared: 0.414
Model: OLS Adj. R-squared: 0.392
Method: Least Squares F-statistic: 19.30
Date: Thu, 26 Jun 2025 Prob (F-statistic): 1.47e-09
Time: 17:35:22 Log-Likelihood: -375.28
No. Observations: 86 AIC: 758.6
Df Residuals: 82 BIC: 768.4
Df Model: 3
Covariance Type: nonrobust
================================================================================
coef std err t P>|t| [0.025 0.975]
--------------------------------------------------------------------------------
Intercept 194.3951 37.768 5.147 0.000 119.263 269.527
Wealth 0.2820 0.093 3.024 0.003 0.097 0.468
Literacy -0.3840 0.127 -3.033 0.003 -0.636 -0.132
np.log(Pop1831) -25.2363 6.047 -4.174 0.000 -37.265 -13.207
==============================================================================
Omnibus: 7.602 Durbin-Watson: 1.890
Prob(Omnibus): 0.022 Jarque-Bera (JB): 7.051
Skew: -0.651 Prob(JB): 0.0294
Kurtosis: 3.524 Cond. No. 1.13e+03
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.13e+03. This might indicate that there are
strong multicollinearity or other numerical problems.
mX5 executed: 26/06/2025 17:35:23 Runtime: 0:0:47.680 Memload: 58% use
Partial Regression Plot Grid

https://spatialanalysis.github.io/geodaData/reference/guerry.html

 Execstring('import statsmodels.api as sm; import numpy as np'); Execstr('model = smf.ols("Lottery ~ Wealth + Literacy + np.log(Pop1831)", data=df).fit()'); execstr('print(model.summary())'); execstr('sm.graphics.plot_partregress_grid(model)'); 

Statsmodels is a Python library designed for statistical modeling, hypothesis testing, and data exploration. It provides a wide range of statistical models, including linear regression, time series analysis, and generalized linear models. 
The library supports both formula-based modeling (similar to R) and direct use of NumPy arrays. The data science tutorial explains the so called AGSI data storage and his visualization of the timeline. AGSI is the Aggregated Gas Storage Inventory and offers you the possibility to be kept up to date whenever a new service announcement or update from one of our data providers is posted on the website.

Data representation of gas in storage as a timeline AGSI dataset.

https://blogs.embarcadero.com/why-a-data-scientist-chooses-delphi-for-powerful-real-world-visualizations/

Scikitlearns model.score(X,y) calculation works on coefficient of determination i.e R^2 is a simple functionthat takes model.score= (X_test,y_test). It doesn’t require y_predicted value to be supplied externally to calculate the score for you, rather it calculates y_predicted internally and uses it in the calculations.

  • Mean Function: Calculates the average of an array.
  • Covariance Function: Computes the covariance between two arrays.
  • Correlation: Uses covariance and standard deviations for the correlation coefficient.
  • ComputeCorrelationMatrix: Iterates through all variable pairs to compute corr-matrix.
  • PrintMatrix: Outputs the matrix to the console.

Delphi does not have a built-in correlation matrix function with a heatmap too, but you can implement one using standard math operations.

Guerry Dataset – from a csv file to a dataframe

The mrMath, mrStats, mrMatrix, mrImgUtils package includes:

  • Standard Fisher LDA classifier
  • Robust (and Fast Robust) version of this classifier
  • Incremental (and Robust) Fisher LDA classifier learning.
  • Support Vector Machines (least squares and lagrangian learning)
  • Naive Bayes
  • Simple Decission stumps
  • Radial basis function
  • C4.5 Decission trees.
  • K-means
  • Ensemble classifiers: AdaBoost, Gentle Boost, Bagging
  • Simple feed forward Neural Nets
mX5.2 with Statsmodels Console Output

Learn how to install statsmodels, a Python package for statistical modeling, using Anaconda, PyPI, source or development version.

Statsmodels Py 3.13.4

You can compute a correlation matrix in Delphi by iterating over all pairs of variables, extracting columns, and applying the Pearson correlation formula. For more advanced matrix operations or large datasets, consider using a Delphi matrix library.

 procedure ComputeCorrelationMatrix(const Data: DMatrix; var CorrMatrix: DMatrix); var i, j, k, nVars, nObs: Integer; colI, colJ: array of Double; begin nObs:= Length(Data); nVars:= Length(Data[0]); //SetLength(CorrMatrix, nVars, nVars); SetMatrixLength(corrMatrix, nvars, nvars); for i:= 0 to nVars-1 do begin SetLength(colI, nObs); for j:= 0 to nObs - 1 do colI[j]:= Data[j][i]; for j:= i to nVars-1 do begin SetLength(colJ, nObs); for k:= 0 to nObs - 1 do colJ[k]:= Data[k][j]; CorrMatrix[i][j]:= PearsonCorrelation(colI, colJ); CorrMatrix[j][i]:= CorrMatrix[i][j]; // Matrix is symmetric end; end; end; 

To transform the CSV data from file to matrix and dataframe you need 4 steps:

 S:= TStringList.Create; try //S.StrictDelimiter := True; S.LineBreak := #10; //S1.Delimiter := ','; s.loadfromfile(exepath+'\examples\1417_export_dataframe.csv'); writ('size: '+itoa(s.count)); SetMatrixLength(mData, 86, 6); TStringListToMatrix(s, mData); 

Input: The StringList contains rows of data as strings, separated commas.

Parsing: Each string is split into columns using CommaText of a temporary TStringList.

Matrix Population: The parsed values are stored in a 2D array (Matrix).

Output: The matrix is printed to verify the conversion.

 procedure TStringListToMatrix(strList: TStringList; var matrix: DMatrix); var i, j: Integer; RowData: TStringList; begin if strList.Count = 0 then Exit; // Create a temporary TStringList to parse each row RowData:= TStringList.Create; try RowData.Delimiter:= ','; // Assuming comma-separated values RowData.StrictDelimiter:= True; //RowData.commatext // Resize matrix to match the TStringList dimensions SetLength(matrix, strList.Count); for i := 1 to strList.Count - 1 do begin RowData.DelimitedText:= strList[i]; //writ('debug '+itoa(rowdata.count)); SetLength(Matrix[i], RowData.Count); for j:= 4 to RowData.Count - 15 do begin //slice 4-9 Matrix[i][j-4]:= strtofloat(RowData[j]); //writ('debug '+flots(matrix[i][j])); end; end; finally RowData.Free; end; end; 
Partial Regression Plot with Statsmodels

debug: 208-RuntimeError: CPU dispatcher tracer already initlized 865 err:20
debug: 209-RuntimeError: CPU dispatcher tracer already initlized 865 err:20
Exception: RuntimeError: CPU dispatcher tracer already initlized at 865.3134

 //# Fit an OLS regression model //eng.Execstring('model = smf.ols("Lottery ~ Literacy + np.log(Pop1831)", data=data).fit()'); Execstr('model = smf.ols("Lottery ~ Wealth + Literacy + np.log(Pop1831)", data=df).fit()'); //# Display the summary of results execstr('print(model.summary())'); {Notice that there is one missing observation in the Region column. We eliminate it using a DataFrame method provided by pandas:} execstr('df = df.dropna()'); execstr('sm.graphics.plot_partregress("Lottery","Wealth", ["Region","Distance"],'+ 'data=df, obs_labels=False)'); ExecStr('plt.show()'); 
sample or observation tracer obs_labels=True

Scripts at:

https://sourceforge.net/projects/maxbox5/files/EKON29/1385_DCorrelation3SeabornPyCompare2_uc.txt/download

https://sourceforge.net/projects/maxbox5/files/examples/1417_statsmodels_64_delphi_python3.12.4debug30.txt/download

Python for Delphi at maXbox statsmodels console
Informatik Workshop 1994
2 Nation Locs France – Italy
Train your Main Brain
Museum near Rotterdam
EKON 29 Session

https://entwickler-konferenz.de/delphi-innovations-fundamentals/statistic-packages-fuer-delphi-oder-python

30 Jahre Delphi – 30 Jahre 4Gewinnt Gewinnspiel

You can find the script to run in maXbox VM

https://github.com/maxkleiner/maXbox5/blob/main/examples/170_4gewinnt_main3_2025.txt

Script: https://github.com/maxkleiner/maXbox5/blob/main/examples/4gewinnt2025.txt
Bug hunting in V5.2.9.196
CC 6572 Cité du Train – Mulhouse – France
The Heat Beat

Weatherapp5.4 with OpenWeathermap and OWM forecaster plus mapbox city maps

AKE AKE-Eisenbahntouristik
3 Nation Multisystem Locs – DB, B, SNCF – BB 181 212-2, BB 1608, CC 40110
ICE 411 (ICE-T) Hannover – Fulda, 24/08/2025

Char Statistic

So, for example, if passed “aAAbbbb” for ‘stones’ and “aA” for ‘jewels‘, the function should return 3.

 function CommonLetters(S1,S2: string): integer; {Count the number of letters in S1 are found in S2} var i, j: integer; begin result:=0; for i:=1 to Length(S1) do for j:=1 to Length(S2) do if S1[i]=S2[j] then Inc(result); end; procedure ShowJewelsStones(Memo: TMemo; Jewels,Stones: string); {Show one Jewels-Stones comparison} begin Memo2.Lines.Add(Jewels+' '+Stones+' '+IntToStr(CommonLetters(Jewels,Stones))); end; 

Both strings can contain any number of upper or lower case letters. However, in the case of ‘jewels’, all letters must be distinct.

 const PYFUNC = 'def countJewels(s, j): '+LF+ ' return sum(x in j for x in s) '; with TPythonEngine.Create(Nil) do begin loadDLL; try ExecString('import sys'); ExecString(PYFUNC); println('pyout:'+evalstr('countJewels("taAAbbbb","aA"),sys.version')); println('pyout:'+evalstr('countJewels("12345643","34"),sys.version')); except raiseError; finally unloadDll; Free; end; end; 

https://rosettacode.org/wiki/Jewels_and_stones#Python

TEE Capitole – Jouef CC 6526, Roco BB 9278, L.S. Models CC 6513
EKON 29

    result:= ‘GEO_Code Data Map Out: ‘+UTF8ToString(lStream.ReadString(lStream.Size));

mX52 SynEdit Upgrade
mX5.2.9.198 Upgrade
Tokyo Bay
EKON History

The slides for the session:

Click to access maxbox_starter157.pdf