feat: Support log for Decimal128 and Decimal256 #17023

theirix · 2025-08-03T11:27:59Z

Which issue does this PR close?

Closes feat: support decimal for math functions: log #17140.

Rationale for this change

Add Decimal128 and Decimal256 support for log UDF.
It's a most generic function, allowing for specifying a logarithm base, but by default it is log10, which makes it a good candidate for long decimals. The calculate_binary_math helper could simplify a lot of UDFs (a subject of future PRs).

Since decimals only support integer logarithms, the result is rounded and then converted back to a float, but it is still done on the i128/i256 level, not by rounding input parameters as before.

Also, if numbers are parsed as floats, there is precision lost due to floating-point handling. So the majority of tests are targeting parse_float_as_decimal=true as in #14612. Otherwise, the log result could differ by one or two due to rounding – see regression SLT.

Notably, we still miss math for 256-bits. Arrow's i256 type uses the [BigInt implementation], which could provide log10 at least, but we can extend decimal arithmetic in Arrow as well.

What changes are included in this PR?

Support for decimals
A generic function for binary functors simplifying math UDF development
Additional unit and SLT tests
A minor follow-up fix to feat: Add ScalarValue::{new_one,new_zero,new_ten,distance} support for Decimal128 and Decimal256 #16831 for zero scale
Updated some tests to reflect default float64 calculation
[chore] Allow env logging in functions crate

Please note, there are more specialised functions log2, log10, ln (), which are not handled by LogFunc. They could be migrated to this UDF later, providing a base value explicitly.

Are these changes tested?

Unit tests
Regression SLT tests
Manual invocation of datafusion-cli
Manual comparison of results to other large decimal math implementations (Python, WolframAlpha)

Are there any user-facing changes?

No, except the precision of log calculation can increase for float inputs (since f64 is used). For decimals, results could become more accurate since no float downcasting is involved.

Signed-off-by: theirix <theirix@gmail.com>

Jefffrey · 2025-09-02T08:04:39Z

datafusion/functions/src/math/log.rs

+ Numeric(1),
+ Numeric(2),
+ Exact(vec![DataType::Float32]),
+ Exact(vec![DataType::Float64]),
+ Exact(vec![DataType::Float32, DataType::Float32]),
+ Exact(vec![DataType::Float64, DataType::Float64]),
+ Exact(vec![
+ DataType::Int64,
+ DataType::Decimal128(DECIMAL128_MAX_PRECISION, 0),
+ ]),
+ Exact(vec![
+ DataType::Float32,
+ DataType::Decimal128(DECIMAL128_MAX_PRECISION, 0),
+ ]),
+ Exact(vec![
+ DataType::Float64,
+ DataType::Decimal128(DECIMAL128_MAX_PRECISION, 0),
+ ]),
+ Exact(vec![
+ DataType::Int64,
+ DataType::Decimal256(DECIMAL256_MAX_PRECISION, 0),
+ ]),
+ Exact(vec![
+ DataType::Float32,
+ DataType::Decimal256(DECIMAL256_MAX_PRECISION, 0),
+ ]),
+ Exact(vec![
+ DataType::Float64,
+ DataType::Decimal256(DECIMAL256_MAX_PRECISION, 0),
+ ]),


Are the Exact(...) signatures superseded by the Numeric(1) and Numeric(2)? Considering integers, floats and decimals are considered numeric types?

Yes, Numeric is a class of types including decimals, floats and decimals.

The reason I included the third signature onwards is inability of Numeric(1) to support a null value, and we need to support select log(null); as per SLT and postgres feature parity.

If you have an idea how to make it work with nulls, I'd appreciate help. I don't like the current list of signatures

Hmm, good point. It also seems to make queries like this fail:

1. query failed: DataFusion error: Error during planning: Internal error: Function 'log' failed to match any signature, errors: Error during planning: Function 'log' expects 1 arguments but received 2,Error during planning: For function 'log' Decimal128(3, 1) and Decimal256(76, 0) are not coercible to a common numeric type. This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues No function matches the given name and argument types 'log(Decimal128(3, 1), Decimal256(76, 0))'. You might need to add explicit type casts. Candidate functions: log(Numeric(1)) log(Numeric(2)) [SQL] select log(10.0, 100000000000000000000000000000000000::decimal(76,0)); at /Users/jeffrey/Code/datafusion/datafusion/sqllogictest/test_files/decimal.slt:836

I'll try look at the coercion code a bit and try to understand what's going on here 🤔

(I'm assuming Numeric(2) actually means coerce to a common type)

I think adding a null coercion for numerics could help, because it worked with float32 in main. Null handling is hard because we have to conform with PostgreSQL and/or standard

I think we can safely remove these signatures:

Exact(vec![DataType::Float32]), Exact(vec![DataType::Float64]),

As they are indeed superseded by Numeric(1); Numeric(2) will need to remain with the rest of the signatures as it only guarantees coercing to a common numeric type across all the arguments which I don't think we want for all invocations of log. This is the part I misunderstood, I'll look into raising a PR to clarify some of the documentation later perhaps.

datafusion/functions/src/math/log.rs

Refactoring decimal128 conversions Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>

datafusion/functions/src/math/log.rs

datafusion/functions/src/utils.rs

Jefffrey

LGTM 👍

We can raise a separate issue for 256-bit support (the values that can't be casted to 128-bit)

Jefffrey · 2025-09-14T02:16:44Z

Created #17555 just so we have something to track the NotYetImplemented item

theirix · 2025-09-14T12:34:17Z

Thank you, @Jefffrey !

theirix added 4 commits August 3, 2025 11:27

Enable env_logger for datafusion-functions crate

a597e3a

Fixup ScalarValue decimal constructors

6a71c64

Support decimals in log UDF

62fcf93

Add sqllogic test for log on Decimals

74e8270

github-actions bot added sqllogictest SQL Logic Tests (.slt) common Related to common crate functions Changes to functions implementation labels Aug 3, 2025

theirix added 7 commits August 3, 2025 13:31

Fix test for scalar new_ten

e693f12

Loosen requirements on a return type

9dec4cf

Remove extra logging

ae26d5c

Improve handling scale and mix of base/value types

e296b8a

Signed-off-by: theirix <theirix@gmail.com>

Format

d6c463f

Merge branch 'main' into log-decimal

b034768

Adjust ScalarFunctionArgs construction

c2559ba

theirix marked this pull request as ready for review August 10, 2025 07:23

Jefffrey reviewed Sep 2, 2025

View reviewed changes

theirix and others added 3 commits September 8, 2025 23:59

Apply suggestions from code review

af40af9

Refactoring decimal128 conversions Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>

Update tests

756398b

Update tests and SLT

a27cbb9

Jefffrey reviewed Sep 11, 2025

View reviewed changes

datafusion/functions/src/math/log.rs Show resolved Hide resolved

datafusion/functions/src/utils.rs Show resolved Hide resolved

theirix added 3 commits September 11, 2025 18:13

Improve test for decimal128_to_i128

75ed4ac

Improve type signature for log UDF

186c004

Fix clippy

33b2a15

Jefffrey approved these changes Sep 12, 2025

View reviewed changes

Merge branch 'main' into log-decimal

d71fde4

Jefffrey merged commit 572c204 into apache:main Sep 14, 2025
30 checks passed

Jefffrey mentioned this pull request Sep 14, 2025

Native decimal 32/64/256 bit support for log #17555

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Support log for Decimal128 and Decimal256 #17023

feat: Support log for Decimal128 and Decimal256 #17023

Uh oh!

theirix commented Aug 3, 2025 •

edited

Loading

Jefffrey Sep 2, 2025

theirix Sep 10, 2025

Jefffrey Sep 11, 2025

theirix Sep 11, 2025

Jefffrey Sep 11, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jefffrey left a comment •

edited

Loading

Uh oh!

Jefffrey commented Sep 14, 2025

theirix commented Sep 14, 2025

Labels

2 participants

feat: Support log for Decimal128 and Decimal256 #17023

feat: Support log for Decimal128 and Decimal256 #17023

Uh oh!

Conversation

theirix commented Aug 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Jefffrey Sep 2, 2025

Choose a reason for hiding this comment

theirix Sep 10, 2025

Choose a reason for hiding this comment

Jefffrey Sep 11, 2025

Choose a reason for hiding this comment

theirix Sep 11, 2025

Choose a reason for hiding this comment

Jefffrey Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jefffrey left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jefffrey commented Sep 14, 2025

theirix commented Sep 14, 2025

Labels

2 participants

theirix commented Aug 3, 2025 •

edited

Loading

Jefffrey Sep 11, 2025 •

edited

Loading

Jefffrey left a comment •

edited

Loading