Pulse · apache/spark · GitHub

July 3, 2025 – July 10, 2025

Overview

33 Active pull requests

0 Active issues
- 0 Merged pull requests
- 33 Open pull requests
- 0 Closed issues
- 0 New issues

33 Pull requests opened by 28 people

[WIP][SPARK-52646][PS] Avoid CAST_INVALID_INPUT of `__eq__` in ANSI mode
#51370 opened Jul 4, 2025
[WIP] [SPARK-52689][SQL] Send DML Metrics to V2Write
#51377 opened Jul 4, 2025
[SPARK-52659][SQL]Misleading modulo error message in ansi mode
#51378 opened Jul 5, 2025
[SPARK-52545][SQL][DOCS] Update string literal docs for quote escaping rules
#51379 opened Jul 5, 2025
[SPARK-52617][SQL]Cast TIME to/from TIMESTAMP_NTZ
#51381 opened Jul 5, 2025
[SPARK-52696][SQL] Strip `__is_duplicate` metadata after analysis
#51389 opened Jul 7, 2025
approx_top_k_combine
#51393 opened Jul 7, 2025
[IN PROGRESS] Support getting pod state using Informers/Listers
#51396 opened Jul 8, 2025
[SPARK-52751][PYTHON][CONNECT] Don't eagerly validate column name in `dataframe['col_name']`
#51400 opened Jul 8, 2025
[SPARK-52716][SDP][SQL] Remove comment from Flow trait and references
#51406 opened Jul 9, 2025
[SPARK-52720][PS] Fix float32 type widening in `add`/`radd` under ANSI
#51408 opened Jul 9, 2025
[DRAFT] Parameter markers in DDL.
#51410 opened Jul 9, 2025
[SPARK-52722][CORE] Deprecate JdbcRDD class
#51415 opened Jul 9, 2025
[SPARK-52724][SQL] Enhance broadcast join OOM error handling with SHUFFLE_MERGE hint support
#51417 opened Jul 9, 2025
[SPARK-52729][SQL] Add GENERAL_TABLE v2 table capacity
#51419 opened Jul 9, 2025
[SPARK-52730][SQL] Store underlying driver and database version in JDBCRDD
#51421 opened Jul 9, 2025
[SPARK-46941][SQL][3.5] Can't insert window group limit node for top-k computation if contains SizeBasedWindowFunction
#51422 opened Jul 9, 2025
[SPARK-52725][CORE] Delay resource profile manager initialization until plugin is loaded
#51424 opened Jul 9, 2025
[SPARK-52750][TESTS] Add eventually in flaky test LocalTableScanExec
#51425 opened Jul 9, 2025
[SPARK-52737][CORE] Pushdown predicate and number of apps to FsHistoryProvider when listing applications
#51428 opened Jul 9, 2025
[SPARK-52740] [SS] Fix NPE in HDFSBackedStateStoreProvider accessing StateStoreConf.sqlConf when checkpoint format version is >=2
#51431 opened Jul 9, 2025
[SPARK-52741][SQL] RemoveFiles ShuffleCleanup mode doesnt work with non-adaptive execution
#51432 opened Jul 9, 2025
[SPARK-52727][SQL] Refactor Window resolution in order to reuse it in single-pass analyzer
#51433 opened Jul 10, 2025
[SPARK-52745][SQL] Ensure one of the `schema` and `columns` in the Table interface is implemented and `columns` is preferable
#51434 opened Jul 10, 2025
[SPARK-52752][CORE] Check if tasks on executor are finished when killing executor
#51437 opened Jul 10, 2025
[SPARK-52753] Make parseDataType binary compatible with previous versions
#51438 opened Jul 10, 2025
[SPARK-52726][SQL] Normalize project list under Window
#51439 opened Jul 10, 2025
[SPARK-52735][FOLLOWUP] Drop created UDF UDFs in SQL query tests
#51440 opened Jul 10, 2025
[WIP][SQL] Support TIME subtract
#51441 opened Jul 10, 2025
[SPARK-52757][CONNECT] Rename "plan" field in DefineFlow to "relation"
#51442 opened Jul 10, 2025
[SPARK-52760][PS] Fix float32 type widening in `sub`/`rsub` under ANSI
#51444 opened Jul 10, 2025
[SPARK-52759][SDP][SQL] Throw exception if pipeline has no tables or persisted views
#51445 opened Jul 10, 2025
[WIP][SQL] Clarify schema mismatch types in insertInto error
#51446 opened Jul 10, 2025

31 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

[SPARK-52187][SQL] Introduce Join pushdown for DSv2
#50921 commented on Jul 10, 2025 • 114 new comments
[SPARK-52582][SQL] Improve the memory usage of XML parser
#51287 commented on Jul 8, 2025 • 54 new comments
[SPARK-47547] BloomFilter fpp degradation
#50933 commented on Jul 8, 2025 • 19 new comments
[SPARK-52495][SQL] Allow including partition columns in the single variant column
#51206 commented on Jul 8, 2025 • 15 new comments
[SPARK-52588][SQL] Approx_top_k: accumulate and estimate
#51308 commented on Jul 9, 2025 • 8 new comments
[SPARK-48359][SQL] Built-in functions for Zstd compression and decompression
#46672 commented on Jul 8, 2025 • 8 new comments
[SPARK-52565] [SQL] Enforce ordinal resolution before other sort order expressions
#51268 commented on Jul 10, 2025 • 5 new comments
Enable -Xsource:3 compiler flag
#50474 commented on Jul 10, 2025 • 4 new comments
[SPARK-42746][SQL][FIXUP] Fix optimizer failure for SortOrder in the LISTAGG function
#51117 commented on Jul 10, 2025 • 1 new comment
[DRAFT][PYTHON] Improve Python UDF Arrow Serializer Performance
#51225 commented on Jul 7, 2025 • 1 new comment
[SPARK-52577][SDP] Add tests for Declarative Pipelines DatasetManager with Hive catalog
#51283 commented on Jul 10, 2025 • 1 new comment
[MINOR][DOCS] Updated the docstring of DataStreamWriter.foreach() method
#51316 commented on Jul 9, 2025 • 0 new comments
[SPARK-52638][SQL] Allow preserving Hive-style column order to be configurable
#51342 commented on Jul 4, 2025 • 0 new comments
[SPARK-52598][DOCS] Reorganize Spark Connect programming guide
#51305 commented on Jul 4, 2025 • 0 new comments
[SPARK-52640][SDP] Propagate Python Source Code Location
#51344 commented on Jul 7, 2025 • 0 new comments
[SPARK-52669][PySpark]Improvement PySpark choose pythonExec in cluster yarn client mode
#51357 commented on Jul 8, 2025 • 0 new comments
[WIP][PYTHON] Arrow UDF for aggregation
#51292 commented on Jul 8, 2025 • 0 new comments
[SPARK-52673][CONNECT][CLIENT] Add grpc RetryInfo handling to Spark Connect retry policies
#51363 commented on Jul 8, 2025 • 0 new comments
[CORE] Let LocalSparkContext clear active context in beforeAll
#51284 commented on Jul 7, 2025 • 0 new comments
[SPARK-52544][SQL] Allow configuring Json datasource string length limit through SQLConf
#51235 commented on Jul 7, 2025 • 0 new comments
[WIP][SPARK-51224][BUILD] Test Maven 4
#51230 commented on Jul 9, 2025 • 0 new comments
[SPARK-52444][SQL][CONNECT] Add support for Variant/Char/Varchar Literal
#51215 commented on Jul 9, 2025 • 0 new comments
[SPARK-52486][SQL] Fix Spark Driver Planning OOM issue due to unworthwhile dpp expression before Execution when enabling AQE
#51184 commented on Jul 8, 2025 • 0 new comments
[SPARK-51168][BUILD] Test Hadoop 3.4.2
#51127 commented on Jul 4, 2025 • 0 new comments
[SPARK-52012][CORE][SQL] Restore IDE Index with type annotations
#50798 commented on Jul 8, 2025 • 0 new comments
[WIP][SPARK-52011][SQL] Reduce HDFS NameNode RPC on vectorized Parquet reader
#50765 commented on Jul 4, 2025 • 0 new comments
[SPARK-51883][DOCS][PYTHON] Python Data Source user guide for filter pushdown
#50684 commented on Jul 9, 2025 • 0 new comments
[SPARK-51647][INFRA] Add a job to guard REPLs: spark-sql and spark-shell
#50423 commented on Jul 10, 2025 • 0 new comments
[SPARK-51637][PYTHON] Implement createColumnarReader for Python Data Source
#50414 commented on Jul 10, 2025 • 0 new comments
[SPARK-51359][CORE][SQL] Set INT64 as the default timestamp type for Parquet files
#50215 commented on Jul 7, 2025 • 0 new comments
[SPARK-49547][SQL][PYTHON] Add iterator of `RecordBatch` API to `applyInArrow`
#49005 commented on Jul 4, 2025 • 0 new comments