Skip to content

Conversation

trueleo
Copy link
Contributor

@trueleo trueleo commented Dec 15, 2023

Fixes #575.

Description

Datafusion PartitionedFile struct relies on correct file url and file size. This file size is used for calculating offset for reading the parquet footer. Wrong file size can lead to query issues.

This PR aims to fix that issue


This PR has:

  • been tested to ensure log ingestion and log query works.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added documentation for new or modified features or behaviors.
It is being used for calculating offset in query. Wrong file size can lead to query issues.
@nitisht nitisht requested review from nitisht and theteachr December 15, 2023 11:58
Copy link
Member

@nitisht nitisht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM & Tested

@nitisht nitisht merged commit 7c83641 into parseablehq:main Dec 15, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Dec 15, 2023
@trueleo trueleo deleted the fix_file_size branch December 19, 2023 06:30
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

2 participants