Skip to content

Error reading job file for Dataflow Flex Template with Unable to Open Template File error #9153

@belwalshubham

Description

@belwalshubham

Hii @beccasaurus @nicain @lukesneeringer @hfwang I'm encountering an error while running a custom Dataflow job using a flex template in Google Cloud Platform (GCP).
A custom pipeline was created using a custom template in which a JSON file was provided. The pipeline was launched
successfully and scheduled to run. However, at some point during the execution, an error occurred, causing the pipeline to fail.
Environment:
Apache Beam version: apache-beam[gcp]==2.44.0

The error message is as follows:

Failed to read the job file: gs://dataflow-staging-us-central1-713358881388/staging/template_launches/2023-02-20_18_27_45-8498022740013370621/job_object with error message: (c20b1cad16245ca5): Unable to open template file: gs://dataflow-staging-us-central1-713358881388/staging/template_launches/2023-02-20_18_27_45-8498022740013370621/job_object.. 

I have also verified that the options for the job are set correctly. Here's an example of how I'm setting the options using the PipelineOptions class in Python:

pipeline_options = PipelineOptions.from_dictionary({ 'runner': 'DataflowRunner', 'project': 'testcircle-350611', 'region': 'us-central1', 'staging_location': 'gs://dataflow-staging-us-central1-713358881388/staging/', 'temp_location': 'gs://dataflow-staging-us-central1-713358881388/tmp/', 'template_location': 'gs://dataflow-staging-us-central1-713358881388/staging/template_launches/', 'service_account_email': 'xxxx-compute@developer.gserviceaccount.com' }) 

here is my JSON file

"resources": { "sdkPipelineOptions": { "description": "Apache Beam SDK pipeline options", "properties": { "saveMainSession": "true", "runner": "DataflowRunner", "project": "testcircle-350611", "region": "us-central1", "staging_location": "gs://dataflow-staging-us-central1-713358881388/staging/", "temp_location": "gs://dataflow-staging-us-central1-713358881388/tmp/", "template_location": "gs://dataflow-staging-us-central1-713358881388/staging/template_launches/", "service_account_email": "-xxxxcompute@developer.gserviceaccount.com" } } }, 

and here is my docker file code

 RUN pip install --upgrade pip RUN apt-get update && apt-get install -y default-jdk postgresql-client ARG WORKDIR=/dataflow/template RUN mkdir -p ${WORKDIR} WORKDIR ${WORKDIR} COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY etl.py . ENV FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE="${WORKDIR}/requirements.txt" ENV FLEX_TEMPLATE_PYTHON_PY_FILE="${WORKDIR}/etl.py" ENTRYPOINT ["/opt/google/dataflow/python_template_launcher"] ``` 

Metadata

Metadata

Assignees

Labels

priority: p2Moderately-important priority. Fix may not be included in next release.samplesIssues that are directly related to samples.triage meI really want to be triaged.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions