Skip to content

Experienced '504 Deadline Exceeded' error when tried to delete large number rows by partitioned dml #199

@FengnaLiu

Description

@FengnaLiu

There are 1 billion old records need to be cleaned. According the documents, It seems Partitioned DML should be the good choice. I choose java client because we can set timeout by using setPartitionedDmlTimeout(Duration.ofHours(24L)). However, I still experienced the '504 Deadline Exceeded' after a few hours that shorter than 24 hours I set. Why setPartitionedDmlTimeout did not work? How should I solve the problem? Is it not possible to delete so large number of data by using Partitioned DML.

Following are the details.

Environment details

  1. Spanner
    3 nodes, 2 billion records in a table and 1 billion old records need to be cleaned
  2. OS type and version: MacOs version 10.15.3
  3. Java version: java 8
  4. spanner version(s): the latest version

Steps to reproduce

  1. Write 2 billion records with timestamp in to the spanner table with dataflow
  2. Run a clean job to delete the old 1 billion records with Partitioned DML by java client
  3. Set the timeout to be 24 hours by using setPartitionedDmlTimeout(Duration.ofHours(24L))
  4. Run the java code

Code example

SpannerOptions options = SpannerOptions.newBuilder().setPartitionedDmlTimeout(Duration.ofHours(24L)).build(); Spanner spanner = options.getService(); try { DatabaseId db = DatabaseId.of(options.getProjectId(), args[0], args[1]); // [END init_client] //This will return the default project id based on the environment. String clientProject = spanner.getOptions().getProjectId(); if (!db.getInstanceId().getProject().equals(clientProject)) { System.err.println( "Invalid project specified. Project in the database id should match the" + "project name set in the environment variable GOOGLE_CLOUD_PROJECT. Expected: " + clientProject); } // [START init_client] DatabaseClient dbClient = spanner.getDatabaseClient(db); SimpleDateFormat formatter = new SimpleDateFormat("dd/MM/yyyy hh:mm:ss.SSS"); String sql="DELETE FROM MyTable WHERE updateTimestamp<'2020-05-02T03:00:00Z'"; Date currentDate = new Date(); System.out.println(String.format("Delete Start:%s",formatter.format(currentDate))); long lStartTime = Instant.now().toEpochMilli(); deleteUsingPartitionedDml(dbClient,sql); long lEndTime = Instant.now().toEpochMilli(); long elaspedTime = lEndTime - lStartTime; System.out.println("Elapsed time in : " + elaspedTime/1000); } finally { spanner.close(); } // [END init_client] System.out.println("Closed client"); } private static void deleteUsingPartitionedDml(DatabaseClient dbClient,String sql) { long rowCount = dbClient.executePartitionedUpdate(Statement.of(sql)); System.out.printf("%d records deleted.\n", rowCount); }

Metadata

Metadata

Assignees

Labels

api: spannerIssues related to the googleapis/java-spanner API.priority: p1Important issue which blocks shipping the next release. Will be fixed prior to next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions