- Notifications
You must be signed in to change notification settings - Fork 89
Closed
Closed
Copy link
Labels
api: bigquerystorageIssues related to the googleapis/java-bigquerystorage API.Issues related to the googleapis/java-bigquerystorage API.
Description
We use executeSelect API to run SQL query and read results from BigQuery. We expected a good speed based on this article
Reading data using executeSelect
API is extremely slow.
Reading of 100_000 rows takes 23930 ms.
The profiling showed no prominent places where we spent most of the time.
Are there any recent changes that might cause performance degradation for such an API?
Do you have a benchmark to understand what performance we should expect?
Thanks!
Environment details
com.google.cloud:google-cloud-bigquery:2.43.3
- Mac OS Sonoma M1
- Java version: 17
Code example
Mono.fromCallable { bigQueryOptionsBuilder.build().service } .flatMap { context -> val connectionSettings = ConnectionSettings.newBuilder() .setRequestTimeout(10L) .setUseReadAPI(true) .setMaxResults(1000) .setNumBufferedRows(1000) .setUseQueryCache(true) .build(); val connection = context.createConnection(connectionSettings) val bqResult = connection.executeSelect(sql) val result = Flux.usingWhen( Mono.just(bqResult.resultSet), { resultSet -> resultSet.toFlux(bqResult.schema) }, { _ -> Mono.fromRunnable<Unit> { connection.close() } } ) Mono.just(Data(result, bqResult.schema.toSchema())) } ... fun ResultSet.toFlux(schema:Schema): Flux<DataRecord> { return Flux.generate<DataRecord> { sink -> if (next()) { sink.next(toDataRecord(schema)) } else { sink.complete() } } }
Metadata
Metadata
Assignees
Labels
api: bigquerystorageIssues related to the googleapis/java-bigquerystorage API.Issues related to the googleapis/java-bigquerystorage API.