-
- Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
As-is behavior:
First-time execution of Dapper's QueryAsync<T> extension method consistently takes over 1 minute to return when supplied with a parameterless Oracle SQL query that returns 21497 rows with 31 values each. The same query when executed in SQL Developer on the same machine with the same connection string takes 12s (In SQL Developer, spooling to file was used to avoid overly optimistic timing results resulting from pagination.)
Expected behavior:
Dapper's QueryAsync<T> should return in roughly the same amount of time as the native query; no longer than 2X as a ballpark.
Environment:
- Processor: Intel Xeon W-2102 2.9GHz
- Memory: 64GB
- OS: Windows 10 Enterprise 22H2
- .NET: v7.0
- Oracle.ManagedDataAccess.Core: v3.21.90
- Dapper: v2.0.143
Code:
Argument values to QueryAsync:
- T: See sanitized POCO below.
- sql: see sanitized query below.
- param: null
- transaction: null
- commandTimeout: 600
- commandType: Text
Out of a total 1:06 execution time, the following block of code in SqlMapper.Async takes 1:02:
while (await reader.ReadAsync(cancel).ConfigureAwait(false)) { object val = func(reader); buffer.Add(GetValue<T>(reader, effectiveType, val)); }Stepping through this block via symbol file indicates no observable performance lag. No branch other than the first is ever hit in the GetValue<T> method:
[MethodImpl(MethodImplOptions.AggressiveInlining)] private static T GetValue<T>(DbDataReader reader, Type effectiveType, object val) { if (val is T tVal) { return tVal; } else if (val == null && (!effectiveType.IsValueType || Nullable.GetUnderlyingType(effectiveType) != null)) { return default; } else if (val is Array array && typeof(T).IsArray) { var elementType = typeof(T).GetElementType(); var result = Array.CreateInstance(elementType, array.Length); for (int i = 0; i < array.Length; i++) result.SetValue(Convert.ChangeType(array.GetValue(i), elementType, CultureInfo.InvariantCulture), i); return (T)(object)result; } else { try { var convertToType = Nullable.GetUnderlyingType(effectiveType) ?? effectiveType; return (T)Convert.ChangeType(val, convertToType, CultureInfo.InvariantCulture); } catch (Exception ex) { #pragma warning disable CS0618 // Type or member is obsolete ThrowDataException(ex, 0, reader, val); #pragma warning restore CS0618 // Type or member is obsolete return default; // For the compiler - we've already thrown } } }Below is a mapping of the properties of the sanitized POCO class used as the generic type for QueryAsync/Query to the corresponding value types from the Oracle query:
public class DataRow { public string Value1 { get; set; } // VARCHAR2(32) NULLABLE public DateTime Value2 { get; set; } // DATE public string Value3 { get; set; } // VARCHAR2(50) NULLABLE public string Value4 { get; set; } // VARCHAR2(150) public string Value5 { get; set; } // VARCHAR2(32) NULLABLE public string Value6 { get; set; } // VARCHAR2(150) public string Value7 { get; set; } // VARCHAR2(200) public string Value8 { get; set; } // VARCHAR2(50) NULLABLE public string Value9 { get; set; } // VARCHAR2(500) NULLABLE public string Value10 { get; set; } // VARCHAR2(50) public string Value11 { get; set; } // VARCHAR2(50) NULLABLE public string Value12 { get; set; } // VARCHAR2(50) NULLABLE public long Value13 { get; set; } // NUMBER public long Value14 { get; set; } // NUMBER public string Value15 { get; set; } // VARCHAR2(601) [Calculated value from query] public double Value16 { get; set; } // NUMBER [Calculated value from query] public double Value17 { get; set; } // NUMBER [Calculated value from query] public string Value18 { get; set; } // VARCHAR2(50) public double? Value19 { get; set; } // NUMBER NULLABLE [Calculated value from query] public bool Value20 { get; set; } // VARCHAR2(5) NULLABLE public long? Value21 { get; set; } // NUMBER(38,0) NULLABLE [from query LEFT JOIN] public string Value22 { get; set; } // VARCHAR(200) NULLABLE public double? Value23 { get; set; } // NUMBER(38,0) NULLABLE public double? Value24 { get; set; } // NUMBER(38,0) NULLABLE public double? Value25 { get; set; } // NUMBER(38,0) NULLABLE public double? Value26 { get; set; } // NUMBER(38,0) NULLABLE public double? Value27 { get; set; } // FLOAT NULLABLE public double? Value28 { get; set; } // FLOAT NULLABLE public double? Value29 { get; set; } // FLOAT NULLABLE public double? Value30 { get; set; } // FLOAT NULLABLE public double? Value31 { get; set; } // NUMBER [Calculated value from query] }Below is the sanitized SQL query
SELECT wc.wc_row_id AS value13 ,SUBSTR(wc.wc1, 2) AS value1 ,wc.wc2 AS value2 ,wc.wc3 AS value3 ,wc.wc4 AS value12 ,wc.wc5 AS value4 ,wc.wc6 AS value6 ,wc.wc7 AS value5 ,wc.wc8 AS value7 ,wc.wc9 AS value8 ,wc.wc10 AS value9 ,wc.wc11 AS value11 ,ic.ic1 AS value14 ,ic.ic2 || '\' || ic.ic3 as value15 ,ic.ic4 AS value10 ,TRUNC(iss.ic5,2) AS value16 ,TRUNC(100*iss.ic6 / iss.ic7,2) AS value17 ,wz.wz1 AS value18 ,GREATEST(ABS(wz.wz2), ABS(wz.wz3), ABS(wz.wz4), ABS(wz.wz5)) as value19 ,wz.wz6 as value20 ,adc.adc1 As value21 ,dc.dc1 as value22 ,adc.adc2 as value23 ,adc.adc3 as value24 ,adc.adc4 as value25 ,adc.adc5 as value26 ,adc.adc6 as value27 ,adc.adc7 as value28 ,adc.adc8 as value29 ,adc.adc9 as value30 ,TRUNC(COALESCE(COUNT(ad.adc1) OVER (PARTITION BY adc.adc1) / iss.ic7, 0),6) AS value31 FROM wc JOIN ic ON ic.wc_row_id = wc.wc_row_id LEFT JOIN iss ON iss.wc_row_id = wc.wc_row_id AND iss.ic_row_id = ic.ic_row_id LEFT JOIN wz ON wz.ic_row_id = ic.ic_row_id AND wz.wc_row_id = wc.wc_row_id AND wz.wz8 = 'KURT' AND wz.wz9 = 'Original' LEFT JOIN adc ON adc.ic_row_id = ic.ic_row_id LEFT JOIN dc ON dc.dc_id = adc.dc_id LEFT JOIN ad ON ad.wc_row_id = adc.wc_row_id AND ad.ic_row_id = adc.ic_row_id AND ad.adc_row_id = adc.adc_row_id WHERE 1=1 AND wc.wc1 IN 'ThisValue' AND wc.TIME>to_date('08/09/2023 10:00:00','mm/dd/yyyy hh24:mi:ss') AND wc.TIME<=to_date('08/09/2023 10:10:00','mm/dd/yyyy hh24:mi:ss') Other observations:
- Same duration using generic-less
QueryAsync(not surprising since it is the same code under the hood; however this suggests it doesn't have to do with my POCO). - Same duration using
Query<T>(also leverages samewhileloop as above) - Response time improves as columns are eliminated, but no obvious indications that any given column(s) are problematic in particular. Delta is only obvious when a significant fraction of the columns are removed.
- No spikes in CPU or memory consumption during execution of
QueryAsync<T>. CPU ranges between 20-30% and memory was flat at 43%. - Subsequent execution is faster thanks to buffering; but first-time execution speed is critical since the query is parameterized during normal usage and repeat parameterization is rare.
- The buffer parameter is not available for manipulation with
QueryAsync<T>; however when I tried setting it to false withQuery<T>, the performance is about the same. - The queries used in actual production would need to pull 10-20X as much data in an efficient manner. It seems that the time to get data back from
QueryAsyncscales exponentially with the data volume, although this is just a rough estimate.
Specific inquiries:
- Is it reasonable to expect faster first-time response from QueryAsync or is a 6-7X degradation compared to the native SQL query expected?
- If better performance is a reasonable expectation, what's your best guess as to the source of the problem: my query, the POCO, the Dapper code, some configuration?
- What can I do to improve matters or get a better handle on benchmarking? (Would a stored procedure be likely to be more performant?)
- Let me know what other information I can provide that would help make this actionable for you.
Really hoping someone can assist. Dapper is deeply embedded in our code but the performance especially for large queries is becoming critical.