Specification

Introduction

This section defines the SQL commenter algorithm which augments a SQL statement with a comment containing serialized key value pairs that are retrieved from the various ORMs and frameworks in your programming language and environment of choice.

A preview of the result can be seen as per exhibit

SELECT * FROM FOO /*action='%2Fparam*d',controller='index,'framework='spring', traceparent='00-5bd66ef5095369c7b0d1f8f4bd33716a-c532cb4098ac3dd2-01', tracestate='congo%3Dt61rcWkgMzE%2Crojo%3D00f067aa0ba902b7'*/ 

Read along to see how you can conform to the specification and produce similar output.

Format

The final comment SHOULD be affixed to the final SQL statement in the format

<SQL STATEMENT> /*<ATTRIBUTE_KEY_VALUE_PAIRS>*/ 

Comment escaping

Comments within SQL comments are of the format

If a comment already exists within a SQL statement, we MUST NOT mutate that statement.

Separator

Each key value pair MUST be separated by a comma “,” so for example, given

Values: [<FIELD_1>, <FIELD_2>, <FIELD_3>, ...]

Expected concatenation result: <FIELD_1>,<FIELD_2>,<FIELD_3>,<FIELD_N...>

Meta characters

Meta characters such as ' should be escaped with a slash \. That creates the following algorithm:

Algorithm

algorithm(value): escaped := value.escape_with_slash_any_of(') return escaped 

Key serialization

  1. URL encode the key e.g. given route parameter, that’ll become route%20parameter

Which produces the following algorithm:

Algorithm

key_serialization(key): encoded := url_encode(key) meta_escaped := escape_meta_characters(encoded) return meta_escaped 

Value serialization

  1. URL encode the value e.g. given /param first, that SHOULD become %2Fparam%20first

  2. Escape meta-characters within the raw value; a single quote ' becomes \'

  3. SQL escape the value by placing it within two single quotes e.g.

    DROP should become 'DROP'

    FOO 'BAR should become 'FOO%20\'BAR'

And when generalized into an algorithm:

Algorithm

value_serialization(value): encoded := url_encode(value) meta_escaped := escape_meta_characters(encoded) final := sql_escape_with_single_quotes(meta_escaped) return final 

and running the algorithm on the following table will produce

value url_encode(value) sql_escape_with_single_quotes
DROP TABLE FOO DROP%20TABLE%20FOO 'DROP%20TABLE%20FOO'
/param first %2Fparam%20first '%2Fparam%20first'
1234 1234 '1234'

Key Value format

Given a key value pair (key, value):

  1. Run the Key serialization algorithm on key

  2. Run the Value serialization algorithm on value

  3. Using an equals sign =, concatenate the result from 1. and 2. to give

    <SERIALIZED_KEY>=<SERIALIZED_VALUE> gotten from:

    serialize_key(key)=serialize_value(value)

Thus given for example the following key value pairs

key value pair serialized_key serialized_value Final
route=/polls 1000 route '%2Fpolls%201000' route='%2Fpolls%201000'
name='DROP TABLE FOO' route '%2Fpolls%201000' route='%2Fpolls%201000'
name''="DROP TABLE USERS'" name='' DROP%20TABLE%20USERS' name=''=‘DROP%20TABLE%20USERS'’

Sorting

With a list of serialized key=value pairs, sort them by lexicographic order.

Algorithm

sort(key_value_pairs): sorted = lexicographically_sort(key_value_pairs) return sorted 

Exhibit

Thus

 sort([ traceparent='00-5bd66ef5095369c7b0d1f8f4bd33716a-c532cb4098ac3dd2-01', tracestate='congo%3Dt61rcWkgMzE%2Crojo%3D00f067aa0ba902b7', route='%2Fparam*d', controller='index', ]) 

produces

 [ controller='index', route='%2Fparam*d', traceparent='00-5bd66ef5095369c7b0d1f8f4bd33716a-c532cb4098ac3dd2-01', tracestate='congo%3Dt61rcWkgMzE%2Crojo%3D00f067aa0ba902b7', ] 

Concatenation

After all the keys and values have been serialized and sorted, they MUST be joined by a comma ,.

If no values are present, concatenate MUST return the empty value ''

Algorithm

concatenate(key_value_pairs): if len(key_value_pairs) == 0: return '' return ','.join(key_value_pairs) 

Exhibit

Therefore

 concatenate([ controller='index', route='%2Fparam*d', traceparent='00-5bd66ef5095369c7b0d1f8f4bd33716a-c532cb4098ac3dd2-01', tracestate='congo%3Dt61rcWkgMzE%2Crojo%3D00f067aa0ba902b7', ]) 

produces

controller='index',route='%2Fparam*d',traceparent='00-5bd66ef5095369c7b0d1f8f4bd33716a-c532cb4098ac3dd2-01',tracestate='congo%3Dt61rcWkgMzE%2Crojo%3D00f067aa0ba902b7' 

Affix comment

After serialization, sorting, concatenation, the final form MUST be placed between /* and */

Algorithm

affix_comment(sql, concatenated): if is_empty(concatenated): return sql // Do NOT modify the SQL if concatenated is blank. affixed := sql + '/*' + concatenated + '*/' return affixed 

Exhibit

for example given

affix_comment('SELECT * from FOO', '') 

produces

SELECT * from FOO 
affix_comment('SELECT * from FOO', "route='%2Fparam*d'") 

produces

SELECT * from FOO /*route='%2Fparam*d'*/ 

SQL commenter

Wrapping all the steps together, we thus have the following algorithm

sql_commenter(sql, attributes): if contains_sql_comment(sql): return sql # DO NOT mutate a statement with an already present comment. serialized_key_value_pairs := [] for each attribute in attributes: serialized := serialize_key_value_pair(attribute) if serialized: serialized_key_value_pairs.append(serialized) sorted := sort(serialized_key_value_pairs) concatenated := concatenate(sorted) final := affix_comment(sql, concatenated) return final 

Exhibit

Running sql_commenter on an ORM integration that extracts the respective attributes:

sql_commenter('SELECT * FROM FOO', [ tracestate='congo%3Dt61rcWkgMzE%2Crojo%3D00f067aa0ba902b7', traceparent='00-5bd66ef5095369c7b0d1f8f4bd33716a-c532cb4098ac3dd2-01', framework='spring', action='%2Fparam*d', controller='index', ]) 

finally produces

SELECT * FROM FOO /*action='%2Fparam*d',controller='index,'framework='spring', traceparent='00-5bd66ef5095369c7b0d1f8f4bd33716a-c532cb4098ac3dd2-01', tracestate='congo%3Dt61rcWkgMzE%2Crojo%3D00f067aa0ba902b7'*/ 

Parsing

Parsing is the step to reverse sql-commenter and extract the key value attributes.

It’ll follow the following steps:

  1. Find the last comment so search for and strip out /* and */
  2. Split the comment by comma ,
  3. Split each key='value' pair so extract key and 'value'
    • 3.1. For key, unescape_meta_characters then url_decode
    • 3.2. For value, sql_unescape/trim the ' at the beginning and end of 'value' -> value
      • 3.2.1. Unescape the meta characters in value
      • 3.2.2. URL Decode the value

Algorithm

parse(sql_with_comment): if !contains_sql_comment(sql_with_comment): return sql_with_comment, null // Since we now have a SQL comment, let's extract the serialized attributes.  sql_stmt, serialized_attrs := extract_sql_commenter(sql_with_comment) if is_empty(serialized_attrs): return sql_stmt, null attrs := {} kv_splits := split_by_comma(serialized_attrs) for kv in kv_splits: e_key, e_value := split_by_equals(kv) key := decode_key(e_key) value := decode_value(e_value) attrs[key] = value // Some attributes such as traceparent, tracestate, sampled  // might need need some grouping and reconstruction.  final := deconstruct_and_group_attributes(attrs) return sql_stmt, final 

Exhibit

Given the value from SQLCommenter exhibit

SELECT * FROM FOO /*action='%2Fparam*d',controller='index,'framework='spring', traceparent='00-5bd66ef5095369c7b0d1f8f4bd33716a-c532cb4098ac3dd2-01', tracestate='congo%3Dt61rcWkgMzE%2Crojo%3D00f067aa0ba902b7'*/ 

Running parse on the value

sql, attributes = parse(`SELECT * FROM FOO /*action='%2Fparam*d',controller='index,'framework='spring', traceparent='00-5bd66ef5095369c7b0d1f8f4bd33716a-c532cb4098ac3dd2-01', tracestate='congo%3Dt61rcWkgMzE%2Crojo%3D00f067aa0ba902b7'*/`) 

produces

sql: SELECT * FROM FOO attributes: { controller: 'index', framework: 'spring', action: '/param*d', trace: { sampled: true, span_id: 'c532cb4098ac3dd2', trace_id: '5bd66ef5095369c7b0d1f8f4bd33716a', trace_state: [{'congo': 't61rcWkgMzE'}, {'rojo': '00f067aa0ba902b7'}], }, } 

References

Resource URL
URL Encoding https://en.wikipedia.org/wiki/Percent-encoding
Comments within SQL comments https://docs.oracle.com/cd/B12037_01/server.101/b10759/sql_elements006.htm