|
| 1 | +# 06 Split strings into maps |
| 2 | + |
| 3 | + |
| 4 | + |
| 5 | +> :bulb: This example will show how you can create a map of key/value pairs by splitting string values using `STR_TO_MAP`. |
| 6 | +
|
| 7 | +The source table (`customers`) is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker), which continuously generates rows in memory based on Java Faker expressions. |
| 8 | + |
| 9 | +There are many different data types in Flink SQL. You can group these in Character Strings, Binary Strings, Exact Numerics, Approximate Numerics, Date and Time, Constructed Data Types, User-Defined Types and Other Data Types. |
| 10 | +Some examples are `VARCHAR/STRING`, `CHAR`, `DECIMAL`, `DATE`, `TIME`, `TIMESTAMP`, `ARRAY`, `MAP`, `ROW` and `JSON`. You can find more information about these data types in the [Flink SQL Data Types Reference](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/types/). |
| 11 | + |
| 12 | +In this recipe, we'll convert two `STRING` columns containing a `postal_address` and a `residential_address` into a `MAP` column. |
| 13 | + |
| 14 | +This table DDL creates a `customers` table. It contains an identifier, the full name of a customer, the address to which you sent mail and the address where the customer is living. |
| 15 | + |
| 16 | +## Script |
| 17 | + |
| 18 | +```sql |
| 19 | +-- Create source table |
| 20 | +CREATE TABLE `customers` ( |
| 21 | + `identifier` STRING, |
| 22 | + `fullname` STRING, |
| 23 | + `postal_address` STRING, |
| 24 | + `residential_address` STRING |
| 25 | +) WITH ( |
| 26 | + 'connector' = 'faker', |
| 27 | + 'fields.identifier.expression' = '#{Internet.uuid}', |
| 28 | + 'fields.fullname.expression' = '#{Name.firstName} #{Name.lastName}', |
| 29 | + 'fields.postal_address.expression' = '#{Address.fullAddress}', |
| 30 | + 'fields.residential_address.expression' = '#{Address.fullAddress}', |
| 31 | + 'rows-per-second' = '1' |
| 32 | +); |
| 33 | +``` |
| 34 | + |
| 35 | +After creating this table, we use the [`STR_TO_MAP`](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/functions/systemfunctions/#string-functions) in our SELECT statement. |
| 36 | +This function splits a `STRING` value into one or more key/value pair(s) using a delimiter. |
| 37 | +The default pair delimiter is `,` but this can be adjusted by providing a second argument to this function. In this example, we change the pair delimiter to `;` since our addresses can contain `,`. |
| 38 | +There is also a default key-value delimiter, which is `=`. In this example, we're changing this to `:` by providing a third argument to the function. |
| 39 | + |
| 40 | +To create our `MAP` column, we're using `||` to concatenate multiple `STRING` values. |
| 41 | +We're hardcoding the first key to 'postal_address:' to include the key-value delimiter and concatenate the value from the `postal_address` column. |
| 42 | +We then continue with hardcoding our second key to ';residential_address:'. That includes the pair delimiter `;` as a prefix and again `:` as our key-value delimiter as a suffix. |
| 43 | +To complete the function, we change the default values for pair delimiter and key-value delimiter to `;` and `:` respectively. |
| 44 | + |
| 45 | +```sql |
| 46 | +SELECT |
| 47 | + `identifier`, |
| 48 | + `fullname`, |
| 49 | + STR_TO_MAP('postal_address:' || postal_address || ';residential_address:' || residential_address,';',':') AS `addresses` |
| 50 | +FROM `customers`; |
| 51 | +``` |
| 52 | + |
| 53 | +## Example Output |
| 54 | + |
| 55 | + |
0 commit comments