Skip to content

Commit 0c7c7de

Browse files
author
MartijnVisser
authored
Merge pull request #57 from MartijnVisser/recipe/split_strings_into_maps
Recipe: Split strings into maps
2 parents b33313c + 31086e7 commit 0c7c7de

File tree

4 files changed

+58
-2
lines changed

4 files changed

+58
-2
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ The cookbook is a living document. :seedling:
3838
3. [Filtering out Late Data](other-builtin-functions/03_current_watermark/03_current_watermark.md)
3939
4. [Overriding table options](other-builtin-functions/04_override_table_options/04_override_table_options.md)
4040
5. [Expanding arrays into new rows](other-builtin-functions/05_expanding_arrays/05_expanding_arrays.md)
41+
6. [Split strings into maps](other-builtin-functions/06_split_strings_into_maps/06_split_strings_into_maps.md)
4142

4243
### User-Defined Functions (UDFs)
4344
1. [Extending SQL with Python UDFs](udfs/01_python_udfs/01_python_udfs.md)
@@ -62,6 +63,6 @@ Learn more about Flink at https://flink.apache.org/.
6263

6364
## License
6465

65-
Copyright © 2020-2021 Ververica GmbH
66+
Copyright © 2020-2022 Ververica GmbH
6667

6768
Distributed under Apache License, Version 2.0.

other-builtin-functions/05_expanding_arrays/05_expanding_arrays.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# 04 Expanding arrays into new rows
1+
# 05 Expanding arrays into new rows
22

33
![Twitter Badge](https://img.shields.io/badge/Flink%20Version-1.3%2B-lightgrey)
44

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# 06 Split strings into maps
2+
3+
![Twitter Badge](https://img.shields.io/badge/Flink%20Version-1.3%2B-lightgrey)
4+
5+
> :bulb: This example will show how you can create a map of key/value pairs by splitting string values using `STR_TO_MAP`.
6+
7+
The source table (`customers`) is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
8+
9+
There are many different data types in Flink SQL. You can group these in Character Strings, Binary Strings, Exact Numerics, Approximate Numerics, Date and Time, Constructed Data Types, User-Defined Types and Other Data Types.
10+
Some examples are `VARCHAR/STRING`, `CHAR`, `DECIMAL`, `DATE`, `TIME`, `TIMESTAMP`, `ARRAY`, `MAP`, `ROW` and `JSON`. You can find more information about these data types in the [Flink SQL Data Types Reference](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/types/).
11+
12+
In this recipe, we'll convert two `STRING` columns containing a `postal_address` and a `residential_address` into a `MAP` column.
13+
14+
This table DDL creates a `customers` table. It contains an identifier, the full name of a customer, the address to which you sent mail and the address where the customer is living.
15+
16+
## Script
17+
18+
```sql
19+
-- Create source table
20+
CREATE TABLE `customers` (
21+
`identifier` STRING,
22+
`fullname` STRING,
23+
`postal_address` STRING,
24+
`residential_address` STRING
25+
) WITH (
26+
'connector' = 'faker',
27+
'fields.identifier.expression' = '#{Internet.uuid}',
28+
'fields.fullname.expression' = '#{Name.firstName} #{Name.lastName}',
29+
'fields.postal_address.expression' = '#{Address.fullAddress}',
30+
'fields.residential_address.expression' = '#{Address.fullAddress}',
31+
'rows-per-second' = '1'
32+
);
33+
```
34+
35+
After creating this table, we use the [`STR_TO_MAP`](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/functions/systemfunctions/#string-functions) in our SELECT statement.
36+
This function splits a `STRING` value into one or more key/value pair(s) using a delimiter.
37+
The default pair delimiter is `,` but this can be adjusted by providing a second argument to this function. In this example, we change the pair delimiter to `;` since our addresses can contain `,`.
38+
There is also a default key-value delimiter, which is `=`. In this example, we're changing this to `:` by providing a third argument to the function.
39+
40+
To create our `MAP` column, we're using `||` to concatenate multiple `STRING` values.
41+
We're hardcoding the first key to 'postal_address:' to include the key-value delimiter and concatenate the value from the `postal_address` column.
42+
We then continue with hardcoding our second key to ';residential_address:'. That includes the pair delimiter `;` as a prefix and again `:` as our key-value delimiter as a suffix.
43+
To complete the function, we change the default values for pair delimiter and key-value delimiter to `;` and `:` respectively.
44+
45+
```sql
46+
SELECT
47+
`identifier`,
48+
`fullname`,
49+
STR_TO_MAP('postal_address:' || postal_address || ';residential_address:' || residential_address,';',':') AS `addresses`
50+
FROM `customers`;
51+
```
52+
53+
## Example Output
54+
55+
![06_create_maps](06_split_strings_into_maps.png)
520 KB
Loading

0 commit comments

Comments
 (0)