Sub command: Custom

Introduction

Mock Data tool is designed with mocking tables based on the datatype of a column, it's not smart in determining if that is a name column or a email column etc. With custom sub command mock data tool provides the control to the user and lets the user decide the lifecycle of mocking the data to the tables, i.e

User can pick which column to skip and let mock data tool decide the best data for it
User can control what kind of data goes to a column i.e user can feed in custom dataset to mock ( i.e picked randomly during mocking )
User can select from the list of supported realistic data key

NOTE: For all the realistic key, checkout the page

Under the custom subcommand the user is provided with a file and a plan of how the data will be loaded to the columns, the file can then be modified and fed to the tool to control the dataset to mock.

Short Hand: The short hand of the schema subcommand is c

Preference Order

There are 3 ways to load data using the custom tool,

User provided dataset
Realistic dataset
Random dataset

The order of selection (in case two or more option is set for a column) of what kind of data to be used to mock the table is determined by the order mentioned above i.e user generated dataset is give preference over realistic dataset etc.

Usage

The usage of table subcommand is

[gpadmin@gpdb-m ~]$ mock custom --help Control the data being written to the tables Usage: mock custom [flags] Aliases: custom, c Flags: -f, --file string Mock the tables provided in the yaml file -h, --help help for custom -t, --table-name string Provide the table name whose skeleton need to be copied to the file Global Flags: -a, --address string Hostname where the postgres database lives -d, --database string Database to mock the data (default "gpadmin") -q, --dont-prompt Run without asking for confirmation -i, --ignore Ignore checking and fixing constraints -w, --password string Password for the user to connect to database -p, --port int Port number of the postgres database (default 3000) -r, --rows int Total rows to be faked or mocked (default 10) -u, --username string Username to connect to the database -v, --verbose Enable verbose or debug logging

Example

As indicated above, you have choice of three ways to control the data to be loaded onto a table, click below if you want to quickly jump to the one you are interested

User Generated Dataset

Lets take a example of table that has a check constraint ( for eg.s partition in greenplum database or create have your own postgres database tables)
Now lets build a plan of this table
```
mock custom --table-name sales -- OR -- mock c -t sales 
```
NOTE:
- If the table is not on the default public schema then use mock c -t <schema-name>.<table-name>
- If you want to generate plan for multiple table then use mock c -t <schema-name1>.<table-name1>,<schema-name2>.<table-name2>...<schema-nameN>.<table-nameN>
Once the plan is generated you will received the location and yaml file at the end The YAML is saved to file: <PATH>/<FILENAME>
Edit the file generated using any text editor of your choice
- On the column you want to take control add array of value you would like to mock data to randomly pick under the UserData key, for eg we take control of date column below
```
Custom: - Schema: public Table: sales Column: - Name: id Type: integer UserData: [] Realistic: "" - Name: date Type: date UserData: - 2016-01-01 - 2016-03-01 - 2016-04-01 Realistic: "" - Name: amt Type: numeric(10,2) UserData: [] Realistic: "" 
```
- Continue this procedure for the rest of the columns you are interested

Using the custom generated plan, feed the yaml to the mock tool

mock custom --file <filename or path/filename> -- OR -- mock c -f <filename or path/filename>

If you want more rows use the row flag

mock custom --file <filename or path/filename> --row <total rows number> -- OR -- mock c -f <filename or path/filename> -r <total rows number>

Realistic Dataset

Lets create a table eg.s

CREATE TABLE employee ( name VARCHAR(100), email VARCHAR(120), mobile VARCHAR(50), gender VARCHAR(2), address VARCHAR(500) );

Let's generate a plan for the table

mock custom --table-name employee -- OR -- mock c -t employee

Edit the yaml generated using the above command to include realistic keys like below, for the complete list of realistic keys available check out this part of the code available here

Custom: - Schema: public Table: employee Column: - Name: name Type: character varying(100) UserData: [] Realistic: "NameFullName" - Name: email Type: character varying(120) UserData: [] Realistic: "InternetEmail" - Name: mobile Type: character varying(50) UserData: [] Realistic: "PhoneNumberString" - Name: gender Type: character varying(2) UserData: [] Realistic: "NameGenderAbbrev" - Name: address Type: character varying(500) UserData: [] Realistic: "AddressString"

Using the custom generated plan, feed the yaml to the mock tool

mock custom --file <filename or path/filename> -- OR -- mock c -f <filename or path/filename>

Random / User Generated / Realistic Dataset

If you combine all the three i.e power of random generated data / user provided & realistic you can have N possibilities of loading the data, let's take a example

Let us create a table

CREATE TABLE employee ( name VARCHAR(100), password_hash VARCHAR(30), gender VARCHAR );

Let's generate a plan for the table

mock custom --table-name employee -- OR -- mock c -t employee

Edit the yaml generated using the above command, here we will use

name column will be fed by realistic data
password_hash column will be generated randomly by the tool
gender column will be inserted by user generated dataset

so our yaml now looks like

Custom: - Schema: public Table: employee Column: - Name: name Type: character varying(100) UserData: [] Realistic: "NameFullName" - Name: password_hash Type: character varying(30) UserData: [] Realistic: "" - Name: gender Type: character varying UserData: ["M", "F", "O"] Realistic: ""

Using the custom generated plan, feed the yaml to the mock tool

mock custom --file <filename or path/filename> -- OR -- mock c -f <filename or path/filename>

all-custom-command-options

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sub command: Custom

Introduction

Preference Order

Usage

Example

User Generated Dataset

NOTE:

Realistic Dataset

Random / User Generated / Realistic Dataset

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally