Create your first data model
Cube follows a dataset-oriented data modeling approach, which is inspired by and expands upon dimensional modeling. Cube incorporates this approach and provides a practical framework for implementing dataset-oriented data modeling.
When building a data model in Cube, you work with two dataset-centric objects: cubes and views. Cubes usually represent business entities such as customers, line items, and orders. In cubes, you define all the calculations within the measures and dimensions of these entities. Additionally, you define relationships between cubes, such as "an order has many line items" or "a user may place multiple orders."
Views sit on top of a data graph of cubes and create a facade of your entire data model, with which data consumers can interact. You can think of views as the final data products for your data consumers - BI users, data apps, AI agents, etc. When building views, you select measures and dimensions from different connected cubes and present them as a single dataset to BI or data apps.
Working with cubes
To begin building your data model, click on Enter Development Mode in Cube Cloud. This will take you to your personal developer space, where you can safely make changes to your data model without affecting the production environment.
In the previous section, we generated four cubes from the Snowflake schema. To see the data graph of these four cubes and how they are connected to each other, navigate to the Visual Model page.
Let's review the orders cube first and update it with additional dimensions and measures.
Once you are in developer mode, navigate to the Data Model and click on the orders.yml file in the left sidebar inside the model/cubes directory to open it.
You should see the following content of model/cubes/orders.yml file.
cubes: - name: orders sql_table: ECOM.ORDERS joins: - name: users sql: "{CUBE}.USER_ID = {users}.ID" relationship: many_to_one dimensions: - name: status sql: STATUS type: string - name: id sql: ID type: number primary_key: true - name: created_at sql: CREATED_AT type: time - name: completed_at sql: COMPLETED_AT type: time measures: - name: count type: countAs you can see, we already have a count measure that we can use to calculate the total count of our orders.
Let's add an additional measure to the orders cube to calculate only completed orders. The status dimension in the orders cube reflects the three possible statuses: processing, shipped, or completed. We will create a new measure completed_count by using a filter on that dimension. To do this, we will use a filter parameter of the measure and refer to the existing dimension.
Add the following measure definition to your model/cubes/orders.yml file. It should be included within the measures block.
- name: completed_count type: count filters: - sql: "{CUBE}.status = 'completed'"With these two measures in place, count and completed_count, we can create a derived measure. Derived measures are measures that you can create based on existing measures. Let's create the completed_percentage derived measure.
Add the following measure definition to your model/cubes/orders.yml file within the measures block.
- name: completed_percentage type: number sql: "(100.0 * {CUBE.completed_count} / NULLIF({CUBE.count}, 0))" format: percentBelow you can see what your updated orders cube should look like with two new measures. Feel free to copy this code and paste it into your model/cubes/order.yml file.
cubes: - name: orders sql_table: ECOM.ORDERS joins: - name: users sql: "{CUBE}.USER_ID = {users}.ID" relationship: many_to_one dimensions: - name: status sql: STATUS type: string - name: id sql: ID type: number primary_key: true - name: created_at sql: CREATED_AT type: time - name: completed_at sql: COMPLETED_AT type: time measures: - name: count type: count - name: completed_count type: count filters: - sql: "{CUBE}.status = 'completed'" - name: completed_percentage type: number sql: "(100.0 * {CUBE.completed_count} / NULLIF({CUBE.count}, 0))" format: percentClick Save All in the upper corner to save changes to the data model. Now, you can navigate to Cube’s Playground. The Playground is a web-based tool that allows you to query your data without connecting any tools or writing any code. It's the fastest way to explore and test your data model.
You can select measures and dimensions from different cubes in playground, including your newly created completed_percentage measure.
Working with views
When building views, we recommend following entity-oriented design and structuring your views around your business entities. Usually, cubes tend to be normalized entities without duplicated or redundant members, while views are denormalized entities where you pick as many measures and dimensions from multiple cubes as needed to describe a business entity.
Let's create our first view, which will provide all necessary measures and dimensions to explore orders. Views are usually located in the views folder and have a _view postfix.
Create model/views/orders_view.yml with the following content:
views: - name: orders_view cubes: - join_path: orders includes: - status - created_at - count - completed_count - completed_percentage - join_path: orders.users prefix: true includes: - city - age - stateWhen building views, you can leverage the cubes parameter, which enables you to include measures and dimensions from other cubes in the view. You can build your view by combining multiple joined cubes and specifying the path by which they should be joined for that particular view.
After saving, you can experiment with your newly created view in the Playground. In the next section, we will learn how to query our orders_view using a BI tool.