Getting Started With Superset On Duperset

Get your own Superset instance

power_settings_newFirst Principles

Duperset is a SaaS Analytics service based on open source Superset.

Duperset allows you to get started in minutes with Superset and take advantage of a rich set of features focused on data exploration and visualization, with charts and dashboards.

There are a few key concepts you need to familiarize yourself with in order to get started with Superset.
Once you've done that, everything is pretty self-explanatory in Superset’s UI.
Databases and Datasources and Tables and Fields and Slices, Oh My!
Superset provides a tight level of control (and simplification) to accessing your data whereby, unless you explicitly make them available as Datasources, tables and their fields in your database are not available for visualization.

Datasources can be created either by defining available Tables and associated Fields; or by creating SQL Queries and defining datasources out of them.

In either case, you’ll need to first configure your Databases - i.e. the database(s) you want to explore.
You'll then define a combination of:
. Tables -the tables you want to enable access to- and associated Fields -the tables’ fields you want to allow access to.
and/or
. SQL Queries - SQL Select statements that you can save and make available as Datasources.

Then your users will be able to create and edit
. Slices - visualizations on your Tables and related Fields
. Dashboards - sets of Slices that can be shared

We'll walk you through all these below.
Everything Is A Record
In the Superset UI (and underneath its hood) everything is internally maintained as a record - i.e. the configurations of Databases, Tables, Slices, etc... are maintained as records.
This is extremely powerful, but can also be confusing at first:
. It’s powerful because it’s a unified model that allows fine granularity of access control
. It can be confusing because you’ll see language in the UI that says “show record” that in fact means “show the details of this database configuration”. As an admin, you’ll also see configuration data -such as the core metrics you’ll define- available as records to view, edit, or delete.

(Note that these are actual records that are stored in a separate configuration database which we automatically provision and maintain for your own instance of Superset on Duperset)

warningChanging The admin Password

The Superset instance we create for you comes with a default 'admin' account.
Your first step should be to log in and change its password:

Log in with Username "admin" and Password "admin".

Navigate to your Profile

Click on the "Reset my password" button.

and reset your admin's password.


We also recommend that you reset your admin's email:
Navigate back to your Profile, click on "Edit User", change your email, and save your changes.

import_exportConnecting To Your Target Database(s)

Your next step is to connect to the databases you want to explore.
(We recommend connecting to slave/replicated DBs rather than your primary production DB.)

Navigate to (Menu -> Sources -> Databases)

and click the + sign on the far right to configure a new Database source

The available fields are self-explanatory - one critical thing to note:
SQLAlchemy URI: if you’re on Heroku, this is going to be the database URL specified in your Settings / Config Vars.
It should be in the format: postgresql://{userid}:{password}@{host}:5432/{dbname} - where 5432 is the port for Postgres if you’ve sticked to the defaults.
Make sure that the URL starts with postgresql and not just postgres - postgres:// will kinda-work, but you’ll run into issues down the road.

view_moduleCreating Datasources Using Tables

Once you’ve configured the database(s) you want to connect to, you can create Datasources for your visualizations either by configuring Tables, or by defining queries using SQL Lab.

When using Tables, you need to explicitly define which tables in your database you want to give access to, and for each table, which fields you want to give access to.

The initial steps for adding a new Table are similar to adding a new Database: Navigate to (Menu -> Sources -> Tables)

and click the + sign on the far right to configure a new Table source.


The one thing that differs is that you need to then go back and “Edit the Record” you just created to actually configure which fields will be available.

Once on the Edit Table screen, navigate to the List Columns tab.


This is the place where you’re not only defining which fields are available to your users, but also the underlying semantic - as to whether they can be grouped, counted, summed, or considered as dates/times.

In turn, these become available as metrics available to your users.

At this point, you and your users have the basic blocks in place to start creating visualizations (“Slices”) and Dashboards.

view_columnsCreating Datasources Using SQL Lab

The alternative to defining Tables as Datasources is building SQL Queries using SQL Lab.

Navigate to (Menu -> SQL Lab).
The SQL editor allows you to write and run SQL queries on your configured databases.

Click on the Visualize button when you want to make a query available as a Datasource.

You can then select the type of visualization you want to build (you'll be able to change that later); but also change the name of the Datasource, to make it more explicit, and define the dimensions you're making available.

dashboardCreating Slices And Dashboards

When you want to create a new visualization (“Slice”) from scratch, you do so in much of the same way as everything else - by navigating to the Slices page and adding a “new record”.


You then chose a datasource and a visualization type

and roll into the Exploration screen.


A couple of things to note:
. As you modify your Slice, you’ll need to click [Query] in order to see your changes reflected - unless the configuration you’re modifying is labeled with a [flash] sign, in which case, the change takes effect immediately

. The Slice’s title is going to be “[undefined] - untitled” and when you hover over it it will tell you “you don’t have the rights to alter this title” - but yes you do: just save your new Slice and give it a name

. The documentation is sparse and the depth of functionality very extensive - we recommend you play around with the different types of visualizations, Advanced Analytics and SQL features, and Filters

Dashboards are created along similar steps as Slices.


For additional documentation on how to explore your data, create Slices and Dashboards, you can look at to the official documentation of the Superset project at https://superset.incubator.apache.org/tutorial.html

groupSuperset Security Model

Users
Create new users by navigating to (Menu -> Security -> List Users); then click on the + icon to “Add a new record”.
Enter all required fields, and specify the roles this user should have (more on that below).

A few things to note:
. Your new user should most likely be marked with the Active checkbox on
. The User Name needs to be unique
. Email address is required but currently not being used on Duperset - in particular, the system does NOT send any form of notification to your new user that their account has been created. You will need to do that on your own.
. You need to specify (and confirm) a password for your new user. You should instruct your users to immediately change these.
Default Roles
Superset ships with 5 default base roles:
(Note that even though it’s possible to modify them, it is not recommended to alter them in any way)

Admin - Admins have all possible rights, including granting or revoking rights from other users and altering other people’s slices and dashboards.

Alpha - These are your power-users. They have access to all data sources. However they cannot grant or revoke access from other users. They are also limited to only altering the objects that they own. Alpha users can add and alter data sources.

Gamma - Gamma users have a very controlled access - and can create slices and dashboards only on the data sources they have been granted access to.
They can only consume data coming from data sources they have specifically been given access to through another complementary role.
In other words - they do NOT have access to anything until specifically granted.
When granted access to data sources, they only have access to view the slices and dashboards made from those data sources they have been granted access to.
Gamma users are not able to alter or add data sources.
Also note that when Gamma users look at the dashboards and slices list view, they will only see the objects that they have access to.

sql_lab - The sql_lab role grants access to SQL Lab. Note that while Admin users have access to all databases by default, both Alpha and Gamma users need to be given access on a per database basis.

Note that the Public role -which allows anonymous users to view dashboards- is disabled on Duperset.

We’re walking through the process of customizing Roles and access in the next section.
Customizing Access
Your security strategy should be to compose user access based on these base roles and roles that you create.
For instance you could create a role Financial Analyst that would be made of set of permissions to a set of data sources (tables) and/or databases. Users would then be granted Gamma, Financial Analyst, and perhaps sql_lab.

Here’s how to provide Gamma users access to only specific datasets:
. First make sure the users with limited access have [only] the Gamma role assigned to them.
. Second, create a new role (Menu -> Security -> List Roles) and click the + sign on the far right.

This new window allows you to give this new role a name, attribute it to users and select the tables in the Permissions dropdown.
To select the data sources you want to associate with this role, simply click in the dropdown and use the typeahead to search for your table names.

You can then confirm with your Gamma users that they see the objects (dashboards and slices) associated with the tables related to their roles.