Skip to content

ericjhwang24/rudderstack-profiles-basic-example

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RudderStack Profile Builder

RudderStack Profile Builder (PB) is a CLI tool that allows you to create customer 360 profiles by defining configurations in a SQL like meta language. PB can stitch data together from multiple sources right in your Snowflake data warehouse (Redshift coming soon). PB can stitch user identities and user features from multiple sources, including RudderStack ETL, Fivetran ETL, or other tables in your Snowflake data warehouse. The resulting customer 360 tables can be used to send customer data to downstream tools such as email marketing, chat, or CRM, along with many other destinations using RudderStack Reverse ETL. Profile Builder is also very flexible and can also be used to create profiles for users, companies, sessions, or any other entity you choose.

Getting Started

Clone This Repo

https://github.com/rudderlabs/rudderstack-profiles-basic-example.git

Make sure the items listed below are in your .gitignore file for hygiene and security purposes (this is done for you by default):

.DS_Store
output
logs

Install Profile Builder

In a terminal shell, install profile builder using python pip:

pipx install profiles-rudderstack # isolated virtual environment
-or - 
pip3 install profiles-rudderstack

Validate your installation of pb:

pb version
pb --help # view list of commands

Create the configuration file for connection to your warehouse: this can be modified later at (/Users/your_user/.pb/siteconfig.yaml)

pb init connection

Enter your information into the prompts (example is for Snowflake):

Select a warehouse. (Enter s for Snowflake): s
Enter Connection Name: your_connection_name <rs-profiles-test> # you will need this to ref your connection later
Enter target:  (default:dev):  # Just press enter, leaving it to default
Enter account: your_account.your_region <abc12345.us-east-1>
Enter warehouse: your_warehouse_name
Enter dbname: your_db_name
Enter schema: your_schema_name #create a schema then ref here
Enter user: your_user_name
Enter password: your_password
Enter role: your_role
Append to /Users/<user_name>/.pb/siteconfig.yaml? [y/N]
yes

After installing PB and configuring your connections, you need to update inputs.yaml with the names of your source tables. Navigate to the inputs.yaml file and update the table: information for both tables. Keep the table names and only change the schema and database if you want to use the sample data.

  - name: rsIdentifies
    table: <your_warehouse_db>.<your_warehouse_schema>.<table_name> # change this to your fully qualified input table name 

  - name: rsTracks
    table: <your_warehouse_db>.<your_warehouse_schema>.<table_name> # change this to your fully qualified input table name 
    occurred_at_col: timestamp

Notice the names are mentioned as edge_sources in profiles.yaml and define specs for creating ID stitcher/feature table. This has been done for you.

ID Stitcher Example: edge_sources and user_id stitcher in profiles.yaml

models:
  - name: user_id_stitcher
    model_type: id_stitcher
    model_spec:
      edge_sources:
        - from: inputs/rsIdentifies
        - from: inputs/rsTracks

Feature Table Example: in profiles.yaml

  - name: user_profile
    model_type: feature_table_model
    model_spec:
      entity_key: user
      vars:
        - entity_var:
            name: first_seen
            select: min(timestamp::date)
            from: inputs/rsTracks

If you plan to use the sample data, run the command below. It will insert two sample tables into the database and schema you defined during setup. (Be sure you are in the project directory in your terminal first):

pb insert

Use this command to validate that your project will be able to access the warehouse specified in pb init connection and create objects in that warehouse.

pb validate access

You can use this command to generate the SQL that will run in your warehouse, and this will also tell you if there are syntax errors in your model YAML file.

pb compile

If there are no errors, you can use this command to create the output table in your warehouse. If using the sample data, this should execute in about 60 seconds:

pb run

View User Features & Profiles in Your Warehouse

What user profiles were created in my warehouse?

The query below will give you the user profiles view generated by PB. This view is pointed to your most up-to-date user profile table. PB automatically maintains a history of profiles tables so you can see what a profile looked like at a particular snapshot in the past.

select * from YOUR_DB.YOUR_SCHEMA.USER_PROFILE limit 5
USER_MAIN_ID VALID_AT FIRST_SEEN USER_LIFESPAN DAYS_ACTIVE
rid93c0681d775e73f01830351e693a610e 2023-06-30 18:50:11.685 2022-11-14 4 2
ridcb1b32379f00d727ee6648777534b8e5 2023-06-30 18:50:11.685 2022-11-15 59 9
rid0379ebf6a4cc85cedbf436efe9bb422d 2023-06-30 18:50:11.685 2022-11-18 56 11
rid1bdbc498de7458039510d81b565ef6ba 2023-06-30 18:50:11.685 2022-05-13 0 1
rid168ce3120988c676d8c3604c0971d632 2023-06-30 18:50:11.685 2022-11-28 11 8

What User IDs were stitched to make the profile?

The query below will provide a sample of the Other IDs connected together to create the user profiles. The table below shows 3 anonymous_ids, 1 user_id, and 1 email as the Other IDs that were stitched into 1 profile ID. Note: The email record was added for illustration purposes and is not in the sample dataset. The next query will show the total number of Other IDs stitched together.

select * from YOUR_DB.YOUR_SCHEMA.USER_ID_STITCHER limit 5
USER_MAIN_ID OTHER_ID OTHER_ID_TYPE VALID_AT
rid00e6b900e23df0c9aba09928ffcd0d31 089511773507192a39cbf1f94e34e366 anonymous_id 2022-06-06 19:16:45.000
rid00e6b900e23df0c9aba09928ffcd0d31 99c6d8b0d3afc5650d3ad9b5eaa06780 anonymous_id 2022-06-06 19:16:45.000
rid00e6b900e23df0c9aba09928ffcd0d31 1ef94c5bf009d0da48ac7a227aeb43be anonymous_id 2022-06-06 19:16:45.000
rid00e6b900e23df0c9aba09928ffcd0d31 1945306b10849bbe946a738f6fd9372f user_id 2022-06-06 19:16:45.000
rid00e6b900e23df0c9aba09928ffcd0d31 [email protected] email 2022-06-06 19:16:45.000

How Many IDs were part of the user profile?

This query shows the total number of IDs used to make each profile. id00e6b900e23df0c9aba09928ffcd0d31 has 24 different anonymous_ids and 1 user_idthat went into the profile creation.

select USER_MAIN_ID as RUDDER_USER_ID,other_id_type,count (distinct other_id) as "OTHER_ID_COUNT"
from profiles_demo_db.rs_profiles_7_1.USER_ID_STITCHER
group by USER_MAIN_ID,other_id_type
order by user_main_id asc
limit 5
USER_MAIN_ID OTHER_ID COUNT_OF_IDs
rid00e6b900e23df0c9aba09928ffcd0d31 user_id 1
rid00e6b900e23df0c9aba09928ffcd0d31 anonymous_id 24
rid0379ebf6a4cc85cedbf436efe9bb422d user_id 1
rid0379ebf6a4cc85cedbf436efe9bb422d anonymous_id 30
rid0386089d15c9669fec23c6835fdf2ac6 anonymous_id 24
rid0386089d15c9669fec23c6835fdf2ac6 user_id 1

Conclusion

You are now up and running with Profiles! When used with the RudderStack Platform, these profiles can create audiences and sync customer360 data to tools like Salesforce, Braze, HubSpot, or Klaviyo. They can also be used for paid advertising audience creation in Google Ads, Facebook, or other tools. Lastly, they can be used for other personalizations like search powered by Algolia, chat powered by Intercom, or subscriptions managed by Stripe — along with MANY more applications.

Learn More about RudderStack

RudderStack Profile Builder
RudderStack Warehouse Native CDP

More Advanced Uses

Shopify Features
Shopify Churn Features
General eCommerce Features
Stripe Features

Questions?

Join RudderStack Slack Community and post your question in the #profile-builder channel.

About

quick-start for learning about rudderstack profiles

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published