Einstein Discovery Data Prep and Create Stories Flashcards
You can remedy data issues in 2 ways:
Fix issue in CRM Analytics dataset using data prep tools.
Correcting issue in story using story settings. Story fixes don’t affect data in the dataset.
Data Prep Terminology
Variables - Category of data. Columns in dataset.
Observations - Row Value in Dataset
Data Type - Numerical, Categorical (text), Date
Max # Observations in datset
20M
What is a Story
A story contains answers, explanations, predictions, and suggested actions that arranged into an organized presentation with logical flow and related sections. The story is filled with insights about your data as they relate to the outcome you’re interested in. Einstein Discovery walks you through what has happened and why, what has changed, what is likely to happen, and what you can do about it.
Two Types of Stories
Insights Only - only descriptive
Insights and Predictions - all insight types.
2 ways to create a story
Dataset or Template
Ways to create a Story from a Dataset
- Create -> Create from dataset.
- While viewing a lens.
- From dataset dropdown.
Stories and Security Predicates
All users who access the story can see the results of the story. They don’t need the same row-level access as the story creator.
What data in a dataset is a story based on?
A snapshot of the data. Initial data snapshot taken when story is created. If data has changed in source dataset, users with sufficient privileges can refresh story based on most recent data. Otherwise, subsequent changes to the story do not affect the snapshot, and subsequent changes to the dataset are ignored.
Occurences
performs an extensive query analysis of dataset values by calculating the number of times a value occurs in a column, including interactions with other columns. For example, the color red occurs 30% of the time in an Automobile dataset, of those rows the most frequent body type is coupe.
What does template overview provide?
Description, List of Supported Objects, Sample Insights
Issue: Story concurrency limits exceeded
No more than two stories can be created concurrently
Dataflow run limits exceeded
During app creation, story templates runs a dataflow twice - create a dataset used to train the predictive model, and use the predictive model to generate prediction scores and rite back to crm.
If you exceel the max number of dataflow runs in your ord in a 24 hours period
Data Sync-related limits exceeded
Story templates can add objects to Data Sync. If org already has created the max number of data sync objects, will fail.
Daya Sync-related errors
If app creation triggers data sync-related errors, address them in data manager before trying to create the template again.
Elements of a story interface
- Story Headline - Name of story, goal, most recent version
- Story toolbar
- Variables Panel - list of explanetory variables and their correlation to outcome
- Story Version summary - summary of insights, version comparison
- insight Summary Panels - List of variables, ordered by correlation, that positively or negatively impact a story
What does a story headline contain?
The basis of the story. Story Name Version Update Story Goal Story Version
What does the story version summary contain?
Goal
Row Couunt (# obs in analysis)
# Change in Row Count from previous version
Outcome Avg
% Change in outcome avg from previous version
What changed between versions
How to Edit Story
Open Story
click Edit Story
Can change columns, update story to latest dataset change
Use correlation column to see how much each field contributed to the outcome. remove columns that have little to no impact.
What column contains fields that you can improve, such as fields with outliers or duplicates?
Data Alert Column
What can you edit in the general settings tab?
Analysis Type (insights or insights & preds)
Algorithm (GLM, GBM, XGBoost, random Forest)
- select Model Tournament to have ED run all algorithms and show the results of algorithm that performed best
Validation Type -
- Training/Validation Ratio
- Validation Dataset (can specify crm dataset). Will only see datasets that match the schema of your story’s datset.
None (default) - uses only k-fold validation.
Configure Number Variables
change settings for individual numbers in your story.
On Story settings, click number field. Can:
analyze for bias (select to exclude a variable from the model. A SHIELD icon will appear next to the title of the insight to remind you it’s a sensitive variable)
Transform - Replace missing values, projected predictions
Bucket Values by (count, width, manual)
Number of buckets - specify number of buckets to show in charts
Include only – adds min and max values to starting values and ending value fields
Preview - Graph shows number of values that occur across the range of number ranges.
What are projected predictions?
Providing trending data for numeric variables that factor into your predictions to make them more accurate
Configure Projected Predictions
Provide dataset that contains trend data.
Tell story data about the dataset:
Unique columns identifier
Variable column (maps to selected variable in story)
time interval column
time interval number of intervals to project ahead
seasonality (auto or none or number)