Chapter 1: Introduction Flashcards Preview

Business Analytics > Chapter 1: Introduction > Flashcards

Flashcards in Chapter 1: Introduction Deck (22):
1

New Sources of Data 

  • Tweets 12tb
  • Facebook 25tb
  • Google, youtube ...
  • RFID
  • Smart Meters
  • Cameras
  • GPS

2

#1 Source of large data

  • Customer transactional data --> how do customers behave?

3

Traditional Data Warehousing 

  • Several Sources (e.g. online transaction system) -->
  • Extractor / Monitor -->
  • Integration System ( Meta Data) -->
  • Data warehouse (Mngmt decision support)
  • --> Clients

4

Volume, Velocity, and Variety 

  • Volume: Enterprises are awash with ever-growing data of all types.

    • Turn 12 terabytes of Tweets each day into improved product sentiment analysis

    • Convert 350 billion annual meter readings to better predict power consumption 

  • Velocity: For time-sensitive processes such as catching fraud, big data must be used as it streams into your enterprise in order to maximize its value.

    • Scrutinize 5 million trade events created each day to identify potential fraud

    • Analyze 500 million daily call detail records in real-time to predict customer churn
      faster 

  • Variety: Big data is any type of data - structured and unstructured data such as text, sensor data, audio, video, click streams, log files and more.

    • Monitor 100’s of live video feeds from surveillance cameras to target points of interest

    • Exploit the 80% data growth in images, video and documents to improve customer satisfaction

5

Aggregating Data from Different Sources 

The challenge for most organizations is to manage and analyze the various sources of structured, structured, and streaming data.

  • Websites
  • Billing, ERP, CRM
  • RFID
  • Network switches
  • Social media

6

New Trends in Data Organization 

  • Main memory databases are able to run queries in seconds (which took hours!)
  • Distributed file systems allow for effective parallelization (e.g., Apache Hadoop)

7

Business Analytics (Definition)

Business analytics makes extensive use of statistical analysis, including explanatory and predictive modeling, and fact-based management to drive decision making. It is therefore closely related to management science. Analytics may be used as input for human decisions or may drive fully automated decisions. 

8

Descriptive Analytics 

What has occurred?

How much did I sell?

BI, Data engineering, statistics ...

Data Engineering and Statistics:

Organize data, execute large queries, describe means, trends, and test hypotheses 

9

Predictive Analytics 

What will occur?

Try to understand behaviour. E.g. switching customers

Data Mining and Econometrics

Forecast events, predict time series, or discrete choice decisions of customers 

10

Prescriptive Analytics 

What should occur? 

Network flow, Management science ...

Algorithms and Optimization

Develop algorithms and optimization models for planning, scheduling,

pricing, and revenue mgt. 

11

Relationship to Business Intelligence (BA related to predictive / inductive statistics and BI related to descriptive analytics / statistics)

  • Business analytics (related to predictive analytics / inductive statistics)

    • focuses on developing new insights and understanding of business

      performance based on data and statistical methods.

    • may be used as input for human decisions or may drive fully automated decisions.

  • Business intelligence (related to descriptive analytics / statistics)

    • traditionally focuses on using a consistent set of metrics to both measure past performance and guide business planning, which is also based on data and statistical methods.

    • is often associated with querying, reporting, OLAP, and "alerts". 

12

From Data to Information (Flow)

  • Data consolidation (Data input and Querys) --> DWH
  • Selection and processing (make sense out of large table)
  • Business analytics (model that fits data)
  • Interpretation and evaluation (insights)

13

Predictive Analytics 

  • Algorithms and Databases

    • Association Rule Algorithms

    • Algorithm Design Techniques

    • Algorithm Analysis

    • Statistics and Econometrics

  • Statistics and Econometrics

    • Bayes Theorem

    • Regression Analysis

    • EM Algorithm

    • Clustering

    • Time Series Analysis 

  • Machine Learning and Data Mining

    • Decision Tree and other Classification Algorithms

    • Clustering

    • Neural Networks 

14

Numerical prediction

Given a collection of data with known numeric outputs, create a function that outputs a predicted value from a new set of inputs.

E.g. Given gestation time of an animal, predict its maximum life span. 

15

Classification 

  • From data with known labels, create a classifier that determines which label to apply to a new observation
  • E.g. Identify new loan applicants as low, medium, or high risk based on existing applicant behavior. 

16

Clustering 

  • Identify “natural” groupings in data

  • Unsupervised learning, no predefined groups

  • E.g. Identify clusters of “similar” customers. 

Difference to classification: you do not know the groups

17

Association Rule Analysis 

  • Identify relationships in data from co-occuring terms or items.
  • E.g., analyze grocery store purchases to identify items most commonly purchased together. 

Market basket analysis (milk, sugar, eggs)

18

What is a Model? 

Mathematical functions

  • Mathematical combination of attribute values
  • E.g. linear model, non-linear model
  • CPU performance prediction 

E.g.:

  • Decision tree:
    • Study: >= 10hrs --> Do homework <10 hours test well...
  • Neural networks:

19

Model selection

  • Build model
  • evaluate performance
  • meet criteria? no --> build model
  • yes: interpret model

20

Most important algorithms

  • Regression
  • Decision tree
  • Cluster analysis

21

Examples of Analytics in Retailing 

  • Campaign management

  • Product recommendations

  • Customer profitability analysis

  • Customer segmentation analysis

  • Pricing products

  • Forecasting revenues

  • Analysis of clickstream data 

22

CRM Marketing and examples

  • CRM marketing is #1 area to which data mining is applied.
  • Reccomender systems: (systems for recommending items) Amazon, netflix ...
    • increase sales
    • Customer A buys
    • Customer B searches what A bought, gets presented what he also bought...
  • Collaborative Filtering 

    • Maintain a database of many users’ ratings of a variety of items. For a given user, find other similar users whose ratings strongly correlate with the current user.

    • Recommend items rated highly by these similar users, but not rated by the current user.