Final Flashcards
(42 cards)
Business Intelligence
- The umbrella term that includes the application, infrastructure and tools, and best practices that enable access to and analysis of information to improve and optimize decisions and performance.
- The use of data visualization and reporting for becoming aware and understanding “what happened and what is happening”
- Done by charts, tables, and dashboards to display, examine and explore data
- Process of raw data to interpreting information
- The process of going from raw data to intelligent information
Business Analytics
- Extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact based management to drive decisions and actions
- Practice and art of bringing quantitative data to bear on decision making
- Subset of business intelligence
- Relies on a number of different disciplines to collect and analyze data
- Ex: upsailing to customers
Data Mining
- Business analytics methods that go beyond counts, descriptive techniques, reporting, and methods based on business rules.
- Extracts useful info from large data sets (finding gold)
- Process of exploration and analysis of large quantities of data in order to discover meaningful patterns and rules
- Employs pattern recognition technologies as well as statistical and mathematical techniques
- Ex: evaluating which customers are going to switch- customer retention (phone provider)
o Subset of business intelligence
o Intersection of IT and statistics
Unsupervised Learning
- Search for patterns and structure among all variables
- No predefined outcome groups (no dependent or outcome variables)
- Define groups of cases with similar characteristics
- Find out the structure of the data
- Ex: average characteristics of measures of data in groups or clusters
- Find out classifier
- Ex: cluster analysis
Supervised Learning
- Have a target variable
- Example is regression analysis
- Predefined outcome groups or variable (know dependent variable)
- Decide to which class each case belongs (by calculating membership score of each case)
- Find out major characteristics that differentiate predefined groups
Data Mining Techniques
- Prediction
- Classification
- Association
Prediction
- Dependent (response variable is a continuous variable
- Formula or model to predict future observations
- Ex: multiple regression and decision trees
- Ex: predicting amount of time to sort when using gloves
Classification
- Dependent variable is a categorical variable
- Identify categories of data (buy vs. not buy)
- Ex: logistic regression, decision trees, cluster analysis
- Ex: will someone buy or not buy products
Association
- Relationship among entities
- Ex: if you bought cornflakes did you also buy bananas
- Ex: market basket analysis (not in class)
Business Intelligence
- Umbrella term that spans people, process and tools
- Organize data/information, enable access to it, analyze it, improves decision and manage performance
Business Analytics
- Process of “doing” analysis in a particular domain
- Uses analytical techniques (data mining)
Data Mining
Process of discovering new patterns from large data sets involving artificial intelligence, statistics, and database systems
CRIP-DM
- Cross- Industry Standard Process for Data mining
- Fits data mining into the general problem-solving strategy of business/research unit
CRIP-DM Phases (stages of Data Mining Process)
- ) Business Understanding
- ) Data Understanding
- ) Data Preparation
- ) Modeling
- ) Evaluation
- ) Deployment
Business Understanding
- Demonstrate business objectives (why study- specific problem, knowledge discovery- increase sales of new shirt)
- Assess situation (set up a concise and clear discription of the problem)
- Determine data mining goals (achieve in technical terms and what is success criteria)
- Product project plan (establish a budget)
Data Understanding
- Collect initial data
- Describe data
- Explore data
- Verify data quality
Data Preparation
- Select data
- Clean data (outliers, transform)
- Construct data
- Integrate data
- Format data
Modeling
- Select modeling technique
- Generate test design
- Build model
- Assess model
Modeling Techniques
- Classification
- Clustering
- Predictions
- Sequential patterns
Classification
Map each item of data into one of set of classes
Clustering
Grouping data- no predefined classes
Predictions
Predict a value of variable- regression analysis
Sequential Patterns
Analyzing time series data- find out a seasonal pattern
Evaluation
- Evaluate the result (interpret the results and are busines objectives met)
- Review process
- Determine next steps