5.1 BUSINESS INTELLIGENCE & DATA ANALYTICS Flashcards
Q1: As said earlier, it is assumed that about 80% of all firm data is
a) structured.
b) unstructured.
c) semi-structured.
d) metadata.
b) unstructured.
Q2: Which statement is NOT CORRECT?
a) RFM analysis is sometimes referred to as a poor man’s approach to customer lifetime value (CLV) analysis.
b) The RFM framework is a well-known and well-developed measurement framework used in marketing across different industries such as banking, insurance, Telco, non-profit, travel, on-line retailers, and even government.
c) In the RFM framework, R stands for Recency, F for Frequency and M for Monetary.
d) The RFM framework focusses on prospective customers, instead of existing customers.
d) The RFM framework focusses on prospective customers, instead of existing customers.
Q3: According to Van Vlasselaer and Baesens, which is a key characteristic of fraud?
a) uncommon
b) well-considered
c) imperceptibly concealed
d) time-evolving
e) often carefully organized
f) all of the above
f) all of the above
Q4: In response modeling, reading an advertisement email, clicking on a link, downloading a product description, configuring a product such as a car for example, or contacting the customer service desk for a price quote could be considered as examples of
a) implicit response.
b) explicit response.
a) implicit response.
Q5: Which of the following are component(s) of the Customer Lifetime Value (CLV)?
a) Costs
b) Revenues
c) Discount rate
d) Time horizon
e) All of the above.
e) All of the above.
Q6: Open data is data that
a) only a very specific set of people can access, use and share.
b) only the government can access, use and share.
c) anyone can access, use and share.
d) no one can access, use and share.
c) anyone can access, use and share.
Q7: Which are key skills of a data scientist?
a) Quantitative skills
b) Business skills
c) Creativity
d) Programming skills
e) All of the above.
e) All of the above.
Q8: What data relates to the core entities a company is working with such as customers, products, employees, suppliers and vendors?
a) Master data
b) Transactional data
c) Metadata
d) External data
a) Master data
Master data is the core data that is absolutely essential for running operations within a business enterprise or unit.
Most popular categories; customer, supplier; product, location
Q9: Customer journey analysis can be used to
a) get a clear and comprehensive picture of the overall process.
b) highlight process deficiencies such as excessive processing times, indicate deadlock situations, circular references, and unwanted customer leakage, among others.
c) verify if the process is compliant with both internal and external regulations.
d) all of the above.
d) all of the above.
Q10: Which of the following is a characteristic of Big Data?
a) Volume
b) Velocity
c) Variety
d) Veracity
e) Value
f) All of the above
f) All of the above
Q11: Which statement is NOT CORRECT?
a) Active churn implies that the customer stops the relationship with the firm.
b) Passive churn occurs when the customer stays with the firm but decreases the intensity of the relationship.
c) Forced churn implies that the customer stops the relationship with the company because its products or services are too expensive.
d) Expected churn occurs when the customer no longer needs the product or service.
c) Forced churn implies that the customer stops the relationship with the company because its products or services are too expensive.
= customers are involuntarily or forcibly removed from a service or subscription. due to eg. payment issues, violation of terms of service.
Q12: The goal of machine learning models is to
a) complement human expert-based insights.
b) replace human expert-based insights.
a) complement human expert-based insights.
Q13: In a recommender setting, recommending a product or service, which the user was not aware of and thus not looking for, but turns out to be very interesting to him/her is an example of
a) user interest.
b) serendipity.
c) simplicity.
d) item relevance.
–> b) serendipity. correct
A= user is already aware of and looking for such recommendations.
b= discovery of something interesting or valuable
c= ease of use or straightforwardness of a system
d= the degree to which a recommended item is pertinent or suitable for the user based on their preferences
Q14: In fraud detection, the date and location of an accident picture is an example of:
a) master data.
b) transactional data.
c) metadata.
d) external data.
c) metadata.
Q15: A web page is an example of
a) structured data.
b) unstructured data.
c) semi-structured data.
d) metadata.
c) semi-structured data.
Q16: Which statement is NOT CORRECT?
a) In unsupervised machine learning, there is no target variable available. The idea is to find structure in the data. Popular examples are clustering, association rule mining and sequence rule mining. It is also referred to as descriptive analytics as the idea is to describe patterns in the data.
b) Usually, the machine learning or analytics step is the most complex and most time consuming. Estimates say that it takes about 80% of the total effort.
c) Once the data preprocessing step of the analytics process model is finished, the process proceeds with a data transformation step. Here, data can be aggregated such as from zip code to city, state or even country for example.
d) Supervised machine learning is characterized by the presence of a target variable. The idea is to relate or map predictor variables X to a target variable Y. Popular examples are churn prediction, fraud detection, response modeling, and credit risk modeling. It is also referred to as predictive analytics since the aim is to predict.
b) Usually, the machine learning or analytics step is the most complex and most time consuming. Estimates say that it takes about 80% of the total effort.
datapreprocessing is most time consuming not ml
Q17: The Pareto principle states
a) For many events, roughly 80% of the effects come from 20% of the causes.
b) For many events, roughly 90% of the effects come from 10% of the causes.
c) For many events, roughly 20% of the effects come from 80% of the causes.
d) For many events, roughly 10% of the effects come from 90% of the causes.
a) For many events, roughly 80% of the effects come from 20% of the causes.
Q18: A Tweet is an example of
a) structured data.
b) unstructured data.
c) semi-structured data.
d) metadata.
–> b) unstructured data. correct
a= eg. ID, name, age
b= eg. social media post, mail, article
c= eg. Java, XML
d= eg. contect of digital photo
Q19: A churn prediction model essentially tries to predict which customers will
a) become fraudsters.
b) leave you or decrease their product/service usage.
c) turn into bad payers.
d) respond to your marketing campaign.
b) leave you or decrease their product/service usage.
Q20: Which statement is NOT CORRECT?
a) The goal of response modeling is to model whether customers will respond to a marketing campaign or not.
b) The focus of response modeling can be on either customer acquisition or on deepening customer relationships by selling additional products or services to your existing customer portfolio.
c) Customer acquisition is a lot easier than customer retention.
d) Just as with churn prediction, response modeling essentially boils down to a binary classification task so many of the ideas of churn prediction also apply here.
c) Customer acquisition is a lot easier than customer retention.
Q21: The ACFE, or association of certified fraud examiners, estimates that a typical organization loses
1% of its revenues to fraud each year.
5% of its revenues to fraud each year.
10% of its revenues to fraud each year.
20% of its revenues to fraud each year.
5% of its revenues to fraud each year.
Q22: Credit bureaus are
data pooling organizations that gather default information from various financial institutions such as delinquency history, bureau checks, and bureau score.
governmental institutions that gather credit data at country level.
business units that develop credit scoring models.
consultancy firms that provide credit scoring solutions.
data pooling organizations that gather default information from various financial institutions such as delinquency history, bureau checks, and bureau score.
Q23: Search data such as Google Trends can be used for nowcasting where the aim is to
forecast the past.
forecast the future.
forecast the present or near future.
forecast the present or near future.
Q24: Customer journey analysis can be used to
get a clear and comprehensive picture of the overall process.
highlight process deficiencies such as excessive processing times, indicate deadlock situations, circular references, and unwanted customer leakage, among others.
verify if the process is compliant with both internal and external regulations.
all of the above.
all of the above.