Big Data Flashcards

(18 cards)

1
Q

Definition of Big Data

A

Set of data that includes the characteristics of the 3Vs (Vol, Vel, Var) to an extent that makes the data unsuitable for management based on relational DBMS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Volume

A

Amount of data to be stored

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

2 processes of Volume

A

Scaling up - updating existing systems
Scaling out - adding new servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Velocity

A

Speed of incoming data and processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Velocity of processing can be broken into 2 categories:

A
  1. Stream processing - Processing of real time data through algorithms
  2. Feedback loop - Analyzing data for actionable outcomes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Example of feedback loop

A
  1. User clicks on a book link
  2. Data of the cust and book is captured
  3. Data is analyzed to find out other items and books that users might be interested in
  4. List of recommendations on items and books users might be interested in is listed
  5. Information on the book requested, plus other recommendations is returned to the user
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Variety

A

The type of format the data is stored/captured as

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Diff. types of variety

A
  1. Structured - Organized in a pre-defined format (Relational DBSM)
  2. Unstructured - Data without specific format (multimedia files)
  3. Semi-structured - A mix, with some structured elements
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Other characteristics

A
  1. Veracity - Trustworthiness of data
  2. Value - Usefulness for insights
  3. Variability - Meaning of data can change with context
  4. Visualization - Ability to present data clearly
  5. Polyglot Persistence - Using diverse data storage technologies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Hadoop ecosystem

A

The de facto standard for big data processing due to these 2 components:
1. Hadoop Distributed File System (HDFS) - Low-level distributed file system process used to manage big data sets across multiple computers.
2. MapReduce - programming model used to process large data sets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Big Data in Fintech

A
  1. Data Visualization - Custom dashboards for clear business insights
  2. Data Mining - Analyzing large volume of unstructured data for insights
  3. Real time processing - Track users’ activities for user analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Data Analysis in Fintech

A
  1. Risk analysis - Using AI/ML (Machine Learning) for better risk assessment
  2. Decision making - Using AI/ML to decide the optimal course of actions in complex situations
  3. Automation - Reduce delays, enhance customer service
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Data Security

A
  1. Customer profile - Creates detailed profile for identity protection
  2. Fraud Detection - AI can help prevent Fraud or prevent cyber threats
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Cust Centric Services

A
  1. Personalization: AI analyzes behavior for tailored financial services.
  2. Financial Advice: AI offers financial guidance similar to human advisors.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Financial Inclusion

A

Cost Reduction: Automation lowers operational costs, enabling affordable financial services and promoting inclusion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Money Laundering & Terrorism Prevention

A
  1. Real-Time Monitoring: AI analyzes transaction patterns to spot suspicious activities.
  2. Anomaly Detection: Machine learning identifies irregular transactions efficiently.
17
Q

In-Branch Security

A

Incident Response: AI enhances security by analyzing breaches and providing preventative insights.