Udemy lecture 7: Databases & analytics Flashcards

1
Q

What is a relational database?

A

Relational database is when you make a link to multiple tables (ex. a student made 1 table with student ID, Dept ID, Name, Email, & then a second table was made linking to the first one where in the second table it starts with Dept ID, then gives futher information) (think of it like an excel sheet)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In relational databases it uses the __________ language to perform queries or lookups

A

SQL (Whenever you hear SQL think of relational databases)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

______________ databases are nonrelational databases

A

NoSQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

____________ databases are purpose built for specific data models & have flexible schemas for building modern applications

A

NoSQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some benefits of NoSQL databases?

A
  • Flexible- easy to evolve data model
  • Scalability- designed to scale out by using distributed clusters
  • High-performance- optimized for a specific data model
    -Highly functional- types optimized for data model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does JSON stand for?

A

Javascript object notation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

NoSQL can have its data in _________ format

A

JSON

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data can be _______ in the JSON format

A

Nested (storing data using in a structure way, but the fields (information) can change over time so have to change that information (support for new types of arrays))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is AWS responsibility related to databases

A
  • Responsible for the entire database in terms of patching
  • Automated backup & restore, operations, upgrades
  • Monitoring, alerting

-AWS offers to manage different databses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

______ is a relational database

A

RDS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does RDS stand for?

A

Relational database service

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a Relational database service?

A

A managed database service for database that will use SQL as a query language, & it will allow you to create databases in the cloud that will be managed by AWS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

_________ is a proprietary database from AWS

A

Aurora

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the advantages to using RDS than deploying a database on EC2?

A
  • Automated provisioning, OS patching
  • Continuous backups & restore to specific timestamps (point-in-time restore)!
    -Monitoring dashboards
    -Read replicas for improved read performance
    -Multi-AZ setup for DR (disaster recovery)
  • Maintenance windows for upgrades
  • Scaling capability (vertical & horizontal)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

With RDS databases you can’t connect ________ to it

A

SSH

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the two kinds of database technologies that aurora supports?

A
  1. PostgreSQL
  2. MySQL
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Aurora is supposed to be _________ optimized to yield better performances

A

Cloud

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Aurora storage grows automatically from __________________

A

From 10 gigabytes to 128 terabytes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

__________ & ___________ are the two ways to create relational databases on AWS

A

RDS & Aurora (They are both managed & aurora is more cloud-native whereas RDS is going to be running on the technologies you know that is a managed service)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

The __________ option for Amazon Aurora is where the database instantiation is going to be automated

A

Serverless (also has auto scaling based on your usage)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Both ________ & ___________ are supported as engines of aurora serverless database

A

PostgreSQL & MySQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Aurora serverless is great for _____________ workloads

A

Infrequent/unpredictable workloads

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

If your see Aurora with no management overhead then think of ______________

A

Aurora serverless

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

___________ can scale the read workload of your database

A

RDS read replicas (can create up to 15 replicas & data is only written to the main database)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

__________ is useful to have in case of AZ outage or main database has problems (high availability)

A

failover database (so its bascially multi AZ)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

In the ___________ data is only read/written to the main database & can only have one other AZ as a ________

A

Failover

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

You can use read replicas in multi- regions & use you it for a ___________ in case of region issue & local performance improve, less latency but also has a replication cost

A

Disaster recovery

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

____________ is used to get managed Redis or Memcached databses

A

Elasticache

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

_________ databases are caches that are in-memory databases with high performance & low latency

A

Redis or Memcached

30
Q

Whenever you see “in-memory” database should think of ___________

A

elastichache

31
Q

________________ helps reduce load off databases for read-intensive workloads

A

Elasticache

32
Q

____________ is fully managed and highly available with replication across 3 AZ

A

DynamoDB

33
Q

DynamoDB is a __________ database

A

NoSQL database (not a relational database)

34
Q

___________ has a single-digit millisecond latency- low latency retrieval & it scales to massive workloads, distributed “serverless” database

A

DynamoDB

35
Q

DynamoDB is a __________ database

A

Key/value

36
Q

_______________ is a fully managed in-memory cache for dynamoDB (will give you a 10x performance improvement)

A

DynamoDB accelerator-DAX

37
Q

What is the difference between elasicache & dynamoDB DAX?

A

DAX is only used for & is integrated with dynamnoDB, while elasticache can be used for other databases

38
Q

_______________ make dynamoDB tables accessible with low latency in multiple- regions

A

DynamoDB global tables

39
Q

With dynamoDB-global tables its an ___________ replication (read/write to any AWS region)

A

Active-Active

40
Q

______________ is based on PostgreSQL but its not used for OLTP

A

Redshift

41
Q

Redshift uses ___________

A

OLAP Online analytical processing which is used to do analytics & data warehousing

42
Q

Redshift stores data in a __________ storage

A

Columnar (instead of row-based)

43
Q

Redshift __________ is a feature in redshift that allows you to automatically provision & scale data warehouse underlying capacity

A

Redshift serverless

44
Q

With ________________ you run analytics workloads without managing data warehouse infrastructure

A

Redshift serverless

45
Q

What does EMR mean?

A

Elastic Mapreduce

46
Q

________ helps create Hadoop clusters (big data) to analyze & process vast amount of data

A

EMR

47
Q

In Hadoop the cluster can be made of hundreds of ______________

A

EC2 instances

48
Q

What are the different use cases for EMR?

A
  • Data processing
  • Machine learning
  • Web indexing
  • Big data
49
Q

__________ is a serverless query service to perform analytics against S3 objects

A

Amazon athena

50
Q

Amazon Athena uses __________ language to query files

A

SQL

51
Q

___________ analyze data in S3 using serverless SQL

A

Amazon athena

52
Q

_____________ is a serverless machine learning-powered business intelligence service to create interactive dashboard

A

Amazon quicksight

53
Q

___________ is the same for MongoDB (which is a NoSQL database)

A

DocumentDB

54
Q

________________ is a fully managed graph database

A

Amazon Neptune

55
Q

A popular graph datasets would be a ____________

A

Social network

56
Q

What does QLDB mean?

A

Stands for quantum ledger database

57
Q

A _________ is a book recording financial transactions

A

ledger

58
Q

_____________ is used to just record financial transaction in AWS

A

Amazon QLDB

59
Q

Amazon QLDB is used to review history of all the changes made to your application data over time & its an _________ system which means no entry can be removed or modified, cryptographically verifiable

A

Immutable

60
Q

What is the difference with amazon managed blocked chain and Amazon QLDB?

A

With Amazon QLDB there is no concept of decentralization, which means there’s just a central database owened by amazon but with managed blockchain, its gonna have a decentralized component

61
Q

_______________ makes it possible to build applications where multiple parties can execute transaction without the need for a trusted, central authority

A

Amazon managed blockchain

62
Q

Amazon managed blockchain is compatible with the frameworks ___________ & __________

A

Hyperledger fabric & ethereum

63
Q

___________ is a managed extract, transform & load (ETL) service

A

AWS Glue

64
Q

With __________ you get quick & securely migrated databases to AWS, resilient, self-healing (used to migrate databases)

A

DMS (Database migration service)

65
Q

___________ is a fully managed, petabyte-scale data warehouse service in the cloud.

A

Amazon Redshift

66
Q

__________ is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SOL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

A

Amazon Athena

67
Q

________ is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics.

A

AWS Glue

68
Q

_____________ helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.

A

AWS Database Migration Service

69
Q

The ____________ is a central repository to store structural and operational metadata for all your data assets. For a given data set, you can store its table definition, physical location, add business relevant attributes, as well as track how this data has changed over time.

A

AWS Glue Data Catalog

70
Q

____________ is a
SOL managed service that makes it easy to set up, operate, and scale a relational database in the cloud. It is suited for OLTP workloads

A

Amazon Relational Database Service (Amazon RDS)