04_14 Big Data and NoSQL Flashcards

1
Q

A data model that organizes data around a central entity based on the way the data will be used.

A

aggregate aware

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A data model that does not organize data around a central entity based on the anticipated usage of the data.

A

aggregate ignorant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

A process or set of operations in a calculation.

A

algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A data processing method that runs data processing tasks from beginning to end without any user interaction.

A

batch processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In the HDFS…

A report sent every 6 hours by the data node to the name node informing the name node which blocks are on that data node.

A

block report

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A computer-readable format for data interchange that expands the JSON format to include additional data types including binary objects.

A

BSON (Binary JSON)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In a key-value database…

A logical collection of related key-value pairs.

A

bucket

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In document databases…

A logical storage unit that contains similar documents, roughly analogous to a table in a relational database.

A

collection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In a column family database…

A collection of columns or super columns related to a collection of rows.

A

column family

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A NoSQL database model that organizes data into key-value pairs, in which the value component is composed of a set of columns that vary by row.

A

column family database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A physical data storage technique in which data is stored in blocks, which hold data from a single column across many rows.

A

column-centric storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A declarative query language used in Neo4j for querying a graph database.

A

Cypher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A NoSQL database model that stores data in key-value pairs in which the value component is composed of a tag-encoded document.

A

document database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In a graph database…

The representation of a relationship between nodes.

A

edge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Analyzing stored data to produce actionable results.

A

feedback loop processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A MongoDB method to retrieve documents from a collection.

A

find()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

A NoSQL database model based on graph theory that stores data on relationship-rich data as a collection of nodes and edges.

A

graph database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

A highly distributed, fault-tolerant file storage system designed to manage large amounts of data at high speeds.

A

Hadoop Distributed File System (HDFS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

In the HDFS…

A signal sent every 3 seconds from the data node to the name node to notify the name node that the data node is still available.

A

heartbeat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

In a Hadoop environment…

A central program used to accept, distribute, monitor, and report on MapReduce processing jobs.

A

job tracker

21
Q

A human-readable text format for data interchange that defines attributes and values in a document.

A

JSON (JavaScript Object Notation)

22
Q

A NoSQL database model that stores data as a collection of key-value pairs in which the component is unintelligible to the DBMS.

A

key-value (KV) database

23
Q

The function in a MapReduce job that sorts and filters data into a set of key-value pairs as a subtask within a larger job.

A

map

24
Q

An open-source API that provides fast data analytics services.

One of the main Big Data technologies that allows organizations to process stores.

A

MapReduce

25
Q

A program that performs a map function.

A

mapper

26
Q

In the object-oriented data model…

A named set of instructions to perform an action.

A

method

Methods represent real-world actions and are invoked through messages. Also, a programed function within an object used to manipulate the data in that same object.

27
Q

A database model that attempts to provide ACID-compliant transactions across a highly distributed infrastructre.

A

NewSQL

28
Q

In a graph database…

The representation of a single entity instance.

A

node

29
Q

A new generation of database management systems that is not based on the traditional relational database model.

A

NoSQL

30
Q

The coexistence of a variety of data storage and data management technologies within an organization’s infrastructure.

A

polyglot persistence

31
Q

In MongoDB…

A method that can be chained to the find() method to improve the readability of retrieved documents through the use of line breaks and indentation.

A

pretty()

32
Q

In a graph database…

The attributes or characteristics of a node or edge that are of interest to the users.

A

properties

33
Q

The function in a MapReduce job that collects and summarizes the results of map functions to produce a single result.

A

reduce

34
Q

A program that performs a reduce function.

A

reducer

35
Q

A physical data stroage technique in which data is stored in blocks, which hold data from all columns of a given set of rows.

A

row-centric storage

36
Q

A method for dealing with data growth that involves distributing data storage across a cluster of commodity servers.

A

scaling out

37
Q

A method for dealing with data growth that involves migrating the same structure to more powerful systems.

A

scaling up

38
Q

A method of text analysis that attempts to determine if a statement conveys a positive, negative, or neutral attitude.

A

sentiment analysis

39
Q

The processing of data inputs in order to make decisions about which data to keep and which data to discard before storage.

A

stream processing

40
Q

Data that conforms to a predefined data model and has been formatted to facilitate storage, use, and information generation.

A

structured data

41
Q

In a column family database…

A column that is composed of a group of other related columns.

A

super column

42
Q

A program in the MapReduce framework responsible to running map and reduce tasks on a node.

A

task tracker

43
Q

A query in a graph database.

A

traversal

44
Q

Data that exists in its orginal, raw state.

That is, in the format in which it was collected and does not conform to a predefined data model.

A

unstructured data

45
Q

The degree to which data can be analyzed to provide meaningful insights.

A

value

46
Q

A characteristic of Big Data that describes the speed at which data enters the system and must be processed.

A

velocity

47
Q

The trustworthiness of a set of data.

A

veracity

48
Q

The ability to graphically present data in such a way as to make it understandable to users.

A

visualization

49
Q

A characteristic of Big Data that describes the quantity of data to be stored.

A

volume