Data Management Flashcards
What is a data model?
a simple representation that demonstrates how data structures, including their characteristics, relations, constraints, transformations form a database that solves a real-world business problem.
business rule
a policy, procedure, or principle within a specific organization. business rules are used to define entities, attributes, relationships, and constraints.
the basic features of the relational data model
entities, attributes, relationships
Translate business rules into data models
a noun in a business rule will translate into an entity in the model, and a verb (active or passive) that associates the nouns will translate into a relationship among the entities
entity
An entity is a person, place, thing, concept, or event about which data will be collected and stored.
Attribute
An attribute is a characteristic of an entity. For example, a CUSTOMER entity would be described by attributes such as customer last name, customer first name, customer phone number, customer address, and customer credit limit.
Relationships
describes an association among entities. Data models use three types of relationships: one-to-many, many-to-many, and one-to-one. shorthand notations 1:M or 1..*, M:N or .., and 1:1 or 1..1, respectively.
Visualize One-to-many (1:M or 1..*) relationship.
An author has many books, but a book has one author 1:M AUTHOR publishes BOOKS
An invoice is created by one person, but a customer generates many invoices 1:M CUSTOMER generates INVOICES
Visualize Many-to-many (M:N or ..) relationship.
An employee may learn many job skills, and each job skill may be learned by many employees. “EMPLOYEE learns SKILL” as M:N.
A student can take many classes and each class can be taken by many students, M:N STUDENT takes CLASSES
One-to-one (1:1 or 1..1) relationship.
A retail company’s management structure may require that each of its stores be managed by a single employee. “EMPLOYEE manages STORE” is labeled 1:1.
Schema
The schema is the conceptual organization of the entire database as viewed by the database administrator.
Hierarchial model
The hierarchical model depicts a set of one-to-many (1:M) relationships between a parent and its children segments.
Network model
the user perceives the network database as a collection of records in 1:M relationships. However, unlike the hierarchical model, the network model allows a record to have more than one parent.
Subschema
The subschema defines the portion of the database “seen” by the application programs that actually produce the desired information from the data within the database.
Relational model
A relation (sometimes called a table) as a two-dimensional structure composed of intersecting rows and columns. Each row in a relation is called a tuple. Each column represents an attribute.
The Entity relationship model
The graphical representation of entities and their relationships in a database structure. ER models are normally represented in an entity relationship diagram (ERD)
Object-Oriented Model (OODM)
data and its relationships are contained in a single structure known as an object, forming the basis for the object-oriented database management system (OODBMS).
Why is an object said to have greater semantic content than an entity?
an object includes information about relationships between the facts within the object, as well as information about its relationships with other objects.
Class vs object
A class is a collection of similar objects with shared structure (attributes) and behavior (methods). An object is an abstraction of an entity
Extensible Markup Language (XML)
a markup language designed to store, transport, and structure data in a way that is both human-readable and machine-readable. allows users to define their own tags and data structures based on their needs. <tag></tag>
table
A logical construct perceived to be a two-dimensional structure composed of intersecting rows (entities) and columns (attributes) that represents an entity set in the relational model.
The 3 Vs
VOLUME: amounts of data being stored
VELOCITY: the speed which data grows but also the need to process
VARIETY: the fact that data is collected in multiple formats
Internet of Things (IoT)
A web of Internet-connected devices constantly exchanging and collecting data over the Internet. IoT devices can be remotely managed and configured to collect data and interact with other devices on the Internet.
What is Hadoop, and what are its basic components?
A Java-based, open-source file storage system that uses the write-once, read many model. This means that once the data is written, it cannot be modified.