Flashcards in Exam 2 Deck (23):
Understand the 5 components of IS-
1. Hardware- Computer side/ Actor. Refers to the machinery that includes the computer itself, and all the equipment that goes along with it.
2. Software- Computer side/ Instructions. Software refers to the computer programs and the manuals that support them.
3. Data- Bridge. Data are facts that are used by programs to produce useful information.
4. Procedures- Human side/ instructions. Procedures are the policies that govern the operation of a computer system. “procedures are to people what software is to hardware.”
5. People-Human side/ actor. Every system needs people if it is to be useful. Includes not only the users, but those who operate and service the computers.
What is the purpose of a database?
A database is a self-describing collection of integrated records. A database is composed of: bytes (a character of data), fields, records, table or a file. The purpose of a data base is to keep track of things. Lists of data involving a single theme can be stored in a spreadsheet; a list that involves data with multiple themes require a data base.
• What are records, tables, files, and fields?
Fields- A byte is a character of data, which in a data base are grouped into fields (or columns), such as Student Name and Student Number. A database table has multiple columns that represent the attributes of an entity.
Records- Columns or fields, in turn, are grouped in rows (also called records). The collection of data for all columns (Student Name, Student Number, HW1, HW2, Test Scores, Etc.) are called a row. Groups of columns in a database table.
Table/ File- A group of similar rows or records in a database.
• Define what a database is and describe its key elements (forms, reports, queries and application programs).
Database- As previously stated, a database is a self-describing collection of integrated records.
A database application is a collection of forms, reports, queries, and application programs that serves as an intermediary between users and database data.
Forms- a form is a window or screen that contains numerous fields, or spaces to enter data. Each field holds a field label so that any user who views the form gets an idea of its contents. A form is more user friendly than generating queries to create tables and insert data into fields.
Forms: View data; insert new, update existing, and delete existing data.
Reports- the formatted result of database queries and contains useful data for decision-making and analysis. Most good business applications contain a built-in reporting tool; this is simply a front-end interface that calls or runs back-end database queries that are formatted for easy application usage.
Reports: Structured presentation of data using sorting, grouping, filtering, and other operations.
Queries- A query is a request for data or information from a database table or combination of tables. This data may be generated as results returned by Structured Query Language (SQL) or as pictorials, graphs or complex results, e.g., trend analyses from data-mining tools.
Queries: Search based on data values provided by the user.
Application Programs- A database application is a computer program whose primary purpose is entering and retrieving information from a computerized database. Early examples of database applications were accounting systems and airline reservations systems, such as SABRE, developed starting in 1957.
Application Programs: Provide security, data consistency, and special purpose processing, (ex. Handle out-of-stock situations).
• What are primary and foreign keys?
Primary Key- A Key (or primary key) is a column or group of columns that identifies a unique row in a table. Every table must have a key. Sometimes more than one column is needed to form a unique identifier. For example: In a table called city, they key would consist of the combination of columns (city, state) because a
Further defining primary key via google- A primary key is a special relational database table column (or combination of columns) designated to uniquely identify all table records. A primary key's main features are: It must contain a unique value for each row of data. It cannot contain null values.
Foreign Key- A term used because such columns are keys, but they are keys of a different (foreign) table than the one in which they reside.
Furthering defining Foreign Key: a foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table or the same table. In simpler words, the foreign key is defined in a second table, but it refers to the primary key or a unique key in the first table.
• What is a DBMS?
DBMS stands for Data Management System. It is a program used to create, process, and administer a database. As with operating systems, almost no organization develops its own DBMS. Instead companies license DB2 from IBM, Microsoft, Oracle, etc.
Popular DBMS products are:
1. DB2 -IBM
2. Access and SQL Server- Microsoft
3. Oracle Database- Oracle
4. MySQL- an open sourced DBMS product that is license free for most application.
Other DBMS products are available, but these five process the great bulk of databases today.
A DMBS and a database are two separate entities. A DBMS is a software program; A database is a collection of tables, relationships, and metadata.
Database developers use the DBMS to create tables, relationships, and other structures in the database.
• Understand what SQL is used for.
SQL- is an international standard language for processing a database. All five of the DBMS products mentioned earlier accept and process SQL statement. You do not need to understand or remember SQL language syntax. Instead, just realize that SQL is an international standard for processing a database. SQL can also be used to create databases and database structures.
Furthering Defining: QL is an abbreviation for structured query language, and pronounced either see-kwell or as separate letters. SQL is a standardized query language for requesting information from a database.
What is a data model?
• Database structures can be complex, in some cases very complex. So, before building the database developers construct a logical representation of database data called “Data Model”. It describes the data and relationships that will be stored in the database. It is akin to a blueprint. Just as building architects create a blueprint before they start building, so, too database developers create a data model before they start designing the database.
Interviews with users lead to database requirements, which are summarized in a data model. Once the users have approved the data model, it is transformed into a database design. That design is then implemented into database structures.
• What are entities and attributes?
Entities- An entity is some thing that the users want to track. Examples of entities are: Order, Customer, Salesperson, and Item. Entity names are always singular. “Order” not “Orders” and so on.
Attributes- Entities have attributes that describe characteristics of the entity. Example attributes of Order are OrderName, OrderData, SubTotal, Tax, Total, and so forth.
Example attributes of Salesperson are: SalespersonsName, Email, Phone, etc.
Entities have an identifier, which is an attribute (or group of attributes) whose value is associated with one and only one entity instance. For example, OrderNumber is an identifier of Order because only one Order instance has a given value of OrderNumber.
• What is normalization?
Normalization- The process of converting a poorly structured table into two or more well-structured tables. A table is such a simple construct that you may wonder how one could possibly be poorly structured.
Normalization is the process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating redundancy and inconsistent dependency.
• What are the fundamental categories of business intelligence?
Business Intelligence Systems are informational systems that process operational, social, and other data to identify patterns, relationships, and trends for use by business professionals and other knowledge workers.
1. Reporting- Reporting Applications
2. Data Mining- Data mining applications. the application of statistical techniques to find patterns and relationships among data for classification and prediction.
3. Big Data- Big Data applications
4. Knowledge Management- Knowledge Management applications to produce business intelligence for knowledge workers.
What are the three primary activities of business intelligence? Be able to explain them.
1. Acquire Data- Data acquisition is the process of obtaining, cleaning, organizing, relating, and cataloging source data.
2. Perform analysis- BI analysis is the process of creating business intelligence. The four fundamental categories of BI analysis are reporting, data mining, Big Data, and knowledge Management.
3. Publish Results- Is the process of delivering business intelligence to the knowledge workers who need it. In some cases, this means placing BI results on servers for publication to knowledge workers over the internet or other networks. In other cases, it means making the results available via a Web service for use by other applications. In still other cases, it means creating PDFs or PowerPoint presentations for communicating to colleagues or management.
3A. Push Publishing- Delivers business intelligence to users without any request from the users; the BI results are delivered according to a schedule or as a result of an event or particular data condition.
3B. Requires the user to request BI results. A type of BI delivery system that requires users to request BI results.
• What is click stream data? What is metadata?
Click Stream Data- the process of collecting, analyzing and reporting aggregate data about which pages a website visitor visits -- and in what order. The path the visitor takes though a website is called the clickstream.
Metadata- Data the describes data. A set of data that describes and gives information about other data. Metadata summarizes basic information about data, which can make finding and working with particular instances of data easier. For example, author, date created and date modified and file size are examples of very basic document metadata. The format of metadata depends on the software product that is processing the database.
• What are data warehouses and data marts?
Data warehouses- a facility for managing a organization’s BI data. Think of a data warehouse as a supply chain. The functions of a data warehouse are:
1. Obtain data
2. Cleanse data
3. Organize and relate data
4. Catalog data.
Data marts- A data collection, smaller than a data warehouse, that addresses the needs of a particular department or functional area of business. If the data warehouse is the distributor in a supply chain, then a data mart is like a retail store in a supply chain. Users in the data mart obtain data that pertain to a particular business function of the warehouse. Such users do not have the data management expertise that data warehouse employees have, but they are knowledgeable analysts for a given business function.
• Understand the concepts related to granularity of data.
Granularity- The level of detail represented by the data.
• What are OLAP tables, its dimensions and its benefits?
OLAP- Online Analytical Processing. OLAP provides the ability to sum, count, average, and perform other simple arithmetic operations on group data. The defining characteristics of OLAP is that they are dynamic. The viewer of the report can change the report’s format, hence the term online. OLAP reports have measures and dimensions.
A measure is the data item of interest. It is the item that is summed or averaged or otherwise processed in the OLAP report. Ex: Total sales, average sales, average cost.
Dimension- a characteristic of a measure. Purchase date, customer type, customer location, and sales region are examples.
What is data mining?
the application of statistical techniques to find patterns and relationships among data for classification and prediction. Data mining resulted from a convergence of disciplines, including, artificial intelligence and machine learning.
What is a cluster analysis?
a common unsupervised technique or identifying groups of entities that have similar characteristics.
What is a decision tree?
a hierarchical arrangement of criteria that predict a classification or a value. Decision tree analyses are an unsupervised data mining technique: The analyst sets up the computer program and provides the data to analyze, and then decision tree produces the tree.
• What are the differences between supervised and unsupervised data mining?
Supervised Data Mining- data miners develop a model prior to analysis and apply statistical techniques to data to estimate parameters of the model.
Unsupervised Data Mining- Analysts do not create a model or hypothesis before running the analysis. Instead, it is where a data mining application is applied to the data and results are observed. With this method, analysts create hypotheses after the analyses, in order to explain the patterns found.
• What are the main components of Big Data?
Big Data- is a term used to describe data collections that are characterized by huge volume, rapid, velocity, and great variety.
Components: Big Data data sets are at least a petabyte in size, and usually larger.
Big data is generated rapidly.
Big Data has structured data, free- form text, log files, possibly graphics, audio, and video.
• Define knowledge management.
the process of creating value from intellectual capital and sharing that knowledge with employees, managers, suppliers, customers, and others who need the capital. Process quality is measured by effectiveness and efficiency, and knowledge management can improve both. KM enables employees to share knowledge with each other and with customers and other partners. By doing so, it enables the employees in the organization to better achieve the organization’s strategy. At the same time, sharing knowledge enables employees to solve problems more quickly and to otherwise accomplish works with less time and other resources, hence improving efficiency.