Exam 3 Flashcards

(77 cards)

1
Q

Rate of Data Growth

A

Doubles every 6 months

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Information Abundance

A

World has changed, jobs have changed - so much information - need to geek up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Business Intelligence

A

Firms that are basing decisions on hunches aren’t managing, they are gambling.
Having good data gives the business the power to make an informed decision.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Analytics

A

The extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data

A

Raw facts and figures

tells you nothing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Information

A

Data presented in a context so it can ‘answer a question’ or ‘support decision making’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Knowledge

A

Insight derived from experience and expertise

what humans bring to the table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Structed Data

A

Organized

Predefined Characteristics “Schema”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Unstructured Data

A

Not Organized – No Schema

Text – email, Facebook pages, news stories, etc.
Binary – Images, audio, video

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Table

A

An organized collection of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Records

A

Rows in a database table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Fields

A

Columns in a database table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Relational database

A

Multiple tables that are related

Uses a Key Field (unique identifier) to link tables together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a “transaction”?

What are its two key characteristics?

A

Any business exchange

  • Standardized schema
  • Occurs repeatedly
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Point of Sale system

A

Retail sales transactions - a cash register

Tracks transactions when item is scanned at checkout and sold to a customer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do loyalty cards generate valuable data?

A

The company is paying you for data about you that you otherwise would not give them
(helps the company to see who is buying what items instead of cash anonymous)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

ERP

A

Enterprise Resource Planning

Look into paychecks, invoices, payments become a business transaction and data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

SCM

A

Supply Chain Management

Each order for finished goods, each order for raw materials are a transaction and data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Sources of customer-provided data

A

Customer surveys
Product registration cards
Contests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Data Aggregator

A

Firms that trawl the Internet and other sources for data, then package that data up for resale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Business operations – examples

A

Healthcare Industry – patient data (pharmaceutical research)

Michigan – tags cows at birth

Transportation – engine on Boeing (new airbus aircrafts have over 100k sensors gathering data)

Switzerland – put sensors on 9k trains and 5k km of tracks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Top CIOs say that data growth is the #1 challenge today. What two problems arise from that challenge?

A

Handling explosive growth with constrained budgets

and Exploiting all that data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is an SSD? How does it address the problems with data growth?

A
Solid State Drive 
Uses flash memory (faster)
Lower power consumption (less heat)
RAID
Prices dropping
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is Automated Data Tiering?

A

Match storage performance to access frequency (automatically make data decisions)

Top Tier: Currently working data
Mid Tier: Recently used data
Bottom Tier: Historical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is DeDupe?
Software that identifies where there are duplicates and eliminates the extra data in order to tame the growth of unstructured data (eliminates growth in area of unstructured data)
26
Data Silos
No sharing / communication possible Can be caused by data trapped in obsolete legacy systems or incompatible systems Causes us to miss opportunities to see correlations, patterns and trends
27
Operational data
Data that is continually generated in the day-to-day business operations of a business. When an order is entered, operational data is created and is used immediately by systems that pick inventory from the warehouse, print labels, and arrange for shipping Things like customer, inventory, and purchase data fall into this category. This type of data is pretty straightforward and will generally look the same for most organizations
28
How does the analysis of operational data compete with customers? What can a company do about this problem?
Putting extra load on the system that slows the system down and customers and sales can be lost Add separate data repositories
29
Data Warehouse? What are its characteristics
Collection of databases that supports decision making Many sources Fast queries and Exploration Best way to let your managers do analytics without harming the performance of your operational system
30
How is a Data Mart different from a Data Warehouse?
Similar but, the scale is different. | Instead of looking at an enterprise, we are looking at a specific problem and a specific unit
31
What three characteristics are necessary for something to be “Big Data”? Explain what each “V” means
Volume – “too big” to be analyzed (all credit card transactions on a day within Europe) Velocity – “too fast” Rapid arrival; Feedback Loop (Twitter messages or Facebook posts) Variation – “too all over the place” too little consistency (text, images, sounds, human input, sensors)
32
HaDoop
An open source system that is designed to be able to consume any kind of data
33
Data Mining
The process of using computer software to identify patterns in enormous data sets and build models from that data. The idea is that the models will help identify current and future trends.
34
Canned Report
Canned reports are preformatted reports distributed to a whole organization or to specifically defined user groups. Answers specific questions Needs to be easy to use with almost zero training required! Fortunately, each employee needs to see the same output; there's no need for each user to be able to customize his/her report. Pro: "Easy for users" Con: Inflexible, IT overhead
35
Ad-Hoc Report
Tools that allow users to create custom reports based on the questions they need answered. Users define their own reports and the tool is powerful and flexible. Pro: powerful, flexible Con: "demanding on user", potentially steep learning curve, business knowledge, understand data schema
36
OLAP (Online Analytical Processing)
Pros: Huge data Pre-processed and Summarized User reports fast! Con: No access to details (due to summarization – never going to get all the way down to the detailed data)
37
Network
Collection of devices connected together via communications devices and transmission media
38
LAN
Network covering a limited geographical area, such as a single building
39
Protocol
Rules that govern networking communications
40
MAC Address
Uniquely identifies a node from every other one on the planet; however provides no information about its location Similar to a social security number, each device on the planet has a unique MAC address. This number is permanent and never changes.
41
Router
A specialized computer with multiple network ports and specialized software that is used to interconnect networks
42
DNS
the technology that converts hostnames like www.espn.com into IP addresses like 199.181.132.250
43
The single greatest cause of lost data? How do you protect against that risk?
User Error Backup everything to protect against that risk. You need a good backup program that backs up the data files and an image backup. Backup to tape cartridges (cheap but slow) or an array of backup disks (faster than tapes).
44
Peer-to-Peer
A network where each computer has the ability to both share and use resources
45
Server
A computer that's attached to the network, whose primary purpose is to provide service to other nodes on that network
46
Node
Any device that is attached to a network
47
Packet
Data is divided into packets, which is done by software, in order for them to be sent across the internet more efficiently
48
WAN
Wide Area Network- links LANs together Only works if private data circuit goes to area Does not work for mobile users (needs private data circuit) Works for remote office needs because data circuit exists
49
What are the key differences between copper UTP and fiber Ethernet cabling?
``` UTP = Unshielded Twisted Pair Fiber = Long runs, High speed ```
50
Fiber Optic cable is non-conducting – why is that good?
Avoids electrical interference (Florida summers/weather) | Made of a type of glass
51
Ethernet Switch
What you use to connect a group of nodes together Some switches have management capabilities. You can look at them and see the status of the device.
52
PoE
Power over Ethernet Utilizes unused cables, saves money, can be used for security cameras
53
Backhoe problem - How do you protect your network against this problem?
You can only have a single cable and it can get destroyed by a differ in construction Protect this by having multiple paths, separate, different pathways instead of just one.
54
What guidance did Mr. Olson offer about WiFi range?
They don’t like things that get in the way, like walls. If you think about how far the radio waves can travel, Maybe 100 ft indoors… more outdoors
55
What common devices can interfere with WiFi networks? Which radio spectrum is affected?
2.4 GHz is affected Microwave, cordless phone, baby monitor can interfere
56
AP – Define the acronym. What does it do? How does it connect to rest of the corporate LAN?
Access Point | Allows for other Wi-Fi devices to connect to a wired network
57
Site survey
Walking around the building to find weak and strong signals for best coverage
58
Why is WiFi security important?
Prevents hackers from potentially seeing your internet traffic
59
Rogue APs
Someone installs their own personal router at the office that can become/creates security threats
60
Client-Server service model
Clear division of labor, IT does the services, client/user just receives it Client sends request to server, server send back a response.
61
RAID
Redundant Array of Inexpensive Disks Protects against data loss
62
How does RAID 1 (mirroring) work?
2 hard drives; RAID system copies the information onto both computers so they're mirror images of each other
63
How does RAID 5 work?
3+ drives Error correction -- You use a bunch of hard drives. IT will take your data and divide it into several different drives. n+1 – On the extra drives stores error correcting data. Which is mathematically computed from the other drives. If one of those drives crashes, the data on the surviving drives plus the error correcting data can be used with the same math formula to calculate the data on the dead drive
64
How can a company protect itself against LAN hardware failure?
Put multiple NICs in the server; multiple switches
65
What technology was described that can protect against total server failure? Two basic versions of this technology were described. How do they work? How are they different?
Clustering -- communicate to make sure work load is shared and protect against crashing Active-Passive -- Active server handles all requests, passive only goes up once active crashes Active-Active -- Both servers are working, if one fails then the other just picks up the load
66
URL – what are the component parts and what does each do for you?
Uniform Resource Locator Application transfer protocol -- defined how the data will be handled (http, https, ftp, itpc) Host name - (www.) Organization Domain name - (youtube) Top level domain name - (com, .edu) Path - folder in file system; file - specific piece of content (/tech) File - The specific content you want. This part of the URL is case sensitive. (/index.hmtl -- index.mp4)
67
Domain Registrar – what is it? why would you use one?
Pay annual fee to reserve host name, first come first serve
68
What is meant by the term "last mile"? Why should an organization care about the last mile?
Core of the internet is fast but, when you get to the edge of coverage, speed plummets
69
DAS
DAS = Distributed Antenna System Cell phones often don’t work as well in buildings. You can install multiple small antennas to boost the signal. You can also do this in fields and arenas where you expect large crowds so there is higher signal strength to handle the load.
70
Satellite wireless - what is "latency"? What are the differences between MEO and GEO?
Latency - the delay for the satellite and the signal GEO - a satellite looking down from above in space MEO - has to go faster than earth is rotating, uses multiple satellites
71
Net Neutrality - what's the basic issue? Who is on each side of the issue and why?
The principle that all internet traffic should be treated equally by ISPs Consumers vs providers, going to cost us/ISPs more money if it passes
72
Last Mile
Users that have unacceptably slow links to the Internet
73
Each of the last mile technologies and describe it in very general terms
Analog modems -- Standard telephone lines (POTS) Broadband – digital connections. Cable broadband – Used for cable TV DSL – Digital subscriber line. Using existing telephone wires FTTH – Fiber to the home. 100% Fiber Optic. Super high performance but extremely expensive. Cellular Wireless – You don’t need to wire individual premises, but it is still expensive because the wireless system is expensive to license because people hate the idea of having more cell towers (NIMBY – not in my backyard).
74
CAT Ratings
the CAT number tells you specifically how that cable has been engineered and how fast it can safely transmit data. CAT 5 is the minimum remotely accessible wiring. If it is below that, it should be replaced. CAT 6 is the standard you find now.
75
What is “information overload”? What is its alleged impact?
Some allege there is so much information available that people cannot do their jobs They allege there is a $900 billion cost to the economy
76
TPS
Transaction Processing Systems
77
What are two things to worry about with Data mining?
CLEAN data -- your data needs to be clean. If you have inconsistent data, you could wind up with false results. REPRESENTATIVE data – if the past data is not representative of current or future events, you could wind up with bogus models.