lesson 3 Flashcards

1
Q

You want to store the entire works of William Shakespeare (about 836’000
words) as a text file. How large is it going to be?

A

836000 * 6 * 1 = 5016000 Bytes ~ 5MB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Decode the “secret message” from standard ASCII.
87 69 76 76 32 68 79 78 69

A

WELL DONE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

You want to encode the variable DAY OF WEEK (Monday, Tuesday, . . . ,
Sunday) as efficiently as possible. How many Bits do you need for this variable?

A

7 days of the week -> log_2(7) = ln(7) / ln(2) = 2,8 –> 3Bits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

YYou obtain a german text file. The first line reads like this:
What went probably wrong and how can you fix it?

A

(1) we are using the wrong ASCII table
(2) reopen using correct ASCII table –> convert to unicod

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The Nikon D7500 digital camera has a sensor that captures approximately
6000 x 4000 pixels.
(a) What is the size in Kilobytes of an uncompressed photo of the Nikon D7500?
(b) In what efficient file format would your store the photo?
(c) What size can you expect the file to be in the efficient format?

A

a) 6000 * 4000 * 3 = 72000000 ~ 72MB

b)jpeg

c) 72MB / 10 = 7,2MB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

You design a database of the OHLC (open/high/low/close) prices of all stocks
that are traded in the US.
(a) Write down all fields and SQL data types.
(b) Calcualte the size in Bytes of one record.
(c) Estimate the total size of your database, making reasonable assumptions.
Note: clearly state your assumptions. You don’t have to justify them.
(d) How can you make the database more efficient and/or use less storage? State two
possible measures.

A

a)
open DOUBLE 8 Bytes
high DOUBLE 8
low DOUBLE 8
close DOUBLE 8
ISIN (symbols) CHAR (12) 12
date DATE 3

b)
tot. 47 bytes

c)
number of stocks traded: 4500
number of days in a year: 252
number of years in the Data Base: 20

47 * 4500 * 252 * 20 = 1 065 960 000 ~ 1GB

d)
DOUBLE -> FLOAT
only store “close”
transparent compression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

examples of Slowly-changing data?

A

Stock variables, contracts, Industry association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

examples of Fast-changing data?

A

Flow variables, prices, assets,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are Derived quantities? (+ examples)

A

Anything that is calculated from above quantities.
Mostly (but not always) quotients
es :
GDP per capita
GDP per capita in USD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are Q-quantities? (+ examples)

A

are priced measure.
Anything that prices future utility
es:
- Bonds (time value of money + infation premium)
- Stocks (time value of money + equity premium)
- Derivatives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what are P-quantities?
different from Stocks and Flows (+ examples)

A

P-quantities: Anything that is countable or (physically) measurable

STOCKS: (storage of things).
number of objects or quantity of material
Number of employees, clients, Size of a plot of land, debt

FLOWS: number/quantity per unit (of time)
GDP, deficit, turnover, trade volume, volatility (as quantity of risk), energy consumption (per year), products (e.g. cars)
produced (per year)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Features of a Data base
min 3

A

Everything in one place
– Obvious structure for most data
– Central pillar of data workflow
– Easily share data and collaborate
– Easily create subsets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

SQL cods

A

SELECT <fields></fields>

FROM <d’base>

WHERE <conditions> ORDER BY <field></field></conditions>

SELECT COUNT(*) FROM opt; A function: number of records.

SELECT MIN(date), MAX(date) FROM opt; First and last date.

SELECT DISTINCT date FROM opt; List of all different dates.

SELECT COUNT(DISTINCT date) FROM opt; How many different trade dates?

INSERT

SHOW

USE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how a database is made of?

A

Database
– TABLE
Possibly several, can be linked
⊲ FIELD
– One variable → distinct type
– “colums”
⊲ RECORD
– One observation (individual)
– “Row”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

goals of relational database

A

□ Avoid duplication
□ Avoid inconsistency
□ (Increase efficiency)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is the role of Normalization in database?

A

Normalization:
□ Move all possible duplicate entries to a new table

17
Q

How to link list(s) and main database?

A

□ Require unique identifiyer (“KEY“)

18
Q

Name at least 3 Alternative Database Models?

A

□ Key-Value pairs
□ Multivalue model
□ Wide-column store
□ Graph database model
□ Document model
□ Column oriented
□ Data Stream

19
Q

2 steps of encoding data

A

logical encoding:
efficient representation of data stored data in a computer eg. alphanumeric

technical encoding:
store data as bits

20
Q

principles of economic quantities?

A

Starting point

VALIDITY
Make sure that your variable actually measures what we want to know
(P-Q)

RELIABILITY
Same result in repeat measurement in time with the Same method.

REPRODUCIBILITY
Same result with a different measurement method

OBJECTIVITY
Same result in measurement, when irrelevant third factors change