W06 - Database Concepts and Data Sources Flashcards

(90 cards)

1
Q

how is spatial and attribute data used with GIS?

A

spatial data relate to the geometries of spatial features

attribute data describe the characteristics of the spatial features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how does the georelational data model (eg. a coverage) store spatial and attribute data?

A

separately and links the two by the feature ID. the 2 datasets are synchronized so they can be queried, analyzed, and displayed in unison

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how does the object-based data model (eg. a geodatabase)

A

combines both geometries and attributes in a single system. each spatial feature has a unique object ID and an attribute to store its geometry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how does the raster data model work?

A

cell value corresponds to the value of a continuous feature at the cell location

the value attribute table summarizes cell values and their frequencies in the raster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

how is attribute data stored?

A

in tables, organized by rows (record) and columns (field).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the 2 types of attribute tables in GIS?

A

feature attribute table and and tables of nonspatial data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is a feature attribute table?

A

an attribute table that has access to the geometries of features.

every vector data set must have a feature attribute table

in the georelational data model, the feature attribute table uses the feature ID to link to the feature’s geometry

in the object-based data model, the feature attribute table has a field that stores the feature’s geometry

have default fields that summarize the feature geometries (ex. length for line features and area & perimeter for polygon features)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are tables of non-spatial data?

A

these tables do not have direct access to the feature geometry but has a field linking the table to the feature attribute table.

ex. delimited text files, dBASE files, excel files, access files, other db files from SQL, oracle, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is a database management system (DBMS)?

A

software package that lets us build and manipulate a database.

provides tools for data input, search, retrieval, manipulation and output

ArcGIS for Desktop uses Access for managing personal geodatabases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how is the geodatabase implemented?

A

implemented in a relational database management system and stores both geometries and attributes in a single database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is a client-server distributed database system?

A

a client sends a request to the server, retrieves data from the server, and processes the data on the local computer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are methods of classifying attribute data?

A

by data type, by measurement scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are the different data types?

A

determines how an attribute is stored, typically included in the metadata of geospatial data

ex. number, text (string), date, binary large object (BLOB)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how can numbers (data type) be stored?

A

integers (no decimal digits), float/floating point

integers can be short or long.

float can be single precision or double precision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what do BLOBs store?

A

store images, multimedia and feature geometrics as long sequences of binary numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are the ways to classify data by measurement scale?

A

nominal, ordinal, interval, and ratio data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is nominal data

A

different kinds / categories of data, such as land-use types or soil types

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is ordinal data

A

differentiates data by a ranking relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is interval data

A

have known intervals between values (ex. 60F vs 70F differ by 10F)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is ratio data

A

same as interval data but ratio data are based on a meaningful zero value (ex. population densities)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

categorical data

A

includes nominal and ordinal scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

numerical data

A

includes interval and ratio scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what are the types of database designs?

A
  • flat file
  • hierarchical
  • network
  • relational
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

what is a flat file?

A

stores all data in a large table (ex. spreadsheet)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
what is a hierarchical database?
organizes its data at different levels and uses only one-to-many associations between levels (ex. zoning > parcel > owner)
26
what is a network database?
builds connections across tables
27
what is a common problem with hierarchical and network databases?
the linkages between tables must be known in advance and built into the database at design time. could make the database complicated and inflexible
28
what is a relational database?
collection of tables (or relations) that can be connected to each other by keys
29
what is a primary key?
represents one or more attributes whose values can uniquely identify a record in a table cannot be null and should never change
30
what is a foreign key?
one or more attributes that refer to a primary key in another table
31
common field
primary and foreign key with the same name
32
what are the benefits of a relational database?
simple and flexible each table in the database can be prepared, maintained and edited separately from the other tables tables can remain separate until a query or analysis requires attribute data from different tables to be linked together (efficient for data management and data processing)
33
what is the SSURGO and who produces it?
the Soil Survey Geographic database, produced by the Natural Resources Conservation Service (NRCS) SSURGO data collected from field mapping, archiving data in 7.5 minute quadrangle units, organized by soil survey area, which may consist of a county, multiple counties, or part of multiple counties database consists of spatial and tabular data for each soil survey area, spatial data contained a detailed soil map, made of soil map units (which may be made of one or more noncontiguous polygons). ` a soil map unit represents a set of geographic areas for which a common land-use management strategy is suitable.
34
what is normalization?
process of decomposition, taking a table with all the attribute data and breaking it down into small tables while maintaining the links between them
35
what are the objectives of normalization?
- avoid redundant data in tables that waste space and can cause data integrity problems - ensure attribute data in separate tables can be maintained and updated separately and linked when necessary - facilitate a distributed database
36
normalization performance issues
higher normal forms than the third can slow down data access and create higher maintenance costs.
37
what are the different types of relationships between records in tables?
one to one one to many many to one many to many origin and destination
38
one to one
one record in a table is related to only one record in another table
39
one to many
one record in a table may be related to many records in another table
40
many to one relationship
many records in a table may be related to one record in another table (ex. several households may share the same street address)
41
many to many
many records in a table may be related to many records in another table
42
what is a join
brings together 2 tables by using a common field or a primary key + foreign key ex. joining attribute data from a nonspatial data table to a feature attribute table recommended for one to one or many to one relationships doesn't work for one to many or many to many because only the first matching record from the destination will be assigned to the origin record
43
what is a relate?
operation that temporarily connects 2 tables but keeps the tables physically separate works for all types of relationships, but slows down data access
44
what is a relationship class?
relationships between objects, predefined and stored in a geodatabase. for the object-based data model can be one to one, many to one, one to many and many to many for the first 3, records in the origin are directly linked to records in the destination for many to many, an intermediate table sorts out the associations between records
45
field definition
# define each field in the table, usually include - field name - data width (# of spaces reserved for a field) - data type - number of decimal digits (part of the definition for the float type) field definition becomes a property of the field so it is important to consider how the field will be used before defining it
46
methods of data entry
import attribute files, but if they don't already exist, then typing it in. for map unit symbols or feature IDs, best to enter them directly in a GIS. for nonspatial data, better to use word processing or spreadsheet packages (excel, notepad)
47
what are the 2 steps to attribute data verification?
1) make sure that attribute data are properly linked to spatial data (feature ID should be unique and contain no null values) 2) verify the accuracy of attribute data
48
what is an effective method for preventing data entry errors?
use attribute domains in the geodatabase attribute domains allows the user to define a valid range of values or a valid set of values for an attribute
49
what does field management entail?
adding or deleting fields and creating new attributes through classification and computation of existing attribute data
50
why is it good to delete unnecessary fields after downloading data from the internet?
reduces confusion in using the data set and also saves computer time for data processing
51
creating new attribute data by classification
data classification reduces a data set to a small number of classes (ex. reclassifying elevations into groups) 1) define a new field for saving the classification result 2) select a data subset using a query 3) assign a value to the selected data subset
52
creating new attribute data by computation
1) define a new field | 2) compute the new field values from the values of existing attributes
53
what is the purpose of data exploration?
allows you to examine the general trends in the data, take a look at subsets, focus on possible relationships between data sets purpose is to better understand the data and provide a starting point for formulating research questions and hypotheses
54
data visualization
discipline that uses a variety of exploratory techniques and graphics to understand and gain insight into data
55
how does data exploration in GIS differ from data exploration in statistics?
1) data exploration in GIS involves both spatial and attribute data 2) includes map and map features besides descriptive statistics and graphics, data exploration in GIS must also cover map-based data manipulation, attribute data query, and spatial data query
56
range
difference between the minimum and the maximum
57
median
the midpoint value (50th percentile)
58
first quartile
the 25th percentile
59
third quartile
the 75th percentile
60
mean
average of data values
61
variance
measure of the spread of the data about the mean sum of (value - mean) ^2 divided by # of values
62
standard deviation
square root of the variance
63
z score
standardized score (x - mean) / standard deviation
64
cumulative distribution graph
line graph that plots the ordered data values against the cumulative distribution values the cumulative distribution value is (i - 0.5)/n the values fall between 0 and 1
65
bubble plots
a variation of scatterplots that uses varying-sized bubbles that represent a third variable
66
boxplots
show min, first quartile, median, third quartile, max used to tell if the distribution is symmetric or skilled or if there are any outliers
67
QQ plots
quantile-quantile plots compare the cumulative distribution of a data set with some theoretical distribution (ex. a normal distribution) points in a QQ plot fall in a straight line if the data set follows the theoretical distribution
68
dynamic graphs
graphics displayed in multiple and dynamically linked windows where we can directly manipulate data points
69
brushing
allows the user to graphically select a subset of points from one chart and view related data points in other graphics
70
geovisualization
data visualization that focuses on geospatial data and the integration of cartography, GIS, image analysis, and exploratory data analysis
71
what are the different types of map-based data manipulations?
data classification, spatial aggregation, and map comparison
72
what are the different methods of doing map comparisons?
1) superimpose layers on top of one another and have them be represented on the map differently, or turn the layers on and off, or use transparency 2) use map symbols that can show two data sets ex. bivariate choropleth map ex. cartogram, where the unit areas are sized proportional to a variable (ex state population) and the area symbols are used to represent the second variable 3) temporal animation can be used if there is time-dependent data
73
attribute data query
process of retrieving data by working with attributes (ex. SQL commands)
74
SQL
data query language designed for manipulating relational databases, used in the GIS to communicate with a database select from where ex. select Parcel.Sale_date from Parcel where Parcel.PIN = 'P101' ex. select Parcel.Sale_date from Parcel, Owner where Parcel.PIN = Owner.PIN AND Owner_name = 'Costello' query joins the two tables and then actually queries it
75
procedural differences when querying a local database in a GIS package
1) only have to enter WHERE in the query expression box because typically the field and table have already been selected 2) an attribute query dialog is typically designed for a single table, so if the query involves attributes from two tables, they have to be joined first.
76
query expressions
the where conditions with Boolean expressions and connectors
77
Boolean expression
contains 2 operands and a logical operator operands can be a field, number, or text logical operators can be =, >, =, <> (not equal to) can also contain arithmetic operators
78
boolean connectors
AND, OR, XOR, NOT XOR is the opposite of AND. only records that satisfy one and only one of the expressions are selected
79
what are the types of operations that can act on a data set?
add more records to a subset remove records from a subset select a smaller subset
80
relational database query
works with a relational database, selects a data subset in the table and also selects records related to the subset in other tables
81
what is the difference between join operation and relate operation?
join operations combines the attribute data from 2 or more tables into a single table. relate dynamically links the tables but keeps the tables separate
82
spatial data query
process of retrieving a data subset from a layer by working directly with feature geometries. the results can be simultaneously inspected in the map, linked to the records in the table and displayed in charts can select features spatially using a cursor, a graphic or the spatial relationship between features
83
feature selection by graphic
draw a shape (graphic) to select objects of interest (ex. restaurants within a 1 mile radius of a hotel)
84
feature selection by spatial relationship
selects features based on their spatial or topological relationships to other features ex. roadside rest areas within 50 mile radius of selected rest area; rest areas within each county spatial relationships used for querying include containment, intersect, and proximity
85
containment (spatial query)
selects features that fall completely within features for selection
86
intersect (spatial query)
selects features that intersect features for selection
87
proximity (spatial query)
selects features that are within a specified distance of features for selection
88
spatial adjacency
features to be selected and features for selection share common boundaries and the specified distance is 0
89
raster data query - query by cell value
use the raster instead of a field in the operand to query a feature can query multiple rasters, which may be integer, floating point, or a mix of both. querying multiple rasters directly is unique to raster data
90
raster data query - query by select features
features can be used to query a raster and it returns an output raster with values for cells that correspond to the query and no data in the other cells