Chapter 4, Data Management Patterns Flashcards
What are Data Sources?
Here, data sources are cloud native applications that feed data such as user inputs and sensor readings. They sometimes feed data into data-ingestion systems such as message brokers or, when possible, directly write to data stores. Data-ingestion systems can transfer data as events/messages to other applications or data stores;
160 Figure 4-1. Data architecture for cloud native applications
What do Batch-processing systems so?
Batch-processing systems process data from data sources in batches, and write the processed output back to the data stores so it can be used for reporting or exposed via APIs.
161 Figure 4-1. Data architecture for cloud native applications
What are the three main types of data that influence Application behavior?
- Input data
Sent as part of the input message by the user or client. Most commonly, this data is either JSON or XML messages, though binary formats such as gRPC and Thrift are getting some traction. - Configuration data
Provided by the environment as variables. XML has been used as the configuration language for a long time, and now YAML configs have become the de facto standard for cloud native applications. - State data
The data stored by the application itself, regarding its status, based on all messages and events that occurred before the current time. By persisting the state data and loading it on startup, the application will be able to seamlessly resume its functionality upon restart.
162
What are the three categories of data that Cloud native applications use?
- Structured data
Can fit a predefined schema. For example, the data on a typical user registration form can be comfortably stored in a relational database. - Semi-structured data
Has some form of structure. For example, each field in a data entry may have a corresponding key or name that we can use to refer to it, but when we take all the entries, there is no guarantee that each entry will have the same number of fields or even common keys. This data can be easily represented through JSON, XML, and YAML formats. - Unstructured data
Does not contain any meaningful fields. Images, videos, and raw text content are examples. Usually, this data is stored without any understanding of its content.
164
What are ACID properties?
- Atomicity
- Consistency
- Isolation
- Durability
165
Define Atomicity from ACID
atomicity guarantees that all operations within a transaction are executed as a single unit
165
Define Consistency from ACID
consistency ensures that the data is consistent before and after the transaction
165
Define Isolation from ACID
Isolation makes the intermediate state of a transaction invisible to other transactions
165
Define Durability from ACID
Durability guarantees that after a successful transaction, the data is persistent even in the event of a system failure
165
What does the CAP in CAP theorem stands for?
CAP stands for consistency, availability, and partition tolerance. This theorem states that a distributed application can provide either full availability or consistency; we cannot achieve both while providing network partition tolerance. Here, availability means that the system is fully functional when some of its nodes are down, consistency means an update/change in one node is immediately propagated to other nodes, and partition tolerance means that the system can continue to work even when some nodes cannot connect to each other.
169
What are three types of data store?
- Relational
- NoSQL
- Filesystem
172
What are the three techniques in which data can be managed?
- Centralized
- Decentralized
- Hybrid
172
Describe the Data Service Pattern
The Data Service pattern exposes data in the database as a service, referred to as a data service. The data service becomes the owner, responsible for adding and removing data from the data store. The service may perform simple lookups or even encapsulate complex operations when constructing responses for data requests.
180
How id the Data Service Pattern used?
This pattern can be used when we need to allow access to data that does not belong to a single microservice, or when we need to abstract legacy/proprietary data stores to other cloud native applications.
181
What are some related patterns to the Data Service pattern?
- Caching pattern
Provides an opportunity to optimize the efficiency of data retrieval by using local or distributed caching when exposing data via a service. - Performance optimization patterns
Apart from caching data, these execute complex queries such as table joins and running stored procedures directly in the database to improve performance. - Materialized View pattern
Accessing data via an API can still be performance-intensive. For use cases that need joins to be performed with data that resides in stores belonging to other services, having that data replicated in its local store and building a materialized view can help improve query performance. - Vault Key pattern
Along with API security, knowing who is accessing the data can help identify the caller and enforce adequate security and data protection.
183
Describe the Composite Data Services Pattern
The Composite Data Services pattern performs data composition by combining data from more than one data service and, when needed, performs fairly complex aggregation to provide a richer and more concise response. This pattern is also called the Server-Side Mashup pattern, as data composition happens at the service and not at the data consumer.
185
How does the Composite Data Services Pattern work?
The Composite Data Services Pattern combines data from various services and its own data store into one composite data service. This pattern not only eliminates the need for multiple microservices to perform data composition operations, but also allows the combined data to be cached for improving performance (Figure 4-11).
185 Figure 4-11. Composite Data Services pattern
How is the Composite Data Services Pattern used in practice?
This pattern can be used when we need to eliminate multiple microservices repeating the same data composition. Data services that are fine-grained force clients to query multiple services to build their desired data. We can use this pattern to reduce duplicate work done by the clients and consolidate it into a common service.
187
What are some considerations when using the Composite Data Services Pattern?
Use this pattern only when the consolidation is generic enough and other microservices will be able to reuse the consolidated data. We do not recommend introducing unnecessary layers of services if they do not provide meaningful data compositions that can be reused. Weigh the benefits of reusability and simplicity of the clients against the additional latency and management complexity added by the service layers.
187
What are some patterns related to The Composite Data Services pattern?
- Caching pattern
`Provides an opportunity to optimize the efficiency of data retrieval and helps achieve resiliency by serving data from the cache when backends are not available. - Client-Side Mashup pattern
`Allows the data mashup to happen at the client side, such as in the user’s browser. This can be a good solution when asynchronous data loading is feasible and when meaningful data composition can be performed with partial data.
187
Describe the Client-Side Mashup Pattern
In the Client-Side Mashup pattern, data is retrieved from various services and consolidated at the client side. The client is usually a browser loading data via asynchronous Ajax calls.
188
How does the Client-Side Mashup Pattern work?
This pattern utilizes asynchronous data loading, as shown in Figure 4-12. For example, when a browser using this pattern is loading a web page, it loads and renders part of the web page first, while loading the rest of the web page. This pattern uses client-side scripts such as JavaScript to asynchronously load the content in the web browser.
Rather than letting the user wait for a longer time by loading all content on the website at once, this pattern uses multiple asynchronous calls to fetch different parts of the website and renders each fragment when it arrives. These applications are also referred to as rich internet applications (RIAs).
188 Figure 4-12. Client-Side Mashup at a web browser
How is the Client-Side Mashup Pattern used in practice?
This pattern can be used when we need to present available data as soon as possible, while providing more detail later, or when we want to give a perception that the web page is loading much faster.
190
What are some considerations for the Client-Side Mashup Pattern?
Use this pattern only when the partial data loaded first can be presented to the user or used in a meaningful way. We do not advise using this pattern when the retrieved data needs to be combined and transformed with later data via some sort of a join before it can be presented to the user.
191