Storage Flashcards

1
Q

Azure Storage Access Control

A

Is a set of features that Microsoft provide, to allow you to have highly available data stored and accessed in the cloud.

-It’s massively scalable, so you can store petabytes of data
-Great accessibility, it’s built for worldwide public internet access
-It’s managed for you

Types of Storage:

-Blob Storage: Is an object store that you’re typically going to use, if you store things like text and binary data
-Files: Is a managed file server (If you need folder hierarchy)
-Queue: Is a messaging service for decoupled components of you applications to be able to communicate with one another.
-Table: For storing data from your application (Schemaless structured data - NoSQL store)

Architecture

  1. Storage Account: Special container with important properties for the storage service
    -Name: Unique name used to create public DNS record for accessing storage
    -Performance: Standard: lowest cost, HDD backed
    -Premium: higher cost , SSD backed
    -Type: General Purpose v2, Page Blobs, BlockBlobs, File. Additional legacy options
    -Redundancy: Protect data by replicating across hardware and datacenters
  2. Storage Services: Multiple storage services can exist within a single storage account
  3. Public Endpoints: Storage services are built for public accessibility by design
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Azure Blob Storage Overview

A

Is built for web access, for binary objects, that you don’t need any structure for.

Architecture

  1. Storage Account: Requires GPv2, BlockBlob, PageBlob (or legacy: GPv1, BlobStorage)
  2. Blob Container: Container for managing access to unstructured data (no hierarchy)
  3. Blobs: Are the actual objects/files that are stored

-You don’t get hierarchy, unless, you trun on a feature called “hierarchical namespace”

Blob Types
-Block Blobs: Most common type of block for storing binary/text data. (standard file)
-Append Blobs: Like block blobs, but built for append operations (e.g logging data)
-Page Blobs: Random access files. Used for VM disks and Azure SQL DB files

Blob Sub-Types
-Blob Version: Retain version history of blobs automatically when edited. (Version Control)
-Blob Snapshot: Read-only point-in-time copy of a blob (only stores differences)
-Soft-Deleted Blobs: Blobs that have been deleted but are kept for a specified retention period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Azure Storage Access Control

A

Access Keys
1. Storage Account: Access Keys are managed at the SA level
2. Access Keys provide full unrestricted access to the SA.
-2 Keys are created by default (Primary - Secondary), when you create the SA
-They provide full access via HTTP/S over the REST API to any data in your SA
-Can be used by all services for REST and even Azure File through SMB as well
3. Management: Keys can be rotated, or disabled entirely for a SA
-You can integrate Key Vault for the rotations
-You can disable the keys entirely

Azure AD Identities (RBAC)
1. Identity: A user, group, app, or managed identity to be provided data access
2. Scope: Supports standard RBAC hierarchy down to individual storage service
3. Role: Supports built-in and custom RBAC roles that target data operations

Shared Access Signatures (SAS)
-URLs with a token granting limited, time-bound access to specific resources.
-Fine-grained control over access without sharing keys.

  1. Service SAS: A token that can provide restricted access to an individual service
    -Stored Access Policy: Facilitates server-side control over service shared access signatures
  2. Account SAS: Provide access to one or more services, including service-level operations

We can create them, in two ways:

  1. By default, you are creating it by using the account key of the storage account
    -If you rotate that access key, that would revoke any SAS that were created with it
  2. User-Delegation SAS: SAS associated with an Azure AD identity that only supports blob storage
    -Instead of using a key to sign the SAS token, it’s going to be created by an identity within Azure AD (much more secure)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Azure Storage Redundancy

A

Redundancy Types
-Locally Redundant Storage (LRS): Creates three copies synchronously within a single physical location
-Zone Redundant Storage (ZRS): Creates three copies synchronously across three AZs within the region
-Geo Redundant Storage (GRS): LRS is followed by one asynchronous copy to the secondary region (3:1)
-Geo Zone Redundant Storage (GZRS): ZRS is followed by one asynchronous copy to the secondary region ((1-1-1):1)

Secondary Read Access
-Supported by RA-GRS or RA-GZRS (without the need for a failover to be triggered)
-Can help ensure continuity of access in the event of any outages
-The copy in the secondary region is available via a public endpoint

Storage Account Failover
-Storage account failover is initiated by the customer, manually
-All data in the primary is lost, and the secondary will become the new primary
-After failover, the new secondary will be configured as locally redundant (LRS)
-Failover can result in data loss, because replication is asynchronous
-Microsoft will update the DNS when you trigger the failover so that applications point to the secondary

Important Considerations
-Not all types of redundancy are supportedd by all storage account types (especially Premium)
-Redundancy should not be relied on for data backup; it is for disaster recovery
-You can convert from/to many redundancy types. Some require a support ticket or special process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Blob Storage Access Tiers

A

Hot Tier
-An online tier optimized for storing data that is accessed or modified frequently
-Highest storage costs, but lowest access costs

Cool Tier:
-An online tier optimized for storing data that is infrequently accessed or modified.
-Lower storage costs, but higher access costs (Online)
-Fee if deleted/moved tier earlier than 45 days **
-Should be stored for a minimum of 30 days

Cold Tier:
-An online tier optimized for storing data that is rarely accessed or modified, but still requires fast retrieval.
-Should be stored for a minimum of 90 days.
-Lower storage costs and higher access costs compared to the cool tier.

Archive Tier:
-An offline tier optimized for storing data that is rarely accessed, and that has flexible latency requirements, on the order of hours.
-Lowest storage costs, but highest access costs (Offline) (Latency)
-Fee if deleted/moved tier earlier than 180 days
-You can’t use it on any type of ZRS Redundant storage (ZRS, GZRS, or RAGZRS)

-Rehydration: The process when you are moving blobs in an archive tier to another tier

Architecture
1. Storage Account: Supports General Purpose V2. Not supported by Premium Blockblob
2. Blobs: Supports block blobs only. Page/append blobs are not supported
3. Access Tier: Default is defined for a storage account. Can be assigned per blob

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Blob Storage Lifecycle Management

A

With Lifecycle Management Policies, we are talking about automating some types of actions.
-Moving blobs between tiers
-Deleting blobs after an amount of days

Configuration
1. Storage Account: General Purpose V2, Premium BlockBlob, BlobStorage (legacy)
2. Blobs: Supports block and append blobs (and sub-types: versions, snapshots)
3. Management Policy: Supports complex rules with filters, blob sub-types, and actions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Block Blob Object Replication

A

The ability to have those secondary readable replicas off in other areas of the world. (Replica of your Primary Storage Account )

-Only read access
-Ensure your users get low latency to read the data

Implementation

  1. Source Storage Account: General Purpose V2 or Premium Block Blob storage accounts
  2. Destination Storage Account(s): Up to two destination accounts (can be same/different region/sub/tenant)
  3. Replication Policy: Source & destination containers for replication, including rules/filters
    -What blobs do we want to replicate
    -Asynchronous replication (there can be a delay from where the data is written at the primary source SA through to the secondary destination)
    -Support for access tier but they cannot be using the Archive tier

If you need read access for read heavy applications across the globe, or for people who want to go on more efficiently manage the data

To be able to do the replication, both SA’s need to have enabled in the Data Protection section:
-Enable versioning for blobs
-Enable blob change feed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Immutable Blob Storage

A

Inmutable Blob Storage Access Policy (Time-based Retention) helps us put the data into a special state, called “WORM state” (Write Once Read Many)

-It’s common when you’re working with compliance, industry requirements or maybe legal issues, you might need to have data not being modified

Must keep data for 2 years!
+Create and read blobs
-Delete/Modify blobs (less than 2 years old)

-If you create a policy for a container that is time-based, you can lock it and say, “no one can modify this data for two years”

Legal-Hold
+Create and read blobs
-Delete/Modify blobs (irrespective of age)

Implementation

  1. Storage Account: Supports General Purpose V2 and Premium Block Blob
  2. (Access) Policy
    -Time: Retention period (can be locked) (Time and Locked or not)
    -Legal: Switch on/off as needed (Requires a key)
  3. Scope: Policies can be Container-level-scoped or Blob/Version-level-scoped
    -For versioning, your SA must have the “Version Level Immutability” turned on

-If you want the “Version Level Immutability” feature at the SA scope, you must have to enable it BEFORE creating the SA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Azure Storage Encryption

A

We do get encryption that can help us to protect traffic. For the storage, sitting at rest, we can use Azure Storage Service Encryption (SSE) (On by default)

-There’s also a feature called “Infrastructure Encryption” if you want to doubly encrypt, add another layer of encryption to you data as it’s processed through Microsoft’s infrastructure (Before the SA creation)

-For the encryption of the traffic flowing from our SA to our users, we can use SSL/TLS (can be enforced)

Configuration

  1. Service-Side Encryption: Encryption of the Azure Storage data in Microsoft infrastructure at rest
  2. Encryption Key (Key Vault) Two main ways where we can have encryption for our SA:
    -PMK: Platform Managed Key (Encrypt)
    -CMK (Optional): Customer Manager Key (use it to wrap the PMK)
    –When you are setting up the SA, you get the option to say whether it’s just Blob and Files that is protected by your CMK (You can’t change this after creation)
    –You have to configure an Identity: A managed identity is required for the SA to access Key Vault. This identity is granted permissions to access the KV.
  3. Encryption Scopes: Optionally define blobs/containers to be encrypted with a CMK or PMK
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Azure Files

A

Is a cloud-based file sharing service. It allows users to create highly available network file shares that can be accessed from multiple Azure virtual machines or from on-premises systems.

-Windows: SMB Share (GPv2 or Premium) - Win32
-Linux: NFS Share (Premium) - POSIX

Storage Tiers

Premium (SSD)
-Highest price for high performance, single-digit ms latency
-Supports both SMB and NFS shares
-Only supports provisioned billing (if you request a 100gb you will pay 100gb)

Transaction Optimized (HDD)
-High price for storage with low costs for transactions
-Only supports SMB shares
-Use pay-as-you-go billing

Hot (HDD)
-Mid price for both storage and transactions
-Only supports SMB shares
-Use pay-as-you-go billing

Cool (HDD)
-Lowest price for storage, but high price for transactions
-Only supports SMB shares
-Use pay-as-you-go billing

Architecture

  1. Storage Account: Supported General Purpose V2 and Premium FlieStorage
  2. File Share (configured at creation time)
    -SMB: Supports all tiers/redundancy
    -NFS: Premium and LRS/ZRS only
  3. Client Connectivity: Accessed using REST API for apps or mounted with SMB/NFS (for users)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Azure Files Connectivity and Access Control

A

Connectivity

SMB Shares
-Connection: SMB3.0 + via internet (2.1 via VNet)
-Access Control Lists: Win32 ACLs, with Kerberos and NTLMv2

NFS Shares
-Connection: NFS 4.1 via VNet ONLY
-Access Control Lists: POSIX, with host-based auth
There is no user based authentication, all through the network

Identity Based Athentication (SMB)

Active Directorty Source

SMB Shares - Identity Based Athentication

-Active Directory: Synchronized identities that authenticate against on-prem AD DS

-Azure AD Domain Services: Azure AD identities (cloud or hybrid) that authenticate against Azure AD DS
–We get access to a lot of those traditional types of capabilities, like “domain join”. We could go and domain join that VM to that managed Azure AD DS Domain
–You can have an on-prem domain controller
–You can use cloud identities or also synchronized identities
–This solution provides the most support and flexibility

-Azure AD Kerberos: Hybrid identities that authenticate agaisnt Azure AD via (Hybrid) Azure-AD joined devices
–For employees working remotely that don’t have connection to the on-premises domain controller, and they wouldn’t be able to authenticate against the Azure File SHare using their idenitity
–You need the device that they are accessing the file share from, to be Azure AD joined
–Those users don’t need to be in the private VNet or on the private office or on-premises network, because they will authenticate against the Azure AD Tenant

Implementation Steps

Once have you selected your option, yo go and configure that, at the SA level:

  1. Configure the Identity Source: Enable the AD source. Only one can be selected for a storage account
  2. Assign Share Permissions: Assign permissions at the share level for identities to access Azure Files
    -Whether or not identities are allowed access
    -We have our identity, we assign it a role (Storage File Data - SMB Share Reader) and we scope it to a File Share
    -You can do this to Cloud Identities or Synched Identities, depending on what AD source you are using
    -You can’t configure this permissions for a computer (Registered Device)
  3. Assign Directory/File Permissions: Mount the sahre with the account key and configure permissions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Azure Files Sync

A

Allows organizations to synchronize files between on-premises servers and Azure cloud storage. It helps in centralizing file services in Azure while maintaining local access to files. Useful for organizations with distributed offices or branch offices that need access to the same set of files and data.

-NFS is not supported
-SMB is supported, with it you could have users directly connect to it
-FTP is supported to provide access to your users or systems out at your remote sites, or wherever you have those windows file servers that are synchronizing back to your share.
-You need a Windows O.S to have this synchronization service running
-If you create a file, with the same name in 2 synchronized locations, at the same time, the File Sync service is going to choose the first person who created that file, to be written to the share. The second is going to have kind of a “conflict” version of the file, that is saved with the actual server name, included in the name
-If your users go and directly write some data to the share, it can take up to 24 hrs, before the data will be synchronized

Cloud Tiering: Helps organizations optimize storage usage and reduce costs by intelligently managing file data across on-premises servers and Azure cloud storage.

“I’ve got one of these sites that maybe doesn’t have that much storage, So just synchronize some of the data, but not all of it. Provide access to it, but don’t actually have it sitting on the file server unless someone goes and requests it “

-The data is still visible to everyone, but it might not reside on that server until it is accessed

There’s two ways we can use Cloud Tiering:

-Space Policy: Looks at the space available on our file server and says “i need to keep 100gb free”. So it will go and only synchronize an amount of data that ensures that the free sapce is still available
-Date Policy: Looks at the access time of the data and will synchronize data based on that. So if data hasn’t been accessed in a long time, it won’t actually cache or synchronize a local copy

Architecture

  1. File Share: SMB share within a GPv2 or Premium FileStorage storage account
  2. Sync Service: Servers register to one only; they can then belong to many Sync Groups
    -We create Sync Groups so we can say, what servers will have access to what shares
    -Sync Groups are bound to one share
  3. Endpoints
    -Cloud Endpoint: Azure Files share
    –One Sync Group can only have one Cloud Enpoint
    -Sever Endpoint: Local folder - For the servers to synchronize the data, you will need to go and create some local enpoints (folder, volume, root directory). As long as that server is registered to the Sync Service that contains the Sync Group, then we can go and add an endpoint to that Sync Group, and the sync will take place
    –Servers can belong to multiple Sync Groups
    –One server can only be registered to one Sync Service
    –You can’t have multiple server endpoints that point to the same server for the same Sync Group
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Azure Storage Data Transfer Tools

A

Import/Export
Purpose: Move large volumes of data

  1. Customer Disk(s): Support one or more physical disks (2.5” or 3.5” SATA HDD or SSD)
  2. Supported Services: Supports Blobs (import/export) and Files (import only)
  3. Process: Disks are managed using a Windows tool (waimport). Manage job through Portal
    -You are still sending physical disks to Microsoft
    -You specify what service you’re working with, where it’s geographically based, what region and what storage account you are using
    -If you are importing, you are going to provide the Journal File to Microsoft

Data Box
Purpose: Move large volumes of data

  1. Data Box: Data Box/Disk/Heavy (offline) and Gateway (online) appliances
  2. Supported Services: Blobs (block/page), Managed Disks, Azure Files, ADLS Gen2
  3. Process: Order the device (for import/export) connect and use locally; return
    -It supports NFS and SMB

AzCopy
Purpose: Manage data across different platforms

  1. AzCopy Tool: Cross-platform (Windows/Mac/Linux) command line tool
  2. Supported Services: Blobs and Files (was also Tables, until Cosmos DB team took over)
  3. Process: Authenticate with azcopy then upload/download blobs/files
How well did you know this?
1
Not at all
2
3
4
5
Perfectly