Implement and manage storage Flashcards

1
Q

How many subscriptions can a storage account belong to?

A

1 and only 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the requirements for a storage account name?

A

Globally unique.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the role of regions in with storage accounts?

A

A storage account must be assigned to a single region. The consumer and the region should be as close as possible geographically to maximise performance. Regions are subject to the local legal requirements so European companies should choose an eu emmener region to minimise the impact of the GDPR legal restrictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is redundancy when referring to storage?

A

Redundancy refers to the duplication of data in multiple locations to ensure that data is not lost in case of a problem within the primary data center.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is LRS?

A

Locally redundant storage- have multiple redundant data duplication with the same data centre

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is GRS?

A

Geo-redundant storage, data redundancy in a secondary region.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is ZRS?

A

Zone redundant storage, data duplication in other zones of the same region.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is GZRS?

A

Globally zone redundant storage, includes zone and region redundancy. The safest option and recommended for all critical data scenarios.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does pricing work when it comes to storage?

A

Premium (SSD) /Standard (Magnetic) hot -> cool -> archive

Pricing changes depending on usage and prices different for 0-50tb per month, 50-500tb and over 500tb per month

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the checkbox ‘Make read access to data available on the event of regional unavailability’ do when creating a storage account?

A

It will give the user a read only url which can be used to read ( and only read) data from one of the redundancy data stores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the difference between an unmanaged storage account and a managed one?

A

Managed Disks = are managed by Microsoft Azure and you don’t need any storage account while created new disk. Since the storage account is managed by Azure you do not have full control of the disks that are being created.
Un-managed Disks = is something which requires you to create a storage account before you create any new disk. Since, the storage account is created and owned by you, you have full control over all the data that is present on your storage account. Additionally, you also need to take care of encryption, data recovery plans etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why are storage account keys called claims based security?

A

Because if you have a key, you have a claim.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is enable hierarchical namespace in the advanced tab when creating a new storage account and why would you activate/disable it?

A

To use Data Lake Storage Gen2 capabilities, you must create a storage account that has a hierarchical namespace.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the maximum default size of an azure file share ?

A

5tb. If you want larger you need to select ‘enable large file shares’ in the advanced tab of of the storage account creation process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does it mean that public access is enabled from all networks?

A

The network has no limitations on where traffic can come from or go to however the authentication and authorisation is still required to access the data. So you would still need to authenticate with azure Active Directory or use access keys to access the data. The door is there but it is locked.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a private endpoint/ private link?

A

Private endpoints provide direct network links between azure resources. If public access from all networks is disabled, you can either use a network firewall or connect resources using private endpoints. Private endpoints are considered the most restricted and therefore most secure solution to networking between azure resources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is network routing?

A

Where does you network data travel? On the public internet? On Microsoft network avoiding public internet? Generally internet routing carried more risk of data leaks as the public internet is not under the control of Microsoft ( therefore Microsoft network is a more secure option)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

When creating a storage account, what is the data protection tab’s soft delete option?

A

Deleting marks a file for deletion rather than actually deleting it. The default days to wait before actually deleting the marked for delete files is 7 and there is an independent setting for blobs, containers and file shares.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

When creating a storage account, what is the data protection tab’s tracking section?

A

Enables version control of the data on the storage account. This does duplicate data and will incur an increased storage cost but you will be able to restore previous versions of your data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

When creating a storage account, what is the data protection tab’s enable change feed in the tracking section?

A

It’s a kind of logging of changes to the blob data over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

When creating a storage account, what is the data protection tab’s access control section?

A

Its offers an enable version level immutability support to lock your files in place . Good usage examples for this would be log files where you want to ensure that no one is tampering with the files.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the two main encryption types as seen on the encryption tab of the create storage account options/ parameters?

A

Microsoft Managed Keys MMK and Customer Managed Keys CMK. When MMK is selected there are no other parameters to select. If CMK is selected ( maybe due to company policy) you can either set up a key vault or point to a key vault.

You can choose the scope of encryption to cover just the stored data (blobs or files) or everything including queues, tables as well as blobs or files)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is infrastructure encryption ( as seen in the encryption tab of the create storage account wizard)?

A

Infrastructure encryption is hardware encryption of the data as a secondary layer of encryption . This means that the data needs decrypted twice before it is usable and therefore twice as difficult to hack.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How can you access the JSON representation of your storage account?

A

Via the download a template link on the review tab of the create storage account wizard.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

How long does it take to create a storage account after having clicked the Create button on the final review tab of the create storage account wizard.

A

Around 30 seconds. You can check the duration time in the deployments section of the blade of the newly created storage account resource.

26
Q

In the data storage section of a storage account portal ui, there are four types of data storage. What are they and what is the primary unique purpose for each of them?

A

Containers aka blob storage area in which to put files or any data. There is no hierarchical structure like a directory tree native to containers but some people to use naming conventions to mimic a directory like structure. Each file in the container has a public url but if you do not have access to the url with access keys or SAS tokens you will receive a 404 does not exist response.

File shares : these are standard traditional file shares but hosted on azure which can be mounted to a computer. They use the most common file share protocols are used (SMB). SMB uses port 445 which is blocked for the most part on the public internet so these are best used on corporate and specific networks designed for file shares.

Queues: shared storage for a messaging queue between applications.

Tables: store data on a semi structured manner. One a table is created you can browse and edit the table data from the storage browser link in the left hand blade. It’s considered semi structured because you can add columns to rows after al ready creating N rows in the table. Table storage does not offer the relational mechanisms of a traditional sql database and is more similar to a hosted excel reference table.

27
Q

Why are there 2 access keys for every storage account?

A

Both keys are valid and anyone with that key ( and network access) can added your data. You can add rotation reminders to update keys as a best preview, but while you are regenerating one key your services can use the second key to avoid having any service downtime.

28
Q

What is a shared access signature ( concretely in the context of a storage account)?

A

Generally it is an encrypted key describing the set of admin defined permissions for a specific duration that anyone with the SAS token can enjoy. You can create one in the shared access tokens blade option in the storage account portal, and the generated sas token is somewhat readable but any modifications to the human readable parts of the token will no longer concur with the signature of the token and there for be invalid. Add the token to a file url like < url to file>?<sas> and you will be able to access a previously unassigned file assuming the sas token has been configured to give you at least read access.</sas>

Note that a shared access signature cannot be revoked so if you give permissions for 2 days then change your mind, the token cannot be revoked. The sas token is linked to the access key however so you can rotate the access key to invalidate the sas token.

29
Q

What is a stored access policy?

A

It’s like a condition on a shared access signature. Stored access policies can be revoked without rotating the key so it becomes easier to revoke given access without the inconvenience of rotating the associated access key.

They don’t replace the SAS token but they become a prerequisite to their validity.

From the storage account hi you can click the three dots on the right of a specific container and select access policy. You can add separate stored access policies for immutable blob storage ( versioned) and standard storage access policies. Click add policy and add metadata to policy ( name, start and end date, read, list, delete etc. Permissions to apply). After signing, when you click generate sas in the 3 dots menu of a particular file, you can select the stored access policy as well as a key. Note that the permissions are more defined by the support access policy and the permissions drop down is greyed out.

Now you can invalidate this generated sas token by either deleting or modifying the access policy.

30
Q

How durable is locally redundant storage?

A

11 9’s

31
Q

How to upgrade to geo redundant storage?

A

In the settings section of the storage accounts menu blade go to configuration and select replication.

You will see all the options available to the storage account ( some regions have zone redundant storage using availability zones but not all).

32
Q

What is the failover configuration when dealing with a storage account?

A

Failover refers to the hierarchy and direction of replacement data sources used when a main data source is not available. So in geo redundant storage , if the source data is unavailable for whatever reason the a failover will describe where to try to recuperate a copy of the data from in the redundant storage copies.

In data management in the geo replication button in the storage account menu blade, assuming redundancy has been configured and the initial copy of your now redundant data has been completed, you can click the prepare for failover button to begin process of turning your secondary storage into your primary data source. You can also reconfigure the primary and secondary failovers.

This will change the storage endpoints so more updates are likely necessary to get the new data source working in your infrastructure or application.

33
Q

What are access tiers in the context of a storage account?

A

Options which affect the performance and pricing you can expect from your storage account.

Performance options are standard and premium retirement the physical storage medium used I.e. standard is magnetic disk hard drives and premium is ssd.

The storage type options change depending on your performance choice. Block blobs, file shares and page blobs are available for premium but table storage and queue storage are also available for standard.

Premium storage is SSD based and so latency is lower.

Access tiers represent the price structure and storage usage for a storage account.
The access tier can be hot , cool, cold or archive - hot being in the day to day usage storage with a higher price to write than to read data. Irregularly accessed files should be set to cool as it’s cheaper to write data than to read it so rarely accessed data will be economical in the cool tier.

In container storage , there are other access tiers available called cold and archive. Archive being the cheapest but having to pass through 30 days of cool storage and its retrieval requires rehydration from archive into another access tier like cool which may take several hours and incurs a premium.

34
Q

Storage accounts have a default back up policy. What is it?

A

Daily at 7:30am retaining the backup for 30 days.

A backup can be triggered manually at any time from the backup link on the operations menu option in a storage portal eg in a file share within a storage account.

35
Q

What is a snapshot within a storage account data store?

A

It is a point in time copy of the data within the data store which can be retrieved in the future from the 3 dot menu on the target file within the snapshot overview. Snapshots are retained permanently I.e. until deleted. Each snapshot file has a modified url which includes the name of the snapshot so you can access previous versions of your data store’s files via the snapshots and a consistent url naming convention <share>?sharesnapshot=<name></name></share>

36
Q

How can you enable versioning on your data store after the data store creation process?

A

Via the data protection link on the data management section of the storage account portal ui.

37
Q

What are the versioning options for blob storage?

A
  • Enable versioning for blobs

If checked then:

  • keep all versions
  • delete versions after N days
38
Q

What happens if you delete a versioned file in blob storage?

A

The file is eventually deleted but the previous versions are still available for retrieval.

39
Q

Can you restore previous versions of versioned files in blob storage?

A

Of course. Navigate to the previous version, select it and click the button make current version.

40
Q

How to configure lifecycle management for blob storage?

A

From the lifecycle management link in the storage account left menu blade you can use the interface to add simple rules e.g. after N days move from access tier X to access tier Y.

41
Q

How can you monitor the usage of a storage account or data store?

A

From the monitoring tab of the storage account and using the filters to drill down on a high level between data stores or the whole storage account.

You can also create alerts, metrics, diagnostics and workbooks to display in your dashboard and send to log analytics. Note that diagnostics requires that you create a paid storage space to house the diagnostic data.

42
Q

What does ingress and egress refer to?

A

Data writes is ingress and data reads is egress.

43
Q

How can you write custom queries on the data produced by your storage accounts?

A

Using Kusto query language in Azure monitor and selecting the appropriate target storage accounts or data stores.

44
Q
A
45
Q

What is the purpose of the storage explorer link in the blade of a storage account?

A

It’s to move data in and out of the storage browser with a familiar but web based UI.

You can use the azure file explorer application (like windows explorer but with the functionally to connect to multiple storage accounts ).

Use can use utilities like azcopy.

For bulk operations you can search for the ‘import/export jobs’ from the azure services. This is designed to import or export large amounts of data in bulk and has a familiar style or ui in the azure portal to request a physical data box which will be delivered to your physical location and will be returned to an azure data centre for uploading the data you put on there. There are some prerequisites to using this service in terms of your subscription. Your options are Data box disk for 35tb, data box for 80tb, data box heavy for 800tb (filing cabinet size) and regular import export job of up to 1tb. Each variant of the service has some unique advantages apart from total data size limit such as the disk connectivity ( usb/data for example) and the total number of storage accounts that the data can be uploaded to.

M

46
Q

What is the purpose of the storage explorer link in the blade of a storage account?

A

It’s to move data in and out of the storage browser with a familiar but web based UI.

You can use the azure store explorer application (like windows explorer but with the functionally to connect to multiple storage accounts ).

Use can use utilities like azcopy.

For bulk operations you can search for the ‘import/export jobs’ from the azure services. This is designed to import or export large amounts of data in bulk and has a familiar style or ui in the azure portal to request a physical data box which will be delivered to your physical location and will be returned to an azure data centre for uploading the data you put on there. There are some prerequisites to using this service in terms of your subscription. Your options are Data box disk for 35tb, data box for 80tb, data box heavy for 800tb (filing cabinet size) and regular import export job of up to 1tb. Each variant of the service has some unique advantages apart from total data size limit such as the disk connectivity ( usb/data for example) and the total number of storage accounts that the data can be uploaded to.

This whole process works vice versa too to get data currently stored on azure and bring it back to your premises either on your own disks or on azure disks to be returned to Microsoft after retrieving the data from them.

M

47
Q

What’s the purpose of the WAImportExport tool?

A

It would a used to prepare data (encrypt it and create Jrn journal files to describe what data is in there. It’s used to prepare data before transit to be uploaded to azure.

48
Q

What is azcopy?

A

It’s a small executable used to interact with azure storage using command line commands. It has been optimised and has a full suite of features. It’s the most performant way to pull and push from azure storage accounts.

49
Q

How to access the help of azcopy?

A

Azcopy -?

50
Q

Are access keys a supported method of authentication when using azcopy?

A

No, only shared access towns are supported (SAS)

51
Q

What is the syntax to use azcopy to copy local data to an azure storage account or vice versa using sas tokens?

A

See photo.

Note that that azcopy login is not needed because the authentication and level of permission is part of the sas token.

52
Q

What is the storage browser ?

A

It’s a web ui as part of a storage account’s data store. You can browse the contents of the fileshare, container, table etc. In the case of table storage you can actually manually add an entry from the storage browser.

53
Q

What is storage explorer?

A

It’s an application that can connect to your azure storage accounts and give a file explorer type experience with your cloud stored data.

54
Q

What is object replication?

A

This is how you can physically move files to another data store or storage account. This can be powerful when combined with lifecycle management to control how long data is kept.

You can create replication rules to conditionally replicate objects ( this will enable blob change feed and blob versioning changing the costs incurred).

An example could be to replicate newly added objects to a backup container. Note this is an asynchronous process with no guaranteed time to finish the replication process.

55
Q

What’s happens if a 200gb maximum file size fileshare reaches capacity in a storage account of maximum size of 5petabytes?

A

The data stores own maximum takes priority over the parent storage account maximum. The sum of all the data stores in a storage account cannot exceed the storage account maximum size, in this case being 5 petabytes.

56
Q

Is an azure fileshare a real directory structure?

A

Yes it is a real hierarchical file store, it is not using technical tricks to appear that it is.

57
Q

What is the purpose of azure file sync?

A

Use a local file share and have those synced to an azure file share for backup for wider distribution .

58
Q

How to set up azure file sync service?

A

From the azure portal configure the name, subscription, tags, networking options and region then you will be prompted to download a file sync agent to register your local file-share with your azure file sync service.

Once the file-share is registered, from the portal you can configure different sync groups for different directories of the local file share to different azure file shares .

The sync can also work the other way from an azure file share to a local one.

59
Q

What is the difference between block blobs and page blobs?

A

Hard disks are divided in to blocks of a given size e.g. 32 mb. This is fast and performant like and premium block blobs is essentially ssd storage.

Page blobs is more appropriate for infrequently accessed or modified data in a more random way ( like accessing cached data). The page is the unit of storage and page blobs are a blob designed for performant random read and write processing . Page blobs are divided into pages of contiguous data.

No relation to the pages referred to in certain HDFS processing when using distributed Hadoop storage.

60
Q

Can you have premium storage with globally redundant storage?

A

No, the option is removed from from the redundancy options when premium storage has been selected. The reason is probably due to the longer duration to complete replication options not meeting a commonly understood and accepted level of performance for a product called and priced as premium.

61
Q

What type of blobs can be used as a data lake?

A

Only block blobs.