Application Engineering Flashcards by Blake Wills

What are the components of a database server?

CPU
Memory - inserts and updates are written here first
Persistent Disk
Journal - log of every database process

Database writes to memory first and then writes to journal and disk. There is a lag between the write to memory and the write to disk.

How well did you know this?

Not at all

Perfectly

How could the data be lossed when updating and inserting even though Mongo returns a message saying insert/update success?

The server can crash after the data is written to memory but before it is written to disk. It’s up to the developers to configure Mongo for waiting for a persistance to disk before receiving a successful save message.

How well did you know this?

Not at all

Perfectly

What is the default write concern for MongoDB?

w=1 j = false –> Save occurs for a write to the memory in the database and there’s no wait for the journal to sync

How well did you know this?

Not at all

Perfectly

How do you set a wait for the write to disk in Mongo?

w = 1 j = true –> This waits for the write to disk when saving which is much slower than w=1 j = false option, but less risky.

How well did you know this?

Not at all

Perfectly

Provided you assume that the disk is persistent, what are the w and j settings required to guarantee that an insert or update has been written all the way to disk?

w=1, j=1

How well did you know this?

Not at all

Perfectly

What typically happens when you don’t receive a affirmative response?

Network error; this can also give a false positive

How well did you know this?

Not at all

Perfectly

What are the reasons why an application may receive an error back even if the write was successful?

Network connection between the server and application and the server was reset after the write was received but before the message was sent
Mongodb can also terminate between receiving the response and writing it to disk
Mongodb can fail between the time of the write and the time the application receives a response.

Inserts instead of updates is the only way to prevent this

How well did you know this?

Not at all

Perfectly

How do we get availability and fault tolerance in MongoDB?

Using replication across multiple Mongo instances. Data written to a primary node will asynchronously be replicated to the secondary nodes

How well did you know this?

Not at all

Perfectly

What is the minimum original number of nodes needed to assure the election of a new Primary if a node goes down?

3 –> You need two other secondaries when a primary goes down for an election. When the primary goes down, the a new primary is elected and the data is written there. The old primary returns as a secondary.

How well did you know this?

Not at all

Perfectly

What are the different types of replica set nodes?

Regular - has data; can be primary or secondary
Arbiter - only there for voting purposes in elections
Delayed/Regular - disaster recovery node (can’t participate in voting - p = 0)
Hidden - can’t be a primary but can participate in elections

How well did you know this?

Not at all

Perfectly

Which types of nodes can participate in elections of a new primary?

Regular
Hidden
Arbitors

How well did you know this?

Not at all

Perfectly

Explain Write Consistency

Writes go to the only primary but reads don’t have to go to the primary
## If reads go some where else, you may read stale data

How well did you know this?

Not at all

Perfectly

During the time when failover is occurring, can writes successfully complete?

How well did you know this?

Not at all

Perfectly

What is the oplog?

It is a capped collection on a mongo instance (you can find it using ‘show collections’) that follows the activity of the primary. The activity of the primary is recorded on the primary’s oplog and then these are copied by the secondaries asynchronously

doesn’t matter which driver you use with these - version doesn’t matter or driver doesn’t matter (WiredTiger)

How well did you know this?

Not at all

Perfectly

What does rs.slaveOk() do?

It allows a secondary instance to be queried on; by default they can’t

How well did you know this?

Not at all

Perfectly

T or F –> A copy of the oplog is kept on both the primary and secondary servers

Study These Flashcards

T or F –> Replication supports mixed-mode storage engines For example mmpap1 and wiredTiger

Study These Flashcards

What happens if a node comes back up as a secondary after a period of being offline and the oplog has looped on the primary?

Study These Flashcards

The entire dataset will be copied from the primary

T or F - There is a scenario that a write performed with w = majorithy get rolled back

Study These Flashcards

T - because this can happen when a failover on the primary occurs after the write is committed to the primary but before the replication of the oplog is committed to the secondaries. This will show an exception on the client.

If you leave a replica set node out of the seedlist within the driver, what will happen?

Study These Flashcards

It will be discovered anyways as long as you list atleast one valid node

What kind of exception occurs when a primary fails and election occurs during an insert?

Study These Flashcards

AutoReconnect exception

What will happen if the following statement is executed in Python during a primary election?

Study These Flashcards

db.test.insert_one({‘x’:1})

How is failover detected?

Study These Flashcards

Use catches and exceptions in your code to catch a failover - this however does not guarentee that your failover catch will write all data - instead use a retry up to 3 times or more

Why don’t you need to handle a duplicate key exception for reads?

Study These Flashcards

Duplicate key reads are impossible when reading

Give examples of item potent updates and non item potent updates:

Itempotent --> $set NonItempotent --> $inc, $push --> these are more risky to run again because they could potentially doubly increment - not good for incrementing sensitive data like money but probably not a big deal for

How do you handle not item potent updates?

- Converting to item potent updates - Not updating again and not caring about it -

If you want to be sure that an update with a $inc occurred exactly once in the face of failover, what's the best way to do it?

transform the update into a statement that is item potent - however this risks losing one update in a multithreaded program

Application Engineering Flashcards

(27 cards)