Application Engineering Flashcards
(27 cards)
What are the components of a database server?
- CPU
- Memory - inserts and updates are written here first
- Persistent Disk
- Journal - log of every database process
Database writes to memory first and then writes to journal and disk. There is a lag between the write to memory and the write to disk.
How could the data be lossed when updating and inserting even though Mongo returns a message saying insert/update success?
The server can crash after the data is written to memory but before it is written to disk. It’s up to the developers to configure Mongo for waiting for a persistance to disk before receiving a successful save message.
What is the default write concern for MongoDB?
w=1 j = false –> Save occurs for a write to the memory in the database and there’s no wait for the journal to sync
How do you set a wait for the write to disk in Mongo?
w = 1 j = true –> This waits for the write to disk when saving which is much slower than w=1 j = false option, but less risky.
Provided you assume that the disk is persistent, what are the w and j settings required to guarantee that an insert or update has been written all the way to disk?
w=1, j=1
What typically happens when you don’t receive a affirmative response?
Network error; this can also give a false positive
What are the reasons why an application may receive an error back even if the write was successful?
Network connection between the server and application and the server was reset after the write was received but before the message was sent
Mongodb can also terminate between receiving the response and writing it to disk
Mongodb can fail between the time of the write and the time the application receives a response.
Inserts instead of updates is the only way to prevent this
How do we get availability and fault tolerance in MongoDB?
Using replication across multiple Mongo instances. Data written to a primary node will asynchronously be replicated to the secondary nodes
What is the minimum original number of nodes needed to assure the election of a new Primary if a node goes down?
3 –> You need two other secondaries when a primary goes down for an election. When the primary goes down, the a new primary is elected and the data is written there. The old primary returns as a secondary.
What are the different types of replica set nodes?
- Regular - has data; can be primary or secondary
- Arbiter - only there for voting purposes in elections
- Delayed/Regular - disaster recovery node (can’t participate in voting - p = 0)
- Hidden - can’t be a primary but can participate in elections
Which types of nodes can participate in elections of a new primary?
Regular
Hidden
Arbitors
Explain Write Consistency
- Writes go to the only primary but reads don’t have to go to the primary
- ## If reads go some where else, you may read stale data
During the time when failover is occurring, can writes successfully complete?
No
What is the oplog?
It is a capped collection on a mongo instance (you can find it using ‘show collections’) that follows the activity of the primary. The activity of the primary is recorded on the primary’s oplog and then these are copied by the secondaries asynchronously
- doesn’t matter which driver you use with these - version doesn’t matter or driver doesn’t matter (WiredTiger)
What does rs.slaveOk() do?
It allows a secondary instance to be queried on; by default they can’t
T or F –> A copy of the oplog is kept on both the primary and secondary servers
T
T or F –> Replication supports mixed-mode storage engines For example mmpap1 and wiredTiger
T
What happens if a node comes back up as a secondary after a period of being offline and the oplog has looped on the primary?
The entire dataset will be copied from the primary
T or F - There is a scenario that a write performed with w = majorithy get rolled back
T - because this can happen when a failover on the primary occurs after the write is committed to the primary but before the replication of the oplog is committed to the secondaries. This will show an exception on the client.
If you leave a replica set node out of the seedlist within the driver, what will happen?
It will be discovered anyways as long as you list atleast one valid node
What kind of exception occurs when a primary fails and election occurs during an insert?
AutoReconnect exception
What will happen if the following statement is executed in Python during a primary election?
db.test.insert_one({‘x’:1})
How is failover detected?
Use catches and exceptions in your code to catch a failover - this however does not guarentee that your failover catch will write all data - instead use a retry up to 3 times or more
Why don’t you need to handle a duplicate key exception for reads?
Duplicate key reads are impossible when reading