Skip to content
This repository was archived by the owner on Dec 13, 2023. It is now read-only.

Commit 4a5da7b

Browse files
authored
DOC-295 | Clarifications - Global Write Transaction Lock (#1273)
* edits for clarity * review
1 parent ac4c66c commit 4a5da7b

File tree

2 files changed

+42
-54
lines changed

2 files changed

+42
-54
lines changed

3.10/backup-restore.md

Lines changed: 21 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -130,33 +130,27 @@ of that of the single server installation.
130130

131131
- **The Global Write Transaction Lock**
132132

133-
The global write transaction lock mentioned above is such a determining factor,
134-
that it needs a little detailed attention.
135-
136-
It is obvious that in order to be able to create a consistent snapshot of the
137-
ArangoDB world on a specific single server or cluster deployment, one must
138-
stop all transactional write operations at the next possible time or else
139-
consistency would no longer be given.
140-
141-
On the other hand it is also obvious, that there is no way for ArangoDB to
142-
known, when that time will come. It might be there with the next attempt a
143-
nanosecond away, but it could of course not come for the next 2 minutes.
144-
145-
ArangoDB tries to obtain that lock over and over again. On the single server
146-
instances these consecutive tries will not be noticeable. At some point the
147-
lock is obtained and the hot backup is created then within a very short
148-
amount of time.
149-
150-
In clusters things are a little more complicated and noticeable.
151-
A Coordinator, which is trying to obtain the global write transaction
152-
lock must try to get local locks
153-
on all _DB-Servers_ simultaneously; potentially succeeding on some and not
154-
succeeding on others, leading to apparent dead times in the cluster's write
155-
operations.
156-
157-
This process can happen multiple times until success is achieved.
158-
One has control over the length of the time during which the lock is tried to
159-
be obtained each time prolonging the last wait time by 10%.
133+
To create a consistent snapshot of an ArangoDB single server or
134+
cluster deployment, all transactions need to be suspended in order for the
135+
state of a deployment to be consistent. However, there is no way for ArangoDB
136+
to know by its own when this time comes. This is why a hot backup needs to
137+
aquire a global write transaction lock in order to create the backup in a
138+
consistent state.
139+
140+
On a single server instance, this lock is eventually obtained and the hot
141+
backup is then created within a very short amount of time.
142+
143+
However, in a cluster, this process is more complex. One Coordinator tries to
144+
obtain the global write transaction lock on all _DB-Servers_ simultaneously.
145+
Depending on the activity in the cluster, it can take some time for the
146+
Coordinator to acquire all the locks the cluster needs. Grabbing all the
147+
necessary locks at once might not always be successful, leading to times
148+
when it seems like the cluster's write operations are suspended.
149+
150+
This process can happen multiple times until all locks are obtained.
151+
The system administrator has control over the length of the time during which
152+
the lock is tried to be obtained each time, prolonging the last wait time by
153+
10% (which gives more time for the global write transaction lock to resolve).
160154

161155
- **Agency Lock**
162156

3.11/backup-restore.md

Lines changed: 21 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -130,33 +130,27 @@ of that of the single server installation.
130130

131131
- **The Global Write Transaction Lock**
132132

133-
The global write transaction lock mentioned above is such a determining factor,
134-
that it needs a little detailed attention.
135-
136-
It is obvious that in order to be able to create a consistent snapshot of the
137-
ArangoDB world on a specific single server or cluster deployment, one must
138-
stop all transactional write operations at the next possible time or else
139-
consistency would no longer be given.
140-
141-
On the other hand it is also obvious, that there is no way for ArangoDB to
142-
known, when that time will come. It might be there with the next attempt a
143-
nanosecond away, but it could of course not come for the next 2 minutes.
144-
145-
ArangoDB tries to obtain that lock over and over again. On the single server
146-
instances these consecutive tries will not be noticeable. At some point the
147-
lock is obtained and the hot backup is created then within a very short
148-
amount of time.
149-
150-
In clusters things are a little more complicated and noticeable.
151-
A Coordinator, which is trying to obtain the global write transaction
152-
lock must try to get local locks
153-
on all _DB-Servers_ simultaneously; potentially succeeding on some and not
154-
succeeding on others, leading to apparent dead times in the cluster's write
155-
operations.
156-
157-
This process can happen multiple times until success is achieved.
158-
One has control over the length of the time during which the lock is tried to
159-
be obtained each time prolonging the last wait time by 10%.
133+
To create a consistent snapshot of an ArangoDB single server or
134+
cluster deployment, all transactions need to be suspended in order for the
135+
state of a deployment to be consistent. However, there is no way for ArangoDB
136+
to know by its own when this time comes. This is why a hot backup needs to
137+
aquire a global write transaction lock in order to create the backup in a
138+
consistent state.
139+
140+
On a single server instance, this lock is eventually obtained and the hot
141+
backup is then created within a very short amount of time.
142+
143+
However, in a cluster, this process is more complex. One Coordinator tries to
144+
obtain the global write transaction lock on all _DB-Servers_ simultaneously.
145+
Depending on the activity in the cluster, it can take some time for the
146+
Coordinator to acquire all the locks the cluster needs. Grabbing all the
147+
necessary locks at once might not always be successful, leading to times
148+
when it seems like the cluster's write operations are suspended.
149+
150+
This process can happen multiple times until all locks are obtained.
151+
The system administrator has control over the length of the time during which
152+
the lock is tried to be obtained each time, prolonging the last wait time by
153+
10% (which gives more time for the global write transaction lock to resolve).
160154

161155
- **Agency Lock**
162156

0 commit comments

Comments
 (0)