blob: e2cfb68ab57907f640a618409533a98c7427af1d [file] [log] [blame]
Junio C Hamanoc562f6d2020-09-25 22:50:121git-maintenance(1)
2==================
3
4NAME
5----
6git-maintenance - Run tasks to optimize Git repository data
7
8
9SYNOPSIS
10--------
11[verse]
12'git maintenance' run [<options>]
13
14
15DESCRIPTION
16-----------
17Run tasks to optimize Git repository data, speeding up other Git commands
18and reducing storage requirements for the repository.
19
20Git commands that add repository data, such as `git add` or `git fetch`,
21are optimized for a responsive user experience. These commands do not take
22time to optimize the Git data, since such optimizations scale with the full
23size of the repository while these user commands each perform a relatively
24small action.
25
26The `git maintenance` command provides flexibility for how to optimize the
27Git repository.
28
29SUBCOMMANDS
30-----------
31
Junio C Hamanob7676d52020-11-18 22:15:0832register::
33Initialize Git config values so any scheduled maintenance will
34start running on this repository. This adds the repository to the
35`maintenance.repo` config variable in the current user's global
36config and enables some recommended configuration values for
37`maintenance.<task>.schedule`. The tasks that are enabled are safe
38for running in the background without disrupting foreground
39processes.
40+
Junio C Hamanod5cfc8f2021-01-16 00:14:5141The `register` subcommand will also set the `maintenance.strategy` config
Junio C Hamanob7676d52020-11-18 22:15:0842value to `incremental`, if this value is not previously set. The
43`incremental` strategy uses the following schedule for each maintenance
44task:
45+
46--
47* `gc`: disabled.
48* `commit-graph`: hourly.
49* `prefetch`: hourly.
50* `loose-objects`: daily.
51* `incremental-repack`: daily.
52--
53+
54`git maintenance register` will also disable foreground maintenance by
55setting `maintenance.auto = false` in the current repository. This config
56setting will remain after a `git maintenance unregister` command.
57
Junio C Hamanoc562f6d2020-09-25 22:50:1258run::
59Run one or more maintenance tasks. If one or more `--task` options
60are specified, then those tasks are run in that order. Otherwise,
61the tasks are determined by which `maintenance.<task>.enabled`
62config options are true. By default, only `maintenance.gc.enabled`
63is true.
64
Junio C Hamanob7676d52020-11-18 22:15:0865start::
66Start running maintenance on the current repository. This performs
67the same config updates as the `register` subcommand, then updates
68the background scheduler to run `git maintenance run --scheduled`
69on an hourly basis.
70
71stop::
72Halt the background maintenance schedule. The current repository
73is not removed from the list of maintained repositories, in case
74the background maintenance is restarted later.
75
76unregister::
77Remove the current repository from background maintenance. This
78only removes the repository from the configured list. It does not
79stop the background maintenance processes from running.
80
Junio C Hamanoc562f6d2020-09-25 22:50:1281TASKS
82-----
83
84commit-graph::
85The `commit-graph` job updates the `commit-graph` files incrementally,
86then verifies that the written data is correct. The incremental
87write is safe to run alongside concurrent Git processes since it
88will not expire `.graph` files that were in the previous
89`commit-graph-chain` file. They will be deleted by a later run based
90on the expiration delay.
91
Junio C Hamano6a3d6652020-10-27 23:01:1492prefetch::
93The `prefetch` task updates the object directory with the latest
94objects from all registered remotes. For each remote, a `git fetch`
Junio C Hamano4078a552021-04-30 06:08:1095command is run. The configured refspec is modified to place all
96requested refs within `refs/prefetch/`. Also, tags are not updated.
Junio C Hamano6a3d6652020-10-27 23:01:1497+
98This is done to avoid disrupting the remote-tracking branches. The end users
99expect these refs to stay unmoved unless they initiate a fetch. With prefetch
100task, however, the objects necessary to complete a later real fetch would
101already be obtained, so the real fetch would go faster. In the ideal case,
Junio C Hamano69eb3a62020-12-23 22:58:35102it will just become an update to a bunch of remote-tracking branches without
Junio C Hamano6a3d6652020-10-27 23:01:14103any object transfer.
104
Junio C Hamanoc562f6d2020-09-25 22:50:12105gc::
106Clean up unnecessary files and optimize the local repository. "GC"
107stands for "garbage collection," but this task performs many
108smaller tasks. This task can be expensive for large repositories,
109as it repacks all Git objects into a single pack-file. It can also
110be disruptive in some situations, as it deletes stale data. See
111linkgit:git-gc[1] for more details on garbage collection in Git.
112
Junio C Hamano6a3d6652020-10-27 23:01:14113loose-objects::
114The `loose-objects` job cleans up loose objects and places them into
115pack-files. In order to prevent race conditions with concurrent Git
116commands, it follows a two-step process. First, it deletes any loose
117objects that already exist in a pack-file; concurrent Git processes
118will examine the pack-file for the object data instead of the loose
119object. Second, it creates a new pack-file (starting with "loose-")
120containing a batch of loose objects. The batch size is limited to 50
121thousand objects to prevent the job from taking too long on a
122repository with many loose objects. The `gc` task writes unreachable
123objects as loose objects to be cleaned up by a later step only if
124they are not re-added to a pack-file; for this reason it is not
125advisable to enable both the `loose-objects` and `gc` tasks at the
126same time.
127
128incremental-repack::
129The `incremental-repack` job repacks the object directory
130using the `multi-pack-index` feature. In order to prevent race
131conditions with concurrent Git commands, it follows a two-step
132process. First, it calls `git multi-pack-index expire` to delete
133pack-files unreferenced by the `multi-pack-index` file. Second, it
134calls `git multi-pack-index repack` to select several small
135pack-files and repack them into a bigger one, and then update the
136`multi-pack-index` entries that refer to the small pack-files to
137refer to the new pack-file. This prepares those small pack-files
138for deletion upon the next run of `git multi-pack-index expire`.
139The selection of the small pack-files is such that the expected
140size of the big pack-file is at least the batch size; see the
141`--batch-size` option for the `repack` subcommand in
142linkgit:git-multi-pack-index[1]. The default batch-size is zero,
143which is a special case that attempts to repack all pack-files
144into a single pack-file.
145
Junio C Hamanoa70c9882021-02-23 00:57:12146pack-refs::
147The `pack-refs` task collects the loose reference files and
148collects them into a single file. This speeds up operations that
149need to iterate across many references. See linkgit:git-pack-refs[1]
150for more information.
151
Junio C Hamanoc562f6d2020-09-25 22:50:12152OPTIONS
153-------
154--auto::
155When combined with the `run` subcommand, run maintenance tasks
156only if certain thresholds are met. For example, the `gc` task
157runs when the number of loose objects exceeds the number stored
158in the `gc.auto` config setting, or when the number of pack-files
Junio C Hamanob7676d52020-11-18 22:15:08159exceeds the `gc.autoPackLimit` config setting. Not compatible with
160the `--schedule` option.
161
162--schedule::
163When combined with the `run` subcommand, run maintenance tasks
164only if certain time conditions are met, as specified by the
165`maintenance.<task>.schedule` config value for each `<task>`.
166This config value specifies a number of seconds since the last
167time that task ran, according to the `maintenance.<task>.lastRun`
168config value. The tasks that are tested are those provided by
169the `--task=<task>` option(s) or those with
170`maintenance.<task>.enabled` set to true.
Junio C Hamanoc562f6d2020-09-25 22:50:12171
172--quiet::
173Do not report progress or other information over `stderr`.
174
175--task=<task>::
176If this option is specified one or more times, then only run the
177specified tasks in the specified order. If no `--task=<task>`
178arguments are specified, then only the tasks with
179`maintenance.<task>.enabled` configured as `true` are considered.
180See the 'TASKS' section for the list of accepted `<task>` values.
181
Junio C Hamanode44de32021-09-20 22:46:08182--scheduler=auto|crontab|systemd-timer|launchctl|schtasks::
183When combined with the `start` subcommand, specify the scheduler
184for running the hourly, daily and weekly executions of
185`git maintenance run`.
186Possible values for `<scheduler>` are `auto`, `crontab`
187(POSIX), `systemd-timer` (Linux), `launchctl` (macOS), and
188`schtasks` (Windows). When `auto` is specified, the
189appropriate platform-specific scheduler is used; on Linux,
190`systemd-timer` is used if available, otherwise
191`crontab`. Default is `auto`.
192
Junio C Hamanob7676d52020-11-18 22:15:08193
194TROUBLESHOOTING
195---------------
196The `git maintenance` command is designed to simplify the repository
197maintenance patterns while minimizing user wait time during Git commands.
198A variety of configuration options are available to allow customizing this
199process. The default maintenance options focus on operations that complete
200quickly, even on large repositories.
201
202Users may find some cases where scheduled maintenance tasks do not run as
203frequently as intended. Each `git maintenance run` command takes a lock on
204the repository's object database, and this prevents other concurrent
205`git maintenance run` commands from running on the same repository. Without
206this safeguard, competing processes could leave the repository in an
207unpredictable state.
208
209The background maintenance schedule runs `git maintenance run` processes
210on an hourly basis. Each run executes the "hourly" tasks. At midnight,
211that process also executes the "daily" tasks. At midnight on the first day
212of the week, that process also executes the "weekly" tasks. A single
213process iterates over each registered repository, performing the scheduled
214tasks for that frequency. Depending on the number of registered
215repositories and their sizes, this process may take longer than an hour.
216In this case, multiple `git maintenance run` commands may run on the same
217repository at the same time, colliding on the object database lock. This
218results in one of the two tasks not running.
219
220If you find that some maintenance windows are taking longer than one hour
221to complete, then consider reducing the complexity of your maintenance
222tasks. For example, the `gc` task is much slower than the
223`incremental-repack` task. However, this comes at a cost of a slightly
224larger object database. Consider moving more expensive tasks to be run
225less frequently.
226
227Expert users may consider scheduling their own maintenance tasks using a
228different schedule than is available through `git maintenance start` and
229Git configuration options. These users should be aware of the object
230database lock and how concurrent `git maintenance run` commands behave.
231Further, the `git gc` command should not be combined with
232`git maintenance run` commands. `git gc` modifies the object database
233but does not take the lock in the same way as `git maintenance run`. If
234possible, use `git maintenance run --task=gc` instead of `git gc`.
235
Junio C Hamano7887f9b2021-01-25 23:32:33236The following sections describe the mechanisms put in place to run
237background maintenance by `git maintenance start` and how to customize
238them.
239
240BACKGROUND MAINTENANCE ON POSIX SYSTEMS
241---------------------------------------
242
243The standard mechanism for scheduling background tasks on POSIX systems
244is cron(8). This tool executes commands based on a given schedule. The
245current list of user-scheduled tasks can be found by running `crontab -l`.
246The schedule written by `git maintenance start` is similar to this:
247
248-----------------------------------------------------------------------
249# BEGIN GIT MAINTENANCE SCHEDULE
250# The following schedule was created by Git
251# Any edits made in this region might be
252# replaced in the future by a Git command.
253
2540 1-23 * * * "/<path>/git" --exec-path="/<path>" for-each-repo --config=maintenance.repo maintenance run --schedule=hourly
2550 0 * * 1-6 "/<path>/git" --exec-path="/<path>" for-each-repo --config=maintenance.repo maintenance run --schedule=daily
2560 0 * * 0 "/<path>/git" --exec-path="/<path>" for-each-repo --config=maintenance.repo maintenance run --schedule=weekly
257
258# END GIT MAINTENANCE SCHEDULE
259-----------------------------------------------------------------------
260
261The comments are used as a region to mark the schedule as written by Git.
262Any modifications within this region will be completely deleted by
263`git maintenance stop` or overwritten by `git maintenance start`.
264
265The `crontab` entry specifies the full path of the `git` executable to
266ensure that the executed `git` command is the same one with which
267`git maintenance start` was issued independent of `PATH`. If the same user
268runs `git maintenance start` with multiple Git executables, then only the
269latest executable is used.
270
271These commands use `git for-each-repo --config=maintenance.repo` to run
272`git maintenance run --schedule=<frequency>` on each repository listed in
273the multi-valued `maintenance.repo` config option. These are typically
274loaded from the user-specific global config. The `git maintenance` process
275then determines which maintenance tasks are configured to run on each
276repository with each `<frequency>` using the `maintenance.<task>.schedule`
277config options. These values are loaded from the global or repository
278config values.
279
280If the config values are insufficient to achieve your desired background
281maintenance schedule, then you can create your own schedule. If you run
282`crontab -e`, then an editor will load with your user-specific `cron`
283schedule. In that editor, you can add your own schedule lines. You could
284start by adapting the default schedule listed earlier, or you could read
285the crontab(5) documentation for advanced scheduling techniques. Please
286do use the full path and `--exec-path` techniques from the default
287schedule to ensure you are executing the correct binaries in your
288schedule.
289
290
Junio C Hamanode44de32021-09-20 22:46:08291BACKGROUND MAINTENANCE ON LINUX SYSTEMD SYSTEMS
292-----------------------------------------------
293
294While Linux supports `cron`, depending on the distribution, `cron` may
295be an optional package not necessarily installed. On modern Linux
296distributions, systemd timers are superseding it.
297
298If user systemd timers are available, they will be used as a replacement
299of `cron`.
300
301In this case, `git maintenance start` will create user systemd timer units
302and start the timers. The current list of user-scheduled tasks can be found
303by running `systemctl --user list-timers`. The timers written by `git
304maintenance start` are similar to this:
305
306-----------------------------------------------------------------------
307$ systemctl --user list-timers
308NEXT LEFT LAST PASSED UNIT ACTIVATES
309Thu 2021-04-29 19:00:00 CEST 42min left Thu 2021-04-29 18:00:11 CEST 17min ago git-maintenance@hourly.timer git-maintenance@hourly.service
310Fri 2021-04-30 00:00:00 CEST 5h 42min left Thu 2021-04-29 00:00:11 CEST 18h ago git-maintenance@daily.timer git-maintenance@daily.service
311Mon 2021-05-03 00:00:00 CEST 3 days left Mon 2021-04-26 00:00:11 CEST 3 days ago git-maintenance@weekly.timer git-maintenance@weekly.service
312-----------------------------------------------------------------------
313
314One timer is registered for each `--schedule=<frequency>` option.
315
316The definition of the systemd units can be inspected in the following files:
317
318-----------------------------------------------------------------------
319~/.config/systemd/user/git-maintenance@.timer
320~/.config/systemd/user/git-maintenance@.service
321~/.config/systemd/user/timers.target.wants/git-maintenance@hourly.timer
322~/.config/systemd/user/timers.target.wants/git-maintenance@daily.timer
323~/.config/systemd/user/timers.target.wants/git-maintenance@weekly.timer
324-----------------------------------------------------------------------
325
326`git maintenance start` will overwrite these files and start the timer
327again with `systemctl --user`, so any customization should be done by
328creating a drop-in file, i.e. a `.conf` suffixed file in the
329`~/.config/systemd/user/git-maintenance@.service.d` directory.
330
331`git maintenance stop` will stop the user systemd timers and delete
332the above mentioned files.
333
334For more details, see `systemd.timer(5)`.
335
336
Junio C Hamano7887f9b2021-01-25 23:32:33337BACKGROUND MAINTENANCE ON MACOS SYSTEMS
338---------------------------------------
339
340While macOS technically supports `cron`, using `crontab -e` requires
341elevated privileges and the executed process does not have a full user
342context. Without a full user context, Git and its credential helpers
343cannot access stored credentials, so some maintenance tasks are not
344functional.
345
346Instead, `git maintenance start` interacts with the `launchctl` tool,
347which is the recommended way to schedule timed jobs in macOS. Scheduling
348maintenance through `git maintenance (start|stop)` requires some
349`launchctl` features available only in macOS 10.11 or later.
350
351Your user-specific scheduled tasks are stored as XML-formatted `.plist`
352files in `~/Library/LaunchAgents/`. You can see the currently-registered
353tasks using the following command:
354
355-----------------------------------------------------------------------
356$ ls ~/Library/LaunchAgents/org.git-scm.git*
357org.git-scm.git.daily.plist
358org.git-scm.git.hourly.plist
359org.git-scm.git.weekly.plist
360-----------------------------------------------------------------------
361
362One task is registered for each `--schedule=<frequency>` option. To
363inspect how the XML format describes each schedule, open one of these
364`.plist` files in an editor and inspect the `<array>` element following
365the `<key>StartCalendarInterval</key>` element.
366
367`git maintenance start` will overwrite these files and register the
368tasks again with `launchctl`, so any customizations should be done by
369creating your own `.plist` files with distinct names. Similarly, the
370`git maintenance stop` command will unregister the tasks with `launchctl`
371and delete the `.plist` files.
372
373To create more advanced customizations to your background tasks, see
374launchctl.plist(5) for more information.
375
376
377BACKGROUND MAINTENANCE ON WINDOWS SYSTEMS
378-----------------------------------------
379
380Windows does not support `cron` and instead has its own system for
381scheduling background tasks. The `git maintenance start` command uses
382the `schtasks` command to submit tasks to this system. You can inspect
383all background tasks using the Task Scheduler application. The tasks
384added by Git have names of the form `Git Maintenance (<frequency>)`.
385The Task Scheduler GUI has ways to inspect these tasks, but you can also
386export the tasks to XML files and view the details there.
387
388Note that since Git is a console application, these background tasks
389create a console window visible to the current user. This can be changed
390manually by selecting the "Run whether user is logged in or not" option
391in Task Scheduler. This change requires a password input, which is why
392`git maintenance start` does not select it by default.
393
394If you want to customize the background tasks, please rename the tasks
395so future calls to `git maintenance (start|stop)` do not overwrite your
396custom tasks.
397
Junio C Hamanob7676d52020-11-18 22:15:08398
Junio C Hamanoc562f6d2020-09-25 22:50:12399GIT
400---
401Part of the linkgit:git[1] suite