| <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" | |
| "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> | |
| <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> | |
| <head> | |
| <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" /> | |
| <meta name="generator" content="AsciiDoc 8.6.9" /> | |
| <title>Use of index and Racy Git problem</title> | |
| <style type="text/css"> | |
| /* Shared CSS for AsciiDoc xhtml11 and html5 backends */ | |
| /* Default font. */ | |
| body { | |
| font-family: Georgia,serif; | |
| } | |
| /* Title font. */ | |
| h1, h2, h3, h4, h5, h6, | |
| div.title, caption.title, | |
| thead, p.table.header, | |
| #toctitle, | |
| #author, #revnumber, #revdate, #revremark, | |
| #footer { | |
| font-family: Arial,Helvetica,sans-serif; | |
| } | |
| body { | |
| margin: 1em 5% 1em 5%; | |
| } | |
| a { | |
| color: blue; | |
| text-decoration: underline; | |
| } | |
| a:visited { | |
| color: fuchsia; | |
| } | |
| em { | |
| font-style: italic; | |
| color: navy; | |
| } | |
| strong { | |
| font-weight: bold; | |
| color: #083194; | |
| } | |
| h1, h2, h3, h4, h5, h6 { | |
| color: #527bbd; | |
| margin-top: 1.2em; | |
| margin-bottom: 0.5em; | |
| line-height: 1.3; | |
| } | |
| h1, h2, h3 { | |
| border-bottom: 2px solid silver; | |
| } | |
| h2 { | |
| padding-top: 0.5em; | |
| } | |
| h3 { | |
| float: left; | |
| } | |
| h3 + * { | |
| clear: left; | |
| } | |
| h5 { | |
| font-size: 1.0em; | |
| } | |
| div.sectionbody { | |
| margin-left: 0; | |
| } | |
| hr { | |
| border: 1px solid silver; | |
| } | |
| p { | |
| margin-top: 0.5em; | |
| margin-bottom: 0.5em; | |
| } | |
| ul, ol, li > p { | |
| margin-top: 0; | |
| } | |
| ul > li { color: #aaa; } | |
| ul > li > * { color: black; } | |
| .monospaced, code, pre { | |
| font-family: "Courier New", Courier, monospace; | |
| font-size: inherit; | |
| color: navy; | |
| padding: 0; | |
| margin: 0; | |
| } | |
| pre { | |
| white-space: pre-wrap; | |
| } | |
| #author { | |
| color: #527bbd; | |
| font-weight: bold; | |
| font-size: 1.1em; | |
| } | |
| #email { | |
| } | |
| #revnumber, #revdate, #revremark { | |
| } | |
| #footer { | |
| font-size: small; | |
| border-top: 2px solid silver; | |
| padding-top: 0.5em; | |
| margin-top: 4.0em; | |
| } | |
| #footer-text { | |
| float: left; | |
| padding-bottom: 0.5em; | |
| } | |
| #footer-badges { | |
| float: right; | |
| padding-bottom: 0.5em; | |
| } | |
| #preamble { | |
| margin-top: 1.5em; | |
| margin-bottom: 1.5em; | |
| } | |
| div.imageblock, div.exampleblock, div.verseblock, | |
| div.quoteblock, div.literalblock, div.listingblock, div.sidebarblock, | |
| div.admonitionblock { | |
| margin-top: 1.0em; | |
| margin-bottom: 1.5em; | |
| } | |
| div.admonitionblock { | |
| margin-top: 2.0em; | |
| margin-bottom: 2.0em; | |
| margin-right: 10%; | |
| color: #606060; | |
| } | |
| div.content { /* Block element content. */ | |
| padding: 0; | |
| } | |
| /* Block element titles. */ | |
| div.title, caption.title { | |
| color: #527bbd; | |
| font-weight: bold; | |
| text-align: left; | |
| margin-top: 1.0em; | |
| margin-bottom: 0.5em; | |
| } | |
| div.title + * { | |
| margin-top: 0; | |
| } | |
| td div.title:first-child { | |
| margin-top: 0.0em; | |
| } | |
| div.content div.title:first-child { | |
| margin-top: 0.0em; | |
| } | |
| div.content + div.title { | |
| margin-top: 0.0em; | |
| } | |
| div.sidebarblock > div.content { | |
| background: #ffffee; | |
| border: 1px solid #dddddd; | |
| border-left: 4px solid #f0f0f0; | |
| padding: 0.5em; | |
| } | |
| div.listingblock > div.content { | |
| border: 1px solid #dddddd; | |
| border-left: 5px solid #f0f0f0; | |
| background: #f8f8f8; | |
| padding: 0.5em; | |
| } | |
| div.quoteblock, div.verseblock { | |
| padding-left: 1.0em; | |
| margin-left: 1.0em; | |
| margin-right: 10%; | |
| border-left: 5px solid #f0f0f0; | |
| color: #888; | |
| } | |
| div.quoteblock > div.attribution { | |
| padding-top: 0.5em; | |
| text-align: right; | |
| } | |
| div.verseblock > pre.content { | |
| font-family: inherit; | |
| font-size: inherit; | |
| } | |
| div.verseblock > div.attribution { | |
| padding-top: 0.75em; | |
| text-align: left; | |
| } | |
| /* DEPRECATED: Pre version 8.2.7 verse style literal block. */ | |
| div.verseblock + div.attribution { | |
| text-align: left; | |
| } | |
| div.admonitionblock .icon { | |
| vertical-align: top; | |
| font-size: 1.1em; | |
| font-weight: bold; | |
| text-decoration: underline; | |
| color: #527bbd; | |
| padding-right: 0.5em; | |
| } | |
| div.admonitionblock td.content { | |
| padding-left: 0.5em; | |
| border-left: 3px solid #dddddd; | |
| } | |
| div.exampleblock > div.content { | |
| border-left: 3px solid #dddddd; | |
| padding-left: 0.5em; | |
| } | |
| div.imageblock div.content { padding-left: 0; } | |
| span.image img { border-style: none; vertical-align: text-bottom; } | |
| a.image:visited { color: white; } | |
| dl { | |
| margin-top: 0.8em; | |
| margin-bottom: 0.8em; | |
| } | |
| dt { | |
| margin-top: 0.5em; | |
| margin-bottom: 0; | |
| font-style: normal; | |
| color: navy; | |
| } | |
| dd > *:first-child { | |
| margin-top: 0.1em; | |
| } | |
| ul, ol { | |
| list-style-position: outside; | |
| } | |
| ol.arabic { | |
| list-style-type: decimal; | |
| } | |
| ol.loweralpha { | |
| list-style-type: lower-alpha; | |
| } | |
| ol.upperalpha { | |
| list-style-type: upper-alpha; | |
| } | |
| ol.lowerroman { | |
| list-style-type: lower-roman; | |
| } | |
| ol.upperroman { | |
| list-style-type: upper-roman; | |
| } | |
| div.compact ul, div.compact ol, | |
| div.compact p, div.compact p, | |
| div.compact div, div.compact div { | |
| margin-top: 0.1em; | |
| margin-bottom: 0.1em; | |
| } | |
| tfoot { | |
| font-weight: bold; | |
| } | |
| td > div.verse { | |
| white-space: pre; | |
| } | |
| div.hdlist { | |
| margin-top: 0.8em; | |
| margin-bottom: 0.8em; | |
| } | |
| div.hdlist tr { | |
| padding-bottom: 15px; | |
| } | |
| dt.hdlist1.strong, td.hdlist1.strong { | |
| font-weight: bold; | |
| } | |
| td.hdlist1 { | |
| vertical-align: top; | |
| font-style: normal; | |
| padding-right: 0.8em; | |
| color: navy; | |
| } | |
| td.hdlist2 { | |
| vertical-align: top; | |
| } | |
| div.hdlist.compact tr { | |
| margin: 0; | |
| padding-bottom: 0; | |
| } | |
| .comment { | |
| background: yellow; | |
| } | |
| .footnote, .footnoteref { | |
| font-size: 0.8em; | |
| } | |
| span.footnote, span.footnoteref { | |
| vertical-align: super; | |
| } | |
| #footnotes { | |
| margin: 20px 0 20px 0; | |
| padding: 7px 0 0 0; | |
| } | |
| #footnotes div.footnote { | |
| margin: 0 0 5px 0; | |
| } | |
| #footnotes hr { | |
| border: none; | |
| border-top: 1px solid silver; | |
| height: 1px; | |
| text-align: left; | |
| margin-left: 0; | |
| width: 20%; | |
| min-width: 100px; | |
| } | |
| div.colist td { | |
| padding-right: 0.5em; | |
| padding-bottom: 0.3em; | |
| vertical-align: top; | |
| } | |
| div.colist td img { | |
| margin-top: 0.3em; | |
| } | |
| @media print { | |
| #footer-badges { display: none; } | |
| } | |
| #toc { | |
| margin-bottom: 2.5em; | |
| } | |
| #toctitle { | |
| color: #527bbd; | |
| font-size: 1.1em; | |
| font-weight: bold; | |
| margin-top: 1.0em; | |
| margin-bottom: 0.1em; | |
| } | |
| div.toclevel0, div.toclevel1, div.toclevel2, div.toclevel3, div.toclevel4 { | |
| margin-top: 0; | |
| margin-bottom: 0; | |
| } | |
| div.toclevel2 { | |
| margin-left: 2em; | |
| font-size: 0.9em; | |
| } | |
| div.toclevel3 { | |
| margin-left: 4em; | |
| font-size: 0.9em; | |
| } | |
| div.toclevel4 { | |
| margin-left: 6em; | |
| font-size: 0.9em; | |
| } | |
| span.aqua { color: aqua; } | |
| span.black { color: black; } | |
| span.blue { color: blue; } | |
| span.fuchsia { color: fuchsia; } | |
| span.gray { color: gray; } | |
| span.green { color: green; } | |
| span.lime { color: lime; } | |
| span.maroon { color: maroon; } | |
| span.navy { color: navy; } | |
| span.olive { color: olive; } | |
| span.purple { color: purple; } | |
| span.red { color: red; } | |
| span.silver { color: silver; } | |
| span.teal { color: teal; } | |
| span.white { color: white; } | |
| span.yellow { color: yellow; } | |
| span.aqua-background { background: aqua; } | |
| span.black-background { background: black; } | |
| span.blue-background { background: blue; } | |
| span.fuchsia-background { background: fuchsia; } | |
| span.gray-background { background: gray; } | |
| span.green-background { background: green; } | |
| span.lime-background { background: lime; } | |
| span.maroon-background { background: maroon; } | |
| span.navy-background { background: navy; } | |
| span.olive-background { background: olive; } | |
| span.purple-background { background: purple; } | |
| span.red-background { background: red; } | |
| span.silver-background { background: silver; } | |
| span.teal-background { background: teal; } | |
| span.white-background { background: white; } | |
| span.yellow-background { background: yellow; } | |
| span.big { font-size: 2em; } | |
| span.small { font-size: 0.6em; } | |
| span.underline { text-decoration: underline; } | |
| span.overline { text-decoration: overline; } | |
| span.line-through { text-decoration: line-through; } | |
| div.unbreakable { page-break-inside: avoid; } | |
| /* | |
| * xhtml11 specific | |
| * | |
| * */ | |
| div.tableblock { | |
| margin-top: 1.0em; | |
| margin-bottom: 1.5em; | |
| } | |
| div.tableblock > table { | |
| border: 3px solid #527bbd; | |
| } | |
| thead, p.table.header { | |
| font-weight: bold; | |
| color: #527bbd; | |
| } | |
| p.table { | |
| margin-top: 0; | |
| } | |
| /* Because the table frame attribute is overriden by CSS in most browsers. */ | |
| div.tableblock > table[frame="void"] { | |
| border-style: none; | |
| } | |
| div.tableblock > table[frame="hsides"] { | |
| border-left-style: none; | |
| border-right-style: none; | |
| } | |
| div.tableblock > table[frame="vsides"] { | |
| border-top-style: none; | |
| border-bottom-style: none; | |
| } | |
| /* | |
| * html5 specific | |
| * | |
| * */ | |
| table.tableblock { | |
| margin-top: 1.0em; | |
| margin-bottom: 1.5em; | |
| } | |
| thead, p.tableblock.header { | |
| font-weight: bold; | |
| color: #527bbd; | |
| } | |
| p.tableblock { | |
| margin-top: 0; | |
| } | |
| table.tableblock { | |
| border-width: 3px; | |
| border-spacing: 0px; | |
| border-style: solid; | |
| border-color: #527bbd; | |
| border-collapse: collapse; | |
| } | |
| th.tableblock, td.tableblock { | |
| border-width: 1px; | |
| padding: 4px; | |
| border-style: solid; | |
| border-color: #527bbd; | |
| } | |
| table.tableblock.frame-topbot { | |
| border-left-style: hidden; | |
| border-right-style: hidden; | |
| } | |
| table.tableblock.frame-sides { | |
| border-top-style: hidden; | |
| border-bottom-style: hidden; | |
| } | |
| table.tableblock.frame-none { | |
| border-style: hidden; | |
| } | |
| th.tableblock.halign-left, td.tableblock.halign-left { | |
| text-align: left; | |
| } | |
| th.tableblock.halign-center, td.tableblock.halign-center { | |
| text-align: center; | |
| } | |
| th.tableblock.halign-right, td.tableblock.halign-right { | |
| text-align: right; | |
| } | |
| th.tableblock.valign-top, td.tableblock.valign-top { | |
| vertical-align: top; | |
| } | |
| th.tableblock.valign-middle, td.tableblock.valign-middle { | |
| vertical-align: middle; | |
| } | |
| th.tableblock.valign-bottom, td.tableblock.valign-bottom { | |
| vertical-align: bottom; | |
| } | |
| /* | |
| * manpage specific | |
| * | |
| * */ | |
| body.manpage h1 { | |
| padding-top: 0.5em; | |
| padding-bottom: 0.5em; | |
| border-top: 2px solid silver; | |
| border-bottom: 2px solid silver; | |
| } | |
| body.manpage h2 { | |
| border-style: none; | |
| } | |
| body.manpage div.sectionbody { | |
| margin-left: 3em; | |
| } | |
| @media print { | |
| body.manpage div#toc { display: none; } | |
| } | |
| </style> | |
| <script type="text/javascript"> | |
| /*<+'])'); | |
| // Function that scans the DOM tree for header elements (the DOM2 | |
| // nodeIterator API would be a better technique but not supported by all | |
| // browsers). | |
| var iterate = function (el) { | |
| for (var i = el.firstChild; i != null; i = i.nextSibling) { | |
| if (i.nodeType == 1 /* Node.ELEMENT_NODE */) { | |
| var mo = re.exec(i.tagName); | |
| if (mo && (i.getAttribute("class") || i.getAttribute("className")) != "float") { | |
| result[result.length] = new TocEntry(i, getText(i), mo[1]-1); | |
| } | |
| iterate(i); | |
| } | |
| } | |
| } | |
| iterate(el); | |
| return result; | |
| } | |
| var toc = document.getElementById("toc"); | |
| if (!toc) { | |
| return; | |
| } | |
| // Delete existing TOC entries in case we're reloading the TOC. | |
| var tocEntriesToRemove = []; | |
| var i; | |
| for (i = 0; i < toc.childNodes.length; i++) { | |
| var entry = toc.childNodes[i]; | |
| if (entry.nodeName.toLowerCase() == 'div' | |
| && entry.getAttribute("class") | |
| && entry.getAttribute("class").match(/^toclevel/)) | |
| tocEntriesToRemove.push(entry); | |
| } | |
| for (i = 0; i < tocEntriesToRemove.length; i++) { | |
| toc.removeChild(tocEntriesToRemove[i]); | |
| } | |
| // Rebuild TOC entries. | |
| var entries = tocEntries(document.getElementById("content"), toclevels); | |
| for (var i = 0; i < entries.length; ++i) { | |
| var entry = entries[i]; | |
| if (entry.element.id == "") | |
| entry.element.id = "_toc_" + i; | |
| var a = document.createElement("a"); | |
| a.href = "#" + entry.element.id; | |
| a.appendChild(document.createTextNode(entry.text)); | |
| var div = document.createElement("div"); | |
| div.appendChild(a); | |
| div.className = "toclevel" + entry.toclevel; | |
| toc.appendChild(div); | |
| } | |
| if (entries.length == 0) | |
| toc.parentNode.removeChild(toc); | |
| }, | |
| ///////////////////////////////////////////////////////////////////// | |
| // Footnotes generator | |
| ///////////////////////////////////////////////////////////////////// | |
| /* Based on footnote generation code from: | |
| * http://www.brandspankingnew.net/archive/2005/07/format_footnote.html | |
| */ | |
| footnotes: function () { | |
| // Delete existing footnote entries in case we're reloading the footnodes. | |
| var i; | |
| var noteholder = document.getElementById("footnotes"); | |
| if (!noteholder) { | |
| return; | |
| } | |
| var entriesToRemove = []; | |
| for (i = 0; i < noteholder.childNodes.length; i++) { | |
| var entry = noteholder.childNodes[i]; | |
| if (entry.nodeName.toLowerCase() == 'div' && entry.getAttribute("class") == "footnote") | |
| entriesToRemove.push(entry); | |
| } | |
| for (i = 0; i < entriesToRemove.length; i++) { | |
| noteholder.removeChild(entriesToRemove[i]); | |
| } | |
| // Rebuild footnote entries. | |
| var cont = document.getElementById("content"); | |
| var spans = cont.getElementsByTagName("span"); | |
| var refs = {}; | |
| var n = 0; | |
| for (i=0; i<spans.length; i++) { | |
| if (spans[i].className == "footnote") { | |
| n++; | |
| var note = spans[i].getAttribute("data-note"); | |
| if (!note) { | |
| // Use [\s\S] in place of . so multi-line matches work. | |
| // Because JavaScript has no s (dotall) regex flag. | |
| note = spans[i].innerHTML.match(/\s*\[([\s\S]*)]\s*/)[1]; | |
| spans[i].innerHTML = | |
| "[<a id='_footnoteref_" + n + "' href='#_footnote_" + n + | |
| "' title='View footnote' class='footnote'>" + n + "</a>]"; | |
| spans[i].setAttribute("data-note", note); | |
| } | |
| noteholder.innerHTML += | |
| "<div class='footnote' id='_footnote_" + n + "'>" + | |
| "<a href='#_footnoteref_" + n + "' title='Return to text'>" + | |
| n + "</a>. " + note + "</div>"; | |
| var id =spans[i].getAttribute("id"); | |
| if (id != null) refs["#"+id] = n; | |
| } | |
| } | |
| if (n == 0) | |
| noteholder.parentNode.removeChild(noteholder); | |
| else { | |
| // Process footnoterefs. | |
| for (i=0; i<spans.length; i++) { | |
| if (spans[i].className == "footnoteref") { | |
| var href = spans[i].getElementsByTagName("a")[0].getAttribute("href"); | |
| href = href.match(/#.*/)[0]; // Because IE return full URL. | |
| n = refs[href]; | |
| spans[i].innerHTML = | |
| "[<a href='#_footnote_" + n + | |
| "' title='View footnote' class='footnote'>" + n + "</a>]"; | |
| } | |
| } | |
| } | |
| }, | |
| install: function(toclevels) { | |
| var timerId; | |
| function reinstall() { | |
| asciidoc.footnotes(); | |
| if (toclevels) { | |
| asciidoc.toc(toclevels); | |
| } | |
| } | |
| function reinstallAndRemoveTimer() { | |
| clearInterval(timerId); | |
| reinstall(); | |
| } | |
| timerId = setInterval(reinstall, 500); | |
| if (document.addEventListener) | |
| document.addEventListener("DOMContentLoaded", reinstallAndRemoveTimer, false); | |
| else | |
| window.onload = reinstallAndRemoveTimer; | |
| } | |
| } | |
| asciidoc.install(); | |
| /*]]>*/ | |
| </script> | |
| </head> | |
| <body class="article"> | |
| <div id="header"> | |
| <h1>Use of index and Racy Git problem</h1> | |
| </div> | |
| <div id="content"> | |
| <div class="sect1"> | |
| <h2 id="_background">Background</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>The index is one of the most important data structures in Git. | |
| It represents a virtual working tree state by recording list of | |
| paths and their object names and serves as a staging area to | |
| write out the next tree object to be committed. The state is | |
| "virtual" in the sense that it does not necessarily have to, and | |
| often does not, match the files in the working tree.</p></div> | |
| <div class="paragraph"><p>There are cases Git needs to examine the differences between the | |
| virtual working tree state in the index and the files in the | |
| working tree. The most obvious case is when the user asks <code>git | |
| diff</code> (or its low level implementation, <code>git diff-files</code>) or | |
| <code>git-ls-files --modified</code>. In addition, Git internally checks | |
| if the files in the working tree are different from what are | |
| recorded in the index to avoid stomping on local changes in them | |
| during patch application, switching branches, and merging.</p></div> | |
| <div class="paragraph"><p>In order to speed up this comparison between the files in the | |
| working tree and the index entries, the index entries record the | |
| information obtained from the filesystem via <code>lstat(2)</code> system | |
| call when they were last updated. When checking if they differ, | |
| Git first runs <code>lstat(2)</code> on the files and compares the result | |
| with this information (this is what was originally done by the | |
| <code>ce_match_stat()</code> function, but the current code does it in | |
| <code>ce_match_stat_basic()</code> function). If some of these "cached | |
| stat information" fields do not match, Git can tell that the | |
| files are modified without even looking at their contents.</p></div> | |
| <div class="paragraph"><p>Note: not all members in <code>struct stat</code> obtained via <code>lstat(2)</code> | |
| are used for this comparison. For example, <code>st_atime</code> obviously | |
| is not useful. Currently, Git compares the file type (regular | |
| files vs symbolic links) and executable bits (only for regular | |
| files) from <code>st_mode</code> member, <code>st_mtime</code> and <code>st_ctime</code> | |
| timestamps, <code>st_uid</code>, <code>st_gid</code>, <code>st_ino</code>, and <code>st_size</code> members. | |
| With a <code>USE_STDEV</code> compile-time option, <code>st_dev</code> is also | |
| compared, but this is not enabled by default because this member | |
| is not stable on network filesystems. With <code>USE_NSEC</code> | |
| compile-time option, <code>st_mtim.tv_nsec</code> and <code>st_ctim.tv_nsec</code> | |
| members are also compared. On Linux, this is not enabled by default | |
| because in-core timestamps can have finer granularity than | |
| on-disk timestamps, resulting in meaningless changes when an | |
| inode is evicted from the inode cache. See commit 8ce13b0 | |
| of git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git | |
| ([PATCH] Sync in core time granularity with filesystems, | |
| 2005-01-04). This patch is included in kernel 2.6.11 and newer, but | |
| only fixes the issue for file systems with exactly 1 ns or 1 s | |
| resolution. Other file systems are still broken in current Linux | |
| kernels (e.g. CEPH, CIFS, NTFS, UDF), see | |
| <a href="https://lkml.org/lkml/2015/6/9/714">https://lkml.org/lkml/2015/6/9/714</a></p></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_racy_git">Racy Git</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>There is one slight problem with the optimization based on the | |
| cached stat information. Consider this sequence:</p></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>: modify 'foo' | |
| $ git update-index 'foo' | |
| : modify 'foo' again, in-place, without changing its size</code></pre> | |
| </div></div> | |
| <div class="paragraph"><p>The first <code>update-index</code> computes the object name of the | |
| contents of file <code>foo</code> and updates the index entry for <code>foo</code> | |
| along with the <code>struct stat</code> information. If the modification | |
| that follows it happens very fast so that the file’s <code>st_mtime</code> | |
| timestamp does not change, after this sequence, the cached stat | |
| information the index entry records still exactly match what you | |
| would see in the filesystem, even though the file <code>foo</code> is now | |
| different. | |
| This way, Git can incorrectly think files in the working tree | |
| are unmodified even though they actually are. This is called | |
| the "racy Git" problem (discovered by Pasky), and the entries | |
| that appear clean when they may not be because of this problem | |
| are called "racily clean".</p></div> | |
| <div class="paragraph"><p>To avoid this problem, Git does two things:</p></div> | |
| <div class="olist arabic"><ol class="arabic"> | |
| <li> | |
| <p> | |
| When the cached stat information says the file has not been | |
| modified, and the <code>st_mtime</code> is the same as (or newer than) | |
| the timestamp of the index file itself (which is the time <code>git | |
| update-index foo</code> finished running in the above example), it | |
| also compares the contents with the object registered in the | |
| index entry to make sure they match. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| When the index file is updated that contains racily clean | |
| entries, cached <code>st_size</code> information is truncated to zero | |
| before writing a new version of the index file. | |
| </p> | |
| </li> | |
| </ol></div> | |
| <div class="paragraph"><p>Because the index file itself is written after collecting all | |
| the stat information from updated paths, <code>st_mtime</code> timestamp of | |
| it is usually the same as or newer than any of the paths the | |
| index contains. And no matter how quick the modification that | |
| follows <code>git update-index foo</code> finishes, the resulting | |
| <code>st_mtime</code> timestamp on <code>foo</code> cannot get a value earlier | |
| than the index file. Therefore, index entries that can be | |
| racily clean are limited to the ones that have the same | |
| timestamp as the index file itself.</p></div> | |
| <div class="paragraph"><p>The callers that want to check if an index entry matches the | |
| corresponding file in the working tree continue to call | |
| <code>ce_match_stat()</code>, but with this change, <code>ce_match_stat()</code> uses | |
| <code>ce_modified_check_fs()</code> to see if racily clean ones are | |
| actually clean after comparing the cached stat information using | |
| <code>ce_match_stat_basic()</code>.</p></div> | |
| <div class="paragraph"><p>The problem the latter solves is this sequence:</p></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>$ git update-index 'foo' | |
| : modify 'foo' in-place without changing its size | |
| : wait for enough time | |
| $ git update-index 'bar'</code></pre> | |
| </div></div> | |
| <div class="paragraph"><p>Without the latter, the timestamp of the index file gets a newer | |
| value, and falsely clean entry <code>foo</code> would not be caught by the | |
| timestamp comparison check done with the former logic anymore. | |
| The latter makes sure that the cached stat information for <code>foo</code> | |
| would never match with the file in the working tree, so later | |
| checks by <code>ce_match_stat_basic()</code> would report that the index entry | |
| does not match the file and Git does not have to fall back on more | |
| expensive <code>ce_modified_check_fs()</code>.</p></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_runtime_penalty">Runtime penalty</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>The runtime penalty of falling back to <code>ce_modified_check_fs()</code> | |
| from <code>ce_match_stat()</code> can be very expensive when there are many | |
| racily clean entries. An obvious way to artificially create | |
| this situation is to give the same timestamp to all the files in | |
| the working tree in a large project, run <code>git update-index</code> on | |
| them, and give the same timestamp to the index file:</p></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>$ date >.datestamp | |
| $ git ls-files | xargs touch -r .datestamp | |
| $ git ls-files | git update-index --stdin | |
| $ touch -r .datestamp .git/index</code></pre> | |
| </div></div> | |
| <div class="paragraph"><p>This will make all index entries racily clean. The linux project, for | |
| example, there are over 20,000 files in the working tree. On my | |
| Athlon 64 X2 3800+, after the above:</p></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>$ /usr/bin/time git diff-files | |
| 1.68user 0.54system 0:02.22elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k | |
| 0inputs+0outputs (0major+67111minor)pagefaults 0swaps | |
| $ git update-index MAINTAINERS | |
| $ /usr/bin/time git diff-files | |
| 0.02user 0.12system 0:00.14elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k | |
| 0inputs+0outputs (0major+935minor)pagefaults 0swaps</code></pre> | |
| </div></div> | |
| <div class="paragraph"><p>Running <code>git update-index</code> in the middle checked the racily | |
| clean entries, and left the cached <code>st_mtime</code> for all the paths | |
| intact because they were actually clean (so this step took about | |
| the same amount of time as the first <code>git diff-files</code>). After | |
| that, they are not racily clean anymore but are truly clean, so | |
| the second invocation of <code>git diff-files</code> fully took advantage | |
| of the cached stat information.</p></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_avoiding_runtime_penalty">Avoiding runtime penalty</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>In order to avoid the above runtime penalty, post 1.4.2 Git used | |
| to have a code that made sure the index file | |
| got timestamp newer than the youngest files in the index when | |
| there are many young files with the same timestamp as the | |
| resulting index file would otherwise would have by waiting | |
| before finishing writing the index file out.</p></div> | |
| <div class="paragraph"><p>I suspected that in practice the situation where many paths in the | |
| index are all racily clean was quite rare. The only code paths | |
| that can record recent timestamp for large number of paths are:</p></div> | |
| <div class="olist arabic"><ol class="arabic"> | |
| <li> | |
| <p> | |
| Initial <code>git add .</code> of a large project. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| <code>git checkout</code> of a large project from an empty index into an | |
| unpopulated working tree. | |
| </p> | |
| </li> | |
| </ol></div> | |
| <div class="paragraph"><p>Note: switching branches with <code>git checkout</code> keeps the cached | |
| stat information of existing working tree files that are the | |
| same between the current branch and the new branch, which are | |
| all older than the resulting index file, and they will not | |
| become racily clean. Only the files that are actually checked | |
| out can become racily clean.</p></div> | |
| <div class="paragraph"><p>In a large project where raciness avoidance cost really matters, | |
| however, the initial computation of all object names in the | |
| index takes more than one second, and the index file is written | |
| out after all that happens. Therefore the timestamp of the | |
| index file will be more than one seconds later than the | |
| youngest file in the working tree. This means that in these | |
| cases there actually will not be any racily clean entry in | |
| the resulting index.</p></div> | |
| <div class="paragraph"><p>Based on this discussion, the current code does not use the | |
| "workaround" to avoid the runtime penalty that does not exist in | |
| practice anymore. This was done with commit 0fc82cff on Aug 15, | |
| 2006.</p></div> | |
| </div> | |
| </div> | |
| </div> | |
| <div id="footnotes"><hr /></div> | |
| <div id="footer"> | |
| <div id="footer-text"> | |
| Last updated 2015-07-13 14:44:41 PDT | |
| </div> | |
| </div> | |
| </body> | |
| </html> |