blob: b8463bc5793af808bfd6f1377f4e99f49fa2c7eb [file] [log] [blame]
Junio C Hamano8ed7c462018-04-11 05:14:521<?xml version="1.0" encoding="UTF-8"?>
Junio C Hamano1171ab42017-10-11 06:33:372<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
3 "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
4<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
5<head>
6<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />
Junio C Hamanoa85030a2022-07-27 16:48:217<meta name="generator" content="AsciiDoc 10.2.0" />
Junio C Hamano1171ab42017-10-11 06:33:378<title>Git hash function transition</title>
9<style type="text/css">
10/* Shared CSS for AsciiDoc xhtml11 and html5 backends */
11
12/* Default font. */
13body {
14 font-family: Georgia,serif;
15}
16
17/* Title font. */
18h1, h2, h3, h4, h5, h6,
19div.title, caption.title,
20thead, p.table.header,
21#toctitle,
22#author, #revnumber, #revdate, #revremark,
23#footer {
24 font-family: Arial,Helvetica,sans-serif;
25}
26
27body {
28 margin: 1em 5% 1em 5%;
29}
30
31a {
32 color: blue;
33 text-decoration: underline;
34}
35a:visited {
36 color: fuchsia;
37}
38
39em {
40 font-style: italic;
41 color: navy;
42}
43
44strong {
45 font-weight: bold;
46 color: #083194;
47}
48
49h1, h2, h3, h4, h5, h6 {
50 color: #527bbd;
51 margin-top: 1.2em;
52 margin-bottom: 0.5em;
53 line-height: 1.3;
54}
55
56h1, h2, h3 {
57 border-bottom: 2px solid silver;
58}
59h2 {
60 padding-top: 0.5em;
61}
62h3 {
63 float: left;
64}
65h3 + * {
66 clear: left;
67}
68h5 {
69 font-size: 1.0em;
70}
71
72div.sectionbody {
73 margin-left: 0;
74}
75
76hr {
77 border: 1px solid silver;
78}
79
80p {
81 margin-top: 0.5em;
82 margin-bottom: 0.5em;
83}
84
85ul, ol, li > p {
86 margin-top: 0;
87}
88ul > li { color: #aaa; }
89ul > li > * { color: black; }
90
91.monospaced, code, pre {
92 font-family: "Courier New", Courier, monospace;
93 font-size: inherit;
94 color: navy;
95 padding: 0;
96 margin: 0;
97}
98pre {
99 white-space: pre-wrap;
100}
101
102#author {
103 color: #527bbd;
104 font-weight: bold;
105 font-size: 1.1em;
106}
107#email {
108}
109#revnumber, #revdate, #revremark {
110}
111
112#footer {
113 font-size: small;
114 border-top: 2px solid silver;
115 padding-top: 0.5em;
116 margin-top: 4.0em;
117}
118#footer-text {
119 float: left;
120 padding-bottom: 0.5em;
121}
122#footer-badges {
123 float: right;
124 padding-bottom: 0.5em;
125}
126
127#preamble {
128 margin-top: 1.5em;
129 margin-bottom: 1.5em;
130}
131div.imageblock, div.exampleblock, div.verseblock,
132div.quoteblock, div.literalblock, div.listingblock, div.sidebarblock,
133div.admonitionblock {
134 margin-top: 1.0em;
135 margin-bottom: 1.5em;
136}
137div.admonitionblock {
138 margin-top: 2.0em;
139 margin-bottom: 2.0em;
140 margin-right: 10%;
141 color: #606060;
142}
143
144div.content { /* Block element content. */
145 padding: 0;
146}
147
148/* Block element titles. */
149div.title, caption.title {
150 color: #527bbd;
151 font-weight: bold;
152 text-align: left;
153 margin-top: 1.0em;
154 margin-bottom: 0.5em;
155}
156div.title + * {
157 margin-top: 0;
158}
159
160td div.title:first-child {
161 margin-top: 0.0em;
162}
163div.content div.title:first-child {
164 margin-top: 0.0em;
165}
166div.content + div.title {
167 margin-top: 0.0em;
168}
169
170div.sidebarblock > div.content {
171 background: #ffffee;
172 border: 1px solid #dddddd;
173 border-left: 4px solid #f0f0f0;
174 padding: 0.5em;
175}
176
177div.listingblock > div.content {
178 border: 1px solid #dddddd;
179 border-left: 5px solid #f0f0f0;
180 background: #f8f8f8;
181 padding: 0.5em;
182}
183
184div.quoteblock, div.verseblock {
185 padding-left: 1.0em;
186 margin-left: 1.0em;
187 margin-right: 10%;
188 border-left: 5px solid #f0f0f0;
189 color: #888;
190}
191
192div.quoteblock > div.attribution {
193 padding-top: 0.5em;
194 text-align: right;
195}
196
197div.verseblock > pre.content {
198 font-family: inherit;
199 font-size: inherit;
200}
201div.verseblock > div.attribution {
202 padding-top: 0.75em;
203 text-align: left;
204}
205/* DEPRECATED: Pre version 8.2.7 verse style literal block. */
206div.verseblock + div.attribution {
207 text-align: left;
208}
209
210div.admonitionblock .icon {
211 vertical-align: top;
212 font-size: 1.1em;
213 font-weight: bold;
214 text-decoration: underline;
215 color: #527bbd;
216 padding-right: 0.5em;
217}
218div.admonitionblock td.content {
219 padding-left: 0.5em;
220 border-left: 3px solid #dddddd;
221}
222
223div.exampleblock > div.content {
224 border-left: 3px solid #dddddd;
225 padding-left: 0.5em;
226}
227
228div.imageblock div.content { padding-left: 0; }
229span.image img { border-style: none; vertical-align: text-bottom; }
230a.image:visited { color: white; }
231
232dl {
233 margin-top: 0.8em;
234 margin-bottom: 0.8em;
235}
236dt {
237 margin-top: 0.5em;
238 margin-bottom: 0;
239 font-style: normal;
240 color: navy;
241}
242dd > *:first-child {
243 margin-top: 0.1em;
244}
245
246ul, ol {
247 list-style-position: outside;
248}
249ol.arabic {
250 list-style-type: decimal;
251}
252ol.loweralpha {
253 list-style-type: lower-alpha;
254}
255ol.upperalpha {
256 list-style-type: upper-alpha;
257}
258ol.lowerroman {
259 list-style-type: lower-roman;
260}
261ol.upperroman {
262 list-style-type: upper-roman;
263}
264
265div.compact ul, div.compact ol,
266div.compact p, div.compact p,
267div.compact div, div.compact div {
268 margin-top: 0.1em;
269 margin-bottom: 0.1em;
270}
271
272tfoot {
273 font-weight: bold;
274}
275td > div.verse {
276 white-space: pre;
277}
278
279div.hdlist {
280 margin-top: 0.8em;
281 margin-bottom: 0.8em;
282}
283div.hdlist tr {
284 padding-bottom: 15px;
285}
286dt.hdlist1.strong, td.hdlist1.strong {
287 font-weight: bold;
288}
289td.hdlist1 {
290 vertical-align: top;
291 font-style: normal;
292 padding-right: 0.8em;
293 color: navy;
294}
295td.hdlist2 {
296 vertical-align: top;
297}
298div.hdlist.compact tr {
299 margin: 0;
300 padding-bottom: 0;
301}
302
303.comment {
304 background: yellow;
305}
306
307.footnote, .footnoteref {
308 font-size: 0.8em;
309}
310
311span.footnote, span.footnoteref {
312 vertical-align: super;
313}
314
315#footnotes {
316 margin: 20px 0 20px 0;
317 padding: 7px 0 0 0;
318}
319
320#footnotes div.footnote {
321 margin: 0 0 5px 0;
322}
323
324#footnotes hr {
325 border: none;
326 border-top: 1px solid silver;
327 height: 1px;
328 text-align: left;
329 margin-left: 0;
330 width: 20%;
331 min-width: 100px;
332}
333
334div.colist td {
335 padding-right: 0.5em;
336 padding-bottom: 0.3em;
337 vertical-align: top;
338}
339div.colist td img {
340 margin-top: 0.3em;
341}
342
343@media print {
344 #footer-badges { display: none; }
345}
346
347#toc {
348 margin-bottom: 2.5em;
349}
350
351#toctitle {
352 color: #527bbd;
353 font-size: 1.1em;
354 font-weight: bold;
355 margin-top: 1.0em;
356 margin-bottom: 0.1em;
357}
358
359div.toclevel0, div.toclevel1, div.toclevel2, div.toclevel3, div.toclevel4 {
360 margin-top: 0;
361 margin-bottom: 0;
362}
363div.toclevel2 {
364 margin-left: 2em;
365 font-size: 0.9em;
366}
367div.toclevel3 {
368 margin-left: 4em;
369 font-size: 0.9em;
370}
371div.toclevel4 {
372 margin-left: 6em;
373 font-size: 0.9em;
374}
375
376span.aqua { color: aqua; }
377span.black { color: black; }
378span.blue { color: blue; }
379span.fuchsia { color: fuchsia; }
380span.gray { color: gray; }
381span.green { color: green; }
382span.lime { color: lime; }
383span.maroon { color: maroon; }
384span.navy { color: navy; }
385span.olive { color: olive; }
386span.purple { color: purple; }
387span.red { color: red; }
388span.silver { color: silver; }
389span.teal { color: teal; }
390span.white { color: white; }
391span.yellow { color: yellow; }
392
393span.aqua-background { background: aqua; }
394span.black-background { background: black; }
395span.blue-background { background: blue; }
396span.fuchsia-background { background: fuchsia; }
397span.gray-background { background: gray; }
398span.green-background { background: green; }
399span.lime-background { background: lime; }
400span.maroon-background { background: maroon; }
401span.navy-background { background: navy; }
402span.olive-background { background: olive; }
403span.purple-background { background: purple; }
404span.red-background { background: red; }
405span.silver-background { background: silver; }
406span.teal-background { background: teal; }
407span.white-background { background: white; }
408span.yellow-background { background: yellow; }
409
410span.big { font-size: 2em; }
411span.small { font-size: 0.6em; }
412
413span.underline { text-decoration: underline; }
414span.overline { text-decoration: overline; }
415span.line-through { text-decoration: line-through; }
416
417div.unbreakable { page-break-inside: avoid; }
418
419
420/*
421 * xhtml11 specific
422 *
423 * */
424
425div.tableblock {
426 margin-top: 1.0em;
427 margin-bottom: 1.5em;
428}
429div.tableblock > table {
430 border: 3px solid #527bbd;
431}
432thead, p.table.header {
433 font-weight: bold;
434 color: #527bbd;
435}
436p.table {
437 margin-top: 0;
438}
Junio C Hamano725b0da2020-01-22 22:02:40439/* Because the table frame attribute is overridden by CSS in most browsers. */
Junio C Hamano1171ab42017-10-11 06:33:37440div.tableblock > table[frame="void"] {
441 border-style: none;
442}
443div.tableblock > table[frame="hsides"] {
444 border-left-style: none;
445 border-right-style: none;
446}
447div.tableblock > table[frame="vsides"] {
448 border-top-style: none;
449 border-bottom-style: none;
450}
451
452
453/*
454 * html5 specific
455 *
456 * */
457
458table.tableblock {
459 margin-top: 1.0em;
460 margin-bottom: 1.5em;
461}
462thead, p.tableblock.header {
463 font-weight: bold;
464 color: #527bbd;
465}
466p.tableblock {
467 margin-top: 0;
468}
469table.tableblock {
470 border-width: 3px;
471 border-spacing: 0px;
472 border-style: solid;
473 border-color: #527bbd;
474 border-collapse: collapse;
475}
476th.tableblock, td.tableblock {
477 border-width: 1px;
478 padding: 4px;
479 border-style: solid;
480 border-color: #527bbd;
481}
482
483table.tableblock.frame-topbot {
484 border-left-style: hidden;
485 border-right-style: hidden;
486}
487table.tableblock.frame-sides {
488 border-top-style: hidden;
489 border-bottom-style: hidden;
490}
491table.tableblock.frame-none {
492 border-style: hidden;
493}
494
495th.tableblock.halign-left, td.tableblock.halign-left {
496 text-align: left;
497}
498th.tableblock.halign-center, td.tableblock.halign-center {
499 text-align: center;
500}
501th.tableblock.halign-right, td.tableblock.halign-right {
502 text-align: right;
503}
504
505th.tableblock.valign-top, td.tableblock.valign-top {
506 vertical-align: top;
507}
508th.tableblock.valign-middle, td.tableblock.valign-middle {
509 vertical-align: middle;
510}
511th.tableblock.valign-bottom, td.tableblock.valign-bottom {
512 vertical-align: bottom;
513}
514
515
516/*
517 * manpage specific
518 *
519 * */
520
521body.manpage h1 {
522 padding-top: 0.5em;
523 padding-bottom: 0.5em;
524 border-top: 2px solid silver;
525 border-bottom: 2px solid silver;
526}
527body.manpage h2 {
528 border-style: none;
529}
530body.manpage div.sectionbody {
531 margin-left: 3em;
532}
533
534@media print {
535 body.manpage div#toc { display: none; }
536}
537
538
539</style>
540<script type="text/javascript">
541/*<![CDATA[*/
Junio C Hamano2b153182021-12-15 21:00:31542var asciidoc = { // Namespace.
543
544/////////////////////////////////////////////////////////////////////
545// Table Of Contents generator
546/////////////////////////////////////////////////////////////////////
547
548/* Author: Mihai Bazon, September 2002
549 * http://students.infoiasi.ro/~mishoo
550 *
551 * Table Of Content generator
552 * Version: 0.4
553 *
554 * Feel free to use this script under the terms of the GNU General Public
555 * License, as long as you do not remove or alter this notice.
556 */
557
558 /* modified by Troy D. Hanson, September 2006. License: GPL */
559 /* modified by Stuart Rackham, 2006, 2009. License: GPL */
560
561// toclevels = 1..4.
562toc: function (toclevels) {
563
564 function getText(el) {
565 var text = "";
566 for (var i = el.firstChild; i != null; i = i.nextSibling) {
567 if (i.nodeType == 3 /* Node.TEXT_NODE */) // IE doesn't speak constants.
568 text += i.data;
569 else if (i.firstChild != null)
570 text += getText(i);
571 }
572 return text;
573 }
574
575 function TocEntry(el, text, toclevel) {
576 this.element = el;
577 this.text = text;
578 this.toclevel = toclevel;
579 }
580
581 function tocEntries(el, toclevels) {
582 var result = new Array;
583 var re = new RegExp('[hH]([1-'+(toclevels+1)+'])');
584 // Function that scans the DOM tree for header elements (the DOM2
585 // nodeIterator API would be a better technique but not supported by all
586 // browsers).
587 var iterate = function (el) {
588 for (var i = el.firstChild; i != null; i = i.nextSibling) {
589 if (i.nodeType == 1 /* Node.ELEMENT_NODE */) {
590 var mo = re.exec(i.tagName);
591 if (mo && (i.getAttribute("class") || i.getAttribute("className")) != "float") {
592 result[result.length] = new TocEntry(i, getText(i), mo[1]-1);
593 }
594 iterate(i);
595 }
596 }
597 }
598 iterate(el);
599 return result;
600 }
601
602 var toc = document.getElementById("toc");
603 if (!toc) {
604 return;
605 }
606
607 // Delete existing TOC entries in case we're reloading the TOC.
608 var tocEntriesToRemove = [];
609 var i;
610 for (i = 0; i < toc.childNodes.length; i++) {
611 var entry = toc.childNodes[i];
612 if (entry.nodeName.toLowerCase() == 'div'
613 && entry.getAttribute("class")
614 && entry.getAttribute("class").match(/^toclevel/))
615 tocEntriesToRemove.push(entry);
616 }
617 for (i = 0; i < tocEntriesToRemove.length; i++) {
618 toc.removeChild(tocEntriesToRemove[i]);
619 }
620
621 // Rebuild TOC entries.
622 var entries = tocEntries(document.getElementById("content"), toclevels);
623 for (var i = 0; i < entries.length; ++i) {
624 var entry = entries[i];
625 if (entry.element.id == "")
626 entry.element.id = "_toc_" + i;
627 var a = document.createElement("a");
628 a.href = "#" + entry.element.id;
629 a.appendChild(document.createTextNode(entry.text));
630 var div = document.createElement("div");
631 div.appendChild(a);
632 div.className = "toclevel" + entry.toclevel;
633 toc.appendChild(div);
634 }
635 if (entries.length == 0)
636 toc.parentNode.removeChild(toc);
637},
638
639
640/////////////////////////////////////////////////////////////////////
641// Footnotes generator
642/////////////////////////////////////////////////////////////////////
643
644/* Based on footnote generation code from:
645 * http://www.brandspankingnew.net/archive/2005/07/format_footnote.html
646 */
647
648footnotes: function () {
649 // Delete existing footnote entries in case we're reloading the footnodes.
650 var i;
651 var noteholder = document.getElementById("footnotes");
652 if (!noteholder) {
653 return;
654 }
655 var entriesToRemove = [];
656 for (i = 0; i < noteholder.childNodes.length; i++) {
657 var entry = noteholder.childNodes[i];
658 if (entry.nodeName.toLowerCase() == 'div' && entry.getAttribute("class") == "footnote")
659 entriesToRemove.push(entry);
660 }
661 for (i = 0; i < entriesToRemove.length; i++) {
662 noteholder.removeChild(entriesToRemove[i]);
663 }
664
665 // Rebuild footnote entries.
666 var cont = document.getElementById("content");
667 var spans = cont.getElementsByTagName("span");
668 var refs = {};
669 var n = 0;
670 for (i=0; i<spans.length; i++) {
671 if (spans[i].className == "footnote") {
672 n++;
673 var note = spans[i].getAttribute("data-note");
674 if (!note) {
675 // Use [\s\S] in place of . so multi-line matches work.
676 // Because JavaScript has no s (dotall) regex flag.
677 note = spans[i].innerHTML.match(/\s*\[([\s\S]*)]\s*/)[1];
678 spans[i].innerHTML =
679 "[<a id='_footnoteref_" + n + "' href='#_footnote_" + n +
680 "' title='View footnote' class='footnote'>" + n + "</a>]";
681 spans[i].setAttribute("data-note", note);
682 }
683 noteholder.innerHTML +=
684 "<div class='footnote' id='_footnote_" + n + "'>" +
685 "<a href='#_footnoteref_" + n + "' title='Return to text'>" +
686 n + "</a>. " + note + "</div>";
687 var id =spans[i].getAttribute("id");
688 if (id != null) refs["#"+id] = n;
689 }
690 }
691 if (n == 0)
692 noteholder.parentNode.removeChild(noteholder);
693 else {
694 // Process footnoterefs.
695 for (i=0; i<spans.length; i++) {
696 if (spans[i].className == "footnoteref") {
697 var href = spans[i].getElementsByTagName("a")[0].getAttribute("href");
698 href = href.match(/#.*/)[0]; // Because IE return full URL.
699 n = refs[href];
700 spans[i].innerHTML =
701 "[<a href='#_footnote_" + n +
702 "' title='View footnote' class='footnote'>" + n + "</a>]";
703 }
704 }
705 }
706},
707
708install: function(toclevels) {
709 var timerId;
710
711 function reinstall() {
712 asciidoc.footnotes();
713 if (toclevels) {
714 asciidoc.toc(toclevels);
715 }
716 }
717
718 function reinstallAndRemoveTimer() {
719 clearInterval(timerId);
720 reinstall();
721 }
722
723 timerId = setInterval(reinstall, 500);
724 if (document.addEventListener)
725 document.addEventListener("DOMContentLoaded", reinstallAndRemoveTimer, false);
726 else
727 window.onload = reinstallAndRemoveTimer;
728}
729
730}
Junio C Hamano1171ab42017-10-11 06:33:37731asciidoc.install();
732/*]]>*/
733</script>
734</head>
735<body class="article">
736<div id="header">
737<h1>Git hash function transition</h1>
Junio C Hamanoc2015e32024-02-06 23:02:59738<span id="revdate">2024-02-06</span>
Junio C Hamano1171ab42017-10-11 06:33:37739</div>
740<div id="content">
741<div class="sect1">
742<h2 id="_objective">Objective</h2>
743<div class="sectionbody">
744<div class="paragraph"><p>Migrate Git from SHA-1 to a stronger hash function.</p></div>
745</div>
746</div>
747<div class="sect1">
748<h2 id="_background">Background</h2>
749<div class="sectionbody">
750<div class="paragraph"><p>At its core, the Git version control system is a content addressable
751filesystem. It uses the SHA-1 hash function to name content. For
752example, files, directories, and revisions are referred to by hash
753values unlike in other traditional version control systems where files
754or versions are referred to via sequential numbers. The use of a hash
755function to address its content delivers a few advantages:</p></div>
756<div class="ulist"><ul>
757<li>
758<p>
759Integrity checking is easy. Bit flips, for example, are easily
760 detected, as the hash of corrupted content does not match its name.
761</p>
762</li>
763<li>
764<p>
765Lookup of objects is fast.
766</p>
767</li>
768</ul></div>
769<div class="paragraph"><p>Using a cryptographically secure hash function brings additional
770advantages:</p></div>
771<div class="ulist"><ul>
772<li>
773<p>
774Object names can be signed and third parties can trust the hash to
775 address the signed object and all objects it references.
776</p>
777</li>
778<li>
779<p>
780Communication using Git protocol and out of band communication
781 methods have a short reliable string that can be used to reliably
782 address stored content.
783</p>
784</li>
785</ul></div>
786<div class="paragraph"><p>Over time some flaws in SHA-1 have been discovered by security
Junio C Hamano8ed7c462018-04-11 05:14:52787researchers. On 23 February 2017 the SHAttered attack
788(<a href="https://shattered.io">https://shattered.io</a>) demonstrated a practical SHA-1 hash collision.</p></div>
789<div class="paragraph"><p>Git v2.13.0 and later subsequently moved to a hardened SHA-1
790implementation by default, which isn&#8217;t vulnerable to the SHAttered
Junio C Hamanoa70c9882021-02-23 00:57:12791attack, but SHA-1 is still weak.</p></div>
792<div class="paragraph"><p>Thus it&#8217;s considered prudent to move past any variant of SHA-1
Junio C Hamano8ed7c462018-04-11 05:14:52793to a new hash. There&#8217;s no guarantee that future attacks on SHA-1 won&#8217;t
794be published in the future, and those attacks may not have viable
795mitigations.</p></div>
796<div class="paragraph"><p>If SHA-1 and its variants were to be truly broken, Git&#8217;s hash function
797could not be considered cryptographically secure any more. This would
798impact the communication of hash values because we could not trust
799that a given hash value represented the known good version of content
800that the speaker intended.</p></div>
Junio C Hamano1171ab42017-10-11 06:33:37801<div class="paragraph"><p>SHA-1 still possesses the other properties such as fast object lookup
802and safe error checking, but other hash functions are equally suitable
803that are believed to be cryptographically secure.</p></div>
804</div>
805</div>
806<div class="sect1">
Junio C Hamanoa70c9882021-02-23 00:57:12807<h2 id="_choice_of_hash">Choice of Hash</h2>
808<div class="sectionbody">
809<div class="paragraph"><p>The hash to replace the hardened SHA-1 should be stronger than SHA-1
810was: we would like it to be trustworthy and useful in practice for at
811least 10 years.</p></div>
812<div class="paragraph"><p>Some other relevant properties:</p></div>
813<div class="olist arabic"><ol class="arabic">
814<li>
815<p>
816A 256-bit hash (long enough to match common security practice; not
817 excessively long to hurt performance and disk usage).
818</p>
819</li>
820<li>
821<p>
822High quality implementations should be widely available (e.g., in
823 OpenSSL and Apple CommonCrypto).
824</p>
825</li>
826<li>
827<p>
828The hash function&#8217;s properties should match Git&#8217;s needs (e.g. Git
829 requires collision and 2nd preimage resistance and does not require
830 length extension resistance).
831</p>
832</li>
833<li>
834<p>
835As a tiebreaker, the hash should be fast to compute (fortunately
836 many contenders are faster than SHA-1).
837</p>
838</li>
839</ol></div>
840<div class="paragraph"><p>There were several contenders for a successor hash to SHA-1, including
841SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256.</p></div>
842<div class="paragraph"><p>In late 2018 the project picked SHA-256 as its successor hash.</p></div>
843<div class="paragraph"><p>See 0ed8d8da374 (doc hash-function-transition: pick SHA-256 as
844NewHash, 2018-08-04) and numerous mailing list threads at the time,
845particularly the one starting at
846<a href="https://lore.kernel.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/">https://lore.kernel.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/</a>
847for more information.</p></div>
848</div>
849</div>
850<div class="sect1">
Junio C Hamano1171ab42017-10-11 06:33:37851<h2 id="_goals">Goals</h2>
852<div class="sectionbody">
Junio C Hamano1171ab42017-10-11 06:33:37853<div class="olist arabic"><ol class="arabic">
854<li>
855<p>
Junio C Hamano11ae3202018-08-20 20:15:42856The transition to SHA-256 can be done one local repository at a time.
Junio C Hamano1171ab42017-10-11 06:33:37857</p>
858<div class="olist loweralpha"><ol class="loweralpha">
859<li>
860<p>
861Requiring no action by any other party.
862</p>
863</li>
864<li>
865<p>
Junio C Hamano11ae3202018-08-20 20:15:42866A SHA-256 repository can communicate with SHA-1 Git servers
Junio C Hamano1171ab42017-10-11 06:33:37867 (push/fetch).
868</p>
869</li>
870<li>
871<p>
Junio C Hamano11ae3202018-08-20 20:15:42872Users can use SHA-1 and SHA-256 identifiers for objects
Junio C Hamano1171ab42017-10-11 06:33:37873 interchangeably (see "Object names on the command line", below).
874</p>
875</li>
876<li>
877<p>
878New signed objects make use of a stronger hash function than
879 SHA-1 for their security guarantees.
880</p>
881</li>
882</ol></div>
883</li>
884<li>
885<p>
886Allow a complete transition away from SHA-1.
887</p>
888<div class="olist loweralpha"><ol class="loweralpha">
889<li>
890<p>
891Local metadata for SHA-1 compatibility can be removed from a
892 repository if compatibility with SHA-1 is no longer needed.
893</p>
894</li>
895</ol></div>
896</li>
897<li>
898<p>
899Maintainability throughout the process.
900</p>
901<div class="olist loweralpha"><ol class="loweralpha">
902<li>
903<p>
904The object format is kept simple and consistent.
905</p>
906</li>
907<li>
908<p>
909Creation of a generalized repository conversion tool.
910</p>
911</li>
912</ol></div>
913</li>
914</ol></div>
915</div>
916</div>
917<div class="sect1">
918<h2 id="_non_goals">Non-Goals</h2>
919<div class="sectionbody">
920<div class="olist arabic"><ol class="arabic">
921<li>
922<p>
Junio C Hamano11ae3202018-08-20 20:15:42923Add SHA-256 support to Git protocol. This is valuable and the
Junio C Hamano1171ab42017-10-11 06:33:37924 logical next step but it is out of scope for this initial design.
925</p>
926</li>
927<li>
928<p>
929Transparently improving the security of existing SHA-1 signed
930 objects.
931</p>
932</li>
933<li>
934<p>
935Intermixing objects using multiple hash functions in a single
936 repository.
937</p>
938</li>
939<li>
940<p>
941Taking the opportunity to fix other bugs in Git&#8217;s formats and
942 protocols.
943</p>
944</li>
945<li>
946<p>
Junio C Hamano11ae3202018-08-20 20:15:42947Shallow clones and fetches into a SHA-256 repository. (This will
948 change when we add SHA-256 support to Git protocol.)
Junio C Hamano1171ab42017-10-11 06:33:37949</p>
950</li>
951<li>
952<p>
Junio C Hamano11ae3202018-08-20 20:15:42953Skip fetching some submodules of a project into a SHA-256
954 repository. (This also depends on SHA-256 support in Git
Junio C Hamano1171ab42017-10-11 06:33:37955 protocol.)
956</p>
957</li>
958</ol></div>
959</div>
960</div>
961<div class="sect1">
962<h2 id="_overview">Overview</h2>
963<div class="sectionbody">
964<div class="paragraph"><p>We introduce a new repository format extension. Repositories with this
Junio C Hamano11ae3202018-08-20 20:15:42965extension enabled use SHA-256 instead of SHA-1 to name their objects.
Junio C Hamanoa70c9882021-02-23 00:57:12966This affects both object names and object content&#8201;&#8212;&#8201;both the names
Junio C Hamano1171ab42017-10-11 06:33:37967of objects and all references to other objects within an object are
968switched to the new hash function.</p></div>
Junio C Hamano11ae3202018-08-20 20:15:42969<div class="paragraph"><p>SHA-256 repositories cannot be read by older versions of Git.</p></div>
970<div class="paragraph"><p>Alongside the packfile, a SHA-256 repository stores a bidirectional
971mapping between SHA-256 and SHA-1 object names. The mapping is generated
Junio C Hamano1171ab42017-10-11 06:33:37972locally and can be verified using "git fsck". Object lookups use this
Junio C Hamano11ae3202018-08-20 20:15:42973mapping to allow naming objects using either their SHA-1 and SHA-256 names
Junio C Hamano1171ab42017-10-11 06:33:37974interchangeably.</p></div>
975<div class="paragraph"><p>"git cat-file" and "git hash-object" gain options to display an object
Junio C Hamanoa70c9882021-02-23 00:57:12976in its SHA-1 form and write an object given its SHA-1 form. This
Junio C Hamano1171ab42017-10-11 06:33:37977requires all objects referenced by that object to be present in the
978object database so that they can be named using the appropriate name
979(using the bidirectional hash mapping).</p></div>
980<div class="paragraph"><p>Fetches from a SHA-1 based server convert the fetched objects into
Junio C Hamano11ae3202018-08-20 20:15:42981SHA-256 form and record the mapping in the bidirectional mapping table
Junio C Hamano1171ab42017-10-11 06:33:37982(see below for details). Pushes to a SHA-1 based server convert the
Junio C Hamanoa70c9882021-02-23 00:57:12983objects being pushed into SHA-1 form so the server does not have to be
Junio C Hamano1171ab42017-10-11 06:33:37984aware of the hash function the client is using.</p></div>
985</div>
986</div>
987<div class="sect1">
988<h2 id="_detailed_design">Detailed Design</h2>
989<div class="sectionbody">
990<div class="sect2">
991<h3 id="_repository_format_extension">Repository format extension</h3>
Junio C Hamano11ae3202018-08-20 20:15:42992<div class="paragraph"><p>A SHA-256 repository uses repository format version <code>1</code> (see
Junio C Hamano1171ab42017-10-11 06:33:37993Documentation/technical/repository-version.txt) with extensions
994<code>objectFormat</code> and <code>compatObjectFormat</code>:</p></div>
995<div class="literalblock">
996<div class="content">
997<pre><code>[core]
998 repositoryFormatVersion = 1
999[extensions]
Junio C Hamano11ae3202018-08-20 20:15:421000 objectFormat = sha256
Junio C Hamano1171ab42017-10-11 06:33:371001 compatObjectFormat = sha1</code></pre>
1002</div></div>
Junio C Hamano8ed7c462018-04-11 05:14:521003<div class="paragraph"><p>The combination of setting <code>core.repositoryFormatVersion=1</code> and
1004populating <code>extensions.*</code> ensures that all versions of Git later than
Junio C Hamano11ae3202018-08-20 20:15:421005<code>v0.99.9l</code> will die instead of trying to operate on the SHA-256
Junio C Hamano8ed7c462018-04-11 05:14:521006repository, instead producing an error message.</p></div>
Junio C Hamano1171ab42017-10-11 06:33:371007<div class="literalblock">
1008<div class="content">
Junio C Hamano8ed7c462018-04-11 05:14:521009<pre><code># Between v0.99.9l and v2.7.0
1010$ git status
1011fatal: Expected git repo version &lt;= 0, found 1
1012# After v2.7.0
1013$ git status
Junio C Hamano1171ab42017-10-11 06:33:371014fatal: unknown repository extensions found:
1015 objectformat
1016 compatobjectformat</code></pre>
1017</div></div>
1018<div class="paragraph"><p>See the "Transition plan" section below for more details on these
1019repository extensions.</p></div>
1020</div>
1021<div class="sect2">
1022<h3 id="_object_names">Object names</h3>
Junio C Hamanoa70c9882021-02-23 00:57:121023<div class="paragraph"><p>Objects can be named by their 40 hexadecimal digit SHA-1 name or 64
1024hexadecimal digit SHA-256 name, plus names derived from those (see
Junio C Hamano1171ab42017-10-11 06:33:371025gitrevisions(7)).</p></div>
Junio C Hamanoa70c9882021-02-23 00:57:121026<div class="paragraph"><p>The SHA-1 name of an object is the SHA-1 of the concatenation of its
1027type, length, a nul byte, and the object&#8217;s SHA-1 content. This is the
Junio C Hamano1171ab42017-10-11 06:33:371028traditional &lt;sha1&gt; used in Git to name objects.</p></div>
Junio C Hamanoa70c9882021-02-23 00:57:121029<div class="paragraph"><p>The SHA-256 name of an object is the SHA-256 of the concatenation of its
1030type, length, a nul byte, and the object&#8217;s SHA-256 content.</p></div>
Junio C Hamano1171ab42017-10-11 06:33:371031</div>
1032<div class="sect2">
1033<h3 id="_object_format">Object format</h3>
1034<div class="paragraph"><p>The content as a byte sequence of a tag, commit, or tree object named
Junio C Hamanoa70c9882021-02-23 00:57:121035by SHA-1 and SHA-256 differ because an object named by SHA-256 name refers to
1036other objects by their SHA-256 names and an object named by SHA-1 name
1037refers to other objects by their SHA-1 names.</p></div>
1038<div class="paragraph"><p>The SHA-256 content of an object is the same as its SHA-1 content, except
1039that objects referenced by the object are named using their SHA-256 names
1040instead of SHA-1 names. Because a blob object does not refer to any
1041other object, its SHA-1 content and SHA-256 content are the same.</p></div>
1042<div class="paragraph"><p>The format allows round-trip conversion between SHA-256 content and
1043SHA-1 content.</p></div>
Junio C Hamano1171ab42017-10-11 06:33:371044</div>
1045<div class="sect2">
1046<h3 id="_object_storage">Object storage</h3>
1047<div class="paragraph"><p>Loose objects use zlib compression and packed objects use the packed
Junio C Hamano04495a12022-08-18 21:13:081048format described in <a href="../gitformat-pack.html">gitformat-pack(5)</a>, just like
Junio C Hamanoa70c9882021-02-23 00:57:121049today. The content that is compressed and stored uses SHA-256 content
1050instead of SHA-1 content.</p></div>
Junio C Hamano1171ab42017-10-11 06:33:371051</div>
1052<div class="sect2">
1053<h3 id="_pack_index">Pack index</h3>
1054<div class="paragraph"><p>Pack index (.idx) files use a new v3 format that supports multiple
1055hash functions. They have the following format (all integers are in
1056network byte order):</p></div>
1057<div class="ulist"><ul>
1058<li>
1059<p>
1060A header appears at the beginning and consists of the following:
1061</p>
Junio C Hamanoa70c9882021-02-23 00:57:121062<div class="ulist"><ul>
Junio C Hamano1171ab42017-10-11 06:33:371063<li>
1064<p>
1065The 4-byte pack index signature: <em>\377t0c</em>
1066</p>
1067</li>
1068<li>
1069<p>
10704-byte version number: 3
1071</p>
1072</li>
1073<li>
1074<p>
10754-byte length of the header section, including the signature and
1076 version number
1077</p>
1078</li>
1079<li>
1080<p>
10814-byte number of objects contained in the pack
1082</p>
1083</li>
1084<li>
1085<p>
10864-byte number of object formats in this pack index: 2
1087</p>
1088</li>
1089<li>
1090<p>
1091For each object format:
1092</p>
Junio C Hamanoa70c9882021-02-23 00:57:121093<div class="ulist"><ul>
Junio C Hamano1171ab42017-10-11 06:33:371094<li>
1095<p>
10964-byte format identifier (e.g., <em>sha1</em> for SHA-1)
1097</p>
1098</li>
1099<li>
1100<p>
11014-byte length in bytes of shortened object names. This is the
1102 shortest possible length needed to make names in the shortened
1103 object name table unambiguous.
1104</p>
1105</li>
1106<li>
1107<p>
11084-byte integer, recording where tables relating to this format
1109 are stored in this index file, as an offset from the beginning.
1110</p>
1111</li>
Junio C Hamanoa70c9882021-02-23 00:57:121112</ul></div>
1113</li>
Junio C Hamano1171ab42017-10-11 06:33:371114<li>
1115<p>
11164-byte offset to the trailer from the beginning of this file.
1117</p>
1118</li>
1119<li>
1120<p>
1121Zero or more additional key/value pairs (4-byte key, 4-byte
1122 value). Only one key is supported: <em>PSRC</em>. See the "Loose objects
1123 and unreachable objects" section for supported values and how this
1124 is used. All other keys are reserved. Readers must ignore
1125 unrecognized keys.
1126</p>
1127</li>
Junio C Hamanoa70c9882021-02-23 00:57:121128</ul></div>
1129</li>
Junio C Hamano1171ab42017-10-11 06:33:371130<li>
1131<p>
1132Zero or more NUL bytes. This can optionally be used to improve the
1133 alignment of the full object name table below.
1134</p>
1135</li>
1136<li>
1137<p>
1138Tables for the first object format:
1139</p>
Junio C Hamanoa70c9882021-02-23 00:57:121140<div class="ulist"><ul>
Junio C Hamano1171ab42017-10-11 06:33:371141<li>
1142<p>
1143A sorted table of shortened object names. These are prefixes of
1144 the names of all objects in this pack file, packed together
1145 without offset values to reduce the cache footprint of the binary
1146 search for a specific object name.
1147</p>
1148</li>
1149<li>
1150<p>
1151A table of full object names in pack order. This allows resolving
1152 a reference to "the nth object in the pack file" (from a
1153 reachability bitmap or from the next table of another object
1154 format) to its object name.
1155</p>
1156</li>
1157<li>
1158<p>
1159A table of 4-byte values mapping object name order to pack order.
1160 For an object in the table of sorted shortened object names, the
1161 value at the corresponding index in this table is the index in the
1162 previous table for that same object.
Junio C Hamanoa70c9882021-02-23 00:57:121163 This can be used to look up the object in reachability bitmaps or
1164 to look up its name in another object format.
Junio C Hamano1171ab42017-10-11 06:33:371165</p>
Junio C Hamano1171ab42017-10-11 06:33:371166</li>
1167<li>
1168<p>
1169A table of 4-byte CRC32 values of the packed object data, in the
1170 order that the objects appear in the pack file. This is to allow
1171 compressed data to be copied directly from pack to pack during
1172 repacking without undetected data corruption.
1173</p>
1174</li>
1175<li>
1176<p>
1177A table of 4-byte offset values. For an object in the table of
1178 sorted shortened object names, the value at the corresponding
1179 index in this table indicates where that object can be found in
1180 the pack file. These are usually 31-bit pack file offsets, but
1181 large offsets are encoded as an index into the next table with the
1182 most significant bit set.
1183</p>
1184</li>
1185<li>
1186<p>
1187A table of 8-byte offset entries (empty for pack files less than
1188 2 GiB). Pack files are organized with heavily used objects toward
1189 the front, so most object references should not need to refer to
1190 this table.
1191</p>
1192</li>
Junio C Hamanoa70c9882021-02-23 00:57:121193</ul></div>
1194</li>
Junio C Hamano1171ab42017-10-11 06:33:371195<li>
1196<p>
1197Zero or more NUL bytes.
1198</p>
1199</li>
1200<li>
1201<p>
1202Tables for the second object format, with the same layout as above,
1203 up to and not including the table of CRC32 values.
1204</p>
1205</li>
1206<li>
1207<p>
1208Zero or more NUL bytes.
1209</p>
1210</li>
1211<li>
1212<p>
1213The trailer consists of the following:
1214</p>
Junio C Hamanoa70c9882021-02-23 00:57:121215<div class="ulist"><ul>
Junio C Hamano1171ab42017-10-11 06:33:371216<li>
1217<p>
Junio C Hamano11ae3202018-08-20 20:15:421218A copy of the 20-byte SHA-256 checksum at the end of the
Junio C Hamano1171ab42017-10-11 06:33:371219 corresponding packfile.
1220</p>
1221</li>
1222<li>
1223<p>
Junio C Hamano11ae3202018-08-20 20:15:42122420-byte SHA-256 checksum of all of the above.
Junio C Hamano1171ab42017-10-11 06:33:371225</p>
1226</li>
1227</ul></div>
Junio C Hamanoa70c9882021-02-23 00:57:121228</li>
1229</ul></div>
Junio C Hamano1171ab42017-10-11 06:33:371230</div>
1231<div class="sect2">
1232<h3 id="_loose_object_index">Loose object index</h3>
1233<div class="paragraph"><p>A new file $GIT_OBJECT_DIR/loose-object-idx contains information about
1234all loose objects. Its format is</p></div>
1235<div class="literalblock">
1236<div class="content">
1237<pre><code># loose-object-idx
Junio C Hamano11ae3202018-08-20 20:15:421238(sha256-name SP sha1-name LF)*</code></pre>
Junio C Hamano1171ab42017-10-11 06:33:371239</div></div>
1240<div class="paragraph"><p>where the object names are in hexadecimal format. The file is not
1241sorted.</p></div>
1242<div class="paragraph"><p>The loose object index is protected against concurrent writes by a
1243lock file $GIT_OBJECT_DIR/loose-object-idx.lock. To add a new loose
1244object:</p></div>
1245<div class="olist arabic"><ol class="arabic">
1246<li>
1247<p>
1248Write the loose object to a temporary file, like today.
1249</p>
1250</li>
1251<li>
1252<p>
1253Open loose-object-idx.lock with O_CREAT | O_EXCL to acquire the lock.
1254</p>
1255</li>
1256<li>
1257<p>
1258Rename the loose object into place.
1259</p>
1260</li>
1261<li>
1262<p>
1263Open loose-object-idx with O_APPEND and write the new object
1264</p>
1265</li>
1266<li>
1267<p>
1268Unlink loose-object-idx.lock to release the lock.
1269</p>
1270</li>
1271</ol></div>
1272<div class="paragraph"><p>To remove entries (e.g. in "git pack-refs" or "git-prune"):</p></div>
1273<div class="olist arabic"><ol class="arabic">
1274<li>
1275<p>
1276Open loose-object-idx.lock with O_CREAT | O_EXCL to acquire the
1277 lock.
1278</p>
1279</li>
1280<li>
1281<p>
1282Write the new content to loose-object-idx.lock.
1283</p>
1284</li>
1285<li>
1286<p>
1287Unlink any loose objects being removed.
1288</p>
1289</li>
1290<li>
1291<p>
1292Rename to replace loose-object-idx, releasing the lock.
1293</p>
1294</li>
1295</ol></div>
1296</div>
1297<div class="sect2">
1298<h3 id="_translation_table">Translation table</h3>
Junio C Hamanoa70c9882021-02-23 00:57:121299<div class="paragraph"><p>The index files support a bidirectional mapping between SHA-1 names
1300and SHA-256 names. The lookup proceeds similarly to ordinary object
1301lookups. For example, to convert a SHA-1 name to a SHA-256 name:</p></div>
Junio C Hamano1171ab42017-10-11 06:33:371302<div class="olist arabic"><ol class="arabic">
1303<li>
1304<p>
1305Look for the object in idx files. If a match is present in the
Junio C Hamanoa70c9882021-02-23 00:57:121306 idx&#8217;s sorted list of truncated SHA-1 names, then:
Junio C Hamano1171ab42017-10-11 06:33:371307</p>
1308<div class="olist loweralpha"><ol class="loweralpha">
1309<li>
1310<p>
Junio C Hamanoa70c9882021-02-23 00:57:121311Read the corresponding entry in the SHA-1 name order to pack
Junio C Hamano1171ab42017-10-11 06:33:371312 name order mapping.
1313</p>
1314</li>
1315<li>
1316<p>
Junio C Hamanoa70c9882021-02-23 00:57:121317Read the corresponding entry in the full SHA-1 name table to
Junio C Hamano1171ab42017-10-11 06:33:371318 verify we found the right object. If it is, then
1319</p>
1320</li>
1321<li>
1322<p>
Junio C Hamanoa70c9882021-02-23 00:57:121323Read the corresponding entry in the full SHA-256 name table.
1324 That is the object&#8217;s SHA-256 name.
Junio C Hamano1171ab42017-10-11 06:33:371325</p>
1326</li>
1327</ol></div>
1328</li>
1329<li>
1330<p>
1331Check for a loose object. Read lines from loose-object-idx until
1332 we find a match.
1333</p>
1334</li>
1335</ol></div>
1336<div class="paragraph"><p>Step (1) takes the same amount of time as an ordinary object lookup:
1337O(number of packs * log(objects per pack)). Step (2) takes O(number of
1338loose objects) time. To maintain good performance it will be necessary
1339to keep the number of loose objects low. See the "Loose objects and
1340unreachable objects" section below for more details.</p></div>
1341<div class="paragraph"><p>Since all operations that make new objects (e.g., "git commit") add
1342the new objects to the corresponding index, this mapping is possible
1343for all objects in the object store.</p></div>
1344</div>
1345<div class="sect2">
Junio C Hamanoa70c9882021-02-23 00:57:121346<h3 id="_reading_an_object_8217_s_sha_1_content">Reading an object&#8217;s SHA-1 content</h3>
1347<div class="paragraph"><p>The SHA-1 content of an object can be read by converting all SHA-256 names
1348of its SHA-256 content references to SHA-1 names using the translation table.</p></div>
Junio C Hamano1171ab42017-10-11 06:33:371349</div>
1350<div class="sect2">
1351<h3 id="_fetch">Fetch</h3>
1352<div class="paragraph"><p>Fetching from a SHA-1 based server requires translating between SHA-1
Junio C Hamano11ae3202018-08-20 20:15:421353and SHA-256 based representations on the fly.</p></div>
Junio C Hamano1171ab42017-10-11 06:33:371354<div class="paragraph"><p>SHA-1s named in the ref advertisement that are present on the client
Junio C Hamano11ae3202018-08-20 20:15:421355can be translated to SHA-256 and looked up as local objects using the
Junio C Hamano1171ab42017-10-11 06:33:371356translation table.</p></div>
1357<div class="paragraph"><p>Negotiation proceeds as today. Any "have"s generated locally are
1358converted to SHA-1 before being sent to the server, and SHA-1s
Junio C Hamano11ae3202018-08-20 20:15:421359mentioned by the server are converted to SHA-256 when looking them up
Junio C Hamano1171ab42017-10-11 06:33:371360locally.</p></div>
1361<div class="paragraph"><p>After negotiation, the server sends a packfile containing the
Junio C Hamano11ae3202018-08-20 20:15:421362requested objects. We convert the packfile to SHA-256 format using
Junio C Hamano1171ab42017-10-11 06:33:371363the following steps:</p></div>
1364<div class="olist arabic"><ol class="arabic">
1365<li>
1366<p>
1367index-pack: inflate each object in the packfile and compute its
1368 SHA-1. Objects can contain deltas in OBJ_REF_DELTA format against
1369 objects the client has locally. These objects can be looked up
Junio C Hamanoa70c9882021-02-23 00:57:121370 using the translation table and their SHA-1 content read as
Junio C Hamano1171ab42017-10-11 06:33:371371 described above to resolve the deltas.
1372</p>
1373</li>
1374<li>
1375<p>
1376topological sort: starting at the "want"s from the negotiation
1377 phase, walk through objects in the pack and emit a list of them,
1378 excluding blobs, in reverse topologically sorted order, with each
1379 object coming later in the list than all objects it references.
1380 (This list only contains objects reachable from the "wants". If the
1381 pack from the server contained additional extraneous objects, then
1382 they will be discarded.)
1383</p>
1384</li>
1385<li>
1386<p>
Junio C Hamanoa70c9882021-02-23 00:57:121387convert to SHA-256: open a new SHA-256 packfile. Read the topologically
Junio C Hamano1171ab42017-10-11 06:33:371388 sorted list just generated. For each object, inflate its
Junio C Hamanoa70c9882021-02-23 00:57:121389 SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
1390 pack. Record the new SHA-1&#8592;&#8594;SHA-256 mapping entry for use in the idx.
Junio C Hamano1171ab42017-10-11 06:33:371391</p>
1392</li>
1393<li>
1394<p>
1395sort: reorder entries in the new pack to match the order of objects
Junio C Hamanoa70c9882021-02-23 00:57:121396 in the pack the server generated and include blobs. Write a SHA-256 idx
Junio C Hamano1171ab42017-10-11 06:33:371397 file
1398</p>
1399</li>
1400<li>
1401<p>
1402clean up: remove the SHA-1 based pack file, index, and
1403 topologically sorted list obtained from the server in steps 1
1404 and 2.
1405</p>
1406</li>
1407</ol></div>
1408<div class="paragraph"><p>Step 3 requires every object referenced by the new object to be in the
1409translation table. This is why the topological sort step is necessary.</p></div>
1410<div class="paragraph"><p>As an optimization, step 1 could write a file describing what non-blob
1411objects each object it has inflated from the packfile references. This
1412makes the topological sort in step 2 possible without inflating the
1413objects in the packfile for a second time. The objects need to be
1414inflated again in step 3, for a total of two inflations.</p></div>
1415<div class="paragraph"><p>Step 4 is probably necessary for good read-time performance. "git
1416pack-objects" on the server optimizes the pack file for good data
1417locality (see Documentation/technical/pack-heuristics.txt).</p></div>
1418<div class="paragraph"><p>Details of this process are likely to change. It will take some
1419experimenting to get this to perform well.</p></div>
1420</div>
1421<div class="sect2">
1422<h3 id="_push">Push</h3>
1423<div class="paragraph"><p>Push is simpler than fetch because the objects referenced by the
Junio C Hamanoa70c9882021-02-23 00:57:121424pushed objects are already in the translation table. The SHA-1 content
Junio C Hamano1171ab42017-10-11 06:33:371425of each object being pushed can be read as described in the "Reading
Junio C Hamanoa70c9882021-02-23 00:57:121426an object&#8217;s SHA-1 content" section to generate the pack written by git
Junio C Hamano1171ab42017-10-11 06:33:371427send-pack.</p></div>
1428</div>
1429<div class="sect2">
1430<h3 id="_signed_commits">Signed Commits</h3>
Junio C Hamano11ae3202018-08-20 20:15:421431<div class="paragraph"><p>We add a new field "gpgsig-sha256" to the commit object format to allow
Junio C Hamano1171ab42017-10-11 06:33:371432signing commits without relying on SHA-1. It is similar to the
Junio C Hamanoa70c9882021-02-23 00:57:121433existing "gpgsig" field. Its signed payload is the SHA-256 content of the
Junio C Hamano11ae3202018-08-20 20:15:421434commit object with any "gpgsig" and "gpgsig-sha256" fields removed.</p></div>
Junio C Hamanoa70c9882021-02-23 00:57:121435<div class="paragraph"><p>This means commits can be signed</p></div>
1436<div class="olist arabic"><ol class="arabic">
1437<li>
1438<p>
1439using SHA-1 only, as in existing signed commit objects
1440</p>
1441</li>
1442<li>
1443<p>
1444using both SHA-1 and SHA-256, by using both gpgsig-sha256 and gpgsig
Junio C Hamano1171ab42017-10-11 06:33:371445 fields.
Junio C Hamanoa70c9882021-02-23 00:57:121446</p>
1447</li>
1448<li>
1449<p>
1450using only SHA-256, by only using the gpgsig-sha256 field.
1451</p>
1452</li>
1453</ol></div>
Junio C Hamano1171ab42017-10-11 06:33:371454<div class="paragraph"><p>Old versions of "git verify-commit" can verify the gpgsig signature in
1455cases (1) and (2) without modifications and view case (3) as an
1456ordinary unsigned commit.</p></div>
1457</div>
1458<div class="sect2">
1459<h3 id="_signed_tags">Signed Tags</h3>
Junio C Hamano11ae3202018-08-20 20:15:421460<div class="paragraph"><p>We add a new field "gpgsig-sha256" to the tag object format to allow
Junio C Hamano1171ab42017-10-11 06:33:371461signing tags without relying on SHA-1. Its signed payload is the
Junio C Hamanoa70c9882021-02-23 00:57:121462SHA-256 content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
Junio C Hamano1171ab42017-10-11 06:33:371463SIGNATURE-----" delimited in-body signature removed.</p></div>
Junio C Hamanoa70c9882021-02-23 00:57:121464<div class="paragraph"><p>This means tags can be signed</p></div>
1465<div class="olist arabic"><ol class="arabic">
1466<li>
1467<p>
1468using SHA-1 only, as in existing signed tag objects
1469</p>
1470</li>
1471<li>
1472<p>
1473using both SHA-1 and SHA-256, by using gpgsig-sha256 and an in-body
Junio C Hamano1171ab42017-10-11 06:33:371474 signature.
Junio C Hamanoa70c9882021-02-23 00:57:121475</p>
1476</li>
1477<li>
1478<p>
1479using only SHA-256, by only using the gpgsig-sha256 field.
1480</p>
1481</li>
1482</ol></div>
Junio C Hamano1171ab42017-10-11 06:33:371483</div>
1484<div class="sect2">
1485<h3 id="_mergetag_embedding">Mergetag embedding</h3>
Junio C Hamanoa70c9882021-02-23 00:57:121486<div class="paragraph"><p>The mergetag field in the SHA-1 content of a commit contains the
1487SHA-1 content of a tag that was merged by that commit.</p></div>
1488<div class="paragraph"><p>The mergetag field in the SHA-256 content of the same commit contains the
1489SHA-256 content of the same tag.</p></div>
Junio C Hamano1171ab42017-10-11 06:33:371490</div>
1491<div class="sect2">
1492<h3 id="_submodules">Submodules</h3>
1493<div class="paragraph"><p>To convert recorded submodule pointers, you need to have the converted
1494submodule repository in place. The translation table of the submodule
1495can be used to look up the new hash.</p></div>
1496</div>
1497<div class="sect2">
1498<h3 id="_loose_objects_and_unreachable_objects">Loose objects and unreachable objects</h3>
1499<div class="paragraph"><p>Fast lookups in the loose-object-idx require that the number of loose
1500objects not grow too high.</p></div>
1501<div class="paragraph"><p>"git gc --auto" currently waits for there to be 6700 loose objects
1502present before consolidating them into a packfile. We will need to
1503measure to find a more appropriate threshold for it to use.</p></div>
1504<div class="paragraph"><p>"git gc --auto" currently waits for there to be 50 packs present
1505before combining packfiles. Packing loose objects more aggressively
1506may cause the number of pack files to grow too quickly. This can be
1507mitigated by using a strategy similar to Martin Fick&#8217;s exponential
1508rolling garbage collection script:
1509<a href="https://gerrit-review.googlesource.com/c/gerrit/+/35215">https://gerrit-review.googlesource.com/c/gerrit/+/35215</a></p></div>
1510<div class="paragraph"><p>"git gc" currently expels any unreachable objects it encounters in
1511pack files to loose objects in an attempt to prevent a race when
1512pruning them (in case another process is simultaneously writing a new
1513object that refers to the about-to-be-deleted object). This leads to
1514an explosion in the number of loose objects present and disk space
1515usage due to the objects in delta form being replaced with independent
1516loose objects. Worse, the race is still present for loose objects.</p></div>
1517<div class="paragraph"><p>Instead, "git gc" will need to move unreachable objects to a new
1518packfile marked as UNREACHABLE_GARBAGE (using the PSRC field; see
1519below). To avoid the race when writing new objects referring to an
1520about-to-be-deleted object, code paths that write new objects will
1521need to copy any objects from UNREACHABLE_GARBAGE packs that they
Junio C Hamano556b57e2019-08-12 17:46:381522refer to new, non-UNREACHABLE_GARBAGE packs (or loose objects).
Junio C Hamano1171ab42017-10-11 06:33:371523UNREACHABLE_GARBAGE are then safe to delete if their creation time (as
1524indicated by the file&#8217;s mtime) is long enough ago.</p></div>
1525<div class="paragraph"><p>To avoid a proliferation of UNREACHABLE_GARBAGE packs, they can be
1526combined under certain circumstances. If "gc.garbageTtl" is set to
1527greater than one day, then packs created within a single calendar day,
1528UTC, can be coalesced together. The resulting packfile would have an
1529mtime before midnight on that day, so this makes the effective maximum
1530ttl the garbageTtl + 1 day. If "gc.garbageTtl" is less than one day,
1531then we divide the calendar day into intervals one-third of that ttl
1532in duration. Packs created within the same interval can be coalesced
1533together. The resulting packfile would have an mtime before the end of
1534the interval, so this makes the effective maximum ttl equal to the
1535garbageTtl * 4/3.</p></div>
1536<div class="paragraph"><p>This rule comes from Thirumala Reddy Mutchukota&#8217;s JGit change
1537<a href="https://git.eclipse.org/r/90465">https://git.eclipse.org/r/90465</a>.</p></div>
1538<div class="paragraph"><p>The UNREACHABLE_GARBAGE setting goes in the PSRC field of the pack
1539index. More generally, that field indicates where a pack came from:</p></div>
1540<div class="ulist"><ul>
1541<li>
1542<p>
15431 (PACK_SOURCE_RECEIVE) for a pack received over the network
1544</p>
1545</li>
1546<li>
1547<p>
15482 (PACK_SOURCE_AUTO) for a pack created by a lightweight
1549 "gc --auto" operation
1550</p>
1551</li>
1552<li>
1553<p>
15543 (PACK_SOURCE_GC) for a pack created by a full gc
1555</p>
1556</li>
1557<li>
1558<p>
15594 (PACK_SOURCE_UNREACHABLE_GARBAGE) for potential garbage
1560 discovered by gc
1561</p>
1562</li>
1563<li>
1564<p>
15655 (PACK_SOURCE_INSERT) for locally created objects that were
1566 written directly to a pack file, e.g. from "git add ."
1567</p>
1568</li>
1569</ul></div>
1570<div class="paragraph"><p>This information can be useful for debugging and for "gc --auto" to
1571make appropriate choices about which packs to coalesce.</p></div>
1572</div>
1573</div>
1574</div>
1575<div class="sect1">
1576<h2 id="_caveats">Caveats</h2>
1577<div class="sectionbody">
1578<div class="sect2">
1579<h3 id="_invalid_objects">Invalid objects</h3>
Junio C Hamanoa70c9882021-02-23 00:57:121580<div class="paragraph"><p>The conversion from SHA-1 content to SHA-256 content retains any
Junio C Hamano1171ab42017-10-11 06:33:371581brokenness in the original object (e.g., tree entry modes encoded with
1582leading 0, tree objects whose paths are not sorted correctly, and
1583commit objects without an author or committer). This is a deliberate
1584feature of the design to allow the conversion to round-trip.</p></div>
1585<div class="paragraph"><p>More profoundly broken objects (e.g., a commit with a truncated "tree"
1586header line) cannot be converted but were not usable by current Git
1587anyway.</p></div>
1588</div>
1589<div class="sect2">
1590<h3 id="_shallow_clone_and_submodules">Shallow clone and submodules</h3>
1591<div class="paragraph"><p>Because it requires all referenced objects to be available in the
1592locally generated translation table, this design does not support
1593shallow clone or unfetched submodules. Protocol improvements might
1594allow lifting this restriction.</p></div>
1595</div>
1596<div class="sect2">
1597<h3 id="_alternates">Alternates</h3>
Junio C Hamanoa70c9882021-02-23 00:57:121598<div class="paragraph"><p>For the same reason, a SHA-256 repository cannot borrow objects from a
1599SHA-1 repository using objects/info/alternates or
Junio C Hamano1171ab42017-10-11 06:33:371600$GIT_ALTERNATE_OBJECT_REPOSITORIES.</p></div>
1601</div>
1602<div class="sect2">
1603<h3 id="_git_notes">git notes</h3>
Junio C Hamanoa70c9882021-02-23 00:57:121604<div class="paragraph"><p>The "git notes" tool annotates objects using their SHA-1 name as key.
Junio C Hamano1171ab42017-10-11 06:33:371605This design does not describe a way to migrate notes trees to use
Junio C Hamanoa70c9882021-02-23 00:57:121606SHA-256 names. That migration is expected to happen separately (for
Junio C Hamano1171ab42017-10-11 06:33:371607example using a file at the root of the notes tree to describe which
1608hash it uses).</p></div>
1609</div>
1610<div class="sect2">
1611<h3 id="_server_side_cost">Server-side cost</h3>
Junio C Hamano11ae3202018-08-20 20:15:421612<div class="paragraph"><p>Until Git protocol gains SHA-256 support, using SHA-256 based storage
Junio C Hamano1171ab42017-10-11 06:33:371613on public-facing Git servers is strongly discouraged. Once Git
Junio C Hamano11ae3202018-08-20 20:15:421614protocol gains SHA-256 support, SHA-256 based servers are likely not
Junio C Hamano1171ab42017-10-11 06:33:371615to support SHA-1 compatibility, to avoid what may be a very expensive
Junio C Hamano8ef91f32019-12-01 22:58:271616hash re-encode during clone and to encourage peers to modernize.</p></div>
Junio C Hamano1171ab42017-10-11 06:33:371617<div class="paragraph"><p>The design described here allows fetches by SHA-1 clients of a
Junio C Hamano11ae3202018-08-20 20:15:421618personal SHA-256 repository because it&#8217;s not much more difficult than
Junio C Hamano1171ab42017-10-11 06:33:371619allowing pushes from that repository. This support needs to be guarded
Junio C Hamano11f1df12023-01-30 22:48:081620by a configuration option&#8201;&#8212;&#8201;servers like git.kernel.org that serve a
Junio C Hamano1171ab42017-10-11 06:33:371621large number of clients would not be expected to bear that cost.</p></div>
1622</div>
1623<div class="sect2">
1624<h3 id="_meaning_of_signatures">Meaning of signatures</h3>
1625<div class="paragraph"><p>The signed payload for signed commits and tags does not explicitly
1626name the hash used to identify objects. If some day Git adopts a new
1627hash function with the same length as the current SHA-1 (40
Junio C Hamano11ae3202018-08-20 20:15:421628hexadecimal digit) or SHA-256 (64 hexadecimal digit) objects then the
Junio C Hamano1171ab42017-10-11 06:33:371629intent behind the PGP signed payload in an object signature is
1630unclear:</p></div>
1631<div class="literalblock">
1632<div class="content">
1633<pre><code>object e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7
1634type commit
1635tag v2.12.0
1636tagger Junio C Hamano &lt;gitster@pobox.com&gt; 1487962205 -0800</code></pre>
1637</div></div>
1638<div class="literalblock">
1639<div class="content">
1640<pre><code>Git 2.12</code></pre>
1641</div></div>
Junio C Hamanoa70c9882021-02-23 00:57:121642<div class="paragraph"><p>Does this mean Git v2.12.0 is the commit with SHA-1 name
Junio C Hamano1171ab42017-10-11 06:33:371643e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7 or the commit with
1644new-40-digit-hash-name e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7?</p></div>
Junio C Hamano11ae3202018-08-20 20:15:421645<div class="paragraph"><p>Fortunately SHA-256 and SHA-1 have different lengths. If Git starts
Junio C Hamano1171ab42017-10-11 06:33:371646using another hash with the same length to name objects, then it will
1647need to change the format of signed payloads using that hash to
1648address this issue.</p></div>
1649</div>
1650<div class="sect2">
1651<h3 id="_object_names_on_the_command_line">Object names on the command line</h3>
1652<div class="paragraph"><p>To support the transition (see Transition plan below), this design
1653supports four different modes of operation:</p></div>
1654<div class="olist arabic"><ol class="arabic">
1655<li>
1656<p>
1657("dark launch") Treat object names input by the user as SHA-1 and
1658 convert any object names written to output to SHA-1, but store
Junio C Hamano11ae3202018-08-20 20:15:421659 objects using SHA-256. This allows users to test the code with no
Junio C Hamano1171ab42017-10-11 06:33:371660 visible behavior change except for performance. This allows
Junio C Hamano91a411f2021-07-14 00:40:501661 running even tests that assume the SHA-1 hash function, to
Junio C Hamano1171ab42017-10-11 06:33:371662 sanity-check the behavior of the new mode.
1663</p>
1664</li>
1665<li>
1666<p>
Junio C Hamano11ae3202018-08-20 20:15:421667("early transition") Allow both SHA-1 and SHA-256 object names in
Junio C Hamano1171ab42017-10-11 06:33:371668 input. Any object names written to output use SHA-1. This allows
1669 users to continue to make use of SHA-1 to communicate with peers
1670 (e.g. by email) that have not migrated yet and prepares for mode 3.
1671</p>
1672</li>
1673<li>
1674<p>
Junio C Hamano11ae3202018-08-20 20:15:421675("late transition") Allow both SHA-1 and SHA-256 object names in
1676 input. Any object names written to output use SHA-256. In this
Junio C Hamano1171ab42017-10-11 06:33:371677 mode, users are using a more secure object naming method by
1678 default. The disruption is minimal as long as most of their peers
1679 are in mode 2 or mode 3.
1680</p>
1681</li>
1682<li>
1683<p>
1684("post-transition") Treat object names input by the user as
Junio C Hamano11ae3202018-08-20 20:15:421685 SHA-256 and write output using SHA-256. This is safer than mode 3
Junio C Hamano1171ab42017-10-11 06:33:371686 because there is less risk that input is incorrectly interpreted
1687 using the wrong hash function.
1688</p>
1689</li>
1690</ol></div>
1691<div class="paragraph"><p>The mode is specified in configuration.</p></div>
1692<div class="paragraph"><p>The user can also explicitly specify which format to use for a
1693particular revision specifier and for output, overriding the mode. For
1694example:</p></div>
Junio C Hamanoa70c9882021-02-23 00:57:121695<div class="literalblock">
1696<div class="content">
1697<pre><code>git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}</code></pre>
1698</div></div>
Junio C Hamano1171ab42017-10-11 06:33:371699</div>
1700</div>
1701</div>
1702<div class="sect1">
Junio C Hamano1171ab42017-10-11 06:33:371703<h2 id="_transition_plan">Transition plan</h2>
1704<div class="sectionbody">
Junio C Hamanoa70c9882021-02-23 00:57:121705<div class="paragraph"><p>Some initial steps can be implemented independently of one another:</p></div>
1706<div class="ulist"><ul>
1707<li>
1708<p>
1709adding a hash function API (vtable)
1710</p>
1711</li>
1712<li>
1713<p>
1714teaching fsck to tolerate the gpgsig-sha256 field
1715</p>
1716</li>
1717<li>
1718<p>
1719excluding gpgsig-* from the fields copied by "git commit --amend"
1720</p>
1721</li>
1722<li>
1723<p>
1724annotating tests that depend on SHA-1 values with a SHA1 test
Junio C Hamano1171ab42017-10-11 06:33:371725 prerequisite
Junio C Hamanoa70c9882021-02-23 00:57:121726</p>
1727</li>
1728<li>
1729<p>
1730using "struct object_id", GIT_MAX_RAWSZ, and GIT_MAX_HEXSZ
Junio C Hamano1171ab42017-10-11 06:33:371731 consistently instead of "unsigned char *" and the hardcoded
1732 constants 20 and 40.
Junio C Hamanoa70c9882021-02-23 00:57:121733</p>
1734</li>
1735<li>
1736<p>
1737introducing index v3
1738</p>
1739</li>
1740<li>
1741<p>
1742adding support for the PSRC field and safer object pruning
1743</p>
1744</li>
1745</ul></div>
Junio C Hamano1171ab42017-10-11 06:33:371746<div class="paragraph"><p>The first user-visible change is the introduction of the objectFormat
Junio C Hamanoa70c9882021-02-23 00:57:121747extension (without compatObjectFormat). This requires:</p></div>
1748<div class="ulist"><ul>
1749<li>
1750<p>
1751teaching fsck about this mode of operation
1752</p>
1753</li>
1754<li>
1755<p>
1756using the hash function API (vtable) when computing object names
1757</p>
1758</li>
1759<li>
1760<p>
1761signing objects and verifying signatures
1762</p>
1763</li>
1764<li>
1765<p>
1766rejecting attempts to fetch from or push to an incompatible
1767 repository
1768</p>
1769</li>
1770</ul></div>
1771<div class="paragraph"><p>Next comes introduction of compatObjectFormat:</p></div>
1772<div class="ulist"><ul>
1773<li>
1774<p>
1775implementing the loose-object-idx
1776</p>
1777</li>
1778<li>
1779<p>
1780translating object names between object formats
1781</p>
1782</li>
1783<li>
1784<p>
1785translating object content between object formats
1786</p>
1787</li>
1788<li>
1789<p>
1790generating and verifying signatures in the compat format
1791</p>
1792</li>
1793<li>
1794<p>
1795adding appropriate index entries when adding a new object to the
Junio C Hamano1171ab42017-10-11 06:33:371796 object store
Junio C Hamanoa70c9882021-02-23 00:57:121797</p>
1798</li>
1799<li>
1800<p>
1801--output-format option
1802</p>
1803</li>
1804<li>
1805<p>
1806</p>
1807</li>
1808<li>
1809<p>
1810configuration to specify default input and output format (see
1811 "Object names on the command line" above)
1812</p>
1813</li>
1814</ul></div>
1815<div class="paragraph"><p>The next step is supporting fetches and pushes to SHA-1 repositories:</p></div>
1816<div class="ulist"><ul>
1817<li>
1818<p>
1819allow pushes to a repository using the compat format
1820</p>
1821</li>
1822<li>
1823<p>
1824generate a topologically sorted list of the SHA-1 names of fetched
Junio C Hamano1171ab42017-10-11 06:33:371825 objects
Junio C Hamanoa70c9882021-02-23 00:57:121826</p>
1827</li>
1828<li>
1829<p>
1830convert the fetched packfile to SHA-256 format and generate an idx
Junio C Hamano1171ab42017-10-11 06:33:371831 file
Junio C Hamanoa70c9882021-02-23 00:57:121832</p>
1833</li>
1834<li>
1835<p>
1836re-sort to match the order of objects in the fetched packfile
1837</p>
1838</li>
1839</ul></div>
Junio C Hamano1171ab42017-10-11 06:33:371840<div class="paragraph"><p>The infrastructure supporting fetch also allows converting an existing
1841repository. In converted repositories and new clones, end users can
1842gain support for the new hash function without any visible change in
1843behavior (see "dark launch" in the "Object names on the command line"
Junio C Hamano11ae3202018-08-20 20:15:421844section). In particular this allows users to verify SHA-256 signatures
Junio C Hamano1171ab42017-10-11 06:33:371845on objects in the repository, and it should ensure the transition code
1846is stable in production in preparation for using it more widely.</p></div>
1847<div class="paragraph"><p>Over time projects would encourage their users to adopt the "early
1848transition" and then "late transition" modes to take advantage of the
Junio C Hamano11ae3202018-08-20 20:15:421849new, more futureproof SHA-256 object names.</p></div>
Junio C Hamano1171ab42017-10-11 06:33:371850<div class="paragraph"><p>When objectFormat and compatObjectFormat are both set, commands
Junio C Hamano11ae3202018-08-20 20:15:421851generating signatures would generate both SHA-1 and SHA-256 signatures
Junio C Hamano1171ab42017-10-11 06:33:371852by default to support both new and old users.</p></div>
Junio C Hamano11ae3202018-08-20 20:15:421853<div class="paragraph"><p>In projects using SHA-256 heavily, users could be encouraged to adopt
Junio C Hamano1171ab42017-10-11 06:33:371854the "post-transition" mode to avoid accidentally making implicit use
1855of SHA-1 object names.</p></div>
1856<div class="paragraph"><p>Once a critical mass of users have upgraded to a version of Git that
Junio C Hamano11ae3202018-08-20 20:15:421857can verify SHA-256 signatures and have converted their existing
Junio C Hamano1171ab42017-10-11 06:33:371858repositories to support verifying them, we can add support for a
Junio C Hamano11ae3202018-08-20 20:15:421859setting to generate only SHA-256 signatures. This is expected to be at
Junio C Hamano1171ab42017-10-11 06:33:371860least a year later.</p></div>
1861<div class="paragraph"><p>That is also a good moment to advertise the ability to convert
Junio C Hamano11ae3202018-08-20 20:15:421862repositories to use SHA-256 only, stripping out all SHA-1 related
Junio C Hamano1171ab42017-10-11 06:33:371863metadata. This improves performance by eliminating translation
1864overhead and security by avoiding the possibility of accidentally
1865relying on the safety of SHA-1.</p></div>
1866<div class="paragraph"><p>Updating Git&#8217;s protocols to allow a server to specify which hash
1867functions it supports is also an important part of this transition. It
1868is not discussed in detail in this document but this transition plan
1869assumes it happens. :)</p></div>
1870</div>
1871</div>
1872<div class="sect1">
1873<h2 id="_alternatives_considered">Alternatives considered</h2>
1874<div class="sectionbody">
1875<div class="sect2">
1876<h3 id="_upgrading_everyone_working_on_a_particular_project_on_a_flag_day">Upgrading everyone working on a particular project on a flag day</h3>
1877<div class="paragraph"><p>Projects like the Linux kernel are large and complex enough that
1878flipping the switch for all projects based on the repository at once
1879is infeasible.</p></div>
1880<div class="paragraph"><p>Not only would all developers and server operators supporting
1881developers have to switch on the same flag day, but supporting tooling
1882(continuous integration, code review, bug trackers, etc) would have to
1883be adapted as well. This also makes it difficult to get early feedback
1884from some project participants testing before it is time for mass
1885adoption.</p></div>
1886</div>
1887<div class="sect2">
1888<h3 id="_using_hash_functions_in_parallel">Using hash functions in parallel</h3>
Junio C Hamano59e88242019-12-10 14:09:041889<div class="paragraph"><p>(e.g. <a href="https://lore.kernel.org/git/22708.8913.864049.452252@chiark.greenend.org.uk/">https://lore.kernel.org/git/22708.8913.864049.452252@chiark.greenend.org.uk/</a> )
Junio C Hamano1171ab42017-10-11 06:33:371890Objects newly created would be addressed by the new hash, but inside
1891such an object (e.g. commit) it is still possible to address objects
Junio C Hamanoa70c9882021-02-23 00:57:121892using the old hash function.</p></div>
1893<div class="ulist"><ul>
1894<li>
1895<p>
1896You cannot trust its history (needed for bisectability) in the
Junio C Hamano1171ab42017-10-11 06:33:371897 future without further work
Junio C Hamanoa70c9882021-02-23 00:57:121898</p>
1899</li>
1900<li>
1901<p>
1902Maintenance burden as the number of supported hash functions grows
Junio C Hamano1171ab42017-10-11 06:33:371903 (they will never go away, so they accumulate). In this proposal, by
Junio C Hamanoa70c9882021-02-23 00:57:121904 comparison, converted objects lose all references to SHA-1.
1905</p>
1906</li>
1907</ul></div>
Junio C Hamano1171ab42017-10-11 06:33:371908</div>
1909<div class="sect2">
1910<h3 id="_signed_objects_with_multiple_hashes">Signed objects with multiple hashes</h3>
Junio C Hamano11ae3202018-08-20 20:15:421911<div class="paragraph"><p>Instead of introducing the gpgsig-sha256 field in commit and tag objects
Junio C Hamanoa70c9882021-02-23 00:57:121912for SHA-256 content based signatures, an earlier version of this design
1913added "hash sha256 &lt;SHA-256 name&gt;" fields to strengthen the existing
1914SHA-1 content based signatures.</p></div>
Junio C Hamano1171ab42017-10-11 06:33:371915<div class="paragraph"><p>In other words, a single signature was used to attest to the object
Junio C Hamanoa70c9882021-02-23 00:57:121916content using both hash functions. This had some advantages:</p></div>
1917<div class="ulist"><ul>
1918<li>
1919<p>
1920Using one signature instead of two speeds up the signing process.
1921</p>
1922</li>
1923<li>
1924<p>
1925Having one signed payload with both hashes allows the signer to
1926 attest to the SHA-1 name and SHA-256 name referring to the same object.
1927</p>
1928</li>
1929<li>
1930<p>
1931All users consume the same signature. Broken signatures are likely
1932 to be detected quickly using current versions of git.
1933</p>
1934</li>
1935</ul></div>
1936<div class="paragraph"><p>However, it also came with disadvantages:</p></div>
1937<div class="ulist"><ul>
1938<li>
1939<p>
1940Verifying a signed object requires access to the SHA-1 names of all
Junio C Hamano1171ab42017-10-11 06:33:371941 objects it references, even after the transition is complete and
1942 translation table is no longer needed for anything else. To support
Junio C Hamanoa70c9882021-02-23 00:57:121943 this, the design added fields such as "hash sha1 tree &lt;SHA-1 name&gt;"
1944 and "hash sha1 parent &lt;SHA-1 name&gt;" to the SHA-256 content of a signed
Junio C Hamano1171ab42017-10-11 06:33:371945 commit, complicating the conversion process.
Junio C Hamanoa70c9882021-02-23 00:57:121946</p>
1947</li>
1948<li>
1949<p>
1950Allowing signed objects without a SHA-1 (for after the transition is
Junio C Hamano1171ab42017-10-11 06:33:371951 complete) complicated the design further, requiring a "nohash sha1"
Junio C Hamanoa70c9882021-02-23 00:57:121952 field to suppress including "hash sha1" fields in the SHA-256 content
1953 and signed payload.
1954</p>
1955</li>
1956</ul></div>
Junio C Hamano1171ab42017-10-11 06:33:371957</div>
1958<div class="sect2">
1959<h3 id="_lazily_populated_translation_table">Lazily populated translation table</h3>
1960<div class="paragraph"><p>Some of the work of building the translation table could be deferred to
1961push time, but that would significantly complicate and slow down pushes.
Junio C Hamanoa70c9882021-02-23 00:57:121962Calculating the SHA-1 name at object creation time at the same time it is
1963being streamed to disk and having its SHA-256 name calculated should be
Junio C Hamano1171ab42017-10-11 06:33:371964an acceptable cost.</p></div>
1965</div>
1966</div>
1967</div>
1968<div class="sect1">
1969<h2 id="_document_history">Document History</h2>
1970<div class="sectionbody">
1971<div class="paragraph"><p>2017-03-03
1972<a href="mailto:bmwill@google.com">bmwill@google.com</a>, <a href="mailto:jonathantanmy@google.com">jonathantanmy@google.com</a>, <a href="mailto:jrnieder@gmail.com">jrnieder@gmail.com</a>,
1973<a href="mailto:sbeller@google.com">sbeller@google.com</a></p></div>
Junio C Hamanoa70c9882021-02-23 00:57:121974<div class="ulist"><ul>
1975<li>
1976<p>
1977Initial version sent to <a href="https://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com">https://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com</a>
1978</p>
1979</li>
1980</ul></div>
Junio C Hamano1171ab42017-10-11 06:33:371981<div class="paragraph"><p>2017-03-03 <a href="mailto:jrnieder@gmail.com">jrnieder@gmail.com</a>
Junio C Hamanoa70c9882021-02-23 00:57:121982Incorporated suggestions from jonathantanmy and sbeller:</p></div>
1983<div class="ulist"><ul>
1984<li>
1985<p>
1986Describe purpose of signed objects with each hash type
1987</p>
1988</li>
1989<li>
1990<p>
1991Redefine signed object verification using object content under the
1992 first hash function
1993</p>
1994</li>
1995</ul></div>
1996<div class="paragraph"><p>2017-03-06 <a href="mailto:jrnieder@gmail.com">jrnieder@gmail.com</a></p></div>
1997<div class="ulist"><ul>
1998<li>
1999<p>
2000Use SHA3-256 instead of SHA2 (thanks, Linus and brian m. carlson).[1][2]
2001</p>
2002</li>
2003<li>
2004<p>
2005Make SHA3-based signatures a separate field, avoiding the need for
Junio C Hamano1171ab42017-10-11 06:33:372006 "hash" and "nohash" fields (thanks to peff[3]).
Junio C Hamanoa70c9882021-02-23 00:57:122007</p>
2008</li>
2009<li>
2010<p>
2011Add a sorting phase to fetch (thanks to Junio for noticing the need
Junio C Hamano1171ab42017-10-11 06:33:372012 for this).
Junio C Hamanoa70c9882021-02-23 00:57:122013</p>
2014</li>
2015<li>
2016<p>
2017Omit blobs from the topological sort during fetch (thanks to peff).
2018</p>
2019</li>
2020<li>
2021<p>
2022Discuss alternates, git notes, and git servers in the caveats
Junio C Hamano1171ab42017-10-11 06:33:372023 section (thanks to Junio Hamano, brian m. carlson[4], and Shawn
2024 Pearce).
Junio C Hamanoa70c9882021-02-23 00:57:122025</p>
2026</li>
2027<li>
2028<p>
2029Clarify language throughout (thanks to various commenters,
2030 especially Junio).
2031</p>
2032</li>
2033</ul></div>
2034<div class="paragraph"><p>2017-09-27 <a href="mailto:jrnieder@gmail.com">jrnieder@gmail.com</a>, <a href="mailto:sbeller@google.com">sbeller@google.com</a></p></div>
2035<div class="ulist"><ul>
2036<li>
2037<p>
2038Use placeholder NewHash instead of SHA3-256
2039</p>
2040</li>
2041<li>
2042<p>
2043Describe criteria for picking a hash function.
2044</p>
2045</li>
2046<li>
2047<p>
2048Include a transition plan (thanks especially to Brandon Williams
Junio C Hamano1171ab42017-10-11 06:33:372049 for fleshing these ideas out)
Junio C Hamanoa70c9882021-02-23 00:57:122050</p>
2051</li>
2052<li>
2053<p>
2054Define the translation table (thanks, Shawn Pearce[5], Jonathan
Junio C Hamano1171ab42017-10-11 06:33:372055 Tan, and Masaya Suzuki)
Junio C Hamanoa70c9882021-02-23 00:57:122056</p>
2057</li>
2058<li>
2059<p>
2060Avoid loose object overhead by packing more aggressively in
2061 "git gc --auto"
2062</p>
2063</li>
2064</ul></div>
Junio C Hamano11ae3202018-08-20 20:15:422065<div class="paragraph"><p>Later history:</p></div>
Junio C Hamanoa70c9882021-02-23 00:57:122066<div class="ulist"><ul>
2067<li>
2068<p>
2069See the history of this file in git.git for the history of subsequent
2070 edits. This document history is no longer being maintained as it
2071 would now be superfluous to the commit log
2072</p>
2073</li>
2074</ul></div>
2075<div class="paragraph"><p>References:</p></div>
Junio C Hamano11ae3202018-08-20 20:15:422076<div class="literalblock">
2077<div class="content">
Junio C Hamanoa70c9882021-02-23 00:57:122078<pre><code>[1] https://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
2079[2] https://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
2080[3] https://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
2081[4] https://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
2082[5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/</code></pre>
Junio C Hamano11ae3202018-08-20 20:15:422083</div></div>
Junio C Hamano1171ab42017-10-11 06:33:372084</div>
2085</div>
2086</div>
2087<div id="footnotes"><hr /></div>
2088<div id="footer">
2089<div id="footer-text">
Junio C Hamano2ef0ba32018-01-26 23:13:532090Last updated
Junio C Hamano9c919c72023-12-10 01:43:112091 2023-01-30 14:44:53 PST
Junio C Hamano1171ab42017-10-11 06:33:372092</div>
2093</div>
2094</body>
2095</html>