Q1 Enforcing consistent lineendings
Q2 Enforcing at commit as well as checkout (comment)
I’ll divide this into 2 parts: Practice and Principle
Practice
Expansion of code-apprentice’s suggestion
- Strictly avoid
autocrlf
— See why autocrlf is always wrong.
And here for the core git devs arguing about the ill-thoughtout-ness of autocrlf. Note particularly that the implementor is annoyed at the critic but doesn’t deny the criticism. - Religiously use
.gitattributes
instead - Use
safecrlf=true
to enforce commit-cleanliness.safecrlf
is the answer to your Q2 – a file that would change on check-in check-out round tripping would error out on the check-in stage itself.
When a new repo is init-ed:
Go through ls -lR
and choose for it’s type text, binary
or ignore (ie put it in .gitignore)
Debugging:
Use git-check-attr to check that attribute matching and computation are as desired
Principle
Data Store
We may treat git as a data-store loosely analogous to how a USB drive is one.
We say the drive is working if the stuff we put in comes out the same. Else it’s corrupted. Likewise if the file we commit comes out the same on checkout the repo is fine else (something) is borked. The key question is
What does “same” mean?
It’s non-trivial because we implicitly apply different standards of “sameness” in different contexts!
Binary Files
- A binary file is a sequence of bytes
- Preserving that sequence faithfully amounts to reproducing the file
Text Files
…are different
-
A text file consists of a sequence of «printable characters» — let’s leave the printable char notion unspecified other than to say no cr no lf!
-
How these lines are separated (or terminated) is again unspecified
-
Symbolically:
type Line = [Char]
type File = [Line] -
Expanding on the 1st unspecified gives us ASCII, Latins, Unicode etc etc… Not relevant to this question
-
Expanding on the 2nd is what distinguishes windows *nix etc. JFTR this kind of file may be little known by the younger generation but also exists. And is particularly useful to remember that the notion “sequence of lines” can be imposed at many different levels.
We don’t care how the sameness respects the unspecified parts
To return to our
USB drive analogy
When I copy foo.txt from Windows to Linux I expect the contents to be invariant. However I’m quite satisfied if H:foo.txt
changes to /media/name/Transcend/foo.txt
. In fact it would be more than a bit annoying if the windowsisms came through untranslated or vice versa.
Far-fetched?? ¡¡Think again!!
IOW thanks to splendid folks like Theodore T’so we take it for granted that Linux can read a windows file (system). This happens because a non-trivial amt of
- abstraction matching
- abstraction hiding
happens under the hood.
Back to Git
We therefore expect that a file checked in to git is the same that’s checked out… at a different time… And OS!
The catch is that the notion of same is sufficiently non-trivial that git needs some help from us in achieving that “sameness” to our satisfaction… That help is called .gitattributes!