How to force consistent line endings in Git commits with cross-platform compatibility

Q1 Enforcing consistent lineendings

Q2 Enforcing at commit as well as checkout (comment)

I’ll divide this into 2 parts: Practice and Principle

Practice

Expansion of code-apprentice’s suggestion

  1. Strictly avoid autocrlf — See why autocrlf is always wrong.
    And here for the core git devs arguing about the ill-thoughtout-ness of autocrlf. Note particularly that the implementor is annoyed at the critic but doesn’t deny the criticism.
  2. Religiously use .gitattributes instead
  3. Use safecrlf=true to enforce commit-cleanliness. safecrlf is the answer to your Q2 – a file that would change on check-in check-out round tripping would error out on the check-in stage itself.

When a new repo is init-ed:
Go through ls -lR and choose for it’s type text, binary or ignore (ie put it in .gitignore)

Debugging:
Use git-check-attr to check that attribute matching and computation are as desired

Principle

Data Store

We may treat git as a data-store loosely analogous to how a USB drive is one.

We say the drive is working if the stuff we put in comes out the same. Else it’s corrupted. Likewise if the file we commit comes out the same on checkout the repo is fine else (something) is borked. The key question is

What does “same” mean?

It’s non-trivial because we implicitly apply different standards of “sameness” in different contexts!

Binary Files

  • A binary file is a sequence of bytes
  • Preserving that sequence faithfully amounts to reproducing the file

Text Files

…are different

  • A text file consists of a sequence of «printable characters» — let’s leave the printable char notion unspecified other than to say no cr no lf!

  • How these lines are separated (or terminated) is again unspecified

  • Symbolically:
    type Line = [Char]
    type File = [Line]

  • Expanding on the 1st unspecified gives us ASCII, Latins, Unicode etc etc… Not relevant to this question

  • Expanding on the 2nd is what distinguishes windows *nix etc. JFTR this kind of file may be little known by the younger generation but also exists. And is particularly useful to remember that the notion “sequence of lines” can be imposed at many different levels.

    We don’t care how the sameness respects the unspecified parts

To return to our

USB drive analogy

When I copy foo.txt from Windows to Linux I expect the contents to be invariant. However I’m quite satisfied if H:foo.txt changes to /media/name/Transcend/foo.txt. In fact it would be more than a bit annoying if the windowsisms came through untranslated or vice versa.

Far-fetched?? ¡¡Think again!!

IOW thanks to splendid folks like Theodore T’so we take it for granted that Linux can read a windows file (system). This happens because a non-trivial amt of

  • abstraction matching
  • abstraction hiding

happens under the hood.

Back to Git

We therefore expect that a file checked in to git is the same that’s checked out… at a different time… And OS!

The catch is that the notion of same is sufficiently non-trivial that git needs some help from us in achieving that “sameness” to our satisfaction… That help is called .gitattributes!

Leave a Comment

tech