awk is handy to compare records and fields between two or more files. The key features used in the solution below:

  • For two files as input, NR==FNR will be true only when the first file is being processed
  • next will skip rest of the script and fetch the next record
  • a[$0] by itself is a valid statement. It will create an uninitialized element in array a with $0 as the key (assuming the key doesn't exist yet)
  • $0 in a checks if the given string ($0 here) exists as a key in the array a
$ cat colors_1.txt teal light blue green yellow $ cat colors_2.txt light blue black dark green yellow  # common lines $ awk 'NR==FNR{a[$0]; next} $0 in a' colors_1.txt colors_2.txt light blue yellow  # lines from colors_2.txt not present in colors_1.txt $ awk 'NR==FNR{a[$0]; next} !($0 in a)' colors_1.txt colors_2.txt black dark green 

warning Note that the NR==FNR logic will fail if the first file is empty, since NR wouldn't get a chance to increment. You can set a flag after the first file has been processed to avoid this issue. See this unix.stackexchange thread for more workarounds.

# no output $ awk 'NR==FNR{a[$0]; next} !($0 in a)' /dev/null <(seq 2)  # gives the expected output $ awk '!f{a[$0]; next} !($0 in a)' /dev/null f=1 <(seq 2) 1 2 

Here's an example of comparing specific fields instead of whole lines. When you use a , separator between strings to construct the array key, the value of SUBSEP is inserted. This special variable has a default value of the non-printing character \034 which is usually not used as part of text files.

$ cat marks.txt Dept    Name    Marks ECE     Raj     53 ECE     Joel    72 EEE     Moi     68 CSE     Surya   81 EEE     Tia     59 ECE     Om      92 CSE     Amy     67  $ cat dept_name.txt EEE Moi CSE Amy ECE Raj  $ awk 'NR==FNR{a[$1,$2]; next} ($1,$2) in a' dept_name.txt marks.txt ECE     Raj     53 EEE     Moi     68 CSE     Amy     67 

Video demo:


info See also my CLI text processing with GNU awk ebook.