CLI tip 32: text processing between two files with GNU awk

awk is handy to compare records and fields between two or more files. The key features used in the solution below:

For two files as input, NR==FNR will be true only when the first file is being processed
next will skip rest of the script and fetch the next record
a[$0] by itself is a valid statement. It will create an uninitialized element in array a with $0 as the key (assuming the key doesn't exist yet)
$0 in a checks if the given string ($0 here) exists as a key in the array a

$ cat colors_1.txt teal light blue green yellow $ cat colors_2.txt light blue black dark green yellow  # common lines $ awk 'NR==FNR{a[$0]; next} $0 in a' colors_1.txt colors_2.txt light blue yellow  # lines from colors_2.txt not present in colors_1.txt $ awk 'NR==FNR{a[$0]; next} !($0 in a)' colors_1.txt colors_2.txt black dark green

Note that the NR==FNR logic will fail if the first file is empty, since NR wouldn't get a chance to increment. You can set a flag after the first file has been processed to avoid this issue. See this unix.stackexchange thread for more workarounds.
# no output $ awk 'NR==FNR{a[$0]; next} !($0 in a)' /dev/null <(seq 2)  # gives the expected output $ awk '!f{a[$0]; next} !($0 in a)' /dev/null f=1 <(seq 2) 1 2 

Here's an example of comparing specific fields instead of whole lines. When you use a , separator between strings to construct the array key, the value of SUBSEP is inserted. This special variable has a default value of the non-printing character \034 which is usually not used as part of text files.

$ cat marks.txt Dept    Name    Marks ECE     Raj     53 ECE     Joel    72 EEE     Moi     68 CSE     Surya   81 EEE     Tia     59 ECE     Om      92 CSE     Amy     67  $ cat dept_name.txt EEE Moi CSE Amy ECE Raj  $ awk 'NR==FNR{a[$1,$2]; next} ($1,$2) in a' dept_name.txt marks.txt ECE     Raj     53 EEE     Moi     68 CSE     Amy     67

Video demo:

info See also my CLI text processing with GNU awk ebook.