wc

The wc utility is very handy for counting the number of lines, words, and characters in a file.

We can run wc on individual files. For instance, we can see that the file /anvil/projects/tdm/data/consumer_complaints/complaints.csv has 7078918 lines, 374350052 words, and 2655317486 characters.

wc /anvil/projects/tdm/data/consumer_complaints/complaints.csv

We can just get the number of lines using wc -l or the number of words using wc -w or the number of characters wc -c.

Sometimes we want to check the number of lines, words, and characters in a collection of files, for instance, we can check the number of lines, words, and characters in all of the itcont files in the directory /anvil/projects/tdm/data/election, like this:

cat /anvil/projects/tdm/data/election/itcont*.txt | wc

We see that, altogether, in the itcont*.txt files, there are 229169299 lines, 1385963208 words, and 42790681570 characters altogether.