Sep 27, 2021

Linux commands to make you a CLI Wizard

🚧

People involved in frequent shell scripting. Lazy people analogous to myself who absolutely hate typing repeated long commands and like to have an alias for every goddam task possible. This is no beginner guide to Linux CLI commands. An intermediate experience is recommended.

Let me say this once, I am not here to tell you about the niche(yet indispensable) commands like ls, mv, cp, rm, etc. In this post, we'll check out some less known commands which when used together give extraordinary utility and boost to your productivity. I am not talking about some commands that you'll use once a day (like nslookup, top, ifconfig) but rather commands that'll help you in complex manipulation of stdout in Linux. These are critical when you are writing a shell script or even when you want to reduce manual labor on the terminal.

So let's begin!

What is Piping?

Piping is a process of redirecting(transferring) the stdout to some other destinations/commands. It is predominantly used in Unix-based systems. This essentially connects the stdout of one command to stdin of another using | symbol.

Syntax: command_1 | command_2 | command_3 | .... | command_N

You can see multiple instances of piping in the following text.

grep command

The grep command is used to search and display n number of lines around a search string or pattern. Since displaying on the terminal is writing to stdout, you can again pipe the output to a chain for further executions.

Useful Options

There are many options supported by grep. Below, I am highlighting the most used ones.

e <pattern> or -regexp=Patterns - You can pass a certain regular expression to be used while searching for strings in text. If this option is used multiple times or is combined with the f (-file) option, it then searches for all patterns given.
f FILE or -file=FILE - Instead of passing the patterns as options in a command, you can store them all in a single file (separated by lines) and then pass the filename with f option to search for all the patterns given.
v or -invert-match - As the name suggests, this option selects everything which does not matches the given pattern.
c or -count - Print the number of lines that match the given patterns. This suppresses the normal output and only prints the count.
C Num or -context NUM - Print adjoining NUM lines when this option is passed.
m NUM, -max-count=NUM - Stop reading a file after NUM matching lines.

Examples

Basic: Checking for a particular process. Let's say if you wish to check for the Kafka process running on a particular machine.

shell

ps -ef | grep kafka

Loading syntax highlighting...

Intermediate: Find groups of n digits, but no more than n.

shell

# Using Extended regular expressions (by -E flag)
grep -E '(^|[^0-9])[0-9]{4}($|[^0-9])' file

Loading syntax highlighting...

Advanced: Match Valid Email Addresses

shell

$ grep -E -o "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b" data.txt

Loading syntax highlighting...

find command

The find command is used to search for files/directories from the given starting path. Similar to grep command, the find command also supports regular expressions pattern matching to search. You can even pass arguments to perform subsequent operations on the result of find. It supports searching by file, folder, name, creation date, modification date, owner, and permissions.

Useful Options

There are numerous flags and options supported by find command. Below I'll list down the most used ones.

exec <expression> \; - This option is used for adding subsequent operations that needed to be performed on the search result files/directories. All following expressions to exec option are taken for consideration until a ; is encountered. The string {} is replaced by the current file name being processed. Make sure you escape (\) the ; to prevent your shell from interpreting it as a new line.
regex <expression> - I don't think I need to explain this much. But one important thing to note is that this regex works on the whole file path and not just on the filename. By default, the regular expressions understood by find are Emacs Regular Expressions. But this can be changed by regextype option.
name <pattern> - Now if you want to apply patterns only on the filename, you can use this option. You can see the Intermediate example below. Because the leading directories are removed, the file names considered for a match with name will never include a slash, so name a/b will never match anything.
empty - This is a test option and tests if the file is empty and is either a regular file or a directory.
maxdepth N - Helps configure how much the find command should descend in the directory tree to find the files/directories from the starting point. There is also an opposite of this - mindepth N which starts searching from the given minimum tree depth.
type c - Search for a particular type of entity which could be a regular file (f), directory (d), a character (c), a symbolic link (l), socket (s), etc.

Examples

Basic: Get all hidden files in the home directory

shell

find /home -name ".*"

Loading syntax highlighting...

Intermediate: Finding and deleting all files except with particular extensions

shell

find . -type f ! -name '*.yaml' -delete

Loading syntax highlighting...

Advanced: Clean out temporary files

shell

find . \( ‑name a.out ‑o ‑name '∗.o' ‑o ‑name 'core' \) ‑exec rm {} \;

Loading syntax highlighting...

sed command

Although mostly used for substitution of strings, SED (stream editor) can perform lots of functions on files like searching, find and replace, insertion or deletion. We also have alternatives to sed like ed, but it is sed's ability to filter text in a pipeline that particularly distinguishes it from other types of editors.

Usage of sed command is different from other similar commands. Its format is -

sed OPTIONS... [SCRIPT] [INPUTFILE...]

There are not many options available. The main magic happens inside the script structure.

Useful Options

Let's take a look at a couple of options available to use with sed command -

e script - add the script to the commands to be executed
f script-file - add the contents of a script file to the commands to be executed
i SUFFIX - edit files in place (makes backup if SUFFIX supplied). The older file is saved with the suffix added to the filename.

Script Synopsis

Most of the work is done via the script string passed. Each function is separated using a /.

s/regexp/replacement/[flags] - (substitute) Match the regular expression against the content of the pattern space. If found, replace matched string with replacement. Below are some of the flags -

/g - replace globally
/3g replace from the third occurrence to globally

d - Delete the pattern space; immediately start the next cycle.

i text - insert text before a line (alternative syntax).

Examples

Basic: Consider a file containing text - Apples are good for health. Replace Apples with Mangoes.

shell

sed 's/Apples/Mangoes/g' file.txt

Loading syntax highlighting...

Intermediate: Enclose the first letter of every word in a sentence inside round brackets.

shell

echo "Apples are good for health" | sed 's/\(\b[A-Z]\)/\(\1\)/g'

# Output
(A)pples (a)re (g)ood (f)or (h)ealth

Loading syntax highlighting...

Advanced: Removing all the commented & empty lines from a file.

shell

sed -e 's/#.*//;/^$/d' testfile.txt

Loading syntax highlighting...

awk command

💡

This utility is particularly useful for people working with Docker and Kubernetes. awk can be easily used to parse results of native docker and kubectl commands which are basically in a tabular form.

awk utility is used to execute programs written in the awk programming language which is specialized for textual data manipulation. It's a sequence of patterns and corresponding actions. When an input is discovered matching a pattern, the associated action is carried out.

Input is always taken as a line unless otherwise specified.

AWK Operations:(a) Scans a file line by line(b) Splits each input line into fields(c) Compares input line/fields to pattern(d) Performs action(s) on matched lines

Useful Options

Since awk is a programming language, there are not many options provided by awk command. But you are still welcome to check out the above documentation to get a full gist of this command. But first, let's take a look at the options -

F - Helps define the input field separator. Default is space.
f - If you don't wish to pass the whole awk program script from CLI, you can also use an external file to store the awk program.

Examples

Basic: Print the first word of every line in a file

shell

awk '{print $1}' myfile

Loading syntax highlighting...

Intermediate: Let's say if you wish to remove inactive replica sets in a Kubernetes cluster.

grep replicaset.apps prints this to the terminal -

shell

$ kubectl get all | grep replicaset.apps
NAME                                           DESIRED   CURRENT   READY   AGE
replicaset.apps/limosyn-com-ghost-6d47f4c9b9   1         1         1       7d9h
replicaset.apps/limosyn-com-mysql-dffbd7cdb    1         1         1       7d9h
replicaset.apps/nuxt-frontend-546dff5b99       1         1         1       7d2h
replicaset.apps/nuxt-frontend-67f5659d95       0         0         0       7d8h


# Delete the 4th replica
$ kubectl delete $(kubectl get all | grep replicaset.apps | awk '{if ($2 + $3 + $4 == 0) print $1}')

Loading syntax highlighting...

Now if you notice closely, $2 corresponds to DESIRED column, $3 to CURRENT and so on. Since replicaset.apps/nuxt-frontend-67f5659d95 has $2 + $3 + $4 == 0. This replica set will be deleted.

If not for this awk utility, you'll have to select replica sets individually and delete them manually or write a script to do so.

Advanced: Print the length of the longest line present in the file.

shell

awk '{ if (length($0) > max) max = length($0) } END { print max }' sample.txt

Loading syntax highlighting...

comm command

The comm command is used to compare two sorted file streams. It displays three types of comparisons -

Non-matching items of the first file
Non-matching items of the second file
Matching items of both files

Syntax: comm <file1> <file2>

Useful Options

You can combine the options. With no options, comm produces three-column output.

1/-2/-3 - Does not display column 1/2/3 respectively.
i - Case insensitive comparison of lines
-check-order - Check the order of the input, even if all input lines are pairable.

Examples

shell

$ cat words.txt
Apple
Banana
Orange
India
US
Canada

$ cat countries.txt
India
US
Canada

$ comm -23 < (sort words.txt | uniq) < (sort countries.txt | uniq)
Apple
Banana
Orange

Loading syntax highlighting...

Last Words

So here we are at the end of this article. I have purposefully not included a lot of information for each command. You are better off referring to the documentation that I added to each of them. My goal is to make you aware of such commands so that you don't always need to resort to scripting languages such as Python or Perl.

If you are a DevOps Engineer like me, you'd surely be required to set up certain event-triggered automation/cron jobs which are mostly written in shell. Knowing some advanced commands like above is certainly priceless.

Once you become comfortable with individual commands, you can combine them via piping to achieve unimaginable tasks!

This is it for this post and remember -

Be Lazy, do crazy

...Adios!