Yao Lirong's Blog

CS2043 Unix Tools and Scripting

2020/01/24

About

The goal of CS2043 is to introduce you to the UNIX/Linux “command line” and its accompanying tools. When done with this class you should feel comfortable navigating any UNIX shell prompt, installing UNIX/Linux systems and understanding any shell script that you may encounter down the road. We’ll cover basic commands through script writing and visit some of the more common tools used today!

LEC02 File System (01/24)

Root Directory

Unlike Windows, UNIX has a single global “root” directory (instead of a root directory for each disk or volume). The root directory is just /. Absolute paths start with a /, and always refer to the root directory.

  • cat: concatenate and print a file
  • cat >> <filename>: concatenate your following input in shell to the file specified
  • wc -l <filename>: count the number of lines in a file
  • wc -w <filename>: count the number of words in a file
  • touch: create a file if not existed
  • mkdir -p test/a/b: make directory and all its parent directory if they do not exist
  • cp –r <src> <dest>: To copy a complete directory
  • cp –f <src> <dest>: To overwrite more aggressively

LEC03 Permission (01/27)

Reading permission:

Linux Representation Permission by user type
-rwx—— User permissions
—-rwx— Group permissions
——-rwx Other permissions

r- read, w- write, x - execute

  • groups <username>: check which group this user is in and you can manage permission by groups

  • chmod <mode> <filename>: change permissions:

    <mode>:

    • +774: add user and group all rwx permissions, give others only r permission to the file
    • -222: deprive user, group, and other’s permissions to write the file
    • =111: change user, group, and other’s permissions to only execute the file. They will lose permissions to read or write if they previously had
  • su: makes you the super user

  • sudo: grants you the super power temporarily

LEC04 More Commands (01/29)

  • more | less: to view file

  • man: \something to search for “something”, n to go to its next occurrence

  • find <directory> -<criteria> <specification>:

    Search is recursive (will search all subdirectories too), so sometimes you may need to limit the depth with -maxdepth <int>

    Modifiers for find are evaluated in conjunction (a.k.a AND). But you can condition your arguments with an OR using the –o flag
    find can also execute command on found files / directories by using the –exec modifier, and find will execute the command for you

    • The variable name is {}
    • You have to end the command with either a :
      • Semicolon (;): execute command on each result as you find them-
      • Plus (+): find all the results first, then execute command

    arguments for <criteria>:

    • -name: the file’s name

    • -amin n: file last access was n minutes ago

    • -atime n: file last access was n days ago

    examples:

    • find ./ -name *.sh: find under the current directory all files containing the extension name “.sh”
    • find . –amin -10 –exec cat {} \+: Display all the contents of files accessed in the last 10 minutes
    • find . –type f –readable –executable: All files that are readable and executable
    • find . –type f –readable –o –executable: All files that are readable or executable
    • find . –amin +10: Find all files accessed at least 10 minutes ago
    • find . –amin -10: Find all files accessed at most 10 minutes ago

LEC05 Zipping (01/31)

Zipping

  • tar -c -v -f <zipped_filename> <files_to_zip>:

    tar files only create a bundle of file,s it doesn’t compress

    Remember to put -f as the last one, or at least -f must come right before <zipped_filename>

    • -c : create a new bundle
    • -v : verbose (output information about what’s going on)
    • -f : save in file
  • tar -xvf <archived_filename> <files_to_zip>:

    • -x: extract files
  • zip archive.zip file1 file2 ...: zip files

  • zip -r archive.zip folder/: zip folder recursively

  • unzip -Z <zip_filename>: show what’s inside the zip file

  • unzip <zip_filename>: unzip the file

Piping

1
<command1> | <command2>
  • ls -al /bin | less: show everything in directory /bin as scrollable
  • history | tail -20 | head -10 : the most recent 10th - 19th file

Redirection

If you don’t specify, the output or input of a command comes from the terminal

  • command > file: write the output of the command into file (overwrite)
  • command < file: take the file as input of a command line
  • command 2> file: outputs the error message to a file (stderr(2) in C)
  • command >> file: append the output of the command into file (doesn’t overwrite)

LEC06 Loops and Variables (02/03)

Environment and Variables

  • environment variables: in the computer
  • local variables: only in current shell

Shebang

  • #!/bin/sh: execute the file using Bourne shell (sh): describes the shell programming language, usually its implementation points to /bin/bash
  • #!/bin/bash: execute the file using bash shell (bash): an sh-compatible implementation with modern implementation

exit code

returned value of main will be printed out if executing the script in Linux

exit N: exit with status N

executing multiple commands in a row

  • cmd1; cmd2: execute cmd 1 first, then cmd 2
  • cmd1 && cmd2: execute cmd2 only if cmd 1 returns 0 (exited normally)
  • cmd1 || cmd2: execute cmd2 only if cmd 1 doesn’t return 0 (failed)

Scripting

We mostly use bash in our scripting. So remember to include #!/bin/bash in the top

Variables

storing command output: var = "$(echo hello world)"

if statement

1
2
3
4
5
6
7
8
9
if [ CONDITION_1 ] 
then
# statements
elif [ CONDITION_2 ]
then
# statements
else
# statements
fi

if...then...fi part is necessary. elif and else are allowed, but not necessary.

Shorten codes with ; to write them in one line, like if [[ 0 –eq 0]]; then echo “Hiya”; fi

for loop

1
2
3
for (( i = 0; i <= 11; ++i )); do
echo “i: $i”;
done

while loop

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
s=“s” 
while [[ "$s" != "ssss" ]]; do
echo "$s"
s="s$s"
done

x=0
while (( x <= 11 )); do
echo "x: $x"
(( ++x ))
done

# Loop through lines in a file
file=“filename.txt”
while read -r line; do
echo "Line: $line"
done < "$file"

Comparing Values

Numbers
  • $n1 –eq $n2 tests if n1 == n2
  • $n1 –ne $n2 tests if n1 != n2
  • $n1 –lt $n2 tests if n1 < n2
  • $n1 –le $n2 tests if n1 <= n2
  • $n1 –gt $n2 tests if n1 > n2
  • $n1 –ge $n2 tests if n1 >= n2
Strings
  • “$s1” == “$s2” tests if s1 and s2 are identical
  • “$s1” != “$s2” tests if s1 and s2 are different

Path Testing

  • Test if /some/path exists: -e /some/path
  • Test if /some/path is a file: -f /some/path
  • Test if /some/path is a directory: -d /some/path
  • Test if /some/path can be read/written/execute: -r/-w/-x /some/path

Arithmetic Expression

Put expressions inside (( )) . In script, you need to put $ before expressions to read values. Below are some examples

1
2
3
echo $(( 2 + 3 )) #5
x=10; sum=$(( $x+10 ))
echo $sum #20

Passing Arguments

  • $1, $2, …, $10: values of the first, second, etc. arguments

    If 3 arguments are given, $4, $5, … higher are empty

  • $0 is the name of the script

  • $# is the number of arguments (argc in C)

  • $? Is the exit code of the last program executed

    • You can have your script set this with exit <number> (read man exit)
    • No explicit call to exit is the same as exit 0 (a.k.a, success!)
  • $* expands \$1 .. \$n into one string, has the same effect as “\$1 \$2 … \$n” (one string)

  • $@ expands $1 .. $n into individual strings, same as “$1” “$2” .. “$n” (n strings)

Be careful with spacing

comparing two variables

Lec07 Your Shell, Job, and Processes (02/05)

Resource Monitoring

Commands

  • ps <PID> (process snapshot): report the current running processes, including PID
  • ps -C <command_name>: report the current process using its corresponding shell command
  • top: displays CPU usage of current processes
  • htop: better version of top, though not pre-installed in many Linux distributions
  • free -h: display available memory in human-readable format
  • nvidia-smi: display Nvidia GPU information

Examples

1
2
3
4
ps –C firefox #find firefox's pid through its command name
61860 ... firefox

htop -p 61860 #display usage of this specific process

Modifying Processes

  • nice -n <priority:int> <command name> : initialize command with non-default priority
  • renice -n <priority> -p <PID>: readjust the priority of a running process
  • kill <PID>: kill this process
  • killall <command name>: kill processes by name, kill all processes related to this program

Jobs

When we are executing ping or installing big packages, we may lose control of our command line temporarily. And we may want to run these commands in the background.

  • <command> &: run the command in background, but will still print output in the terminal
  • jobs: report jobs working in background
  • bg <job_id>: resumes the job in background (note: job id should come after %, like %1, or the command will take it as the PID)
  • fg <job_id>: resume job in the foreground

Lec08 Your Shell (02/10)

  • source <script_name>: the command runs script in the current shell, not as usual in a spawned shell
  • exec $shell: restart the current shell (source)
  • alias <new_name>=<old_name>: e.g. x = 'cd /Desktop'
  • ssh -X: allows X11 rendering (allows graphic interface through remote server)
  • scp [flags] <from> <to> (secure copy): copy files from the internet (remote host): Must specify the user on the remote host. Syntax for remote client: user@host:/path (Note You need the : to start the path)
  • ctrl + r reverse search your history for the most recent command that has the string you just typed in.

Lec10 Shell Expansions and Search (02/14)

Grammar of Shell Expansions

  • *: multiple character wildcard: match any string, including the empty string
  • ?: single character wildcard: match a single character: matches exactly one but what that character is doesn’t matter
  • [brackets]: [a-z, A-Z] matches one character in the range
  • [^ ...]: not, [^abc] matches any character that is not a, b, or c
  • {... , ...}: matches any pattern inside the comma separated braces. {Hello,World} matches either “Hello” or “World”
  • \ : escape space:
    • {Hello, Goodbye} World = Hello Goodbye World
    • {Hello, Goodbye}\ World = Hello World Goodbye World (the space is escaped, so “World” is now taken as a part of the set of words)
  • $: to read values (echo $PWD reads the PWD variable and then echo its value)
  • <: create instream from file
  • > >>: direct output to a file (overwrite or append)

GREP

grep <pattern> [input] Globally search a Regular Expression and Print. GREP can be used to search or filter large amounts of data.

  • grep -r <pattern> ./ search current directory and all its subdirectories for the pattern specified
  • grep -i <p> ./ ignore upper/lower case distinctions
  • -v display those lines that do NOT match
  • -n precede each matching line with the line number
  • -c print only the total count of matched lines

Regular Expressions

  • a?: search for a with 0 or 1 appearance
  • a*: search for a with 0 or multiple appearance
  • a+: search for a with 1 or multiple appearance
  • .: wildcard

Lec11 Sed, Cut, and Paste

cut

  • cut -c M-N file: print out the Mth to Nth characters (-c) in file
  • cut -d " " -f 1 t.txt: print out the first field of each line in file t.txt, field is determined by delimiter space

sed

sed is the Stream EDitor. It goes line by line searching for the regular expression

  • Print sed '/pattern/p': print out all occurrences that meet pattern.
  • Replace sed 's/pattern1/pattern2/': sed 's/no spoon /a fork/g' no_spoon.txt: replace every occurrence in the whole document (globally: /g) of “no spoon” with “a fork”
  • Delete sed '/pattern/d': sed '/[Dd]avid/d' david.txt: executes delete command (/d)
  • Extended regular expression sed -E '...': let sed to use the more usual version of regex, where + ? () have special meanings. See here for an explanation of Extended Syntax

xargs

xargs: can read from stdin, so it can pass output from other commands to scripts that only take in arguments, not from stdin.

  • xargs -n 1 <command>: only feeds 1 arguments to the next command. See the next grep 688 for an example

  • xargs -I '{}' <command> '{}': -I specify where to use the the argument read in by xargs. Specifically, for the results fed into xargs, we give them a name {}, when we later run the next command, replace occurrence of {} in that command with those results.

    e.g. grep 688 ./ | xargs -n 1 -I '{}' mv '{}' ./results/ for each of the file that was found to contain 688 in their name, move that file to results/

    we can also use a token other than {}, for example cat directories.txt | xargs -I % sh -c 'echo %; mkdir %'

shift & paste

shift <number>: drop the first <number> arguments

paste: merge multiple files

  • paste –d , names.txt phones.txt > result.csv: merge names and phones together, -ddelimit them with ‘,’
  • paste –d , -s names.txt phones.txt > result.csv: merge file serially (-s) instead of in parallel

LEC12 awk, gawk, and Process Substitution

awk

Use awk on delimited fields on a per-line basis

The basic structure of an awk program is:

1
2
3
4
BEGIN {commands}
pattern1 { commands1 }
pattern2 { commands2 }
END {commands}
  • awk '/[Mm]onster/ {print}' frankenstein.txt: find regex [Mm]onste and print the lines out.
    • awk '/[Mm]onster/' frankenstein.txt: if not specified, the default action is to print the whole line
    • awk '/[Mm]onster/ {print $0}' frankenstein.txt: $0 refers the whole line
  • awk '/[Mm]onster/ {print $1}' frankenstein.txt: prints the first word of the line contains our pattern
  • awk '/Ron/{print $3}' marks.txt: prints the third column of the line containing ‘Ron’ in the file ‘marks.txt’

You can also use awk without specifying a pattern:

  • awk 'BEGIN{x=5; y=10; z=x+y; print z}': arithmetic in awk
  • awk '{print $1 ","}' t.txt: print out the first field of each line plus a colon in file t.txt, field is determined by delimiter space

&& || a?b:c !(a&&b): also work in awk.

If you want to do regular expression, you need to enclose them with /regular expression/

  • awk '/s/?/8./:/9./ {print}' marks.txt: If there’s an ‘s’, look for grade in 80s, otherwise grade in 90s
  • awk '!/s/ {print}' marks.txt: Look for all lines that do not contain an ‘s’

Process Substitution

We can treat a command of series of commands as if they were a file

  • < (list): treat the list of commands as input e.g. echo "This is a test" > >(wc –w)
  • > (list): treat the list of commands as output file e.g. while read x; do echo $x; done < <(git log)

LEC13 Advanced Bash Scripting

Condition Statements: case

case employs a patter match, using shell expansion

1
2
3
4
5
6
7
8
9
case "$var" in
"A" )
#commands to execute if [[ $var == "A" ]]
2 )
#commands to execute if [[ $var -eq 2 ]]
[2-4] )
##commands to execute if [[ $var -ge 2 ]] && [[ $var -le 4 ]]
* )
#default commands

Arrays

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
arr = ( use parentheses and seperate items by space )
my_arr = ( "a string" 1 ) % can be of multiple types

# You can also customize the indexes inside the array (so its more like a dictionary instead of a traditional array)
my_arr[44] = "string"

# perform an array operation by ${expr}
echo "Index 51: ${arr[51]}"

#iterate through the array as individual items
for x in "${arr[@]}"; do echo "$x"; done

#iterate through the array as a joined long series
for x in "${arr[*]}"; do echo "$x"; done

#iterate through the list of indexes
for idx in "${!new_arr[@]}"; do echo “$idx”; done

Lec99 Practical Tasks

  • find . -name "*.png" -type f -print0 | xargs -0 rm -v -rf "{}" reference
    • find . -name "*.png" -type f: search files (type -f) with this name
    • -print0: names will be terminated by a null character, and spaces and strange characters will be catered for.
    • xargs -0: xargs is also going to consider filenames to be null-terminated, and spaces and strange characters will not cause problems.
    • rm -v -rf "{}": The “{}” is replaced by each filename.
CATALOG
  1. 1. About
  2. 2. LEC02 File System (01/24)
    1. 2.1. Root Directory
  3. 3. LEC03 Permission (01/27)
  4. 4. LEC04 More Commands (01/29)
  5. 5. LEC05 Zipping (01/31)
    1. 5.1. Zipping
    2. 5.2. Piping
    3. 5.3. Redirection
  6. 6. LEC06 Loops and Variables (02/03)
    1. 6.1. Environment and Variables
      1. 6.1.1. Shebang
      2. 6.1.2. exit code
      3. 6.1.3. executing multiple commands in a row
    2. 6.2. Scripting
      1. 6.2.1. Variables
      2. 6.2.2. if statement
      3. 6.2.3. for loop
      4. 6.2.4. while loop
      5. 6.2.5. Comparing Values
        1. 6.2.5.1. Numbers
        2. 6.2.5.2. Strings
      6. 6.2.6. Path Testing
      7. 6.2.7. Arithmetic Expression
      8. 6.2.8. Passing Arguments
  7. 7. Lec07 Your Shell, Job, and Processes (02/05)
    1. 7.1. Resource Monitoring
      1. 7.1.1. Commands
      2. 7.1.2. Examples
    2. 7.2. Modifying Processes
    3. 7.3. Jobs
  8. 8. Lec08 Your Shell (02/10)
  9. 9. Lec10 Shell Expansions and Search (02/14)
    1. 9.1. Grammar of Shell Expansions
    2. 9.2. GREP
    3. 9.3. Regular Expressions
  10. 10. Lec11 Sed, Cut, and Paste
    1. 10.1. cut
    2. 10.2. sed
    3. 10.3. xargs
    4. 10.4. shift & paste
  11. 11. LEC12 awk, gawk, and Process Substitution
    1. 11.1. awk
    2. 11.2. Process Substitution
  12. 12. LEC13 Advanced Bash Scripting
    1. 12.1. Condition Statements: case
    2. 12.2. Arrays
  13. 13. Lec99 Practical Tasks