Aunty's Basic Unix Tutorial
With Portions "borrowed" from Sun Solaris Tutorials

BRIEF HISTORY OF UNIX

Difference Between DOS and UNIX

The FileSystem

Types of Files

File Permissions

INODES

 

 

BRIEF HISTORY OF UNIX

UNIX is a multiuser, multitasking operating system originally developed by AT&T. UNIX is written in C, also developed by AT&T, which can be compiled into many different machine languages, causing UNIX to run in a wider variety of hardware than any other operating system. UNIX has thus become synonymous with "open systems."

UNIX is made up of the kernel (fundamental tasks), the file system (the hierarchical directory for organizing the disk) and the shell (the interface that processes user commands). The major command-line interfaces are the Bourne shell, C shell and Korn shell. The UNIX vocabulary is exhaustive with over 600 commands that manipulate data and text every way conceivable. Many commands are cryptic (see comparison below), but just as Windows hides the DOS prompt, graphical user interfaces, such as OSF/Motif and Open Look, are presenting a friendlier image to UNIX users.

Command

UNIX

DOS

List directory

ls

dir

Copy a file

cp

copy

Delete a file

rm

del

Rename a file

mv

rename

Display contents

cat

type

Print a file

lpr

print

Check disk space

df

chkdsk

UNIX was developed in 1969 by Ken Thompson at AT&T, who scaled down the sophisticated MULTICS operating system for the PDP-7. The name was coined for a single-user version (un) of MULT "ICS" (ix). More work was done by Dennis Ritchie, and, by 1974, UNIX had matured into a state-of-the-art operating system primarily on PDPs. UNIX became very popular in scientific and academic environments.

Considerable enhancements were made to UNIX at the University of California at Berkeley, and versions of UNIX with the Berkeley extensions became widely used. By the late 1970s, commercial versions of UNIX, such as IS/1 and XENIX, became available.

In the early 1980s, AT&T began to consolidate the many UNIX versions into standards which evolved into System III and eventually System V. Before divestiture (1984), AT&T licensed UNIX to universities and other organizations, but was prohibited from outright marketing of the product. After divestiture, it began to market UNIX aggressively.

In 1989, UNIX Software Operation (USO) was formed as an AT&T division. USO introduced System V Release 4.0 (SVR4), which incorporated XENIX, SunOS, Berkeley 4.3BSD and System V into one UNIX standard. The System V Interface Definition (SVID) was introduced, which defined UNIX compatibility. In 1990, USO was turned into UNIX System Laboratories, Inc. (USL), an AT&T subsidiary. In 1993, USL was acquired by Novell and merged into Novell's UNIX Systems Group.

Although every major hardware vendor has a version of UNIX, X/Open and POSIX are industry associations that govern UNIX standards, commonly referred to as "open systems." The Open Software Foundation (OSF) also promotes software for universal adoption.

More attempts at unifying UNIX into one standard have been made than for any other operating system. Over the years various industry consortia have tried to make UNIX a shrink-wrapped standard like DOS, Windows and the Mac. However, since UNIX runs on so many different hardware platforms, the only way the same UNIX software package can ever run on all of them is by the use of a pseudo language, such as proposed by the OSF (see ANDF in the OSF definition). While possible in theory, this is highly unlikely in the near future.

What UNIX application developers hope for is a single UNIX Application Programming Interface (API) so that they only have to recompile the source code for each platform, rather than maintain different versions of the source code. See Spec 1170.

Nevertheless, with all of its versions, UNIX has evolved into the archetype operating system for industrial-strength processing in a distributed environment. TCP/IP communications protocols are used in the Internet, the world's largest network of networks. SMTP provides e-mail, NFS allows files to be distributed across the network, NIS provides a "Yellow Pages" directory, Kerberos provides network security, and X Window allows a user to run applications on other machines in the network simultaneously.

Top

Difference Between DOS and UNIX

Case Sensitivity - UNIX is case sensitive; DOS is not. All UNIX commands are lower-case; typing in upper case will result in a message that command is not found.

Command Interpreters - There is only one DOS command interpreter: the command.com file. UNIX has three primary ones, known as shells. Most vendors offer more than one with their software; users can pick the one they like the best from Bourne, Korn, and C.

Consistency - DOS is more consistent throughout the command process; for example, options are always preceded with a slash (/). In UNIX, sometimes letters are used for options; sometimes not.

Deletions - When DOS deletes a file, the first character of the file name is removed in the FAT (file allocation table), marking the space it occupied as being available for other files. This is why a deleted file can be recovered; all you have to do is respecify the first character. In UNIX, a deleted file is really removed from the system, and it cannot be recovered. The only recourse is to retrieve from a backup.

Extensions - Under UNIX, extensions have no meaning and an executable (run) file can have any kind of name. The period (.) has no special meaning unless it is the first character in the file name, in which case the file is "hidden" from directory listings. Additionally, directory listings are always shown in ASCII order (numbers first, upper case next, lower case last.)

File Names - In DOS, file names can only be 8 characters long. If an extension is used, the first 8 characters are followed by a "." then up to 3 additional characters. Additionally, extensions have meaning in DOS. Executable files must end in one of the following: ".EXE", ".COM", ".BAT".

File Separators - DOS uses the backslash (\) to separate directories and subdirectories; UNIX uses the forward slash (/).

Multi-user - DOS is based on a single-user scenario. UNIX is based on a multiple-user scenario; therefore, each user must login before beginning to work. Once logged in a UNIX user can only access files to which he or she has permission; DOS users can access all files on a single machine.

Multi-tasking - Only one command at a time can be specified on the DOS command line. Multiple commands can be given in UNIX, separating each with a semicolon if they are completely separate commands (clear ; pwd), or placing multiple requests on the same line if the same command is performed on each of them (ls -l *.exe *.com *.bat).

Options - DOS uses the forward slash (/) to signal a command option (eg: dir /w). UNIX uses the hyphen (-) in place of this (eg: ls -l).

Prompts - The standard Dos prompt reflects the drive and/or subdirectory ($p$g). The standard UNIX prompt is based on the shell used: a $ for most shells, a % for the C shell.

Top

The FileSystem

The file tree is composed of chunks called filesystems each of which consists of one directory and its subdirectories and files.

Filesystems are attached to the file tree with the mount command which maps a directory within the existing file tree, called the mount point, to the root of the new filesystem.

The previous contents of the mount point become inaccessible as long as a filesystem is mounted there. Mount points are usually empty directories, however. For example, mount /dev/sd0a /users would install the filesystem stored on the device /dev/sd0a under the path /users. It is then possible to use ls /users to see what the filesystem contains.

Filesystems are detached with the umount command. Usually, a filesystem that is busy cannot be unmounted; that is, there mustn't be any open files or processes cd'd there. If the filesystem contains any executable files, they must not be running.

Note: There is a program called lsof that catalogs open file descriptors by process and filename. This program can be downloaded from vic.cc.purdue.edu via anonymous ftp.

The filesystem is used by the operating system to organize the system's storage resources which can include different kinds and sizes of media (hard disks, floppy disks, cd-rom, etc.)

UNIX is a hierarchical file system which means it uses a file organization method that stores data in a top-to-bottom, one to many, organization structure. All access to the data starts at the top and proceeds throughout the levels of the hierarchy.

In DOS, OS/2, and UNIX, the root directory is the starting point of the UNIX file system. Files can be stored in the root directory, or directories can be created off the root that hold files and subdirectories.

Throughout UNIX, root is used to mean beginning or superior. A root user (superuser) has the ability to change anything related to the file system without question: bring the system up, shut it down and anything in between.

The root filesystem includes the root directory and a minimal set of files and subdirectories. The kernel is stored in the root filesystem (usually called /unix, /vmunix, or /kernel/unix).

In addition to the UNIX file, there are a number of subdirectories. The UNIX file tree can be quite deep, but no more than 1,023 characters can be used in a single path, nor more than 256 characters in a directory name. Pathnames are either absolute (starting from the root, eg: /bin/cp) or relative (starting from the current directory eg: bin/cp). To access a file whose absolute pathname is longer than 1023 characters, you must cd to an intermediate directory and then use a relative pathname.

The default subdirectories created when a new operating system is installed are as follows:

  • bin (or sbin) for important utilities
    dev for device files for terminals, disks, modems, etc.
    etc for critical startup and config files
    lib for C compiler libraries
    lost+found is used by fsck in emergencies; do not delete it (corrupted files are put here)
    shlib for shared C compiler libraries
    tcb Trusted computing base (for additional security implementation)
    tmp for temporary file storage (disappear between reboots)
    usr for most standard programs, on-line manuals, header files, games, etc.
    var for spool directories, log files, accounting information, etc (generally host-specific files)
    The bin Directory - contains binary files and executables.
    Many of the commands and utilities reside in the bin directory including login, passwd, and others such as:
    cat - list a file to screen

    chgrp - change the group affiliation of a file
    chmod - change the permissions on a file
    chown - change the owner of the file
    clear- clear the screen
    cp - copy a file
    csh - start a c shell (see also sh, ksh)
    date - display current date and time
    df - display file system information
    diff- report the difference between two files
    du - summary disk usage
    echo - display argument to screen
    env - display current environment shell variable bindings
    file - describe file type
    find - find a file
    grep - locate a string within a file filter
    head - display the first lines in a file
    kill - terminate a process
    ksh - start a Korn shell
    ls - list files names
    mail - send and receive mail
    mkdir - create a directory
    mv - rename (move) a file or directory
    pr - format for pert
    ps - print the status of processes
    pwd - print current default directory path
    rm - delete a file
    rmdir - delete a directory
    sh - start a Bourne shell
    sort - sort output filter
    tail - display the last lines in a file
    tar - archive/restore files
    vi - edit a file
    who - identify other system users
    wc- size of file

The dev Directory - contains all the device information for the system. Every piece of hardware constitutes a device that must be defined to the system: terminals, tape drives, modems, etc. Key among these are:

  • clock - the system clock
    dsk - hard drive
    null - dumping ground where output and errors can be redirected to,
    preventing them from writing to the screen or filling a file.

File permissions in dev traditionally begin with b or c depending on the type of device in question. If it begins with a "c" then it is a character device, capable of reading one character at a time - such as a terminal. If it begins with a "b", then it is a block device, capable of reading one or more blocks at a time - such as a backup tape device. The size of the file (usually "0") tips off the fact that the entries in the directory are not really files, but merely pointers and links.

It is usually eash to tell what the device is by the name: tty=terminal; pty=pseudo terminal; mt=magnetic tape; mem=main memory; fd=floppy disk controller; ar=archive tape drive; alm=asynchronous line multiplexor.

The etc Directory - contains all the information specific to the machine on which it resides. Users must have read and execute permissions to this directory. Some of the more important files thar reside here are as follows:

  • inittab - Information about what devices are enabled per each run state
    issue - text that appears on each login screen
    magic - a database of what type files are recognized by the system
    motd - a "Message of the Day" that appears each time a user logs in
    profile - actions automatically executed with each login - usually includes checking for mail and setting the time zone
    shadow - the encrypted passwords from the passwd file
    TIMEZONE - information about the local time zone

The lib Directory - contains libraries of data for the various compilers on the system. Usually these are used by C language routines.

The lost+found Directory - corrupted files are placed here by the system. These files are rarely readable any longer; their presence here is simply to inform you that a problem has occurred. In a perfect system, this directory is empty.

The shlib Directory - contains shared library versions of programs used by compilers (generally C compilers).

The tcp Directory - Trusted Computing Base (not available on all versions of UNIX) is a system security implementation that goes beyond passwords and logins. The files that tcb needs are kept in this directory.

The tmp Directory - open or in use file copies are stored here as they are being modified or used by users. Files you intend to keep should never be stored in the tmp directory as the system routinely empties this directory to keep it from overfilling the system.

The usr Directory - User Specific Resources directory holds data (such as home directories) on individual users. Like etc, information in this directory is specific to the machine. Besides home directories and user info, the usr directory also has several subdirectories:

  • adm - main administrative directory; used for monitoring operations, process accounting, error reporting, and usage files. The adm directory contains Sublog, which keeps track of the number of times user invokes the superuser utility; install scripts, and ctlog which monitors ct usage.
  • bin - further utilities not in the standard bin directory such as at, awk, banner, cal, cancel, compress, cut, diff3, disable, egrep, fgrep, join, logname, lpr, man, more, news, nl, paste and vi.
  • include - uncompiled routines used by UNIX when compiling the kernel for the system.
  • lib - UNIX tables needed to keep the system running (eg. keyboard, terminals, uucp, cron and individual user account files or dirs.)
  • man - online manual entries describing each UNIX command.
  • spool - temporary data locations (ie.: print, mail, uucp, cron spools)

The var Directory - contains only information that varies from one system to another. Traditionally, vendors add their enhancements to var enabling administrators to spot differences between UNIX products.

Top

Types of Files

There are eight kinds of UNIX files:

Regular files - Most common file type (executable program, book chapter, GIF image, C source code, etc.)

Directories - Can contain any kind of file and can be created with mkdir and deleted with rmdir (if empty). If not empty can be deleted with rm -r.

Character device files - Peripheral devices that transfers data one byte at a time at a time, such as a parallel or serial port.

Block device files - Peripheral devices that transfer a group of bytes (block, sector, etc.) of data at a time such as a disk.

UNIX domain sockets (BSD) - A UNIX socket is a communications interface that lets an application access a network protocol by "opening a connection (or socket)" and declaring a destination. UNIX domain sockets are local to a particular host and are referenced through a filesystem object rather than a network port. UNIX domain sockets are created with the socket system call and can be removed with the rm command or the unlink system call when the socket no longer has any users.

Named pipes (FIFOs) - Named Pipes allow communication between two unrelated processes usually running on the same host. They can be created with mknod and removed with rm.

Hard links -These allow a single file to have multiple names (or aliases). Since a link is a direct connection to a file, it must be part of the same filesystem as the file and it doesn't work for directories. Hard links are created with 1n and deleted with rm.

Soft links - A soft link or "symbolic" link points to a file by name rather than directly referencing the file as a hard link does. Soft links often point to directories and can span filesystems using either absolute or relative paths. They are created with 1n -s and removed with rm.

Top

File Permissions

Every file has a set of nine permission bits that control who can read, write and execute the contents of the file. Together with three other bits that affect the operation of executable programs, these bits constitute the file's mode. The 12 mode bits are stored together with 4 bits of file-type information in a 16-bit word.

The 12 mode bits can be modified by the file's owner or the su using the chmod command. ls is used to inspect the values of these bits.

Setuid and Setgid Bits - the bits with the octal values 4000 and 2000. Allow programs access to files and processes (only important on executable files).

Sticky Bit - the bit with the octal value of 1000 (usually ignored by modern kernels), sometimes you cannot delete/rename a directory unless you own it because of the sticky bit.

The Permission Bits - 9 bits used to determine what operations may be performed on a file and by whom. Read, Write, Execute bits for: file owner, group owners, and everyone else.

Changing Permissions: chmod changes the permissions on a file. Only the owner and su can change a file's permissions using octal notation.

Example: chmod 711 myprog (all permissions granted to owner and execute-only permission to everyone else.)
(r=read, w=write, x=execute)

Assigning Default Permissions: The built-in shell command umask can set default permissions on the files you create. umask is specified as a three-digit octal value and when a file is created, its permissions are set to whatever the creating program asks for minus whatever the umask forbids.

Although users can't be forced to use a particular umask value, su can provide a default in the sample .cshrc and .profile files given to new users. The typical default umask value is 022, which gives write permission only to the owner.

Top

INODES

An inode is like an index card, holding specific information about a file or directory including: file type and mode, link count, file owner, group, size in bytes, date of last modification and finally, name of file.

The chown command is used to change the ownership of a file. The chgrp command changes the group ownership. Most versions of chown and chgrp offer the recursive -R flag which changes the owner or group of a directory and all the files underneath it.

For example: chmod 755 ~violet
chown -R violet ~violet
chgrp -R violet ~violet

might be used to set up the home directory of a new user (violet) after the system administrator has copied in the default startup files. (~violet is shorthand for "the home directory of violet" in most shells.)

Top

Other Unix Learning Sites

Return to: Aunty Violet's Advice for the PC Impaired Home Page

Copyright © 1998 • Violet Weed, Inc. • Microsoft Certified