This is an old revision of the document!

pax - the POSIX archiver

pax can do a lot of fancy stuff, feel free to contribute and more awesome pax tricks!

The POSIX archiver, pax, is an attempt to have a standardized archiver that has the best features of the classic big two (tar and cpio) combined, while being able to work on all common archive types.

However, this is not a manpage, it will not list you all possible options, it will not give you detailed descriptions about every corner of pax. It's just an introduction.

This article is based on the debianized Berkeley implementation of pax, but implementation-specific things should be tagged as such. Unfortunately, the Debian package doesn't seem to be maintained anymore.

Operation modes

There are four basic operation modes, to list, read, write and copy archives. They're switched with the combinations of the -r and -w commandline options:

Mode RW-Options
List no RW-options
Read -r
Write -w
Copy -r -w


In list mode, pax writes the list of archive members to the standard output (a table of contents). If a pattern to match is specified in the commandline, only the filenames matching these patterns are printed.


Read an archive. pax will read archive data and extract the members to the current directory. If a pattern to match is specified in the commandline, only the filenames matching these patterns are extracted.

When reading an archive, the archive type is guessed from the archive data.


Write an archive, which means to create a new one or to append to an existing one. All files and directories specified on commandline are put into the archive. The archive is written to standard output by default.

If no files are given at commandline, the filenames of the files to pack into the archives are read from STDIN.

The write mode is the only mode where you need to give the desired archive type with -x <TYPE>, e.g. -x ustar.


Copy-mode is similar to the passthrough mode of cpio. It provides a way to replicate a complete or partial file hierarchy (with all the possibilities pax gives you, e.g. rewriting groups etc.) to another location.

Archive data

When you don't specify anything special, pax will attempt to read archive data from the standard input (read/list modes) and write archive data to the standard output (write mode). This ensures that pax can be easily used as member of a typical shell pipe construct, e.g. to read a compressed archive that's decompressed in the pipe.

The option to specify a pathname of a file to be used as archive is the -f option. This file will be used as in- or output, depending on what you do (read/write/list).

Whenever pax reads an archive, no matter from where, it tries to guess the type of the archive. However, in write mode, you must give it the information which type of archive to appen, using the -x <TYPE> switch. When you omit this switch, an archive of the default type will be created (POSIX says it's implementation defined, my Berkeley pax creates ustar if no options given).

The following archive formats are supported (Berkeley implementation):

ustar POSIX TAR format (default)
cpio POSIX CPIO format
tar classic BSD TAR format
bcpio old binary CPIO format
sv4cpio SVR4 CPIO format
sv4crc SVR4 CPIO format with CRC

Additionally, the Berkeley pax supports the options -z and -j, similar to GNU tar, to filter archive files through GZIP/BZIP2.

Matching archive members

In read and list modes, you can specify patterns which pax will use to match against the archive members to decide, which files to list or extract.

  • the pattern notation is the one known by a POSIX-shell, i.e. the one known by Bash without extglob
  • if the specified pattern matches a complete directory, then it affects all files rooted at this directory
  • if you specify the -c option, pax will negate the matches, i.e. it will match all filenames but the ones matched by the specified patterns
  • if no patterns are given, pax will "match" (list or extract) all files from the archive
  • To avoid conflicts with shell's pathname expansion, it's wise to quote those patterns!

Some assorted examples of patterns

pax -r <myarchive.tar 'data/sales/*.txt' 'data/products/*.png'

pax -r <myarchive.tar 'data/sales/year_200[135].txt'
# should be equivalent to
pax -r <myarchive.tar 'data/sales/year_2001.txt' 'data/sales/year_2003.txt' 'data/sales/year_2005.txt'

This is a brief description of using pax as a normal archiver system, like you would use tar.

Creating an archive

This task is done by the basic syntax

# archive contents to stdout
pax -w >archive.tar README.txt *.png data/

# equivalent, archive contents directly to a file
pax -w -x ustar -f archive.tar README.txt *.png data/

The pax is in write mode, the given filenames are packed into the archive:

  • README.txt is a normal file, it will be packed
  • *.png is a pathname glob for your shell, the shell will substitute it to all matching filenames before pax is executed. The result is a list of filenames that will be packed like the README.txt above
  • data/ is a directory. Everything that's rooted in this directory will be packed into the archive, not only the empty directory itself

When you specify the -v option, pax will write the pathnames of the files it puts into the archive to STDERR.

When, and only when no filename arguments are specified, pax attempts to read filenames of the files to archive from STDIN, separated by newlines. This way you can easily compine find with pax:

find . -name '*.txt' | pax -wf textfiles.tar -x ustar

Listing archive contents

The standard output format to list archive members simply is to print each filename in a separate line. But the output format can be customized, to include permissions, timestamps, etc., using the -o listopt=<FORMAT> specification. The syntax of the format specification is strongly derived from the printf(3) format specification syntax.

Unfortunately the pax utility delivered with Debian doesn't seem to support these extended listing formats.

However, pax lists archive members in a ls -l-like format, when you give the -v option:

pax -v <myarchive.tar
# or, of course
pax -vf myarchive.tar

Extracting from an archive

You can extract files, all or files (not) matching specific patterns from an archive using constructs like

# "normal" extraction
pax -rf myarchive.tar '*.txt'

# with negated pattern
pax -rf myarchive.tar -c '*.txt'

Copying files

To simply copy directory contents over to another directory, similar to a cp -a command, just do:

mkdir destdir
pax -rw dir destdir #creates a copy of dir in destdir/, ie destdir/dir 

Copying files over ssh

To simply copy directory contents over to another directory on a distant machine, just do:

pax -w localdir | ssh user@host "cd distantdest && pax -r -v"
pax -w localdir | gzip | ssh user@host "cd distantdir && gunzip | pax -r -v" #send the data compressed
These commands create a copy of localdir in distandir (distantdir/dir) on the distant machine.

Backup your daily work

Note: -T is an extension and is not defined by POSIX.

Say, you have write-access to some fileserver mounted into your system. In copy mode, you can tell pax to only copy the files that were modified today:

mkdir /n/mybackups/$(date +%A)/
pax -rw -T 0000 data/ /n/mybackups/$(date +%A)/
This is done using the -T switch, which normally allows to give a range of time, but in this case only the start point which means "today midnight".

When you execute this (of course very simple!) backup after your daily work, you will always have a copy of the files you modify.

Note: The %A format from date expands to the name of the current day, localized, e.g. "Friday" (en) or "Mittwoch" (de).

The very same, but using an archive, can of course be achieved by:

pax -w -T 0000 -f /n/mybackups/$(date +%A)
In this case, of course, the day-name is an archive-file (you don't need a filename extension like .tar of course, but if you feel better, add one).

Changing filenames while archiving

pax is able to rewrite the filenames while archiving or while extracting from the archive. This simple example will create a tar archive containing the holiday_2007/ directory, but the directory name inside the archive will be holiday_pics/:

pax -x ustar -w -f holiday_pictures.tar -s '/^holiday_2007/holiday_pics/' holiday_2007/

The option responsible for the string manipulation is the -s <REWRITE-SPECIFICATION>. It takes the string rewrite specification as argument, in the form /OLD/NEW/[gp], which is an ed(1)-like regular expression (BRE) for old and generally can be used like the popolar s/from/to/ command of ed or sed. Every non-null character can be used as delimiter, so to mangle pathnames (containing slashes), you could use #/old/path#/new/path#.

The optional g and p flags are used to apply the substitution (g)lobally to the line or to (p)rint the original and rewritten strings to STDERR.

Multiple -s options can be specified on the commandline. They are all applied to the pathname strings of the files or archive members. This happens in the order as they are specified.

Excluding files from an archive

The -s command seen above can be used to exclude completely a file, for this the substitution must result in a null string: For example let's say that you want to exclude all the CVS directories to create an archive with the source code (you should really use cvs export for this but…), for this we are going to replace the names containing /CVS/ with nothing, note the .* they are needed because we need to match the whole pathname.

  pax -w -x ustar -f release.tar -s',.*/CVS/.*,,' myapplication 
You can use several -s options, for instance let's suppose that you also want to remove the emacs backup files ending in ~:
  pax -w -x ustar -f release.tar -'s,.*/CVS/.*,,' -'s/.*~//' myapplication 

This also can be done while reading an archive, for instance suppose that you have an archive containing a "usr" and a "etc" directory but that you only want to extract the "usr" directory:

pax -r -f archive.tar -s',^etc/.*,,' #the etc/ dir is not extracted

Getting filenames to archive from STDIN

Like cpio, pax can read filenames from standard input (stdin). This provides great flexibility - for example, a find(1) command may select files/directories in ways that pax can't do by itself. In write mode (creating an archive) or copy mode, when no filenames to archive are given, pax expects to read filenames from standard input. For example:

# Back up config files changed less than 3 days ago
find /etc -type f -mtime -3 | pax -x ustar -w -f /backups/etc.tar

# Copy only the directories, not the files
mkdir /target
find . -type d -print | pax -r -w -d /target

# Back up anything that changed since the last backup
find . -newer /var/run/mylastbackup -print0 |
    pax -0 -x ustar -w -d -f /backups/mybackup.tar
touch /var/run/mylastbackup

The -d option tells pax not to recurse into any directories that it reads (cpio-style). Without -d, pax recurses into all directories (tar-style).

Note: the -0 option is not standard, but is present in some implementations.

pax can handle the tar archive format, if you want to switch to the standard tool an alias like:

alias tar='echo USE PAX, you idiot. pax is the standard archiver!; # '
in your ~/.bashrc can be useful :-D.

Here is a quick table comparing (GNU) tar and pax to help you to make the switch:

tar xzvf file.tar.gz pax -rvz -f file.tar.gz -z is an extension, POSIXly: gunzip <file.tar.gz | pax -rv
tar czvf archive.tar.gz path … pax -wvz -f archive.tar.gz path … -z is an extension, POSIXly: pax -wv path | gzip > archive.tar.gz
tar xjvf file.tar.bz2 bunzip2 <file.tar.bz2 | pax -rv
tar cjvf archive.tar.bz2 path … pax -wv path | bzip2 > archive.tar.bz2
tar tzvf file.tar.gz pax -vz -f file.tar.gz -z is an extension, POSIXly: gunzip <file.tar.gz | pax -v

pax might not create ustar (tar) archives by default but its own pax format, add -x ustar if you want to be sure to create tar archives!

You could leave a comment if you were logged in.