10 Digit Numeric Wordlist

On Unix-like operating systems, the csh command launches the C shell, is a command interpreter with a syntax inspired by the C programming language.

Description

The results from our interactive feature may differ from those of other online password-testing tools due to factors such as different equations, processors, and word lists. Our data are based on the following equations: Number of possible character combinations: (Password Type)^(Password Length) Password Type is the number of possible characters. Random Numbers Combination Generator Number Generator 1-10 Number Generator 1-100 Number Generator 4-digit Number Generator 6-digit Number List Randomizer Popular Random Number Generators Games Lotto Number Generator Lottery Numbers - Quick Picks Lottery Number Scrambler UK49 Lucky Pick Odds of Winning Flip a Coin Roll a Die Roll a D20. May 28, 2012 at 6:10 pm text messages from my cell phone online, is there a web page to do thisPleaseText.Me Releases UpdateGenerate 10 digit phone number wordlist using crunchGenerate 10 digit phone number wordlist using crunch ul.legalfooter li. Random Numbers Combination Generator Number Generator 1-10 Number Generator 1-100 Number Generator 4-digit Number Generator 6-digit Number List Randomizer Popular Random Number Generators. Games Lotto Number Generator Lottery Numbers - Quick Picks Lottery Number Scrambler UK49 Lucky Pick Odds of Winning Flip a Coin Roll a Die Roll a D20. Word's default number alignment is Left, which presents an awkward list if the numbers reach double digits. Fortunately, you can force proper alignment - if you know where to find the correct option.

csh is a command language interpreter with many powerful features, including a history mechanism (see History substitutions), job control facilities (see Jobs), interactive file name and username completion (see File Name Completion), and a C-like syntax. It is used both as an interactive login shell and a shell script command processor.

If the first argument (argument 0) to the shell is a dash ('-'), then csh is run as a login shell. A login shell also can be specified by invoking the shell with the -l flag as the only argument.

Syntax

Options

-bThis flag forces a 'break' from option processing, causing any further shell arguments to be treated as non-option arguments. The remaining arguments will not be interpreted as shell options. This may be used to pass options to a shell script without confusion or possible subterfuge. The shell will not run a set-user-ID script without this option.
-cCommands are read from the single argument following -c, which must be present. Any remaining arguments are placed in argv.
-eThe shell exits if any invoked command terminates abnormally or yields a non-zero exit status.
-fThe shell will start faster, because it will neither search for nor execute commands from the file .cshrc in the invoker's home directory. Note that if the environment variableHOME is not set, fast startup is the default.
-iThe shell is interactive and prompts for its top-level input, even if it appears not to be a terminal. Shells are interactive without this option if their inputs and outputs are terminals.
-lThe shell is a login shell (but only if -l is the only flag specified).
-mRead .cshrc, regardless of its owner and group. This option is dangerous and should only be used by the superuser.
-nCommands are parsed, but not executed. This is useful for checking the syntax of shell scripts. When used interactively, the shell can be terminated by pressing control-D (end-of-file character), since exit will not work.
-sCommand input is taken from the standard input.
-tA single line of input is read and executed. A backslash () may be used to escape the newline at the end of this line and continue onto another line.
-VCauses the verbosevariable to be set even before .cshrc is executed.
-vCauses the verbose variable to be set, with the effect that command input is echoed after history substitution.
-XCauses the echo variable to be set even before .cshrc is executed.
-xCauses the echo variable to be set, so that commands are echoed immediately before execution.

After flag arguments are processed, if arguments remain but none of the -c, -i, -s, or -t options were given, the first argument is taken as the name of a file of commands to be executed. The shell opens this file, and saves its name for possible resubstitution by $0. Since many systems use either the standard version 6 or version 7 shells whose shell scripts are not compatible with this shell, the shell will execute such a 'standard' shell if the first character of a script is not a hash mark (#); i.e., if the script does not start with a comment. Remaining arguments initialize the variable argv.

An instance of csh begins by executing commands from the file /etc/csh.cshrc and, if this is a login shell, /etc/csh.login. It then executes commands from .cshrc in the home directory of the invoker, and, if this is a login shell, the file .login in the same location. It is typical for users on CRT monitors to put the command 'sttycrt' in their .login file, and to also invoke tset there.

In the normal case, the shell will begin reading commands from the terminal, prompting with % . Processing of arguments and the use of the shell to process files containing command scripts are described below.

The shell repeatedly performs the following actions: a line of command input is read and broken into 'words'. This sequence of words is placed on the command history list and parsed. Finally each command in the current line is executed.

10 Digit Numeric Wordlist

When a login shell terminates it executes commands from the files .logout in the user's home directory and /etc/csh.logout.

Lexical Structure

The shell splits input lines into words at blanks and tabs with the following exceptions. The characters &, |, ;, <, >, (, and ) form separate words. If doubled in &&, ||, <<, or >>, these pairs form single words. These parser metacharacters may be made part of other words, or have their special meaning prevented, by preceding them with a backslash (). A newline preceded by a is equivalent to a blank.

Strings enclosed in matched pairs of quotations, ', `, or ', form parts of a word; metacharacters in these strings, including blanks and tabs, do not form separate words. These quotations have semantics to be described later. Within pairs of ' or ' characters, a newline preceded by a gives a true newline character.

When the shell's input is not a terminal, the character # introduces a comment that continues to the end of the input line. This special meaning is prevented when preceded by and in quotations using `, ', and '.

Commands

A simple command is a sequence of words, the first of which specifies the command to be executed. A simple command or a sequence of simple commands separated by | characters forms a pipeline. The output of each command in a pipeline is connected to the input of the next. Sequences of pipelines may be separated by ;, and are then executed sequentially. A sequence of pipelines may be executed without immediately waiting for it to terminate by following it with a &.

Any of the above may be placed in () to form a simple command (that may be a component of a pipeline, for example). It is also possible to separate pipelines with || or && showing, as in the C language, that the second is to be executed only if the first fails or succeeds, respectively. See Expressions.

Jobs

The shell associates a job with each pipeline. It keeps a table of current jobs, printed by the jobs command, and assigns them small integer numbers. When a job is started asynchronously with &, the shell prints a line that looks like:

showing that the job which was started asynchronously was job number 1 and had one (top-level) process, whose process ID was 1234.

If you are running a job and wish to do something else you may hit ^Z (control-Z), which sends a SIGSTOPsignal to the current job. The shell will then normally show that the job has been 'Stopped', and print another prompt. You can then manipulate the state of this job, putting it in the background with the bg command, or run some other commands and eventually bring the job back into the foreground with the fg command. A ^Z takes effect immediately and is like an interrupt in that pending output and unread input are discarded when it is typed. There is another special key ^Y that does not generate a SIGSTOP signal until a program attempts to read it. This request can usefully be typed ahead when you have prepared some commands for a job that you want to stop after it has read them.

A job being run in the background will stop if it tries to read from the terminal. Background jobs are normally allowed to produce output, but this can be disabled by giving the command stty tostop. If you set this tty option, then background jobs will stop when they try to produce output like they do when they try to read input.

There are several ways to refer to jobs in the shell. The character % introduces a job name. If you want to refer to job number 1, you can name it as %1. Just naming a job brings it to the foreground; so, %1 is a synonym for fg %1, bringing job number 1 back into the foreground. Similarly, saying %1 & resumes job number 1 in the background. Jobs can also be named by prefixes of the string typed in to start them, if these prefixes are unambiguous; thus %ex would normally restart a suspended ex job, if there were only one suspended job whose name began with the string 'ex'. It is also possible to say %?string, which specifies a job whose text contains string, if there is only one such job.

The shell maintains a notion of the current and previous jobs. In output about jobs, the current job is marked with a + and the previous job with a -. The abbreviation %+ refers to the current job and %- refers to the previous job. For close analogy with the syntax of the history mechanism (described below), %% is also a synonym for the current job.

The job control mechanism requires that the stty option new be set. It is an artifact from a new implementation of the tty driver that allows generation of interrupt characters from the keyboard to tell jobs to stop. See stty for details on setting options in the new tty driver.

Status Reporting

The shell learns immediately whenever a process changes state. It normally informs you whenever a job becomes blocked so that no further progress is possible, but only just before it prints a prompt. This is done so that it does not otherwise disturb your work. If, however, you set the shell variable notify, the shell will notify you immediately of changes of status in background jobs. There is also a shell command notify that marks a single process so that its status changes will be immediately reported. By default notify marks the current process; say notify after starting a background job to mark it.

When you try to leave the shell while jobs are stopped, you will be warned that 'You have stopped jobs'. You may use the jobs command to see what they are. If you try to exit again immediately, the shell will not warn you a second time, and the suspended jobs will be terminated.

File Name Completion

When the file name completion feature is enabled by setting the shell variable filec (see set), csh will interactively complete file names and usernames from unique prefixes when they are input from the terminal followed by the escape character (the escape key, or control-[). For example, if the current directory looks like

and the input is

csh will complete the prefix 'ch' to the only matching file name 'chaosnet', changing the input line to

However, given

csh will only expand the input to

and will sound the terminal bell to indicate that the expansion is incomplete, since there are two file names matching the prefix D.

If a partial file name is followed by the end-of-file character (usually control-D), then, instead of completing the name, csh will list all file names matching the prefix. For example, the input

causes all files beginning with D to be listed:

while the input line remains unchanged.

The same system of escape and end-of-file can also be used to expand partial usernames, if the word to be completed (or listed) begins with the tilde character (~). For example, typing

may produce the expansion

The use of the terminal bell to signal errors or multiple matches can be inhibited by setting the variable nobeep.

Normally, all files in the particular directory are candidates for name completion. Files with certain suffixes can be excluded from consideration by setting the variable fignore to the list of suffixes to be ignored. Thus, if fignore is set by the command

then typing

would result in the completion to

ignoring the files 'xmpl.o' and 'xmpl.out'. However, if the only completion possible requires not ignoring these suffixes, then they are not ignored. In addition, fignore does not affect the listing of file names by control-D. All files are listed regardless of their suffixes.

History Substitutions

History substitutions place words from previous command input as portions of new commands, making it easy to repeat commands, repeat arguments of a previous command in the current command, or fix spelling mistakes in the previous command with little typing and a high degree of confidence. History substitutions begin with the character ! and may begin anywhere in the input stream (provided that they do not nest). This ! may be preceded by a to prevent its special meaning; for convenience, a ! character is passed unchanged when it is followed by a blank, tab, newline, = or (. History substitutions also occur when an input line begins with ^. This special abbreviation is described below. Any input line that contains history substitution is echoed on the terminal before it is executed as it would have been typed without history substitution.

Commands input from the terminal that consist of one or more words are saved on the history list. The history substitutions reintroduce sequences of words from these saved commands into the input stream. The size of the history list is controlled by the history variable; the previous command is always retained, regardless of the value of the history variable. Commands are numbered sequentially from 1.

Consider the following output from the history command:

The commands are shown with their event numbers. It is not usually necessary to use event numbers, but the current event number can be made part of the prompt by placing a ! in the prompt string.

With the current event 13 we can refer to previous events by event number !11, relatively as in !-2 (referring to the same event), by a prefix of a command word as in !d for event 12 or !wri for event 9, or by a string contained in a word in the command as in !?mic? also referring to event 9. These forms, without further change, reintroduce the words of the specified events, each separated by a single blank. As a special case, !! refers to the previous command; thus !! alone is a re-do.

To select words from an event we can follow the event specification by a : and a designator for the desired words. The words of an input line are numbered from 0, the first (usually command) word being 0, the second word (first argument) being 1, etc. The basic word designators are:

0first (command) word
nnth argument
^first argument; i.e., ‘1’
$last argument
%word matched by (immediately preceding) ?s? search
x-yrange of words
-yabbreviates ‘0-y
*abbreviates ‘^-$’, or nothing if only 1 word in event
x*abbreviates ‘x-$
x-like ‘x*’ but omitting word ‘$

The : separating the event specification from the word designator can be omitted if the argument selector begins with a ^, $, *, -, or %. After the optional word designator, a sequence of modifiers can be placed, each preceded by a :. The following modifiers are defined:

hRemove a trailing pathname component, leaving the head.
rRemove a trailing ‘.xxx’ component, leaving the root name.
eRemove all but the extension ‘.xxx’ part.
s/l/r/Substitute l for r.
tRemove all leading pathname components, leaving the tail.
&Repeat the previous substitution.
gApply the change once on each word, prefixing the above; e.g., ‘g&’.
aApply the change as many times as possible on a single word, prefixing the above. It can be used together with ‘g’ to apply a substitution globally.
pPrint the new command line but do not execute it.
qQuote the substituted words, preventing further substitutions.
xLike ‘q’, but break into words at blanks, tabs, and newlines.

Unless preceded by a g the change is applied only to the first modifiable word. With substitutions, it is an error for no word to be applicable.

The left side of substitutions are not regular expressions in the sense of the editors, but instead strings. Any character may be used as the delimiter in place of /; a quotes the delimiter into the l and r strings. The character & in the right side is replaced by the text from the left. A also quotes &. A NULL l (//) uses the previous string either from an l or from a contextual scan string s in !?s?. The trailing delimiter in the substitution may be omitted if a newline follows immediately as may the trailing ? in a contextual scan.

A history reference may be given without an event specification; e.g., !$. Here, the reference is to the previous command unless a previous history reference occurred on the same line in which case this form repeats the previous reference. Thus '!?foo?^ !$' gives the first and last arguments from the command matching '?foo?'.

10 Digit Numeric Wordlist

A special abbreviation of a history reference occurs when the first non-blank character of an input line is a ^. This is equivalent to '!:s^' providing a convenient shorthand for substitutions on the text of the previous line. Thus ^lb^lib fixes the spelling of 'lib' in the previous command. Finally, a history substitution may be surrounded with { and } if necessary to insulate it from the characters that follow. Thus, after ls -ld ~paul we might do !{l}a to do ls -ld ~paula, while !la would look for a command starting with 'la'.

Quotations with ' and '

The quotation of strings by ' and ' can be used to prevent all or some of the remaining substitutions. Strings enclosed in ' are prevented from any further interpretation. Strings enclosed in ' may be expanded as described below.

In both cases the resulting text becomes (all or part of) a single word; only in one special case (see Command Substitution, below) does a ' quoted string yield parts of more than one word; ' quoted strings never do.

Alias substitution

The shell maintains a list of aliases that can be established, displayed and modified by the alias and unalias commands. After a command line is scanned, it is parsed into distinct commands and the first word of each command, left-to-right, is checked to see if it has an alias. If it does, then the text that is the alias for that command is reread with the history mechanism available as though that command were the previous input line. The resulting words replace the command and argument list. If no reference is made to the history list, then the argument list is left unchanged.

Thus if the alias for 'ls' is 'ls -l', the command ls /usr would map to ls -l /usr, the argument list here being undisturbed. Similarly, if the alias for 'lookup' was 'grep !^ /etc/passwd' then lookup bill would map to grep bill /etc/passwd.

If an alias is found, the word transformation of the input text is performed and the aliasing process begins again on the reformed input line. Looping is prevented if the first word of the new text is the same as the old by flagging it to prevent further aliasing. Other loops are detected and cause an error.

Note that the mechanism allows aliases to introduce parser 'metasyntax.' Thus, we can alias print 'pr !* | lpr' to make a command that pr's its arguments to the line printer.

Variable substitution

The shell maintains a set of variables, each of which has as value a list of zero or more words. Some of these variables are set by the shell or referred to by it. For instance, the argv variable holds the shell's argument list, and words of this variable's value are referred to in special ways.

The values of variables may be displayed and changed using the set and unset commands. Of the variables referred to by the shell a number are toggles; the shell does not care what their value is, only whether they are set or not. For instance, the verbose variable is a toggle that causes command input to be echoed. The setting of this variable results from the -v command-line option.

Other operations treat variables numerically. The @ command permits numeric calculations to be performed and the result assigned to a variable. Variable values are, however, always represented as (zero or more) strings. For the purposes of numeric operations, the nullstring is considered to be zero, and the second and additional words of multiword values are ignored.

After the input line is aliased and parsed, and before each command is executed, variable substitution is performed, keyed by $ characters. This expansion can be prevented by preceding the $ with a except within double quotes ('), where substitution always occurs, and within single quotes ('), where it never occurs. Strings quoted by backticks (` `) are interpreted later (see Command Substitution, below), so $ substitution does not occur there until later, if at all. A $ is passed unchanged if followed by a blank, tab, or end-of-line.

Input/output redirections are recognized before variable expansion, and are variable expanded separately. Otherwise, the command name and entire argument list are expanded together. It is thus possible for the first (command) word (to this point) to generate more than one word, the first of which becomes the command name, and the rest of which become arguments.

Unless enclosed in ' or given the :q modifier, the results of variable substitution may eventually be command and filename substituted. Within ', a variable whose value consists of multiple words expands to (a portion of) a single word, with the words of the variable's value separated by blanks. When the :q modifier is applied to a substitution the variable will expand to multiple words with each word separated by a blank and quoted to prevent later command or filename substitution.

The following metasequences are provided for introducing variable values into the shell input. Except as noted, it is an error to reference a variable that is not set.

$name
${name}
Are replaced by the words of the value of variable name, each separated by a blank. Braces insulate name from following characters that would otherwise be part of it. Shell variables have names consisting of up to 20 letters and digits starting with a letter. The underscore character is considered a letter. If name is not a shell variable, but is set in the environment, then that value is returned (but : modifiers and the other forms given below are not available here).
$name[selector]
${name[selector]}
May be used to select only some of the words from the value of name. The selector is subjected to $ substitution and may consist of a single number or two numbers separated by a -. The first word of a variable's value is numbered 1. If the first number of a range is omitted it defaults to 1. If the last number of a range is omitted it defaults to $#name. The selector* selects all words. It is not an error for a range to be empty if the second argument is omitted or in range.
$#name
${#name}
Gives the number of words in the variable. This is useful for later use in a '$argv[selector]'.
$0Substitutes the name of the file from which command input is being read. An error occurs if the name is not known.
$number
${number}
Equivalent to '$argv[number]'.
$*Equivalent to '$argv[*]'.

The modifiers :e, :h, :t, :r, :q, and :x may be applied to the substitutions above as may :gh, :gt, and :gr. If braces {} appear in the command form then the modifiers must appear within the braces. The current implementation allows only one : modifier on each $ expansion.

The following substitutions may not be modified with : modifiers.

$?name
${?name}
Substitutes the string '1' if name is set, '0' if it is not.
$?0Substitutes 1 if the current input filename is known, 0 if it is not.
$$Substitute the (decimal) process number of the (parent) shell. Do NOT use this mechanism for generating temporary file names; see mktemp instead.
$!Substitute the (decimal) process number of the last background process started by this shell.
$<Substitutes a line from the standard input, with no further interpretation. It can be used to read from the keyboard in a shell script.

Command and filename substitution

The remaining substitutions, command and filename substitution, are applied selectively to the arguments of built-in commands. Here, 'selectively' means that portions of expressions that are not evaluated are not subjected to these expansions. For commands that are not internal to the shell, the command name is substituted separately from the argument list. This occurs very late, after input-output redirection is performed, and in a child of the main shell.

Command substitution

Command substitution is shown by a command enclosed in `. The output from such a command is normally broken into separate words at blanks, tabs, and newlines, with null words being discarded; this text then replaces the original string. Within double quotes ('), only newlines force new words; blanks and tabs are preserved.

In any case, the single final newline does not force a new word. Note that it is therefore possible for a command substitution to yield only part of a word, even if the command outputs a complete line.

Filename substitution

If a word contains any of the characters *, ?, [, or {, or begins with the character ~, then that word is a candidate for filename substitution, also known as 'globbing'. This word is then regarded as a pattern, and replaced with an alphabetically sorted list of file names that match the pattern. In a list of words specifying filename substitution it is an error for no pattern to match an existing file name, but it is not required for each pattern to match. Only the metacharacters *, ?, and [ imply pattern matching, the characters ~ and { being more akin to abbreviations.

In matching filenames, the character . at the beginning of a filename or immediately following a /, as well as the character / must be matched explicitly. The character * matches any string of characters, including the null string. The character ? matches any single character.

The sequence '[...]' matches any one of the characters enclosed. Within '[...]', a pair of characters separated by - matches any character lexically between the two (inclusive). Within '[...]', the name of a character class enclosed in [: and :] stands for the list of all characters belonging to that class. Supported character classes:

  • alnum
  • alpha
  • blank
  • cntrl
  • digit
  • graph
  • lower
  • print
  • space
  • upper
  • punct
  • xdigit

These match characters using the macros specified in ctype. A character class may not be used as an endpoint of a range.

The character ~ at the beginning of a filename refers to home directories. Standing alone, i.e., ~, it expands to the invoker's home directory as reflected in the value of the variable home. When followed by a name consisting of letters, digits, and - characters, the shell searches for a user with that name and substitutes their home directory; thus '~ken' might expand to '/usr/ken' and '~ken/chmach' to '/usr/ken/chmach'. If the character ~ is followed by a character other than a letter or /, or does not appear at the beginning of a word, it is left undisturbed.

The metanotation 'a{b,c,d}e' is a shorthand for 'abe ace ade'. Left to right order is preserved, with results of matches being sorted separately at a low level to preserve this order. This construct can be nested. Thus, '~source/s1/{oldls,ls}.c' expands to '/usr/source/s1/oldls.c /usr/source/s1/ls.c' without chance of error if the home directory for 'source' is '/usr/source'. Similarly '../{memo,*box}' might expand to '../memo ../box ../mbox'. Note that 'memo' was not sorted with the results of the match to '*box'. As a special case {, }, and {} are passed undisturbed.

Input/output

The standard input and the standard output of a command may be redirected with the following syntax:

<nameOpen file name (which is first variable, command, and filename expanded) as the standard input.
<<wordRead the shell input up to a line that is identical to word. The word is not subjected to variable, command, or filename substitution, and each input line is compared to word before any substitutions are done on the input line. Unless a quoting , ', ' or ` appears in word, variable and command substitution is performed on the intervening lines, allowing to quote $, and `. Commands that are substituted have all blanks, tabs, and newlines preserved, except for the final newline that is dropped. The resultant text is placed in an anonymous temporary file that is given to the command as its standard input.
>name
>!name
>>&name
>>&!name
The file name is used as the standard output. If the file does not exist then it is created; if the file exists, it is truncated; its previous contents are lost.
If the variable noclobber is set, then the file must not exist or be a character special file (e.g., a terminal or /dev/null) or an error results. This helps prevent accidental destruction of files. Here, the ! forms can be used to suppress this check.
The forms involving & route the standard error output into the specified file as well as the standard output. The name is expanded in the same way as < input filenames are.
>>name
>>&name
>>!name
>>&!name
Uses file name as the standard output; like > but places output at the end of the file. If the variable noclobber is set, then it is an error for the file not to exist unless one of the ! forms is given. Otherwise, similar to >.

A command receives the environment in which the shell was invoked as modified by the input-output parameters and the presence of the command in a pipeline. Thus, unlike some previous shells, commands run from a file of shell commands have no access to the text of the commands by default; instead they receive the original standard input of the shell. The << mechanism should be used to present inline data. This permits shell command scripts to function as components of pipelines and allows the shell to block read its input. Note that the default standard input for a command run detached is not modified to be the empty file /dev/null; instead the standard input remains as the original standard input of the shell. If this is a terminal and if the process attempts to read from the terminal, then the process will block and the user will be notified (see Jobs above).

The standard error output may be directed through a pipe with the standard output. Use the form |& instead of just |.

Expressions

Several of the built-in commands (to be described later) take expressions, in which the operators are similar to those of C, with the same precedence, but with the opposite grouping: right to left. These expressions appear in the @, exit, if, and while commands. The following operators are available:

  • ||
  • &&
  • |
  • &
  • !=
  • =~
  • !~
  • <=
  • >=
  • <
  • >
  • <<
  • >>
  • +
  • -
  • *
  • /
  • %
  • !
  • ~
  • (
  • )

Here the precedence increases down the list, with !==~ and !~, <=>=< and >, << and >>, + and -, */ and % being, in groups, at the same level. The !==~ and !~ operators compare their arguments as strings; all others operate on numbers. The operators =~ and !~ are like != and except that the right hand side is a pattern (containing, e.g., *'s, ?'s, and instances of '[...]') against which the left operand is matched. This reduces the need for use of the switch statement in shell scripts when all that is really needed is pattern matching.

Strings that begin with 0 are considered octal numbers. Null or missing arguments are considered 0. The results of all expressions are strings, which represent decimal numbers. It is important to note that no two components of an expression can appear in the same word; except when adjacent to components of expressions that are syntactically significant to the parser (&, |, <, >, (, and )), they should be surrounded by spaces.

Also available in expressions as primitive operands are command executions enclosed in { and } and file enquiries of the form -l name where l is one of:

rread access
wwrite access
xexecute access
eexistence
oownership
zzero size
fplain file
ddirectory

The specified name is command and filename expanded and then tested to see if it has the specified relationship to the real user. If the file does not exist or is inaccessible then all enquiries return false, i.e., 0. Command executions succeed, returning true, i.e., 1, if the command exits with status 0, otherwise they fail, returning false, i.e., 0. If more detailed status information is required then the command should be executed outside an expression and the variable status examined.

Control flow

10 Digit Numeric Wordlist

The shell contains several commands that can be used to regulate the flow of control in command files (shell scripts) and (in limited but useful ways) from terminal input. These commands all operate by forcing the shell to reread or skip in its input and, because of the implementation, restrict the placement of some of the commands.

The foreach, switch, and while statements, as well as the if-then-else form of the if statement require that the major keywords appear in a single simple command on an input line as shown below.

If the shell's input is not seekable, the shell buffers up input whenever a loop is being read and performs seeks in this internal buffer to accomplish the rereading implied by the loop. To the extent that this allows, backward goto's will succeed on non-seekable inputs.

Built-In commands

Built-in commands are executed within the shell. If a built-in command occurs as any component of a pipeline except the last then it is executed in a sub-shell.

10 Digit Numeric Wordlist For Grade

alias
aliasname
aliasname wordlist
The first form prints all aliases. The second form prints the alias for name. The final form assigns the specified wordlist as the alias of name; wordlist is command and filename substituted. The name is not allowed to be 'alias' or 'unalias'.
allocShows the amount of dynamic memory acquired, broken down into used and free memory. With an argument shows the number of free and used blocks in each size category. The categories start at size 8 and double at each step. This command's output may vary across system types, since systems other than the VAX may use a different memory allocator.
bg
bg %job ...
Puts the current or specified jobs into the background, continuing them if they were stopped.
breakCauses execution to resume after the end of the nearest enclosing foreach or while. The remaining commands on the current line are executed. Multi-level breaks are thus possible by writing them all on one line.
breakswCauses a break from a switch, resuming after the endsw.
caselabel:A label in a switch statement as discussed below.
cd
cdname
chdir
chdirname
Change the shell's working directory to directory name. If no argument is given then change to the home directory of the user. If name is not found as a subdirectory of the current directory (and does not begin with /, ./ or ../), then each component of the variable cdpath is checked to see if it has a subdirectory name. Finally, if all else fails but name is a shell variable whose value begins with /, then this is tried to see if it is a directory.
continueContinue execution of the nearest enclosing while or foreach. The rest of the commands on the current line are executed.
default:Labels the default case in a switch statement. The default should come after all case labels.
dirsPrints the directory stack; the top of the stack is at the left, the first directory in the stack being the current directory.
echowordlist
echo -nwordlist
The specified words are written to the shell's standard output, separated by spaces, and terminated with a newline unless the -n option is specified.
else
end
endif
endsw
See the description of the foreach, if, switch, and while statements below.
evalarg ...Mimics eval in sh: the arguments are read as input to the shell and the resulting command(s) executed in the context of the current shell. This is usually used to execute commands generated as the result of command or variable substitution, since parsing occurs before these substitutions. See tset's manual for an example of using eval.
execcommandThe specified command is executed in place of the current shell.
exit
exit(expr)
The shell exits either with the value of the status variable (first form) or with the value of the specified expr (second form).
fg
fg %job ...
Brings the current or specified jobs into the foreground, continuing them if they were stopped.
foreachname (wordlist)
...
end
The variable name is successively set to each member of wordlist and the sequence of commands between this command and the matching end are executed. Both foreach and end must appear alone on separate lines. The built-in command continue may be used to continue the loop prematurely and the built-in command break to terminate it prematurely. When this command is read from the terminal, the loop is read once prompting with ? before any statements in the loop are executed. If you make a mistake typing in a loop at the terminal you can rub it out.
globwordlistLike echo but no escapes are recognized and words are delimited by NUL characters in the output. Useful for programs that wish to use the shell to filename expand a list of words.
gotowordThe specified word is filename and command expanded to yield a string of the form label. The shell rewinds its input as much as possible and searches for a line of the form 'label:', possibly preceded by blanks or tabs. Execution continues after the specified line.
hashstatPrint a statistics line showing how effective the internal hashtable has been at locating commands (and avoiding exec's). An exec is attempted for each component of the path where the hash function indicates a possible hit, and in each component that does not begin with a /.
history
historyn
history -hn
history -rn
Displays the history event list; if n is given, only the n most recent events are printed. The -h option causes the history list to be printed without leading numbers. This format produces files suitable for sourcing using the -h option to source. The -r option reverses the order of printout to be most recent first instead of oldest first.
if(expr) commandIf the specified expression evaluates to true, then the single command with arguments is executed. Variable substitution on command happens early, at the same time it does for the rest of the if command. command must be a simple command, not a pipeline, a command list, or a parenthesized command list. Input/output redirection occurs even if expr is false, i.e., when command is not executed (this is actually a bug).
if(expr)then
...
else if(expr2)then
...
else
...
endif
If the specified expr is true then the commands up to the first else are executed; otherwise if expr2 is true then the commands up to the second else are executed, etc. Any number of else-if pairs are possible; only one endif is needed. The else part is likewise optional. The words else and endif must appear at the beginning of input lines; the if must appear alone on its input line or after an else.
jobs
jobs -l
Lists the active jobs; the -l option lists process IDs in addition to the normal information.
kill %job
kill [-ssignal_name] pid
kill -sigpid ...
kill -l [exit_status]
Sends either the SIGTERM (terminate) signal or the specified signal to the specified jobs or processes. Signals are either given by number or by names (as given in <signal.h>, stripped of the prefix 'SIG'). The signal names are listed by 'kill -l'; if an exit_status is specified, only the corresponding signal name will be written. There is no default; just saying 'kill' does not send a signal to the current job. If the signal being sent is SIGTERM (terminate) or SIGHUP (hangup), then the job or process will be sent a SIGCONT (continue) signal as well.
limit
limitresource
limitresource maximum-use
limit -h
limit -hresource
limit -hresource maximum-use
Limits the consumption by the current process and each process it creates to not individually exceed maximum-use on the specified resource. If no maximum-use is given, then the current limit is printed; if no resource is given, then all limitations are given. If the -h flag is given, the hard limits are used instead of the current limits. The hard limits impose a ceiling on the values of the current limits. Only the superuser may raise the hard limits, but a user may lower or raise the current limits within the legal range.
Resources controllable currently include:
cputime: the maximum number of CPU-seconds to be used by each process.
filesize: the largest single file (in bytes) that can be created.
datasize: the maximum growth of the data+stack region via sbrk beyond the end of the program text.
stacksize: the maximum size of the automatically-extended stack region.
coredumpsize: the size of the largest core dump (in bytes) that will be created.
memoryuse: the maximum size (in bytes) to which a process's resident set size (RSS) may grow.
memorylocked: The maximum size (in bytes) which a process may lock into memory using the mlock function.
maxproc: The maximum number of simultaneous processes for this user ID.
openfiles: The maximum number of simultaneous open files for this user ID.
vmemoryuse: the maximum size (in bytes) to which a process's total size may grow.
The maximum-use may be given as a (floating point or integer) number followed by a scale factor. For all limits other than cputime the default scale is k or 'kilobytes' (1024 bytes); a scale factor of m or 'megabytes' may also be used. For cputime the default scale is 'seconds'; a scale factor of m for minutes or h for hours, or a time of the form 'mm:ss' giving minutes and seconds also may be used.
For both resource names and scale factors, unambiguous prefixes of the names suffice.
loginTerminate a login shell, replacing it with an instance of /usr/bin/login. This is one way to log off, included for compatibility with sh.
logoutTerminate a login shell. Especially useful if ignoreeof is set.
nice
nice+number
nicecommand
nice+number command
The first form sets the scheduling priority for this shell to 4. The second form sets the priority to the given number. The final two forms run command at priority 4 and number respectively. The greater the number, the less CPU the process gets. The superuser may specify negative priority using 'nice -number ...'. command is always executed in a sub-shell, and the restrictions placed on commands in simple if statements apply.
nohup
nohupcommand
The first form can be used in shell scripts to cause hangups to be ignored for the remainder of the script. The second form causes the specified command to be run with hangups ignored. All processes detached with & are effectively nohup'ed.
notify
notify %job ...
Causes the shell to notify the user asynchronously when the status of the current or specified jobs change; normally notification is presented before a prompt. This is automatic if the shell variable notify is set.
onintr
onintr -
onintrlabel
Control the action of the shell on interrupts. The first form restores the default action of the shell on interrupts, which is to terminate shell scripts or to return to the terminal command input level. The second form onintr - causes all interrupts to be ignored. The final form causes the shell to execute a goto label when an interrupt is received or a child process terminates because it was interrupted.
In any case, if the shell is running detached and interrupts are being ignored, all forms of onintr have no meaning and interrupts continue to be ignored by the shell and all invoked commands. Finally, onintr statements are ignored in the system startup files where interrupts are disabled (/etc/csh.cshrc, /etc/csh.login).
popd
popd+n
Pops the directory stack, returning to the new top directory. With an argument '+n' discards the nth entry in the stack. The members of the directory stack are numbered from the top starting at 0.
pushd
pushdname
pushd+n
With no arguments, pushd exchanges the top two elements of the directory stack. Given a name argument, pushd changes to the new directory (ala cd) and pushes the old current working directory (as in pwd) onto the directory stack. With a numeric argument, pushd rotates the nth argument of the directory stack around to be the top element and changes to it. The members of the directory stack are numbered from the top starting at 0.
rehashCauses the internal hash table of the contents of the directories in the path variable to be recomputed. This is needed if new commands are added to directories in the path while you are logged in. This should only be necessary if you add commands to one of your directories, or if a systems programmer changes the contents of a system directory.
repeatcount commandThe specified command, which is subject to the same restrictions as the command in the one line if statement above, is executed count times. I/O redirections occur exactly once, even if count is 0.
set
setname
setname=word
setname[index]=word
setname=(wordlist)

The first form of the command shows the value of all shell variables. Variables that have other than a single word as their value print as a parenthesized word list. The second form sets name to the null string. The third form sets name to the single word. The fourth form sets the indexth component of name to word; this component must already exist. The final form sets name to the list of words in wordlist. The value is always command and filename expanded.

These arguments may be repeated to set multiple values in a single set command. Note however, that variable expansion happens for all arguments before any setting occurs.

setenv
setenvname
setenvname value
The first form lists all current environment variables. It is equivalent to printenv. The last form sets the value of environment variable name to be value, a single string. The second form sets name to an empty string. The most commonly used environment variables USER, TERM, and PATH are automatically imported to and exported from the csh variables user, term, and path; there is no need to use setenv for these.
shift
shiftvariable
The members of argv are shifted to the left, discarding argv[1]. It is an error for argv not to be set or to have less than one word as value. The second form performs the same function on the specified variable.
sourcename
source -hname
The shell reads commands from name. source commands may be nested; if they are nested too deeply the shell may run out of file descriptors. An error in a source at any level terminates all nested source commands. Normally input during source commands is not placed on the history list; the -h option causes the commands to be placed on the history list without being executed.
stop
stop %job ...
Stops the current or specified jobs that are executing in the background.
suspendCauses the shell to stop in its tracks, much as if it had been sent a stop signal with ^Z. This is most often used to stop shells started by su.
switch(string)
casestr1:
...
breaksw
...
default:
...
breaksw
endsw
Each case label is successively matched against the specified string, which is first command and filename expanded. The file metacharacters *, ? and '[...]' may be used in the case labels, which are variable expanded. If none of the labels match before the 'default' label is found, then the execution begins after the default label. Each case label and the default label must appear at the beginning of a line. The command breaksw causes execution to continue after the endsw. Otherwise, control may fall through case labels and the default label as in C. If no label matches and there is no default, execution continues after the endsw.
time
timecommand
With no argument, a summary of time used by this shell and its children is printed. If arguments are given the specified simple command is timed and a time summary as described under the time variable is printed. If necessary, an extra shell is created to print the time statistic when the command completes.
umask
umaskvalue
The file creation mask is displayed (first form) or set to the specified value (second form). The mask is given in octal. Common values for the mask are 002 giving all access to the group and read and execute access to others or 022 giving all access except write access for users in the group or others.
unaliaspatternAll aliases whose names match the specified pattern are discarded. Thus all aliases are removed by unalias *. It is not an error for nothing to be unaliased.
unhashUse of the internal hash table to speed location of executed programs is disabled.
unlimit
unlimitresource
unlimit -h
unlimit -hresource
Removes the limitation on resource. If no resource is specified, then all resource limitations are removed. If -h is given, the corresponding hard limits are removed. Only the superuser may do this.
unsetpatternAll variables whose names match the specified pattern are removed. Thus all variables are removed by unset *; this has noticeably distasteful side-effects. It is not an error for nothing to be unset.
unsetenvpatternRemoves all variables whose names match the specified pattern from the environment. See also the setenv command above and printenv.
waitWait for all background jobs. If the shell is interactive, then an interrupt can disrupt the wait. After the interrupt, the shell prints names and job numbers of all jobs known to be outstanding.
whichcommandDisplays the resolved command that will be executed by the shell.
while(expr)
...
end
While the specified expression evaluates to non-zero, the commands between the while and the matching end are evaluated. break and continue may be used to terminate or continue the loop prematurely. The while and end must appear alone on their input lines. Prompting occurs here the first time through the loop as for the foreach statement if the input is a terminal.
%jobBrings the specified job into the foreground.
%job&Continues the specified job in the background.

@

@name=expr

@name[index] =expr

The first form prints the values of all the shell variables. The second form sets the specified name to the value of expr. If the expression contains <, >, & or | then at least this part of the expression must be placed within (). The third form assigns the value of expr to the indexth argument of name. Both name and its indexth component must already exist.
The operators *=, +=, etc. are available as in C. The space separating the name from the assignment operator is optional. Spaces are, however, mandatory in separating components of expr, which would otherwise be single words.
Special postfix ++ and -- operators increment and decrement name respectively; i.e., '@ i++'.

Pre-defined and environment variables

The following variables have special meaning to the shell. Of these, argv, cwd, home, path, prompt, shell and status are always set by the shell. Except for cwd and status, this setting occurs only at initialization; these variables will not then be modified unless done explicitly by the user.

The shell copies the environment variable USER into the variable user, TERM into term, and HOME into home, and copies these back into the environment whenever the normal shell variables are reset. The environment variable PATH is likewise handled; it is not necessary to worry about its setting other than in the file .cshrc as inferior csh processes will import the definition of path from the environment, and re-export it if you then change it.

argvSet to the arguments to the shell, it is from this variable that positional parameters are substituted; i.e., '$1' is replaced by '$argv[1]', etc.
cdpathGives a list of alternate directories searched to find subdirectories in chdir commands.
cwdThe full pathname of the current directory.
echoSet when the -x command-line option is given. Causes each command and its arguments to be echoed just before it is executed. For non-built-in commands all expansions occur before echoing. Built-in commands are echoed before command and filename substitution, since these substitutions are then done selectively.
filecEnable file name completion.
histcharsCan be given a string value to change the characters used in history substitution. The first character of its value is used as the history substitution character, replacing the default character !. The second character of its value replaces the character ^ in quick substitutions.
histfileCan be set to the pathname where history is going to be saved/restored.
historyCan be given a numeric value to control the size of the history list. Any command that has been referenced in this many events will not be discarded. Too large values of history may run the shell out of memory. The last executed command is always saved on the history list.
homeThe home directory of the invoker, initialized from the environment. The filename expansion of '~' refers to this variable.
ignoreeofIf set the shell ignores end-of-file from input devices that are terminals. This prevents shells from accidentally being killed by control-Ds.
mailThe files where the shell checks for mail. This checking is done after each command completion that will result in a prompt, if a specified interval has elapsed. The shell says 'You have new mail.' if the file exists with an access time not greater than its modify time.
If the first word of the value of mail is numeric it specifies a different mail checking interval, in seconds, than the default, which is 10 minutes.
If multiple mail files are specified, then the shell says 'New mail in name' when there is mail in the file name.
noclobberAs described in the section on Input/output, restrictions are placed on output redirection to ensure that files are not accidentally destroyed, and that >> redirections refer to existing files.
noglobIf set, filename expansion is inhibited. This inhibition is most useful in shell scripts that are not dealing with filenames, or after a list of filenames has been obtained and further expansions are not desirable.
nonomatchIf set, it is not an error for a filename expansion to not match any existing files; instead the primitive pattern is returned. It is still an error for the primitive pattern to be malformed; i.e., 'echo [' still gives an error.
notifyIf set, the shell notifies asynchronously of job completions; the default is to present job completions just before printing a prompt.
pathEach word of the path variable specifies a directory in which commands are to be sought for execution. A null word specifies the current directory. If there is no path variable then only full path names will execute. The usual search path is '.', '/bin', '/usr/bin', '/sbin' and '/usr/sbin', but this may vary from system to system. For the superuser the default search path is '/bin', '/usr/bin', '/sbin', and '/usr/sbin'. A shell that is given neither the -c nor the -t option will normally hash the contents of the directories in the path variable after reading .cshrc, and each time the path variable is reset. If new commands are added to these directories while the shell is active, it may be necessary to do a rehash or the commands may not be found.
promptThe string that is printed before each command is read from an interactive terminal input. If a ! appears in the string it will be replaced by the current event number unless a preceding is given. Default is '%', or '#' for the superuser.
savehistIs given a numeric value to control the number of entries of the history list that are saved in ~/.history when the user logs out. Any command that has been referenced in this many events will be saved. During start up the shell sources ~/.history into the history list enabling history to be saved across logins. Too large values of savehist will slow down the shell during start up. If savehist is just set, the shell will use the value of history.
shellThe file in which the shell resides. This variable is used in forking shells to interpret files that have execute bits set, but that are not executable by the system. See the description of Non-built-in command execution below. Initialized to the (system-dependent) home of the shell.
statusThe status returned by the last command. If it terminated abnormally, then 0200 is added to the status. Built-in commands that fail return exit status 1, all other built-in commands set status to 0.
timeControls automatic timing of commands. If set, then any command that takes more than this many CPU seconds will cause a line giving user, system, and real times, and a utilization percentage that is the ratio of user plus system times to real time to be printed when it terminates.
verboseSet by the -v command-line option, causes the words of each command to be printed after history substitution.

Non-built-in command execution

When a command to be executed is found to not be a built-in command the shell attempts to execute the command via execve. Each word in the variable path names a directory from which the shell will attempt to execute the command. If it is given neither a -c nor a -t option, the shell will hash the names in these directories into an internal table so that it will only try an exec in a directory if there is a possibility that the command resides there. This shortcut greatly speeds command location when many directories are present in the search path. If this mechanism has been turned off (via unhash), or if the shell was given a -c or -t argument, and in any case for each directory component of path that does not begin with a /, the shell concatenates with the given command name to form a path name of a file which it then attempts to execute.

Parenthesized commands are always executed in a sub-shell. Thus

prints the home directory; leaving you where you were (printing this after the home directory), while

leaves you in the home directory. Parenthesized commands are most often used to prevent chdir from affecting the current shell.

If the file has execute permissions but is not an executable binary to the system, then it is assumed to be a file containing shell commands and a new shell is spawned to read it.

If there is an alias for shell then the words of the alias will be prepended to the argument list to form the shell command. The first word of the alias should be the full path name of the shell (e.g., '$shell'). Note that this is a special, late occurring, case of alias substitution, and only allows words to be prepended to the argument list without change.

Signal Handling

The shell normally ignores SIGQUITsignals. Jobs running detached (either by & or the bg or %... & commands) are immune to signals generated from the keyboard, including hangups. Other signals have the values which the shell inherited from its parent. The shell's handling of interrupts and terminate signals in shell scripts can be controlled by onintr. Login shells catch the SIGTERM (terminate) signal; otherwise this signal is passed on to children from the state in the shell's parent. Interrupts are not allowed when a login shell is reading the file .logout.

Limitations

Word lengths: Words can be no longer than 1024 characters. The number of arguments to a command that involves filename expansion is limited to 1/6th the number of characters allowed in an argument list. Command substitutions may substitute no more characters than are allowed in an argument list. To detect looping, the shell restricts the number of alias substitutions on a single line to 20.

Files

~/.cshrcread at beginning of execution by each shell
~/.loginread by login shell, after .cshrc at login
~/.logoutread by login shell, at logout
/bin/shstandard shell, for shell scripts not starting with a #
/tmp/sh.*temporary file for <<
/etc/passwdsource of home directories for '~name'

Examples

Executes and runs the C Shell (if present).

See the .cshrc guide for an example of the .cshrc file and additional information about this file.

Related commands

bash — The Bourne Again shell command interpreter.
bc — A calculator.
echo — Output text.
login — Begin a session on a system.
ls — List the contents of a directory or directories.
more — Display text one screen at a time.
ps — Report the status of a process or processes.
sh — The Bourne shell command interpreter.

Regular Expression Examples is a list, roughly sorted by complexity, of regular expression examples. It also serves as both a library of useful expressions to include in your own code.

For advanced examples, see Advanced Regular Expression Examples You can also find some regular expressions on Regular Expressions and Bag of algorithms pages.

See Also

Example Regexes to Match Common Programming Language Constructs
Extracting numbers from text strings, removing unwanted characters , comp.lang.tcl, 2002-06-23
a delightful explication by Michael Cleverly
re_syntax
URI detector for arbitrary text as a regular expression
Arts and crafts of Tcl-Tk programming
Regular Expressions
Regular Expression Debugging Tips
Visual Regexp
A terrific way to learn about REs.
Redet
Another tool for learning about and working with REs.
Regular Expression Debugging Tips
More tools.

Simple regexp Examples

regexp has syntax:

regexp ?switches? exp string ?matchVar? ?subMatchVar subMatchVar ...?

If matchVar is specified, its value will be only the part of the string that was matched by the exp. As an example:

If any subMatchVars are specified, their values will be the part of the string that were matched by parenthesized bits in the exp, counting open parentheses from left to right. For example:

Many times, people only care about the subMatchVars and want to ignore matchVar. They use a 'dummy' variable as a placeholder in the command for the matchVar. You will often see things like

where ${->} holds the matched part. It is a sneaky but legal Tcl variable name.

PYK 2015-10-29: As a matter of fact, every string is a legal Tcl variable name.

Splitting a String Into Words

'How do I split an arbitrary string into words?' is a frequently asked question. If you use split $string { }, then multiple spaces will produce a list with empty elements. If you try to use foreach or lindex or some other list operation, then you must be sure that the string is a well-formed list. (Braces could cause problems.) So use a regular expression like this very simple shorthand for non-space characters:

You can even split a string of text with arbitrary spaces and special characters into a list of words by using the -inline and -all switches to regexp:

Split into Words, Respecting Acronyms

from Tcl Chatroom, 2013-10-09

Floating Point Number

This expression includes options for leading +/- character, digits, decimal points, and a trailing exponent. Note the use of nearly duplicate expressions joined with the or operator | to permit the decimal point to lead or follow digits.

Expression to find if a string have any substring maching a floating point number ( This was posted to comp.lang.tcl by Roland B. Roberts.):

More information (http://www.regular-expressions.info/floatingpoint.html )

Letters

Thanks to Brent Welch for these examples, showing the difference between a traditional character matching and 'the Unicode way.'

Only letters:

Only letters, the Unicode way:

Special Characters

Thanks again to Brent Welch for these two examples.

The set of Tcl special characters: ] [ $ { } :

The set of regular expression special characters: '] [ $ ^ ? + * ( ) | '

CHARACTERDESCRIPTION
*The sub-pattern before '*' can occur zero or more times
+The sub-pattern before '+' can occur one or more times
?The sub-pattern before '?' can only occur zero or one time
|(Alteration) Matches any one sub-pattern separated by '|'s. Similar to logical 'OR'.
()Groups a pattern
[]Defines a set of characters, or range of characters [a-z,A-Z,0-9]

I don't understand these examples. Why have [, ], and then the rest of the characters inside a [] - that just makes the string have [ and ] there twice, right?

LV: the first regular expression should be seen like this:

{ ... }
Protect the 9 inner characters.
[ ... ]
Define a set of characters to process.
]
If your set of characters is going to include the right bracket character ] as a specific matching character, then it needs to be first in the set/class definition.
[${}
More individual characters.
Doubled because when regexp goes to evaluate the characters, it would otherwise treat a single backslash as a request to quote the next character, the ending right bracket of the set/class.

The second regular expression is interpreted in a similar fashion. There are more characters because there are more metacharacters.

Also, not all characters are there - where are the period, equals, bang (exclamation sign), dash, colon, alphas that are a part of character entry escapes or classes, 0, hash/pound sign, and angle brackets (< and >)? These special characters all have meta meanings within regular expressions...

LV: Apparently no one has come along and updated the above expression to cover these.

Example posted by KC:

A set containing both angle brackets:

newline/carriage return

Could someone replace this line with some verbiage regarding the way one uses regular expressions for specific newline-carriage return handling (as opposed to the use of the $ metacharacter)?

Janos Holanyi: I would really need to build up a re that would match one line and only one line - that is, excluding carriage-return-newline's (rn) from matching... How would such a re look like?

LV: how about something like this?

If you want to keep carriage returns or newlines by themselves, but not when they are together, you need something like:

This allows plain carriage return or plain newline.

Thanks to bbh and Donal Fellows for this regular expression.

Back References

From comp.lang.tcl:

I did some experimenting with other strings, like 'just a HHHHEEEEAAAADDDDEEEERRRR'. The regular expression (.)111 does the job I would have wanted, whereas (.){4} will return the last of each four characters - as posted as well.

That surprised me too -- being able to place backreferences within the regex is an extremely powerful technique.

for exactly 4 char repeats, and (.)1+ for arbitrary repeats.

Whitespace After a Newline

PYK 2019-02-21: How does one capture any whitespace followed by a newline, except for newlines? The key is to use a negative lookahead to match empty space not followed by a newline. That bears repeating: Parenthesis are used to isolate the negative lookahead so that what matches immediately prior is the empty string:

This mechanism is effectively an and not operator.

bll 2019-02-21: I find:

much easier.

PYK 2019-02-21: That picks up much more than whitespace, so not quite the same thing.

IP Numbers

You can create a regular expression to check an IP address for correct syntax. Note that this regular expression only checks for groups of 1-3 digits separated by periods. If you want to ensure that the digit groups are from 0-255, or that you have a valid IP address, you'll have to do additional (non regexp) work. This code posted to comp.lang.tcl by George Peter Staplin

The above regular expression matches any string where there are four groups of 1-3 digits separated by periods. Since it's not anchored to the start and end of the string (with ^ and $) it will match any string that contains four groups of 1-3 digits separated by periods, such as: '66.70.7.154.9'.

If you don't mind a longer regexp, there is no reason you can't ensure that each group of 1-3 digits is in the range of 0-255. For example (broken up a bit to make it more readable):

recently on comp.lang.tcl, someone mentioned that http://www.oreilly.com/catalog/regex/chapter/ch04.html#Be_Specific talks about matching IP addresses.

Gururajesh: A Perfect regular expression to validate ip address with a single expression.

For 245.254.253.2, output is 245.254.253.2

For 265.254.243.2, output is none, As ip-address can`t have a number greater than 255.

Lars H: Perfect? No, it looks like it would accept 99a99b99c99, since . will match any character. Also, it can be shortened significantly by making use of {4} and the like (see Regular expressions).

Better is

Tcllib should be useful

freethomas: I thinks this regexp is much simple and easier for IP number

AMG: This expression allows any character to separate the octets, not just period. I sincerely doubt this is what you want. Use . instead of D. Also it's not anchored with ^ and $, so it works on substrings rather than requiring that the whole string match. Though maybe this is what you want since you explicitly capture the matching substring.

I already fixed the syntax issue of saying { at the beginning but leaving out the closing }, also of leaving out the first (.

I see no reason to use ( and ) grouping. You don't give variables into which the subexpressions would be captured, and it's pointless to capture the dots between the octets. (See what I did there?) Try this:

AMG: Here's a very similar script (to Lars H's contribution) that uses scan instead of regexp. It's much more readable, in my opinion.

There are a few differences. One, the trailing dot is omitted from the first three output variables (which I call a, b, c, d instead of v1, v2, v3, v4). Two, leading zeroes are permitted and discarded. Three, -0 is accepted as 0. Four, garbage at the end of $string is silently discarded. Five, each octet can have a leading +, e.g. +255.+255.+255.+255. Six, it's OVER FIVE TIMES FASTER! On this machine, my version using scan takes 15 microseconds, whereas your version using regexp takes 78 microseconds. Use time to measure performance. (I replaced puts with return when testing.)

Now, here's a hybrid version that uses regexp.

This version takes 46 microseconds to execute. It doesn't accept leading + or -. It rejects garbage at the end of the string. It treats the octets as octal if they are given leading zeroes, and invalid octal is always accepted. The reason for this last is because if treats strings containing invalid octal as nonnumeric text, so the <= operator is used to sort text rather than compare numbers. Corrected version:

This version takes 47 microseconds and it rejects invalid octal. However, it still interprets numbers as octal if leading zeroes are given, so 0377.255.255.255 is accepted (but 0400.255.255.255 is rejected). To fix this, it would be necessary to make a pattern that rejects leading zeroes unless the octet is exactly zero, something like: (0|[^1-9]d*). But this is getting clumsy and slow; I prefer the scan solution. regexp: not always the right tool!

Gururajesh:

This will be ok... for above mentioned issue.

AMG: Why call scan four times? A single invocation can do the job:

I don't see any drawbacks to this approach. The regular expression is simple and is used only to reject + and - signs and garbage at the end, scan does the job of splitting and converting to integers, and math expressions check ranges. Three tools, each doing what they're designed for.

CJB: Here is a pure regexp version with comparable performance. It matches any valid ip, rejecting octals. However it does not split the integers and is therefore only useful for validation. The timings on my computer were about 22 microseconds for this version compared to 28 microseconds for the regexp/scan combo (I removed the puts statements for the comparison because they are slow and tend to vary).

Note that the pure scan version is still fastest (about 20 microseconds), splits, and has the same rejections (%d stores integers and ignores extra leading 0 characters).

fh 2012-02-13 11:54:30:

To search IP ADDRESS using Regular Expression

Domain names

(First shot)

This code does NOT attempt, obviously, to ensure that the last level of the regular expression matches a known domain...

Regular Expression for parsing http string

the above author should remember this is a Tcl wiki, and not an aolserver one, but thanks for the submission ;)

PYK 2016-02-28: In the previous edit, a - character was added to the regular expression, prohibiting the occurrence of - in scheme component of a URL. As far as I can tell, - is allowed in the scheme component, so I've reverted that change in the expression above.

E-mail addresses

RS: No warranty, just a first shot:

Understand that this expression is an attempt to see if a string has a format that is compatible with normal RFC SMTP email address formats. It does not attempt to see whether the email address is correct. Also, it does not account for comments embedded within email addresses, which are defined even though seldom used.

bll 2017-6-30 E-mail addresses are quite complicated. You must be careful not to reject valid e-mail addresses. For example, % and + characters are valid. Nobody uses the % sign any more as it is not secure. The + character is very useful, but unfortunately, there are a lot of incorrect e-mail validation routines that reject it.

The following pattern will still reject an e-mail of the form [email protected][ip-address]. No lengths are checked. It does not check that the top-level domain (e.g. .org, .com, .solutions) is valid.

Reference: https://en.wikipedia.org/wiki/Email_address#Examples

XML-like data

Wordlists For Openbullet

To match something similar to XML-tags you can use regular-expressions, too. Let's assume we have this text:

We can match the body of bo with this regexp:

Now we extend our XML-text with some attributes for the tags, say:

If we try to match this with:

it won't work anymore. This is because s+ is greedy (in contrary to the non-greedy (.+?) and (.*?)) and that (the one greedy-operator) makes the whole expression greedy.

See Henry Spencer's reply in tcl 8.2 regexp not doing non-greedy matching correctly , comp.lang.tcl, 1999-09-20.

The correct way is:

Now we can write a more general XML-to-whatever-translater like this:

  1. Substitute [ and ] with their corresponding [ and ] to avoid confusion with subst in 3.
  2. Substitute the tags and attributes with commands
  3. Do a subst on the whole text, thereby calling the inserted commands

Call the parser with:

10 Digit Numeric Wordlist Printable

You have to be careful, though. Don't do this for large texts or texts with many nested xml-tags because the regular-expression-machine is not the the right tool to parse large,nested files efficiently. (Stefan Vogel)

DKF: I agree with that last point. If you are really dealing with XML, it is better to use a proper tool like TclDOM or tDOM.

PYK 2015-10-30: I patched the regular expression to fix an issue where the attributes group could pick up part of the tag in documents containing tags with similar prefixes. The fix is to use whitespace followed by non-whitespace other than > to detect the beginning of attributes. There are other things

Negated string

Bruce Hartweg wrote in comp.lang.tcl: You can't negate a regular expression, but you CAN negate a regular expression that is only a simple string. Logically, it's the following:

  • match any single char except first letter in the string.
  • match the first char in string if followed by any letter except the 2nd
  • match the first two if followed by any but the third, et cetera

Then the only thing more is to allow a partial match of the string at end of line. So for a regexp that matches

The following proc will build the expression for any given string

Donal Fellows followed up with:

That's just set me thinking; you can do this by specifying that the whole string must be either not the character of the antimatch*, or the first character of the antimatch so long as it is not followed by the rest of the antimatch. This leads to a fairly simply expressed pattern.

In fact, this allows us to strengthen what you say above to allow the matching of any negated regular expression directly so long as the first component of the antimatch is a literal, and the rest of the antimatch is expressible in an ERE lookahead constraint (which imposes a number of restrictions, but still allows for some fairly sophisticated patterns.)

* Anything's better than overloading 'string' here!

JMN 2005-12-22: Could someone please explain what is meant by a 'negated string' here? Specifically - what do the above achieve that isn't satisfied by the simpler:

Doesn't the following snippet from the regexp manpage indicate that a regexp can be negated? where does(or did?) the 'simple string' requirement come in? - is this info no longer current?

Lars H: It indeed seems the entire problem is rather trivial. In Tcl 7 (before AREs) one sometimes had to do funny tricks like the ones Bruce Hartweg performs above, but his use of {0,2} means he must be assuming AREs. Perhaps there was a transitory period where one was available but not the other.

Oleg 2009-12-11: If one needs to match any string but 'foo', then the following will do the work:

And in general case when one needs to match any string that is neither 'foo' nor 'bar', then the following will do the work:

CRML 2013-11-06 In general case when one needs to match any string that is neither 'foo' nor 'bar' might be done using:

AMG: Oleg's regexps confuse me. Translated literally, I read them as 'match any string that does not begin with foo (or bar) unless that string has more characters after the foo (or bar).' Very indirect, I must say. CRML's suggestion I like better, though I would drop the extra parentheses to obtain: ^(?!(foo|bar)$). This says, 'match any string that does not begin with either foo or bar when immediately followed by end of string.' In other words, 'match any string that is not exactly foo or bar.'

Turn a string into %hex-escaped (url encoded) characters:

e.g. Csan -> %43%73%61%6E

This demonstrates the power of using regsub together with subst, which is regarded as one of the most powerful ways to use regular expressions in Tcl.

Turn a string into %hex-escaped (url encoded) characters (part 2)

This one makes the result more readable and still quite safe to use in URLs e.g. https://wiki.tcl-lang.org -> http%3A%2F%2Fwiki%2Etcl%2Etk

The inverse of the above (not optimized):

Caveats about using regsub with subst

glennj 2008-12-16: It can be dangerous to blindly apply subst to the results of regsub, particularly if you have not validated the input string. Here's an example that's not too contrived:

This results in invalid command name 'Some'. What if $string was [exec format c:]?

See DKF's 'proc regsub-eval' contribution in regsub to properly prepare the input string for substitution. Paraphrased:

which results in what you'd expect: the string '[Some Malicious Command]'

APN I don't follow why all the extra are needed in the string map. The following should work just as well?

PYK 2016-05-28: Indeed:

Maintain proper spacing when formatting for HTML

DG got this from Kevin Kenny on c.l.t.

And the output is:

Tabs require replacement, too:

glennj: Taken from comp.lang.perl.misc, transform variable names into StudlyCapsNames:

When using ASED's syntax checker you get an error of you don't use the -- option to regexp. Instead of regexp {([^A-Za-z0-9_-])} $string you have to write regexp -- {([^A-Za-z0-9_-])} $string

LV: A user recently asked:

I have a string that I'm trying to parse. Why doesn't this seem to work?

It looks to me like the *? causes the subsequent d+ to also be non-greedy and only match the first hit. Did I figure that out correctly? I presume that we currently don't have a way to turn off the greediness item?

Of course, in this simplified problem, one could just drop the greediness and code

I'll let the user decide if that suffices.

PYK 2019-08-15: See 'greediness' at Regular Expressions. In short the greediness of every quantifier is the greediness of its branch, regardless of the default preference of the quantifier. A branch, in turn, picks up its greediness from the first quantifier.

How do you select from two words?

LES: You got the regexp syntax wrong and tried to match the regular expression with the string 'match'. There is no 'zzz' variable (the actual match variable in your code) because your regular expression does not match the string 'match'. Try this:

Note that I could have dropped the 'zzz' variable, but left it there as a second match variable, as an exercise to you. You should understand why and what it does if you read the regexp page and assimilate the syntax.

Infinite spaces at start and end

RUJ: Could you match the following pattern of following string: infinite spaces at start and end.

LV: try

which should have a value of 1 (in other words, it matched). Of course, if those leading and trailing spaces are optional, then change the + to a *.

CRML non greedy or greedy does not give the same result. In the previous example, the .* matches all the string up to the last but one char.

10 Digit Numeric Wordlist Solver

URL Parser

See URL Parser.

Match a 'quoted string'

AMG: Adapted from Wibble:

This recognizes strings starting and ending with double quote characters. Any character can be embedded in the string, even double quotes, when preceded by an odd number of backslashes.

Word Splitting, Respecting Quoted Strings

given some text, e.g.

how to parse it into

see KBK, #tcl irc channel, 2012-12-02

split a string into n-length substrings

evilotto, #tcl, 2013-02-07

At Least 1 Alpha Character Interspersed with 0 or More Digits

Matching a group of strings

We can match a group of strings or subjects in a single regular expression

Sample Wordlist File Download

Sqlite Numeric Literal

ak - 2017-08-08 03:32:33

Regarding negation of regular expressions.

While the regular expression syntax does not allow for simple negation the underlying formalism of (non)deterministic finite automata does. Simply swap final and non-final states to negate, i.e. complement it.

8 Digit Numeric Wordlist Download

See for example the grammar::fa package in Tcllib, which provides a complement method. It is implemented in the operations package. As are methods to convert from and to regular expressions.

Category TutorialCategory String Processing