Day: January 28, 2008

A Little about the Regular Expressions

A Little about the Regular Expressions

A regular expression (regex or regexp for short) is a special text string for describing a search pattern. You can think of regular expressions as wildcards on steroids. You are probably familiar with wildcard notations such as *.txt to find all text files in a file manager. The regex equivalent is .*\.txt$.

But you can do much more with regular expressions. In a text editor like EditPad Pro or a specialized text processing tool like PowerGREP, you could use the regular expression \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b Analyze this regular expression with RegexBuddy to search for an email address. Any email address, to be exact. A very similar regular expression (replace the first \b with ^ and the last one with $) can be used by a programmer to check if the user entered a properly formatted email address. In just one line of code, whether that code is written in Perl, PHP, Java, a .NET language or a multitude of other languages.

If you are not a programmer, you use regular expressions in many situations just as well. They will make finding information a lot easier. You can use them in powerful search and replace operations to quickly make changes across large numbers of files. A simple example is gr[ae]y which will find both spellings of the word grey in one operation, instead of two. There are many text editors and search and replace tools with decent regex support.

If you’re hungry for more information on regular expressions after reading this website, there are a variety of books on the subject.

RegExLib.com, the Internet’s first Regular Expression Library. They have indexed 1935 expressions from 1188 contributors around the world.

A large number of tools incorporate regular expressions as part of their functionality. Unix-oriented command line tools like grep, sed, and awk are mostly wrapper for regular expression processing. Many text editors allow search and/or replacement based on regular expressions. Many programming languages, especially scripting languages such as Perl, Python, and TCL, build regular expressions into the heart of the language. Even most command-line shells, such as Bash or the Windows-console allow restricted regular expressions as part of their command syntax.

There are a few variations in regular expression syntax between different tools that use them. Some tools add enhanced capabilities that are not available everywhere. In general, for the simplest cases, this tutorial will use examples based around grep or sed. For a few more exotic capabilities, Perl or Python examples will be chosen. For the most part, examples will work anywhere; but check the documentation on your own tool for syntax variations and capabilities.

A good tutorial on Regular expressions can be found on http://gnosis.cx/publish/programming/regular_expressions.html
The tutorial has been written by David Mertz (mertz@gnosis.cx)

A good amount of information and other tutorails on Regular Expressions can be found on
regular-expressions.info (Recommended to visit, if you would like to have more information on Regular expressions).

And RegExLib.com, the Internet’s first Regular Expression Library. A great Thanks to all the contributors, who have contributed to the library and helping all the people like us.

The Tcl programming Language

The Tcl programming Language

Tcl (Tool Command Language) is a very powerful but easy to learn dynamic programming language, suitable for a very wide range of uses, including web and desktop applications, networking, administration, testing and many more. Open source and business-friendly, Tcl is a mature yet evolving language that is truly cross platform, easily deployed and highly extensible.

Tk is a graphical user interface toolkit that takes developing desktop applications to a higher level than conventional approaches. Tk is the standard GUI not only for Tcl, but for many other dynamic languages, and can produce rich, native applications that run unchanged across Windows, Mac OS X, Linux and more.

Here in Library Systems at The University of Chicago Tcl is already in use :

* as a general-purpose scripting language to write Unix applications;
* to automate login scripts for telnet connections;
* as the programming language for the Reserve System.

In addition, a number of our Unix applications are written in Tcl (see below), and Tcl is the programming language of the University of Chicago BSDAC Phoenix Project.

The Tcl/Tk developer community now numbers in the tens of thousands and there are thousands of Tcl applications in existence or under development. The application areas for Tcl and Tk cover virtually the entire spectrum of graphical and engineering applications, including computer-aided design, software development, testing, instrument control, scientific visualization, and multimedia. […] Tcl and Tk are being used by hundreds of companies, large and small, as well as universities and research laboratories.[1]

It provides all the usual high-level programming features that we’ve come to expect from languages like the Unix shell, Awk, Perl, or Rexx, such as:

* Variable-length strings
* Associative arrays
* Lists
* Keyed lists (aka structs, structures or records)
* Pattern matching with regular expressions
* Ability to define or redefine procedures at run-time
* Full file access
* Error handling

Tcl is a small language designed to be embedded in other applications (C programs for example) as a configuration and extension language. This minimizes the number of languages that users need to learn in order to configure their applications, and makes these applications programmable with no extra effort. In addition, Tcl is a complete and well-designed programming language, whereas many existing configuration languages were designed (to be kind) in an ad hoc manner.
Syntax

An instruction is the name of a command (not a keyword) followed by a list of words separated by a whitespace, the arguments.
Statements end with the end of the line. They may be separated by a semi-colon on a same line.
Square brackets replace an argument by a command. These are so a substitution symbols.
The = sign is never used, the “set” command assigns a value to a variable:

set varname value

The { } serve for grouping, without subsitution.

# introduces a comment.

Control structures

The if command use two groups { }, the first for the condition, the second for the actions.

if { x < 10 }
{
puts “x less than 10”
}
The while command has the same syntax.
Procedures

The definition starts with the proc command, plus the name and two groups for the arguments and the statements.

proc procname { arguments }
{
…statements…
}

More Information and Tutorials on the TCL programming language can be

found on the TCL/TK website http://www.tcl.tk