"CookBook" matrix

Samples of code (Just started 2007.Feb.10, Web-accessible demos of examples of how to accomplish various tasks in all six languages, after HTML FORM contents have already been decoded. Currently contains only one task: Validating that a particular form field contains a (string) representation of an integer, and if so then converting to actual integer value, and if successful then counting past that value just to show we have a actual integer there.)

The primary purpose of this document is to provide a set of answers for questions of the general form "in language L, how can a program specify conversion of data-type T1 to data-type T2" for all reasonable combinations of two data types. This idea was inspired by a question of that form which appeared in a newsgroup, specifically how to find the ASCII character code for a given character. (original question: <1hslw8p.x4kuq71t0cwqoN%wrf3@stablecross.com> / complaint that Google search didn't find the needed info, which actually triggered my realization of a need for this 'matrix': <1hslyqe.188ewmp1wjd3gqN%wrf3@stablecross.com>) I've chosen to provide this "matrix" (as if the two data-types were row- and column-headers in a rectangular table) for six (at present when I wrote this in 2007) programming languages, namely the six (c, c++, java, Common Lisp, perl, PHP) for which I know how to implement demonstrations via CGI and have such facilities available on this Unix shell ISP, so that I can in fact eventually provide sample source code and runnable demonstrations of each transformation that I describe. (Update 2008.March: Flying Thunder was added to the CGI-capable languages, so eventually I should include it here too.) At each cell in this matrix, all six languages are compared as to how to program that one transformation in each. For the near future, I'm going to restrict my coverage to those such data-type transformations which are available as a built-in primitive (operator or function or method etc.) in at least one of these languages. I'm starting with those which are available in C (about 80% done as of Feb.25), then when that's done I plan to extend my coverage to C++, then Common Lisp, then java (up to J2SE version 1.4 only), then to perl and PHP.

The secondary purpose of this document more generally is to document every commonly-available and generally-useful data-processing task which is available as a built-in primitive (operator or function or method etc.) in at least one of these languages. Somebody who doesn't yet write computer software might thereby get a preliminary "feel" for which tasks are already programmed and immediately accessible by a single line of code compared to which tasks require loading a third-party library or writing an algorithm from scratch yourself. This information could serve to entice somebody into starting to write software, or allow somebody to assess whether a proposed project will be easy (just paste together a few existing functions or methods) or hard (write ten thousand new lines of code from scratch), and hence whether it's worth the novice's energy to even start such a project. This information could also help somebody choose an appropriate programming language for a new project, if such a choice is allowed (by the funding agency or supervisor/boss etc.)

To do (rough draft here): Add disclaimer that for C I want first to describe everthing available from the very start (K&R), and then try to include new features that have been added or changed via the C89 and C99 standards which haven't necessarily all gotten incorporated into the particular version of C that you may be using.

If you know the two primary data types you want to work with or convert between, find one of them as a primary heading in the matrix, then the other as a sub-heading after "and". If there's just a stub for the sub-heading, tell me and I'll expedite filling in that section. If there's already a link, click on it, and you'll get to a section that details functions/operators/methods that coordinate those two data types somehow.

If you know the name for a function in one language, which performs some task of interest, and you want to learn how to perform the equivalent task in another language, use the local-search feature of your Web browser to directly search for the name you already know. For a binary operator, surrounded the search term with spaces. For a unary operator, I'm not sure how to find it, maybe just browse the section that deals with the two datatypes, searching for that operator character within that section.

For the more basic info introducing these various data types in the various languages (chapter 1), and how to write programs in these languages (chapter 2), backtrack to the main file.

Most languages don't have an explicit boolean data type, so they fake boolean by mapping true/false values into a subset of values of some other data type.

lisp -- NIL represents false, while any other value whatsoever is treated as true. Functions that return pure boolean values return T to represent true, and this is the recommended value to use when writing code that requires a true value as a constant. Both NIL and T are constants, each of which is usable as if it were a literal. Many functions return "extra-boolean" results, i.e. NIL for failure but some other value with extra meaning for success. For example, position looks for a particular element within a sequence, returning NIL if no such element was found, otherwise the index of the first such element found.
c, c++ -- The number 0 (of any integer type) is treated as false, while any nonzero integer is treated as true. The value generated by pure boolean predicates may be 1 or -1 or any other particular value, depending on the vendor. There's no particular value recommended as a constant/literal, although I tend to prefer 1 just because it's shortest to type. Warning: Some library functions, where a return value of 0 would represent success, use a totally different convention of returning -1 to represent failure.

Logical negation: given boolean b1, compute its logical inverse:

c, c++, java -- ! b1
lisp -- (not b1)

Logical AND: Given boolean b1,b2, compute their logical conjunction:

c, c++, java -- b1 && b2
lisp -- (and b1 b2)

Logical OR: Given boolean b1,b2, compute their logical disjunction:

c, c++, java -- b1 || b2
lisp -- (or b1 b2)

Unary minus: Given number n1, compute its additive inverse:

c, c++, java -- - n1
lisp -- (- n1)

Binary plus: Given numbers n1,n2, compute their sum:

c, c++, java -- n1 + n2
lisp -- (+ n1 n2) (Note: more than two arguments can be given)

Assignment plus: Given numeric variable n1, and number n2, replace the value of n1 by the sum of the old value of n1 and n2:

c, c++, java -- n1 += n2; (n1 can be only a simple variable, an array element, or an element of a struct)
lisp -- (incf n1 n2) (n1 can be any of a wide variety of "places")

Binary minus: Given numbers n1,n2, compute n2 subtracted from n1:

c, c++, java -- n1 - n2
lisp -- (- n1 n2)

Assignment minus: Given numeric variable n1, and number n2, replace the value of n1 by the difference of the old value of n1 minus n2:

c, c++, java -- n1 -= n2; (n1 can be only a simple variable, an array element, or an element of a struct)
lisp -- (decf n1 n2) (n1 can be any of a wide variety of "places")

Multiply: Given numbers n1,n2, compute their product:

c, c++, java -- n1 * n2
lisp -- (* n1 n2)

Assignment multiply: Given numeric variable n1, and number n2, replace the value of n1 by the product of the old value of n1 times n2:

c, c++, java -- n1 *= n2; (n1 can be only a simple variable, an array element, or an element of a struct)
lisp -- (setf n1 (* n1 n2)) (n1 can be any of a wide variety of "places")

Divide: Given numbers n1,n2, compute quotient of n1 divided by n2:

c, c++, java (depends on data type, see Floats, not possible with integers)
lisp -- (/ n1 n2)

Divide-floor, divide-ceiling, divide-truncate, divide-round: (depends on type and sign of numbers, see Floats or Integers)

Remainder (modulus): (depends on type and sign of numbers, see Floats or Integers)

Pre-increment: Given an integer variable n1, increase its value by one, and also return the result:

c, c++, java -- ++n1 (works only for simple variables and array elements)
lisp -- (incf n1) (works for a wide range of generalized places)

Post-increment: Given an integer variable, save the old value, increase the actual value by one, return the saved old value:

c, c++, java -- n1++ (works only for simple variables and array elements)
lisp -- (prog1 n1 (incf n1)) (see notes for prog1 and post-increment) If n1 is an expression for a place, such as a deep search into a tree, or an array reference where computation of the index is expensive, you probably don't want to evaluate it twice, so the proper code is: (let ((tmp n1)) (setf n1 (+ 1 tmp)) tmp)

Pre-decrement: Given an integer variable n1, decrease its value by one, and also return the result:

c, c++, java -- --n1 (works only for simple variables and array elements)
lisp -- (decf n1) (works for a wide range of generalized places)

Post-decrement: Given an integer variable, save the old value, decrease the actual value by one, return the saved old value:

c, c++, java -- n1-- (works only for simple variables and array elements)
lisp -- (prog1 n1 (decf n1)) or (let ((tmp n1)) (setf n1 (+ -1 tmp)) tmp) (see notes for prog1 and post-increment)

Divide (exact result): Given integers n1,n2, divide n1 by n2, producing the exact integer or rational result depending on whether n2 is or is not an exact multiple of n1:

lisp -- (/ n1 n2)
c, c++, java -- (not possible, no rational type in these languages, see divide-truncate if that's an acceptable result)

Divide-floor: Given integers n1,n2, in effect compute rational number which is the exact quotient of n1 divided by n2, but if it's not an integer then take instead the nearest integer that is smaller (closer to negative infinity), but actually produce that effect more efficiently without generating an intermediate rational result:

lisp -- (floor n1 n2)

Divide-ceiling: Given integers n1,n2, in effect compute rational number which is the exact quotient of n1 divided by n2, but if it's not an integer then take instead the nearest integer that is larger (closer to positive infinity), but actually produce that effect more efficiently without generating an intermediate rational result:

lisp -- (ceiling n1 n2)

Divide-truncate: Given integers n1,n2, in effect compute rational number which is the exact quotient of n1 divided by n2, but if it's not an integer then take instead the nearest integer that is smaller in absolute value (closer to zero), but actually produce that effect more efficiently without generating an intermediate rational result:

c, c++, java -- n1 / n2
lisp -- (truncate n1 n2)

Divide-truncate assignment: Given integer variable n1, integer n2, in effect compute rational number which is the exact quotient of n1 divided by n2, but if it's not an integer then take instead the nearest integer that is smaller in absolute value (closer to zero), but actually produce that effect more efficiently without generating an intermediate rational result, then assign that as the new value of n1:

c, c++, java -- n1 /= n2;
lisp -- (setf n1 (truncate n1 n2))

Divide-round: Given integers n1,n2, in effect compute rational number which is the exact quotient of n1 divided by n2, but if it's not an integer then take instead the nearest integer in either direction, but actually produce that effect more efficiently without generating an intermediate rational result:

lisp -- (round n1 n2)

Modulus: Given two integers n1,n2, divide n1 by n2, using the floor of the quotient, and return only the remainder, thus the return value is in the interval [0, n2) if n2 is positive, in the interval (n2 0] if n2 is negative):

lisp -- (mod n1 n2)

Remainder: Given two integers n1,n2, divide n1 by n2, using the truncate of the quotient, and return only the remainder, thus the return value is in the interval (-abs(n2), 0] if n1 is negative, in the interval [0, abs(n2)) if n1 is positive:

c, c++, java -- n1 % n2
lisp -- (rem n1 n2)

Remainder assignment: Given integer variable n1, integer n2, compute remainder using old value of n1 and n2, store result back in variable n1:

c, c++, java -- n1 %= n2
lisp -- (setf n1 (rem n1 n2))

Quotient and remainder: Given integers n1,n2, divide n1 by n2, returning both quotient and remainder:

c (#include <stdlib.h>) -- div(n1,n2) (Return value is of type struct div_t, used for type int only) (I don't know whether rounding is done toward negative infinity or toward zero)
c (#include <stdlib.h>) -- ldiv(n1,n2) (Return value is of type struct ldiv_t, used for type long int only) (I don't know whether rounding is done toward negative infinity or toward zero)
lisp -- (floor n1 n2) or (truncate n1 n2) (depending on whether you want quotient rounded toward negative infinity or toward zero) (multiple values are returned)

Pseudo-random numbers:

c (#include <stdlib.h>) -- see GNU C documentation: general discussion / ANSI / BSD
lisp -- see CL HyperSpec: RANDOM / MAKE-RANDOM-STATE

Complete table-of-contents for Common Lisp data types, constants, and functions dealing with Numbers (not just Integers), are here.

Complete table-of-contents for Common Lisp data types, constants, and functions dealing with Numbers (not just Floats), are here.

In lisp and java, just about every object is handled via a pointer, but these pointers are invisible to the user/programmer. A very few data types are handled immediately, and which these are is implementation dependent in lisp. Furthermore, in lisp, even within a single formal data type, such as integer, some values are handled immediately whereas others are fullfledged objects in memory handled indirectly via pointers. For example, in most lisp systems, small integers are immediate while all other integers are "bignum" (similar to java "BigInteger" class) objects, and conversion between the two types is automatic as circumstances warrant (for example, a small integer calculation that "overflows" automatically builds a "bignum" to hold the complete result, while any "bignum" calculation whose result happens to be within the small integer range produces an immediate instead of indirect-pointer value). In java by comparison, immediate values and fullfledged objects are kept quite distinct, and any value that can exist as an immediate value can in fact exist in either form, and the programmer must explicitly call a method to convert between immediate numbers and number objects. For details about the conversion (in java) between immediate objects and "wrapper" object classes, see the specific type, such as [integer] or [float].

In the other languages (c, c++, etc.), by comparison, the user is given raw pointers to play with, and the slighest mistake in the program can corrupt unrelated places in memory that happen to be near the intended place, which can either corrupt data or totally destroy a functionning program. The rest of this section, and other sections relating pointers to other data types, deal with this topic of working with raw pointers.

Indirect-fetch (dereference): Given a pointer p1, get whatever data is located at the memory location pointed at by p1:

c, c++ -- *p1 (note that the declaration of the type of the pointer determines how much data will be fetched at that location and successive locations and how that data will be typed upon return)

Indirect-store: Given a value val, and a pointer p1, store that value into the memory location pointed at by p1:

c, c++ -- *p1 = val; (note that the declaration of the type of the pointer determines how much data will be stored at that location and successive locations and the presumed typed of data at that location hence will determine what conversion if any from val is to be made before actual storing)

Pre-increment: Given a pointer p1, increase the pointer by one unit of data, and return the new pointer value:

c, c++ -- ++p1 (note that the actual machine numerical value of the increment will depend on the size of the data which p1 was defined to point at when it was declared; this isn't at all trivial if p1 was declared to point an element of an array of structs, where p1 will be advanced by the entire size of one struct, or where p1 was declared to point at an entire row of a multi-dimensional array, where p1 will therefore be advanced by one entire row, or even more complicatated combinations thereof)

Pre-decrement: Given a pointer p1, decrease the pointer by one unit of data, and return the new pointer value:

c, c++ -- --p1 (see note for pre-increment)

Post-increment: Given a pointer p1, save the old value, increase the pointer by one unit of data, and return the saved old value:

c, c++, java -- p1++ (see note for pre-increment)

Post-decrement: Given a pointer p1, save the old value, decrease the pointer by one unit of data, and return the saved old value:

c, c++, java -- p1-- (see note for pre-increment)

Inside the machine, in several of these languages, integers are implemented in two's complement notation. The individual bits of this representation can be manipulated, as if a bit vector (1-d array), with the rightmost bit corresponding to the numeric value 1 and successively leftward bits corresponding to powers of 2 larger, except the leftmost bit (of signed integers only) which is the 2's-complement-sign bit, as if 1 bits represented values of true and 0 bits represented values of false within the vector, by these operations:

Bitwise (1's) complement: Given integer bitmask n1, compute bitwise complement, which means every 1 bit becomes 0 bit and vice versa:

c, c++, java -- ~ n1
lisp -- (lognot n1)

Bitwise AND: Given integer bitmasks n1,n2, compute bitwise logical conjunction:

c, c++, java -- n1 & n2
lisp -- (logand n1 n2)

Bitwise AND assignment: Given integer bitmask variable n1, integer bitmask n2, compute bitwise logical conjunction, store result back into n1:

c, c++, java -- n1 &= n2;
lisp -- (setf n1 (logand n1 n2))

Bitwise inclusive OR: Given integer bitmasks n1,n2, compute bitwise logical disjunction:

c, c++, java -- n1 | n2
lisp -- (logior n1 n2)

Bitwise inclusive OR assignment: Given integer bitmask variable n1, integer bitmask n2, compute bitwise logical disjunction, store back into n1:

c, c++, java -- n1 |= n2
lisp -- (setf n1 (logior n1 n2))

Bitwise exclusive OR (XOR): Given integer bitmasks n1,n2, compute bitwise logical XOR:

c, c++, java -- n1 ^ n2 (note especially, this is not taking a number to a power, as you might guess from mathematical notation!)
lisp -- (logxor n1 n2)

Bitwise exclusive OR (XOR) assignment: Given integer bitmask variable n1, integer bitmask n2, compute bitwise logical XOR, store result back into n1:

c, c++, java -- n1 ^= n2
lisp -- (setf n1 (logxor n1 n2))

Shift-left: Given integer n1 treated as bitmask, and small positive integer n2 treated as integer, compute new value obtained by shifting n1 leftward by n2 bits, using zero bits to fill the n2 vacated positions at the rightmost end:

c, c++, java -- n1 << n2 (bits that overrun the leftmost bit of the fixed-precesion integer data type are discarded, which can cause the sign of a number to change if the bitmask is later used in arithmetic)
lisp -- (ash n1 n2) (the integer is extended in length to the left just enough that no bit is ever discarded)

Shift-left assignment: Given integer variable n1 treated as bitmask, and small positive integer n2 treated as integer, compute new value obtained by shifting n1 leftward by n2 bits, using zero bits to fill the n2 vacated positions at the rightmost end, store result back into n1:

c, c++, java -- n1 <<= n2 (see note for shift-left)
lisp -- (setf n1 (ash n1 n2))

Shift-right: Given integer n1 treated as bitmask, and small positive integer n2 treated as integer, compute new value obtained by shifting n1 rightward by n2 bits, discarding the n2 bits that run off the right end:

c, c++, java -- n1 >> n2 (bits that fill n2 vacated positions at the leftmost end of n1 are all zero, causing any previously negative value to become positive (or zero) if the bitmask is later used in arithmetic)
lisp -- (ash n1 (- n2)) (the integer is contracted in length as old bits are shifted out the right and nothing new is inserted at the left; if all bits are shifted out, what's left is a minimal number of bits, representing 0 if the leftmost bit was originally 0, or -1 if the leftmost bit was originally 1)

Shift-right assignment: Given integer variable n1 treated as bitmask, and small positive integer n2 treated as integer, compute new value obtained by shifting n1 rightward by n2 bits, discarding the n2 bits that run off the right end, store result back into n1:

c, c++, java -- n1 >>= n2 (see note for shift-right)
lisp -- (setf n1 (ash n1 (- n2)))

lisp -- Numbers treated as bit vectors are treated as if there were an infinite number of copies of the leftmost bit (the sign bit) extending forever to the left. This is important to understand when combining bitwise operations earlier above and shifting operations immediately above. Non-negative numbers as bitmasks can represent the characteristic function of a finite set. Negative numbers as bitmasks then represent the characteristic function of an infinite set whose complement is finite. Shifting then represents Hilbert's Hotel room-shifting operations.

Less-than: Given two numbers n1,n2, compare them, return true iff n1 is less than n2:

c, c++, java -- n1 < n2
lisp -- (< n1 n2)

Less-than chain: Given three or more numbers n1,n2,...,ny,nz, compare adjacent pairs, return true iff every pair satisfies the less-than relation:

c, c++, java -- (n1 < n2) && (n2 < n3) && ... && (ny < nz) (note: saying n1 < n2 < n3 < ... < ny < nz will absolutely not do what you wanted!!)
lisp -- (< n1 n2 ... ny nz)

Less-than or equal: Given two numbers n1,n2, compare them, return true iff n1 is less than or equal to n2:

c, c++, java -- n1 <= n2 (to chain, see less-than chain)
lisp -- (<= n1 n2) (can have more than two args, just like with less-than-chain)

Greater-than: Given two numbers n1,n2, compare them, return true iff n1 is greater than n2:

c, c++, java -- n1 > n2 (to chain, see less-than chain)
lisp -- (> n1 n2) (can have more than two args, just like with less-than-chain)

Greater-than or equal: Given two numbers n1,n2, compare them, return true iff n1 is greater than or equal to n2:

c, c++, java -- n1 >= n2 (to chain, see less-than chain)
lisp -- (>= n1 n2) (can have more than two args, just like with less-than-chain)

Equal: Given two numbers n1,n2, compare them, return true iff n1 is equal to n2:

c, c++, java -- n1 == n2 (Note: Be very careful, = by itself is assignment, not equality-test.) (to chain, see less-than chain)
lisp -- (= n1 n2) (can have more than two args, just like with less-than-chain)

Not-equal: Given two numbers n1,n2, compare them, return true iff n1 is not equal to n2:

c, c++, java -- n1 != n2
lisp -- (/= n1 n2)

No pair equal: Given three or more numbers n1,n2,...,ny,nz, return true iff they are all distinct, i.e. no equal pairs anywhere in the set:

lisp -- (/= n1 n2 ... ny nz)
c, c++, java (you must explicitly code every possible comparision, with && between adjacent individual comparisons; for more than three it's probably better to copy them all to an array and use a nested loop)

Less-than: Given pointers p1,p2, compare them, return true iff p1 points to memory location which is before memory location pointed at by p2:

c, c++ -- p1 < p2 (Note that some compilers allocate local variables downward in memory, so earlier-defined variables will appear later in memory, hence these pointer comparisons and my terms "before"/"after" may seem backwards. If you're suspicious, use the %p directive in printf along with the &varname address-value.)

Less-than or equal: Given pointers p1,p2, compare them, return true iff p1 points to memory location which is before or same as memory location pointed at by p2:

c, c++ -- p1 <= p2

Greater-than: Given pointers p1,p2, compare them, return true iff p1 points to memory location which is after memory location pointed at by p2:

c, c++ -- p1 > p2

Greater-than or equal: Given pointers p1,p2, compare them, return true iff p1 points to memory location which is after or same as memory location pointed at by p2:

c, c++ -- p1 >= p2

Equal: Given pointers p1,p2, compare them, return true iff p1 points to exactly the same memory location as does p2:

c, c++ -- p1 == p2 (Note: Be very careful, = by itself is assignment, not equality-test.)

Not-equal: Given pointers p1,p2, compare them, return true iff p1 points to any different memory location from where p2 points:

c, c++ -- p1 != p2

Store value in global and/or already-declared variable: Given name of variable var, and expression exp which computes a value val, store that val into var, replacing whatever value might have been there previously:

c, c++, java -- var = exp (var can be: simple variable, element of array or struct, or indirection through pointer) (Note: If the declared type of var does not match the actual type of exp, then either a type-conversion will occur (if it's "safe") or a compile-time error is reported (if it's not "safe".) (Note: var absolutely must be declared, either in the current block, or in some enclosing block, or globally, or a compile-time error is reported.)
perl, PHP -- $var = exp (var can be a simple variable that is used to hold a scalor value, or an element of an array whose elements are scalor.) (Note: If var hasn't been declared within the current block or any enclosing block, then it's implicitly global, without any warning.)
lisp -- (setq var exp) (var can be only a simple variable) (Note: If var is a simple variable which hasn't been declared, it's implicitly declared as global, but a warning is issued by the compiler. Important note: In CGI applications, if your code is running "interpreted" (just-in-time incremental compiler in effect), the warning appears in the system error log every time somebody runs your CGI application, so it's important to declare var as global if that's what you intended, so you won't be polluting the system error log with such warnings.)
lisp -- (setf place exp) (place can be a simple variable, or any kind of place for which a setf method has been defined, such as an element of an array or struct or sequence, chain-of-standard-pair accessors such as CADR, value associated with key in hash table, and a whole lot more) (Same note about undeclared variable as for setq.)

Parallel binding: Given simple variable names var1 var2 ... varz, and corresponding expressions for their initial values exp1 exp2 ... expz, temporarily bind those values to those variables, all in parallel (compute all the values first, then do all the bindings in one batch), then execute forms form1 form2 ... formz in that context, then return the value from formz:

lisp -- (let ((var1 exp1) (var2 exp2) ... (varz expz)) form1 form2 ... formz)

Sequential nested binding: Given simple variable names var1 var2 ... varz, and corresponding expressions for their initial values exp1 exp2 ... expz, each of which (except the first) may include references to any of the earlier variable names, temporarily bind those values to those variables, sequentially: First evaluate exp1 in the original context, then bind var1 to that value. Next in the enhanced context with that new binding, evaluate exp2, then bind var2 to that value, etc. sequentially until varz has been bound to the value from expz. Then execute forms form1 form2 ... formz in that context, then return the value from formz:

lisp -- (let* ((var1 exp1) (var2 exp2) ... (varz expz)) form1 form2 ... formz)

Namespaces are handled so very differently in these languages:
Common lisp / c++ / java
There's virtually no commonality that can be collected together into a single kind of action, so I'll simply describe each mechanism separately. By comparison, c perl and PHP have only a single namespace, no scope resolution required.

Common lisp has packages, in which symbols are located. Each symbol in turn may have a function and/or a variable and/or other properties attached to it. Each symbol also has a link back to the package in which it is primarily interned, or a null link if it's non-interned. The package name is a simple name, generally following the syntax of the local name of a symbol. These symbols and packages are actual objects accessible to your program at rutime, not just symbols used by the compiler and loader, so you can explore them interactively from the read-eval-print loop if you wish. The following cases apply:

packageName:symbolName -- Fully qualified, the symbol symbolName within the package packageName, which works only if that symbol is exported from that package (similar to being "public" in C++/java jargon), hence doesn't work if there is no such symbol already in that package.
packageName::symbolName -- Fully qualified, as above, except this syntax works for all symbols, even those not exported, and if there's no such symbol in that package then one will be created just as symbols in the current package can be created at any time. As a matter of style, it's bad practice to use this syntax for symbols already exported by that package.
:symbolName -- Symbol in keyword package, which aren't allowed to carry any properties, not even a value or function. All keyword symbols are literals, i.e. they evaluate to themselves.
symbolName -- Symbol in the current default package or symbol imported from another package into the current default package.
#:symbolName -- When read, creates a new symbol that isn't interned in any package whatsoever, hence there's no way to find it again by its name, hence repeated instances of this exact syntax always create new non-interned symbols with the same name but all distinct symbols each time. Prints whenever prin1 or equivalent printing is performed on a symbol that isn't interned in any package. If several such non-interned symbols have the same name, there's no way to tell from their print form which is which. Non-interned symbols are typically used to create private persistent variables which nevertheless print out meaningfully so you have some idea which you're seeing. See functions gensym and make-symbol for how to create non-interned symbols under program control.

The following packages are standard in every Common Lisp implementation:

lisp -- Contains all the symbols (except keywords), defined by the Common Lisp standard, hence symbols for all standard's functions and global variables.
user -- The current package at the time Common Lisp starts up. It uses the lisp package, so that all symbols defined in the Common Lisp standard are available for use without needing a package qualifier.
keyword -- The package containing all keyword symbols, all of which evaluate to themselves, and cannot have any assigned properties nor be used as the name of a function or variable. On input, you can say keyword:name or just :name, but there's virtually never any reason to do the former.
system (nickname sys) -- The package containing all the user-accessible system-specific or implementation-specific functions. For example, in CMUCL, the function system:beep beeps the console (for example, transmits ctrl-G out standard output if that works on your particular terminal type).

C++ has namespaces which are somewhat analagous to Common Lisp's packages. But since they are purely a compiler feature (no actual symbol in a package-like thing at runtime) which is communicated to the loader (to provide correct linkage), no such thing as a symbol at runtime hence no package-like namespace at runtime, and there's no such thing as a keyword package and no such thing as an non-interned symbol. However there is such a thing as a "global" namespace. Also a sub-namespace is created by each struct, creating a hierarchy of namespaces if the struct is itself inside a namespace. The following cases apply:

symbolName -- Function or variable within the current default namespace or any using namespace.
nameSpace::symbolName -- Function or variable directly within a named namespace, or field within a struct which is in the current default namespace or any using namespace.
::symbolName -- Function or variable directly within the global namespace
nameSpace::structName::fieldName -- Field within a struct which is within a named namespace

The following namespace is standard in most c++ implementations:

std -- The standard library, including for example cin and cout

Java has a whole hierarchy (tree) of namespaces, where the leaf nodes are called classes and the next layer above them are called packages. One major branch of this tree has the name java at the top and contains all packages and classes directly defined by Sun MicroSystems for inclusion in the java language as they define it. There's another branch of the tree for each major vendor or industry group that provides add-ons to what Sun provides. Finally there's a private branch for users to define their own local classes. There's a system variable called CLASSPATH which defines where all the roots of all the trees can be found on the local filesystem. When you specify the fully qualified name of a class, it'll match any directory path from any of these starting points, so it's a good idea not to make your own private sub-directory from your personal starting point that exactly matches one of the toplevel names defined by Sun or other public sources. Within each directory in the trees, jar files may substitute for actual filesystem sub-directories, so you need a class browser, not just shell commands, to explore deep into these hierarchies.

Java uses a period (British "full stop") to separate naming levels within the package hierarchy, and to separate the package name from the class name, and to separate the class name from either a static method name or a static member name, and to separate a reference to an instance from a method name (static or instance). The following combinations are possible:

packageSpec for specifying a package:
- (nothing at all) -- The current (lexical) package, or the user's toplevel package, or java.lang, or any package which has been wholly imported
- pathDownToPackage -- Any other package, for example java.math or java.util.jar
classSpec for specifying a class within a package:
- (nothing at all) -- The current (lexical) class
- ClassName -- Any other class within any package listed as "nothing at all" in the previous menu, for example the class System which is within the package java.lang, or any class that has been individually imported
- packageSpec.Classname -- Any other class within any package requiring pathDownToPackage in the previous menu, for example the class java.math.BigInteger or java.util.jar.Manifest
varSpec for specifying a variable or field, value thereof:
- variableName -- Local variable, or static (global) variable of the class being defined here
- classSpec.variableName -- Static (global) variable in that class, for example System.out or java.math.BigInteger.ONE
- varSpec.variableName -- Field within the object which is the value of varSpec (Note: This pattern can be repeated to go as deep as you want, following a chain of pointers through as many objects as have them in effectively a linked-list.)
methodSpec for specifying a method:
- methodName -- Static method of the class being defined here.
- classSpec.methodName -- Static method within any other class, for example Integer.decode (in the package java.lang), or java.awt.Cursor.getDefaultCursor
- varSpec.methodName -- Instance method, within the class of the value of varSpec, applied to the value of varSpec, for example System.out.println; or static method within the class of the value of varSpec, whereupon the specific value of varSpec is ignored

The following public packages are most important to know about:

java.lang -- All the core packages of the language. These are always available to every java implementation, even to the applet environment in Web browsers (although which edition and which sub-version you get may vary from browser to browser and between versions of a given browser). Also this is the only package, other than your own toplevel default package, whose classes don't require you to explicitly import the package or class or use a fully-qualified class name. For example, you can say just System instead of needing to day java.lang.System.
java.util -- Lots of useful utilities, including the Collections Framework, but sadly not available to most applet environments.
java.io -- I/O including several kinds of buffered streams

There's pretty good online documentation for public java packages, and the classes and interfaces within each such package, and the constants and static variables and methods within each such class (and virtual methods within each such interface), a multi-level document organized in exactly that way, for example: version 1.3 / version 1.4.2

Safe conversion: Silently convert from a low-precision or small-range datatype to a higher-precision or wider-range datatype as needed:

c, c++, java -- The following chain, or any part thereof, happens silently whenever needed to combine two different types as parameters to an arithmetic operation, or when needed to assign a value to a variable declared of a different type, or when needed to pass a value as argument to a fuction whose corresponding formal parameter was declared as a different data type: char -> short -> int -> unsigned int -> long int -> unsigned long int -> float -> double -> long double (Note: In java, this happens only with primitive data types, not with fullfledged objects that are wrappers for primitive data types.)
lisp -- A similar chain, but not including character values, is effected only when combining different types as parameters to an arithmetic function or when passing arguments to just a few functions that require a specific numerical data type, such as trigonometric functions. This does not happen when storing values in a variable, because a variable can hold any type of data, so no coercion is necessary. (Caveat: It's possible to declare a variable, or a slot in a structure/array, to hold only a narrow type of data, similar to primitive types in the other languages. In that case something like c happens.)

Unsafe conversion (casting): Convert in the reverse direction of that chain, discarding low-order or high-order bits to cram the wide data into a narrow spot if it doesn't exactly fit, or discarding the fractional part of a non-integer, to obtain an "equivalent" (cough cough) value of the new data type. The expression produces a value of the wider type, while you need to generate a value of newtype which is narrower:

c, java -- (newtype)expression (Note: For some conversions, the bitpattern is simply copied across, discarding some of the bits if necessary. However for most conversions, the value is actually converted to the nearest legal value of newtype. The c standard doesn't make it quite clear which happens in which cases.)
c++ -- static_cast<newtype>(expression) (Note: This guarantees, if it's acceptable to the compiler, that an equivalent value, usually the nearest value representable in newtype, will be generated.)
c++ -- reinterpret_cast<newtype>(expression) (Note: This guarantees, if it's acceptable to the compiler, that the bitpattern will be copied directly across, possibly with some bits discarded, with no attempt to make the result mathematically "reasonable" with respect to the old value, rather like what could be done with EQUIVALENCE in Fortran. One plausable use for this trick is to pass floating-point and other non-integers through an I/O stream that accepts only integers. The data is garbage en route, but all the bits are intact, and upon conversion back to the original type by the recipient all the data is meaningful again.)
c++ -- const_cast<newtype>(expression) (To remove, or to add, the const or volatile property to/from the datatype.)

Allocating static arrays: At the global level, the array is created at the time the program is started, and disposed at the time the program exits. Within any block, such as the body of a function definition, the array is created anew each time the block is entered (such as when the function is called), and disposed each time control passes out of the block (such as when the function returns). Given the desired type of elements, the desired name of the symbol used to reference the array, and the desired numbers n1 n2 ... nz of elements along each dimension, allocate the array:

c, c++ -- type name[n1][n2]...[nz];

Allocating dynamic arrays: You can make a new array any time you want, and get rid of it any time you want. It's a "first-class citizen", just like numbers, able to be passed around by reference, attached to multiple places such as variables or fields in containers. Given the desired numbers n1 n2 ... nz of elements along each dimension, allocate the array:

lisp -- (make-array dims &key :element-type :initial-element :initial-contents :adjustable :fill-pointer :displaced-to :displaced-index-offset) where dims is a list of the numbers n1 n2 ... nz already computed. For example if you always want a 3-by-9 array you can create that list by saying either (list 3 9) or '(3 9) as dims. But if you have a value n which changes from time to time, and you want a 3-by-n array depending on that value of n, say (list 3 n). Like any first-class citizen, after the last reference to this object is removed, the garbage collector will eventually recycle its memory. (full documentation, including examples)

Array indexing: Given an array already allocated, of dimension k, and integer indexes i1, i2, ... ik, reference the element at that position within the array:

c, c++ -- arrName[i1][i2]...[ik] where arrName is the static name of the array as it was declared (This syntax can appear anywhere an expression can occur, and also on the left side of an assignment.)
lisp -- (aref array i1 i2 ... ik) where array is any expression that evaluates to the array object (This syntax can appear anywhere an expression can occur, and also as first argument to setf.)

Allocate a block of n bytes: malloc(n) (Return value is pointer to first byte of the block.) (Note: In practice, instead of giving n as an explicit integer, the sizeof operator is used to determine the size of whatever data type is to be stored, just one-of, in the allocated memory.) (Warning: The allocated memory is not cleared, so you get garbage in the memory, whatever happened to be in there before you allocated it, which could be leftovers from just about anything that occurred earlier on this computer. You should immediately copy some of your own data to overwrite the garbage.)
Change size of a block, moving it if it gets too large to fit where it was until now: realloc(ptr,newSize) where ptr is a pointer to the first byte of the old block and newSize is the desired new size of the block (Note: In practice, instead of giving n as an explicit integer, the sizeof operator is used to determine the size of whatever data type is to be stored, just one-of, in the allocated memory.) (Warning: When enlarging a block, the newly-allocated memory at the end of the old block is not cleared, so you get garbage in that portion of your overall block as above.) (Warning: When shrinking a block, all data past the end of the new smaller block is lost.)
Allocate an array of k consecutive blocks, each of size n bytes and aligned in a uniform manner (which might require padding i.e. wasted bytes between adjacent elements): calloc(k,n) (Return value is pointer to first byte of first block in the array.) (Note: In practice, instead of giving n as an explicit integer, the sizeof operator is used to determine the size of whatever data type is to be stored in each block of the allocated memory.) (Note: All the allocated memory is cleared to zero bytes before the pointer is returned to caller.)

c++:

Allocate memory to hold a type of data (which may be a array of a particular size containing elements of a smaller type): new type where type is the desired type. Return value is a pointer to the first byte of the allocated memory, with the type of the pointer matching the desired type of the data you specified.
Free the memory that was allocated earlier: delete ptr where ptr is a pointer to that block as returned by new earlier.

In the other languages we don't allocate a block of memory and later copy data into it to build our desired object. Instead we create a new object occupying newly-allocated memory all in one operation:

java -- new ClassName(constructorParametersIfAny) (to construct a object of that class, see specifications of constructors in documentation for that class) or new ElementClassName[dimension] (to construct an array whose elements can each hold an object of that class)
lisp -- For built-in types, there's a different function for building each different type of object, for example see sections on [Arrays] [Lists] [HashTables] [Symbols] [Packages] [Strings] etc. For user-defined structures: (make-NameOfStructureType constructorParameters) (Caveat: That's the default constructor name, but it can be overridden at the time the structure-type is defined via defstruct.)

A structure, or record, is a compact organization of different types of data, as opposed to an array which is a regular repetition of exactly the same type of data (possibly a generic pointer) without variation. To specify the organization of a structure/record, instead of specifying a single type and then saying how many copies are required along each axis, the type of each component must be explicitly stated individually. Furthermore the organization of a structure/record can be nested, whereby sub-structures are part of the overall structure being defined. These sub-structures may be copies of previously defined structures, or layers of structure being defined at the same time as the overall structural organization. The way that structures are defined is sufficiently differently in the various languages, and sufficiently complicated in each, with not much the same between language families, that I'm treating them in separate paragraphs here:

c has several variant syntaxes:

One-time definition and static allocation of single object: struct { decl1; decl2; ... decln; } name; where each decl is of the same form as an ordinary variable declaracteration (type followed by name optionally fallowed by array-subscript dimensions), and name is the name being defined to refer to the object which is being described and allocated simultaneously.
Re-usable definition of structure organization, no allocation at this time: struct tagName { decl1; decl2; ... decln; }; where tagName is the tag that will be used later to denote the entire organization when declaring a variable or parameter or return type to have this organization. For example, later to define (and allocate) a variable whose value has this structure, you say struct tagName varame;
Definition of new type at same time as definition of structure organization, again no allocation at this time: typedef tagName { decl1; decl2; ... decln; } TYPE_NAME; where tagName is as before, and TYPE_NAME (convention says use upper case here) is the new datatype being defined. Later, to declare variables parameters and return values with this type, you only need to say the TYPE_NAME instead of both struct and the tagName, which saves you a little typing.
To nest one structural organization within another all at once, simply nest the {...blockOfDeclaractions...} syntax as deeply as you want. This is not recommended because the namespace for fields within the structure is flat, and the sub-blocks within the struct aren't individually usable such as for passing to functions, and you can't repeat the same sub-blocks within the same struct, so nesting gives you virtually no extra value.
To nest one structural organization within another via chaining of definitions, define the innermost structs first and then work outward referring to inner structs when defining larger structs that include them. This gives a non-flat namespace so that you can refer to an entire inner block of fields by a single name, such as to copy it into or out from a large struct, and you can repeat the same sub-unit, where the repeated components are distinguished by the higher-level names which are different, analagously to a fully-qualified path. This is the recommended way.

c++ similar to c:

struct { decl1; decl2; ... decln; } name; (same syntax and sematics as in c)
struct tagName { decl1; decl2; ... decln; }; (same syntax as in c, but different semantics: In effect this is a typedef where the tagName and the name of the newly-defined type are the same)
Nesting as with c, with same recommendation.

Note that when directly nesting structures, the large structure is actually a very large structure which physically contains the inner structures within it. There is no way that two different instances of the same large structure can share any part of their insides, simply because memory in a computer is contiguous and it's impossible for the rest of the larger structure before or after the shared component to be in two different places at the same time. If you want shared structures between different large records, you use a pointer instead of direct physical nesting. Then you can have two instances of the large structure (not so large as before), each containing a field which points to exactly the same instance of the smaller structure virtually inside both of the larger structures. Of course the way you reference a separate object only pointed-at from inside the main object is different from how you'd reference a sub-object truly inside the larger object. Also since you are allocating the main object and the sub-object separately, you have a little more work to do there.

In java there aren't structures formally defined in the language. Instead, to emulate such, you define classes which have only instance variables (fields), no static (global) variables, and no methods whatsoever. This can get messy, because each class generates a separate object file, and to make compilation automatic as needed each class should be defined in a separate source file, so your directory can get huge if you do this a lot.

Instantiation/allocation of structures is simple enough that I'll do all the languages together as usual in this matrix document:

c, c++ -- struct tagName varName; (static allocation) (varName can later be used directly without indirection)
c, c++ -- typeName varName; (static allocation) (later use as above)
c, c++ -- (arrays of structs, simply include array dimensions after varName with either of the above forms) (array indexing can later be done directly without indirection)
c -- malloc(sizeof(typeName)) (dynamic allocation, return value is pointer to first byte of structure) (will later need indirection to get from pointer to actual object before fields can be referenced)
c -- calloc(numberOfElements, sizeof(typeName)); (dynamic allocation of 1-d array, return value is pointer to first byte of structure which is first element of array) (to avoid confusion later when referencing individual elements of array, it's probably best to use pointer arithmetic and indirection instead of explicit array indexing)
c++ -- new typeName (dynamic allocation, return value is pointer to first byte of object) (will later need indirection to get from pointer to actual object before fields can be referenced)
c++ -- new typeName[numberOfElements] (dynamic allocation, return value is pointer to first byte of first object in array) (to avoid confusion later when referencing individual elements of array, it's probably best to use pointer arithmetic and indirection instead of explicit array indexing)
lisp -- (constructorName :slotName1 slotValue1 :slotName2 slotValue2 ... :slotNamez slotValuez) (return value is pointer to object as usual in lisp) (constructorName is simply make-typeName, unless that was overridden via an option in defstruct)

Referring to an entire structure:

c, c++ -- varName (if structure was static-allocated as singleton, not array)
c, c++ -- varName[index] (if array of structures was static-allocated)
c, c++ -- *ptr (if structure was dynamic-allocated as singleton, not array)
c, c++ -- *(ptr+index) (if array of structures was dynamic-allocated)
lisp, java -- exp (if the value of exp points to a single structure)
lisp -- (aref exp i1 i2 ... ik) (if the value of exp points to an array of structures)

Referring to a slot (field) within a structure:

c, c++ -- varName.slotName (if structure was static-allocated as singleton, not array)
c, c++ -- varName[index].slotName (if array of structures was static-allocated)
c, c++ -- (*ptr).slotName (if structure was dynamic-allocated as singleton, not array)
c, c++ -- ptr->slotName (shorthand for above)
c, c++ -- (*(ptr+index)).slotName (if array of structures was dynamic-allocated) (Note: This works only if ptr is declared of the appropriate type for this kind of structure, so that sizeof can be implicitly used to compute how many addressing units change in ptr is equivalent to the index changing by 1, so that ptr+index performs the appropriate calculation)
c, c++ -- (ptr+index)->slotName (shorthand for above)
lisp -- (typeName-slotName ptr) (unless the accessor function name was changed by an option in the defstruct)
lisp (to get slot within structure which is element of array, compound the usual array-access with the usual slot-access in the obvious way)
c, c++ -- (if field within a struct is a pointer, to get the object it points to instead of the pointer itself, simply wrap *() around the entire expression)
c, c++ -- (to get field within a contained or pointed-at sub-structure, simply use the syntax to get the entire sub-structure, as above, then wrap parens around it and suffix with .slotName)
lisp -- (to chain through multiple levels of sub-structure, always via pointers, simply compound (nest) the syntax for each level of slot-access in the usual way)

String-character? Given character ch, return a true value iff the character is of the special type that can be included as an element in a string:

lisp -- (string-char-p ch)

Is character lower case? Given character ch, return true value if it's a lower-case letter, false value otherwise:

c (#include <ctype.h>) -- islower(ch)
lisp -- (lower-case-p ch)

Is character upper case? Given character ch, return true value if it's an upper-case letter, false value otherwise:

c (#include <ctype.h>) -- isupper(ch)
lisp -- (upper-case-p ch)

Is character upper/lower convertible? Given character ch, return true value if it's an upper-case letter convertible to lower case, or vice versa, false value otherwise:

lisp -- (both-case-p ch)

Is character alphabetic? Given character ch, return true value if it's alphabetic, false value otherwise:

c (#include <ctype.h>) -- isalpha(ch)
lisp -- (alpha-char-p ch)

Is character a decimal digit? Given character ch, return true value if that character represents a digit in the decimal number system, false value otherwise:

c (#include <ctype.h>) -- isdigit(ch)
lisp -- (digit-char-p ch)

Is character alphanumeric? Given character ch, return true value if that character is alphanumeric (letter or digit), false value otherwise:

c (#include <ctype.h>) -- isalnum(ch)
lisp -- (alphanumericp ch)

Is character a hexadecimal digit? Given character ch, return true value if that character is a digit in the standard hexadecimal system (0..9, then A..F or a..f), false value otherwise:

c (#include <ctype.h>) -- isxdigit(ch)
lisp -- (digit-char-p ch 16)

Is character a digit per some base? Given character ch, and integer base in the range 2 thru 36, return true value if that character is a digit in that particular base (an extension of the standard hexadecimal system: 0..9, then A..F..Z or a..f..Z as needed), false value otherwise:

lisp -- (digit-char-p ch base)

Is character punctuation? Given character ch, return true value if that character is punctuation (printing, but neither alphanumeric nor space), false value otherwise:

c (#include <ctype.h>) -- ispunct(ch)

Is character whitespace? Given character ch, return true value if that character is whitespace, false value otherwise:

c (#include <ctype.h>) -- isspace(ch)

Is character blank? Given character ch, return true value if that character is blank (space or tab), false value otherwise:

c (#include <ctype.h>) -- isblank(ch)

Is character graphic? Given character ch, return true value if that character is graphic (has glyph associated with it), false value otherwise:

c (#include <ctype.h>) -- isgraph(ch)
lisp -- (graphic-char-p ch)

Is character printing? Given character ch, return true value if that character is printing (graphic or the space character, but not tab), false value otherwise:

c (#include <ctype.h>) -- isprint(ch)

Is character a control? Given character ch, return true value if that character is a control (anything not printing, including tab), false value otherwise:

c (#include <ctype.h>) -- iscntrl(ch)

Is character a standard US/UK ASCII character? Given character ch, return true value if that character is in the standard US/UK 7-bit ASCII characterset, false value otherwise:

c (#include <ctype.h>) -- isascii(ch)
lisp -- (standard-char-p ch)
(I don't know whether the c and lisp charactersets exactly match.)

Character equality: Given two characters ch1 and ch2, return a true value if they are they exactly the same character, otherwise a false value:

c (#include <ctype.h>) -- ch1 = ch2
lisp -- (char= ch1 ch2)

Character inequality: Given two characters ch1 and ch2, return a false value if they are they exactly the same character, otherwise a true value:

c (#include <ctype.h>) -- ch1 != ch2
lisp -- (char/= ch1 ch2)

Character less-than: Given two characters ch1 and ch2, return a true value if ch1 comes before ch2, false otherwise:

c (#include <ctype.h>) -- ch1 < ch2
lisp -- (char< ch1 ch2)

Character greater-than: Given two characters ch1 and ch2, return a true value if ch1 comes after ch2, false otherwise:

c (#include <ctype.h>) -- ch1 > ch2
lisp -- (char> ch1 ch2)

Character less-than-or-equal: Given two characters ch1 and ch2, return a true value if ch1 comes before ch2, or if ch1 is the same character as ch2, false otherwise (if ch1 comes after ch2):

c (#include <ctype.h>) -- ch1 <= ch2
lisp -- (char<= ch1 ch2)

Character greater-than-or-equal: Given two characters ch1 and ch2, return a true value if ch1 comes after ch2, or if ch1 is the same character as ch2, false otherwise (if ch1 comes before ch2):

c (#include <ctype.h>) -- ch1 >= ch2
lisp -- (char>= ch1 ch2)

In lisp, case-insensitive comparisons of characters are performed using a special character ordering where the corresponding upper-case and lower-case letters are equated pairwise. Whether this is done by simply mapping all upper-case characters to lower-case before comparing, or vice versa, or using a total different scheme for ordering characters, is implementation dependent. These comparisons are used mostly to compare alphabetic words, where all implementations produce the same results.

Case-insensitive character equality: Given two characters ch1 and ch2, return a true value if they are they the same character, ignoring distinctions between case (upper/lower), otherwise a false value:

lisp -- (char-equal ch1 ch2)

Case-insensitive character inequality: Given two characters ch1 and ch2, return a false value if they are they the same character, ignoring distinctions between case (upper/lower), otherwise a true value:

lisp -- (char-not-equal ch1 ch2)

Case-insensitive character less-than: Given two characters ch1 and ch2, return a true value if ch1 comes before ch2, ignoring distinctions between case (upper/lower), false otherwise:

lisp -- (char-lessp ch1 ch2)

Case-insensitive character greater-than: Given two characters ch1 and ch2, return a true value if ch1 comes after ch2, ignoring distinctions between case (upper/lower), false otherwise:

lisp -- (char-greaterp ch1 ch2)

Case-insensitive character less-than-or-equal: Given two characters ch1 and ch2, return a true value if ch1 comes before ch2, or if ch1 is the same character as ch2, ignoring distinctions between case (upper/lower), false otherwise (if ch1 comes after ch2, ignoring case):

lisp -- (char-not-greaterp ch1 ch2)

Case-insensitive character greater-than-or-equal: Given two characters ch1 and ch2, return a true value if ch1 comes after ch2, or if ch1 is the same character as ch2, ignoring distinctions between case (upper/lower), false otherwise (if ch1 comes before ch2, ignoring case):

lisp -- (char-not-lessp ch1 ch2)

In c, characters (declared as type char) are simply very short integers (8 or 9 bits) considered as if characters, whereas in Common Lisp, characters are a whole separate kind of data runtime-distinguishable from any integer (although deep inside the character object there is of course the ASCII code for the character). Accordingly in Common Lisp you can intermix characters and numbers in containers such as linked-lists and arrays, and later when you retrieve such a object the type system will tell you which kind of object it is. In c, by comparison, you can't ever intermix (*) characters (as type char) and integers because they occupy different amounts of storage which must be known at compile time. If you expand characters to more bits to occupy the same amount of storage as some type of integer, and intermix them with true integers, there's no way to later tell them apart. (*) (By "intermix" I mean like in an array or linked-list or other uniform-contents container. Of course you can define structures that have special slots occupied by characters and other slots occupied by integers, but that's not what I'm talking about. I'm referring only to containers where at compile time it isn't yet known which elements will contain characters and which will contain integers.) (Some older versions of lisp, such as MacLisp and Emacs-lisp, didn't have characters either, and used integers instead, much like c. But the only dialect of lisp covered in this "cookbook/matrix" document is Common Lisp, so not to worry.) In java, you can have either a character or an integer as a fullfledged object, whose type can be checked at run time, but also you can have primitive types where you must declare the type at compile time and can't intermix different types at all in a container. (See the [Characters and Integers] section for how to convert from one type to the other in lisp or java.) Perl doesn't have characters either, but instead of using integers it uses single-character strings as stand-ins for character objects.

In lisp, a string is internally a vector (one-dimensional array), allowing all the usual array operations on it (see the sections [Arrays] and [Arrays and Integers] for details), but is also considered to be a sequence, allowing all the usual sequence operations on it (see the sections [Sequences] and [Sequences and Integers] for details). This section deals only with functions that work only with strings in regard to their character elements, except where a more general sequence function serves as an equivalent of a string-specific function in another language.

In c, there's no string type in the first place. One-dimensional arrays of characters, containing non-zero bytes terminated as a zero (NUL) byte, serve as "strings". See the sections [Arrays and Integers] and [Strings] for relevant information.

The rest of this section deals with advanced relationships between strings and characters, beyond simply indexing elements within character arrays which is covered in [Strings and Integers] and [Vectors and Integers].

In c, there's no such data type as character as truly distinct from integers, instead character literals are really integers, and any sufficiently small non-negative integer can be treated as a character when printing (see printf for details of how to achieve that effect), so conversion happens whenever the other type is needed, such as across an assignment, or when a character is fed into a arithmetic function/operator. Consequently no special functions are needed to convert back and forth.

In lisp, integers and characters are two completely different data types, so conversion functions are necessary. See the section Characters for why this design decision is sometimes better.

Parse integer: Given a string str containing the representation of an integer in some base, return the integer numeric value:

c (#include <stdlib.h>) -- char *tailptr; ... strtoll(str,&tailptr,base) (return value is of type long long int) (index after end of parsed integer is stored in tailptr)
c (#include <stdlib.h>) -- char *tailptr; ... strtol(str,&tailptr,base) (return value is of type long int) (index after end of parsed integer is stored in tailptr)
c (#include <stdlib.h>) -- atol(str) (return value is of type long int) (doesn't necessarily check for overflow, undefined behaviour happens, hence not recommended)
c (#include <stdlib.h>) -- char *tailptr; ... strtoul(str,&tailptr,base) (Return value is of type unsigned long int)
c (#include <stdlib.h>) -- atoi(str) (return value is of type int) (doesn't necessarily check for overflow, see above, hence not recommended)
lisp -- (parse-integer str &key :start :end :junk-allowed) (second return value is index after end of parsed integer)

Safe parsing of integer, validation of input from user or other unsafe source in six languages.

Length of string: Given string str, return its effective length:

c (#include <string.h>) -- strlen(str) (return value is of type size_t)
lisp -- (length str)

Copy string (overwriting, exact length): Given string strfrom, mutable byte vector strto, and size telling how many characters to copy, copy exactly that number of characters from strfrom overwriting the initial segment of strto:

c (#include <string.h>) -- strncpy(strto,strfrom,size) (size must be of type size_t) (if size is more than the effective size of strfrom, the NUL byte at the end is copied as many times as necessary to fill out the total size of overwritten bytes) (Warning: No check against overwriting end of allocation for strto, but normally you supply exactly sizeof strto as size so that copying stops at exactly the end of the total allocation.)

Concatenate strings (overwriting, exact length): Given string strfrom, mutable byte vector strto, and size telling how many characters to copy, copy exactly that number of characters from strfrom, overwriting starting at the effective end of strto:

c (#include <string.h>) -- strncat(strto,strfrom,size) (size must be of type size_t) (a single NUL byte is always appended after the copy, so the total allocated size of to must be at least size + 1 bytes longer than its initial length) (Warning: No check against overwriting end of allocation for strto, but normally you supply exactly (sizeof strto) - (strlen(strto)+1) as size, so that copying stops at exactly the end of the total allocation.)

Compare strings lexicographically: Given strings str1,str2, return -1 or +1 depending on direction of first difference, or which exactly matches but is longer, or 0 if all corresponding characters are the same and lengths are the same:

c (#include <string.h>) -- strcmp(str1,str2)
c (#include <string.h>) -- strcasecmp(str1,str2) (case-insensensitive)

Compare prefixes of strings lexicographically: Given strings str1,str2, and size of portion to compare, return -1 or +1 depending on direction of first difference, or 0 if all corresponding characters are the same:

c (#include <string.h>) -- strncmp(str1,str2,size)
c (#include <string.h>) -- strncasecmp(str1,str2,size) (case-insensensitive)

Find character in string: Given string str, and character ch, find first instance of ch as element of string:

c (#include <string.h>) -- strchr(str,ch) or index(str,ch) (return value is pointer to the found character, or null pointer if no matching character was found)
lisp -- (position ch str) (return value is index where character was found, or NIL if no matching character was found) (position takes keyword arguments :from-end :test :test-not :start :end :key, and works with all types of sequences, see [Sequences] for details.)

Find character in string, searching backwards: Given string str, and character ch, find last instance of ch as element of string:

c (#include <string.h>) -- strrchr(str,ch) or rindex(str,ch) (return value is pointer to the found character, or null pointer if no matching character was found)
lisp -- (position ch str :FROM-END T) (return value is index where character was found, or NIL if no matching character was found) (see earlier note about keywords and other types of sequences)

Find substring in string: Given strings needle,haystack, find first location where needle exactly matches a substring of haystack:

c (#include <string.h>) -- strstr(haystack,needle) (return value is pointer to the first character of the found substring, or null pointer if no matching substring was found)
lisp -- (search needle haystack) (return value is index of first character of first matching substring, or NIL if no matching substring was found) (see [Sequences] for keywords and additional sequence data types)

Skip over particular characters in string: Given strings str,bag, find first character of str which is not any one of the characters in bag:

c (#include <string.h>) -- strspn(str,bag) (return value is count of characters of str skipped before the first character not a member of bag was found, or length of str if no such character was found)
lisp -- (position-if-not #'(lambda (ch) (position ch bag)) str) (return value is index of first no-bag character, or NIL if no such found) (see [Sequences] for keywords and additional sequence data types)

Skip until first particular character in string: Given strings str,bag, find first character of str which is any of the characters in bag:

c (#include <string.h>) -- strcspn(str,bag) (return value is count of characters of str skipped before the first bag character was found, or length of str if no such character was found)
c (#include <string.h>) -- strpbrk(str,bag) (return value is pointer to first bag character that was found in str, or null pointer if no such character was found)
lisp -- (position-if #'(lambda (ch) (position ch bag)) str) (return value is index of first bag character found in str, or NIL if no such character was found) (see [Sequences] for keywords and additional sequence data types)

Strings in c are nothing more than byte vectors (1-d arrays) whose individual elements contain character codes (ASCII on most systems, EBCDIC on some) for non-NUL characters, followed by a single NUL byte just after the end of the string, followed by junk from there to the end of the allocated vector. These can exist as string literals (which aren't allowed to be modified in either contents or length), or static declared allocations (which can be modified per both contents and length but only so long as the effective length plus the extra NUL byte don't exceed the total allocation size, but warning: no runtime check for such out-of-bounds overwrite happens, and that's a common cause of trashing memory and/or violating security), or dynamic allocation (where realloc might be able to change the total allocation size, or might make a copy elsewhere, but otherwise these are similar to static declared allocations).

Strings in Common Lisp are all allocated objects. These can exist as string literals (which aren't allowed to be modified in either contents or length), or as runtime-constructed objects (which can be modified by per-character overwriting but the length is constant, unless the string is created with a fill pointer, and even then the length can't exceed the allocated size unless the string was also created adjustable). See [Vectors] for details about fill pointer and adjustable. All array indexing, including indexing within strings, is fully checked against index out of bounds, signalling an exception when that happens, thereby preventing trashing of memory.

Strings in java are all allocated objects, and are all totally immutable regardless of whether they are string literals or constructed at runtime. But StringBuffers act like strings that can be expanded, and otherwise modified, after creation. Despite the commonality of primitive data types between c and java, arrays of characters are not used as strings in java, although arrays of integers might at times be used to hold buffers of raw/numeric data which might contain character codes.

The functions defined below, and in other sections relating Strings to other data types, reflect these implementation differences between the several languages.

Copy string (overwriting): Given string strfrom, and mutable byte vector strto, copy strfrom to overwrite the initial segment of strto:

c (#include <string.h>) -- strcpy(strto,strfrom) (return value is strto) (Note: Because strings are NUL-terminated, the effective length of strfrom after overwriting is now the same as the effective length of strfrom.) (Warning: No check is made against writing past the end of the buffer allocated for strto and thereby overwriting whatever happens to be allocated next in memory.)
c (#include <string.h>) -- stpcpy(strto,strfrom) (return value is pointer to NUL byte terminating the copy within strto) (otherwise same as strcpy, including warning)
lisp -- (replace strto strfrom) (If the strto is too short to hold all the characters from strfrom, then copying stops without writing past the end of strto.) (replace also supports keyword parameters :start1 :end1 :start2 :end2 for limiting strto or strfrom to just a sub-sequence instead of the whole string. See sections on [Sequences] for details.)

Copy string (allocate new): Given string strfrom, allocate a new copy of it, return pointer to the copy:

c (#include <string.h>) -- strdup(strfrom) (returns null pointer if there's not enough memory available to hold the copy)
lisp -- (copy-seq strfrom)

Concatenate strings (overwriting): Given string strfrom, and mutable byte vector strto, copy strfrom to overwrite strto starting past the effective end of strto:

c (#include <string.h>) -- strcat(strto,strfrom) (return value is strto, which effectively is now the concatenation of strto followed by strfrom) (Warning: No check is made against writing past the end of the buffer allocated for strto and thereby overwriting whatever happens to be allocated next in memory.)
lisp -- (replace strto strfrom :START (length strto)) (This works only if strto has a fill pointer, and stops when the allocated size of strto is reached, but if it was created adjustable then you can enlarge it before calling replace to get the whole string copied. See [Vectors] for details of fill pointer and adjustable.)

Concatenate strings (allocating new): Given strings str1,str2, allocate a new string which is the concatenation of them:

lisp -- (concatenate 'string str1 str2) (More than two strings can be supplied.) (See [Sequences] for more flexible use of concatenate with all kinds of sequences, even converting from one type to another.)

Copy block of bytes (overwriting, exact length): Given pointers ptrfrom and ptrto, and size telling how many bytes to copy, copy exactly that number of bytes starting from whereever ptrfrom points, overwriting starting whereever ptrto points:

c (#include <string.h>) -- memcpy(ptrto,ptrfrom,size) (size must be of type size_t) (Warning: No check against overwriting memory you shouldn't be overwriting!!) (Warning: No check against overlap between source and overwrite-area, unpredictable results if that happens.)
c (#include <string.h>) -- memmove(ptrto,ptrfrom,size) (size must be of type size_t) (Specially handles case where source and overwritten data overlap so that exactly one copy of the original source appears where you think it should.) (Warning: No check against overwriting memory you shouldn't be overwriting!!)
c (#include <string.h>) -- bcopy(ptrfrom,ptrto,size) (like memcpy except order of args)

Copy block of bytes (overwriting, exact length, or to delimiter): Given pointers ptrfrom and ptrto, integer c, and size telling how many bytes to copy, copy exactly that number of bytes starting from whereever ptrfrom points, overwriting starting whereever ptrto points, except stop early if a byte matching c is encountered:

c (#include <string.h>) -- memccpy(ptrto,ptrfrom,size) (size must be of type size_t) (Warning: No check against overwriting memory you shouldn't be overwriting!!) (return value is a pointer into overwrite area one byte past where c was copied, or a null pointer if no byte matching c appeared in the first size bytes of from)

Fill block of bytes (overwriting, exact length): Given pointer ptrto, integer c, and size telling how many bytes to fill, write copies of c repeatedly, filling the block starting whereever ptrto points, for a total number of size copies:

c (#include <string.h>) -- memset(ptrto,c,size) (size must be of type size_t) (Warning: No check against overwriting memory you shouldn't be overwriting!!)

Fill block of bytes with zero (overwriting, exact length): Given pointer ptrto, and size telling how many bytes to fill, write copies of c repeatedly, filling the block starting whereever ptrto points, for a total number of size copies:

c (#include <string.h>) -- bzero(ptrto,size) (size must be of type size_t) (Warning: No check against overwriting memory you shouldn't be overwriting!!)

Compare blocks of bytes lexicographically: Given pointers ptr1,ptr2, and size telling how many bytes to compare, return -1 or +1 depending on direction of first difference, or 0 if all corresponding byte-pairs are the same:

c (#include <string.h>) -- memcmp(ptr1,ptr2,size) (size must be of type size_t)
c (#include <string.h>) -- bcmp(ptr1,ptr2,size) (size must be of type size_t)

Find byte in block: Given pointer ptr, byte by, and size telling how many bytes to search, find the first byte matching by within memory starting at ptr:

c (#include <string.h>) -- memchr(ptr,by,size) (return value is pointer to first matching byte found, or null pointer if no match was found)

Find sub-block in block: Given pointer needle, needle_len telling how many bytes to try to match, pointer haystack, haystack_len telling how many bytes total to search within, find first sub-block of haystack exactly matching needle:

c (#include <string.h>) -- memmem(needle,needle_len,haystack,haystack_len) (return value is pointer to first byte of first matching sub-block, or null pointer if no match was found)

Multi-programming-language "cookbook", matrix (Chapter 3)