You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

35 KiB

Built-In Datatypes

The previous chapter introduced some of Rackets built-in datatypes: numbers, booleans, strings, lists, and procedures. This section provides a more complete coverage of the built-in datatypes for simple forms of data.

1 Booleans              
2 Numbers               
3 Characters            
4 Strings \(Unicode\)   
5 Bytes and Byte Strings
6 Symbols               
7 Keywords              
8 Pairs and Lists       
9 Vectors               
10 Hash Tables          
11 Boxes                
12 Void and Undefined   

1. Booleans

Racket has two distinguished constants to represent boolean values: #t for true and #f for false. Uppercase #T and #F are parsed as the same values, but the lowercase forms are preferred.

The boolean? procedure recognizes the two boolean constants. In the result of a test expression for if, cond, and, or, etc., however, any value other than #f counts as true.


> (= 2 (+ 1 1))  
> (boolean? #t)  
> (boolean? #f)  
> (boolean? "no")
> (if "no" 1 0)  

2. Numbers

A Racket number is either exact or inexact:

  • An exact number is either

    • an arbitrarily large or small integer, such as 5, 99999999999999999, or -17;

    • a rational that is exactly the ratio of two arbitrarily small or large integers, such as 1/2, 99999999999999999/2, or -3/4; or

    • a complex number with exact real and imaginary parts (where the imaginary part is not zero), such as 1+2i or 1/2+3/4i.

  • An inexact number is either

    • an IEEE floating-point representation of a number, such as 2.0 or 3.14e+87, where the IEEE infinities and not-a-number are written +inf.0, -inf.0, and +nan.0 or `-nan.0`; or

    • a complex number with real and imaginary parts that are IEEE floating-point representations, such as 2.0+3.0i or -inf.0+nan.0i; as a special case, an inexact complex number can have an exact zero real part with an inexact imaginary part.

Inexact numbers print with a decimal point or exponent specifier, and exact numbers print as integers and fractions. The same conventions apply for reading number constants, but #e or #i can prefix a number to force its parsing as an exact or inexact number. The prefixes #b, #o, and #x specify binary, octal, and hexadecimal interpretation of digits.

+[missing] in [missing] documents the fine points of the syntax of numbers.


> 0.5   
> #e0.5 
> #x03BB

Computations that involve an inexact number produce inexact results, so that inexactness acts as a kind of taint on numbers. Beware, however, that Racket offers no “inexact booleans,” so computations that branch on the comparison of inexact numbers can nevertheless produce exact results. The procedures exact->inexact and inexact->exact convert between the two types of numbers.


> (/ 1 2)                         
> (/ 1 2.0)                       
> (if (= 3.0 2.999) 1 2)          
> (inexact->exact 0.1)            

Inexact results are also produced by procedures such as sqrt, log, and sin when an exact result would require representing real numbers that are not rational. Racket can represent only rational numbers and complex numbers with rational parts.


> (sin 0)   ; rational...    
> (sin 1/2) ; not rational...

In terms of performance, computations with small integers are typically the fastest, where “small” means that the number fits into one bit less than the machines word-sized representation for signed numbers. Computation with very large exact integers or with non-integer exact numbers can be much more expensive than computation with inexact numbers.

(define (sigma f a b)                                 
  (if (= a b)                                         
      (+ (f a) (sigma f (+ a 1) b))))                 
> (time (round (sigma (lambda (x) (/ 1 x)) 1 2000)))  
cpu time: 63 real time: 64 gc time: 0                 
> (time (round (sigma (lambda (x) (/ 1.0 x)) 1 2000)))
cpu time: 1 real time: 0 gc time: 0                   

The number categories integer, rational, real always rational, and complex are defined in the usual way, and are recognized by the procedures integer?, rational?, real?, and complex?, in addition to the generic number?. A few mathematical procedures accept only real numbers, but most implement standard extensions to complex numbers.


> (integer? 5)                        
> (complex? 5)                        
> (integer? 5.0)                      
> (integer? 1+2i)                     
> (complex? 1+2i)                     
> (complex? 1.0+2.0i)                 
> (abs -5)                            
> (abs -5+2i)                         
abs: contract violation               
  expected: real?                     
  given: -5+2i                        
> (sin -5+2i)                         

The = procedure compares numbers for numerical equality. If it is given both inexact and exact numbers to compare, it essentially converts the inexact numbers to exact before comparing. The eqv? (and therefore equal?) procedure, in contrast, compares numbers considering both exactness and numerical equality.


> (= 1 1.0)   
> (eqv? 1 1.0)

Beware of comparisons involving inexact numbers, which by their nature can have surprising behavior. Even apparently simple inexact numbers may not mean what you think they mean; for example, while a base-2 IEEE floating-point number can represent 1/2 exactly, it can only approximate 1/10:


> (= 1/2 0.5)                     
> (= 1/10 0.1)                    
> (inexact->exact 0.1)            

+[missing] in [missing] provides more on numbers and number procedures.

3. Characters

A Racket character corresponds to a Unicode scalar value. Roughly, a scalar value is an unsigned integer whose representation fits into 21 bits, and that maps to some notion of a natural-language character or piece of a character. Technically, a scalar value is a simpler notion than the concept called a “character” in the Unicode standard, but its an approximation that works well for many purposes. For example, any accented Roman letter can be represented as a scalar value, as can any common Chinese character.

Although each Racket character corresponds to an integer, the character datatype is separate from numbers. The char->integer and integer->char procedures convert between scalar-value numbers and the corresponding character.

A printable character normally prints as #\ followed by the represented character. An unprintable character normally prints as #\u followed by the scalar value as hexadecimal number. A few characters are printed specially; for example, the space and linefeed characters print as #\space and #\newline, respectively.

+[missing] in [missing] documents the fine points of the syntax of characters.


> (integer->char 65)     
> (char->integer #\A)    
> #\λ                    
> #\u03BB                
> (integer->char 17)     
> (char->integer #\space)

The display procedure directly writes a character to the current output port see \[missing\], in contrast to the character-constant syntax used to print a character result.


> #\A          
> (display #\A)

Racket provides several classification and conversion procedures on characters. Beware, however, that conversions on some Unicode characters work as a human would expect only when they are in a string (e.g., upcasing “ß” or downcasing “Σ”).


> (char-alphabetic? #\A)      
> (char-numeric? #\0)         
> (char-whitespace? #\newline)
> (char-downcase #\A)         
> (char-upcase #\ß)           

The char=? procedure compares two or more characters, and char-ci=? compares characters ignoring case. The eqv? and equal? procedures behave the same as char=? on characters; use char=? when you want to more specifically declare that the values being compared are characters.


> (char=? #\a #\A)   
> (char-ci=? #\a #\A)
> (eqv? #\a #\A)     

+[missing] in [missing] provides more on characters and character procedures.

4. Strings Unicode

A string is a fixed-length array of characters. It prints using doublequotes, where doublequote and backslash characters within the string are escaped with backslashes. Other common string escapes are supported, including \n for a linefeed, \r for a carriage return, octal escapes using \ followed by up to three octal digits, and hexadecimal escapes with \u up to four digits. Unprintable characters in a string are normally shown with \u when the string is printed.

+[missing] in [missing] documents the fine points of the syntax of strings.

The display procedure directly writes the characters of a string to the current output port see \[missing\], in contrast to the string-constant syntax used to print a string result.


> "Apple"                       
> "\u03BB"                      
> (display "Apple")             
> (display "a \"quoted\" thing")
a "quoted" thing                
> (display "two\nlines")        
> (display "\u03BB")            

A string can be mutable or immutable; strings written directly as expressions are immutable, but most other strings are mutable. The make-string procedure creates a mutable string given a length and optional fill character. The string-ref procedure accesses a character from a string with 0-based indexing; the string-set! procedure changes a character in a mutable string.


> (string-ref "Apple" 0)        
> (define s (make-string 5 #\.))
> s                             
> (string-set! s 2 #\λ)         
> s                             

String ordering and case operations are generally locale-independent; that is, they work the same for all users. A few locale-dependent operations are provided that allow the way that strings are case-folded and sorted to depend on the end-users locale. If youre sorting strings, for example, use string<? or string-ci<? if the sort result should be consistent across machines and users, but use string-locale<? or string-locale-ci<? if the sort is purely to order strings for an end user.


> (string<? "apple" "Banana")         
> (string-ci<? "apple" "Banana")      
> (string-upcase "Straße")            
> (parameterize ([current-locale "C"])
    (string-locale-upcase "Straße"))  

For working with plain ASCII, working with raw bytes, or encoding/decoding Unicode strings as bytes, use byte strings.

+[missing] in [missing] provides more on strings and string procedures.

5. Bytes and Byte Strings

A byte is an exact integer between 0 and 255, inclusive. The byte? predicate recognizes numbers that represent bytes.


> (byte? 0)  
> (byte? 256)

A byte string is similar to a string—see Strings Unicode—but its content is a sequence of bytes instead of characters. Byte strings can be used in applications that process pure ASCII instead of Unicode text. The printed form of a byte string supports such uses in particular, because a byte string prints like the ASCII decoding of the byte string, but prefixed with a #. Unprintable ASCII characters or non-ASCII bytes in the byte string are written with octal notation.

+[missing] in [missing] documents the fine points of the syntax of byte strings.


> #"Apple"                   
> (bytes-ref #"Apple" 0)     
> (make-bytes 3 65)          
> (define b (make-bytes 2 0))
> b                          
> (bytes-set! b 0 1)         
> (bytes-set! b 1 255)       
> b                          

The display form of a byte string writes its raw bytes to the current output port see \[missing\]. Technically, display of a normal i.e,. character string prints the UTF-8 encoding of the string to the current output port, since output is ultimately defined in terms of bytes; display of a byte string, however, writes the raw bytes with no encoding. Along the same lines, when this documentation shows output, it technically shows the UTF-8-decoded form of the output.


> (display #"Apple")                         
> (display "\316\273")  ; same as "λ"       
> (display #"\316\273") ; UTF-8 encoding of λ

For explicitly converting between strings and byte strings, Racket supports three kinds of encodings directly: UTF-8, Latin-1, and the current locales encoding. General facilities for byte-to-byte conversions especially to and from UTF-8 fill the gap to support arbitrary string encodings.


> (bytes->string/utf-8 #"\316\273")                               
> (bytes->string/latin-1 #"\316\273")                             
> (parameterize ([current-locale "C"])  ; C locale supports ASCII,
    (bytes->string/locale #"\316\273")) ; only, so...             
bytes->string/locale: byte string is not a valid encoding         
for the current locale                                            
  byte string: #"\316\273"                                        
> (let ([cvt (bytes-open-converter "cp1253" ; Greek code page     
        [dest (make-bytes 2)])                                    
    (bytes-convert cvt #"\353" 0 1 dest)                          
    (bytes-close-converter cvt)                                   
    (bytes->string/utf-8 dest))                                   

+[missing] in [missing] provides more on byte strings and byte-string procedures.

6. Symbols

A symbol is an atomic value that prints like an identifier preceded with '. An expression that starts with ' and continues with an identifier produces a symbol value.


> 'a          
> (symbol? 'a)

For any sequence of characters, exactly one corresponding symbol is interned; calling the string->symbol procedure, or reading a syntactic identifier, produces an interned symbol. Since interned symbols can be cheaply compared with eq? (and thus eqv? or equal?), they serve as a convenient values to use for tags and enumerations.

Symbols are case-sensitive. By using a #ci prefix or in other ways, the reader can be made to case-fold character sequences to arrive at a symbol, but the reader preserves case by default.


> (eq? 'a 'a)                  
> (eq? 'a (string->symbol "a"))
> (eq? 'a 'b)                  
> (eq? 'a 'A)                  
> #ci'A                        

Any string i.e., any character sequence can be supplied to string->symbol to obtain the corresponding symbol. For reader input, any character can appear directly in an identifier, except for whitespace and the following special characters:

   ( ) [ ] { } " , ' ; # | `

Actually, # is disallowed only at the beginning of a symbol, and then only if not followed by %; otherwise, # is allowed, too. Also, . by itself is not a symbol.

Whitespace or special characters can be included in an identifier by quoting them with | or \. These quoting mechanisms are used in the printed form of identifiers that contain special characters or that might otherwise look like numbers.


> (string->symbol "one, two")
'|one, two|                  
> (string->symbol "6")       

+[missing] in [missing] documents the fine points of the syntax of symbols.

The write function prints a symbol without a ' prefix. The display form of a symbol is the same as the corresponding string.


> (write 'Apple)  
> (display 'Apple)
> (write '|6|)    
> (display '|6|)  

The gensym and string->uninterned-symbol procedures generate fresh uninterned symbols that are not equal according to `eq?` to any previously interned or uninterned symbol. Uninterned symbols are useful as fresh tags that cannot be confused with any other value.


> (define s (gensym))                     
> s                                       
> (eq? s 'g42)                            
> (eq? 'a (string->uninterned-symbol "a"))

+[missing] in [missing] provides more on symbols.

7. Keywords

A keyword value is similar to a symbol see Symbols, but its printed form is prefixed with #:.

+[missing] in [missing] documents the fine points of the syntax of keywords.


> (string->keyword "apple")               
> '#:apple                                
> (eq? '#:apple (string->keyword "apple"))

More precisely, a keyword is analogous to an identifier; in the same way that an identifier can be quoted to produce a symbol, a keyword can be quoted to produce a value. The same term “keyword” is used in both cases, but we sometimes use keyword value to refer more specifically to the result of a quote-keyword expression or of string->keyword. An unquoted keyword is not an expression, just as an unquoted identifier does not produce a symbol:


> not-a-symbol-expression                            
not-a-symbol-expression: undefined;                  
 cannot reference an identifier before its definition
  in module: top-level                               
> #:not-a-keyword-expression                         
eval:2:0: #%datum: keyword misused as an expression  
  at: #:not-a-keyword-expression                     

Despite their similarities, keywords are used in a different way than identifiers or symbols. Keywords are intended for use unquoted as special markers in argument lists and in certain syntactic forms. For run-time flags and enumerations, use symbols instead of keywords. The example below illustrates the distinct roles of keywords and symbols.


> (define dir (find-system-path 'temp-dir)) ; not '#:temp-dir   
> (with-output-to-file (build-path dir "stuff.txt")             
    (lambda () (printf "example\n"))                            
    ; optional #:mode argument can be 'text or 'binary          
    #:mode 'text                                                
    ; optional #:exists argument can be 'replace, 'truncate, ...
    #:exists 'replace)                                          

8. Pairs and Lists

A pair joins two arbitrary values. The cons procedure constructs pairs, and the car and cdr procedures extract the first and second elements of the pair, respectively. The pair? predicate recognizes pairs.

Some pairs print by wrapping parentheses around the printed forms of the two pair elements, putting a ' at the beginning and a . between the elements.


> (cons 1 2)         
'(1 . 2)             
> (cons (cons 1 2) 3)
'((1 . 2) . 3)       
> (car (cons 1 2))   
> (cdr (cons 1 2))   
> (pair? (cons 1 2)) 

A list is a combination of pairs that creates a linked list. More precisely, a list is either the empty list null, or it is a pair whose first element is a list element and whose second element is a list. The list? predicate recognizes lists. The null? predicate recognizes the empty list.

A list normally prints as a ' followed by a pair of parentheses wrapped around the list elements.


> null                           
> (cons 0 (cons 1 (cons 2 null)))
'(0 1 2)                         
> (list? null)                   
> (list? (cons 1 (cons 2 null))) 
> (list? (cons 1 2))             

A list or pair prints using list or cons when one of its elements cannot be written as a quoted value. For example, a value constructed with srcloc cannot be written using quote, and it prints using srcloc:

> (srcloc "file.rkt" 1 0 1 (+ 4 4))              
(srcloc "file.rkt" 1 0 1 8)                      
> (list 'here (srcloc "file.rkt" 1 0 1 8) 'there)
(list 'here (srcloc "file.rkt" 1 0 1 8) 'there)  
> (cons 1 (srcloc "file.rkt" 1 0 1 8))           
(cons 1 (srcloc "file.rkt" 1 0 1 8))             
> (cons 1 (cons 2 (srcloc "file.rkt" 1 0 1 8)))  
(list* 1 2 (srcloc "file.rkt" 1 0 1 8))          

See also list*.

As shown in the last example, list* is used to abbreviate a series of conses that cannot be abbreviated using list.

The write and display functions print a pair or list without a leading ', cons, list, or list*. There is no difference between write and display for a pair or list, except as they apply to elements of the list:


> (write (cons 1 2))      
(1 . 2)                   
> (display (cons 1 2))    
(1 . 2)                   
> (write null)            
> (display null)          
> (write (list 1 2 "3"))  
(1 2 "3")                 
> (display (list 1 2 "3"))
(1 2 3)                   

Among the most important predefined procedures on lists are those that iterate through the lists elements:

> (map (lambda (i) (/ 1 i))                                
       '(1 2 3))                                           
'(1 1/2 1/3)                                               
> (andmap (lambda (i) (i . < . 3))                         
         '(1 2 3))                                         
> (ormap (lambda (i) (i . < . 3))                          
         '(1 2 3))                                         
> (filter (lambda (i) (i . < . 3))                         
          '(1 2 3))                                        
'(1 2)                                                     
> (foldl (lambda (v i) (+ v i))                            
         '(1 2 3))                                         
> (for-each (lambda (i) (display i))                       
            '(1 2 3))                                      
> (member "Keys"                                           
          '("Florida" "Keys" "U.S.A."))                    
'("Keys" "U.S.A.")                                         
> (assoc 'where                                            
         '((when "3:30") (where "Florida") (who "Mickey")))
'(where "Florida")                                         

+[missing] in [missing] provides more on pairs and lists.

Pairs are immutable contrary to Lisp tradition, and pair? and list? recognize immutable pairs and lists, only. The mcons procedure creates a mutable pair, which works with set-mcar! and set-mcdr!, as well as mcar and mcdr. A mutable pair prints using mcons, while write and display print mutable pairs with { and }:


> (define p (mcons 1 2))
> p                     
(mcons 1 2)             
> (pair? p)             
> (mpair? p)            
> (set-mcar! p 0)       
> p                     
(mcons 0 2)             
> (write p)             
{0 . 2}                 

+[missing] in [missing] provides more on mutable pairs.

9. Vectors

A vector is a fixed-length array of arbitrary values. Unlike a list, a vector supports constant-time access and update of its elements.

A vector prints similar to a list—as a parenthesized sequence of its elements—but a vector is prefixed with # after ', or it uses vector if one of its elements cannot be expressed with quote.

For a vector as an expression, an optional length can be supplied. Also, a vector as an expression implicitly quotes the forms for its content, which means that identifiers and parenthesized forms in a vector constant represent symbols and lists.

+[missing] in [missing] documents the fine points of the syntax of vectors.


> #("a" "b" "c")                    
'#("a" "b" "c")                     
> #(name (that tune))               
'#(name (that tune))                
> #4(baldwin bruce)                 
'#(baldwin bruce bruce bruce)       
> (vector-ref #("a" "b" "c") 1)     
> (vector-ref #(name (that tune)) 1)
'(that tune)                        

Like strings, a vector is either mutable or immutable, and vectors written directly as expressions are immutable.

Vectors can be converted to lists and vice versa via vector->list and list->vector; such conversions are particularly useful in combination with predefined procedures on lists. When allocating extra lists seems too expensive, consider using looping forms like for/fold, which recognize vectors as well as lists.


> (list->vector (map string-titlecase                          
                     (vector->list #("three" "blind" "mice"))))
'#("Three" "Blind" "Mice")                                     

+[missing] in [missing] provides more on vectors and vector procedures.

10. Hash Tables

A hash table implements a mapping from keys to values, where both keys and values can be arbitrary Racket values, and access and update to the table are normally constant-time operations. Keys are compared using equal?, eqv?, or eq?, depending on whether the hash table is created with make-hash, make-hasheqv, or make-hasheq.


> (define ht (make-hash))               
> (hash-set! ht "apple" '(red round))   
> (hash-set! ht "banana" '(yellow long))
> (hash-ref ht "apple")                 
'(red round)                            
> (hash-ref ht "coconut")               
hash-ref: no value found for key        
  key: "coconut"                        
> (hash-ref ht "coconut" "not there")   
"not there"                             

The hash, hasheqv, and hasheq functions create immutable hash tables from an initial set of keys and values, in which each value is provided as an argument after its key. Immutable hash tables can be extended with hash-set, which produces a new immutable hash table in constant time.


> (define ht (hash "apple" 'red "banana" 'yellow))
> (hash-ref ht "apple")                           
> (define ht2 (hash-set ht "coconut" 'brown))     
> (hash-ref ht "coconut")                         
hash-ref: no value found for key                  
  key: "coconut"                                  
> (hash-ref ht2 "coconut")                        

A literal immutable hash table can be written as an expression by using #hash for an `equal?`-based table, #hasheqv (for an eqv?-based table), or #hasheq for an `eq?`-based table. A parenthesized sequence must immediately follow #hash, #hasheq, or #hasheqv, where each element is a dotted keyvalue pair. The #hash, etc. forms implicitly quote their key and value sub-forms.


> (define ht #hash(("apple" . red)      
                   ("banana" . yellow)))
> (hash-ref ht "apple")                 

+[missing] in [missing] documents the fine points of the syntax of hash table literals.

Both mutable and immutable hash tables print like immutable hash tables, using a quoted #hash, #hasheqv, or #hasheq form if all keys and values can be expressed with quote or using hash, hasheq, or hasheqv otherwise:


> #hash(("apple" . red)                     
        ("banana" . yellow))                
'#hash(("banana" . yellow) ("apple" . red)) 
> (hash 1 (srcloc "file.rkt" 1 0 1 (+ 4 4)))
(hash 1 (srcloc "file.rkt" 1 0 1 8))        

A mutable hash table can optionally retain its keys weakly, so each mapping is retained only so long as the key is retained elsewhere.


> (define ht (make-weak-hasheq))           
> (hash-set! ht (gensym) "can you see me?")
> (collect-garbage)                        
> (hash-count ht)                          

Beware that even a weak hash table retains its values strongly, as long as the corresponding key is accessible. This creates a catch-22 dependency when a value refers back to its key, so that the mapping is retained permanently. To break the cycle, map the key to an ephemeron that pairs the value with its key (in addition to the implicit pairing of the hash table).

+[missing] in [missing] documents the fine points of using ephemerons.


> (define ht (make-weak-hasheq))
> (let ([g (gensym)])           
    (hash-set! ht g (list g)))  
> (collect-garbage)             
> (hash-count ht)               
> (define ht (make-weak-hasheq))                 
> (let ([g (gensym)])                            
    (hash-set! ht g (make-ephemeron g (list g))))
> (collect-garbage)                              
> (hash-count ht)                                

+[missing] in [missing] provides more on hash tables and hash-table procedures.

11. Boxes

A box is like a single-element vector. It can print as a quoted #& followed by the printed form of the boxed value. A #& form can also be used as an expression, but since the resulting box is constant, it has practically no use.


> (define b (box "apple"))   
> b                          
> (unbox b)                  
> (set-box! b '(banana boat))
> b                          
'#&(banana boat)             

+[missing] in [missing] provides more on boxes and box procedures.

12. Void and Undefined

Some procedures or expression forms have no need for a result value. For example, the display procedure is called only for the side-effect of writing output. In such cases the result value is normally a special constant that prints as #<void>. When the result of an expression is simply #<void>, the REPL does not print anything.

The void procedure takes any number of arguments and returns #<void>. (That is, the identifier void is bound to a procedure that returns #<void>, instead of being bound directly to #<void>.)


> (void)       
> (void 1 2 3) 
> (list (void))

The undefined constant, which prints as #<undefined>, is sometimes used as the result of a reference whose value is not yet available. In previous versions of Racket before version 6.1, referencing a local binding too early produced #<undefined>; too-early references now raise an exception, instead.

The undefined result can still be produced in some cases by the shared form.

(define (fails)                  
  (define x x)                   
> (fails)                        
x: undefined;                    
 cannot use before initialization