Strings (part 1)

String Literals

Strings are groups of characters. They are put end to end like people waiting in a queue. String literals are strings which are containined within quotation marks. Unlike characters, which are enclosed within single quotation marks, strings are enclosed within double quotation marks. Here are some examples of strings:

"To be or not to be, that is the question."
"1 2 3 4 5 6 7 8"
"!"£$%^&*()"
"A rolling stone gathers much speed."

Strings are not just letters, spaces and punctuation symbols. They can also contain other characters.

The ASCII Code

The entire range of characters that can be included in a string is specified in a universal code called the ASCII code (The American Standard Code for Information Interchange) - pronounced "Ass-kee". Most programming languages adhere to this list, which assigns a number to every printable character. I have alluded to it several times in the past when mentioning characters. It's about time that you actually got to see this list:

Number
Character
Number
Character
Number
Character
0 to 31
Control characters
64
@
96
`
32
space
65
A
97
a
33
!
66
B
98
b
34
"
67
C
99
c
35
#
68
D
100
d
36
$
69
E
101
e
37
%
70
F
102
f
38
&
71
G
103
g
39
'
72
H
104
h
40
(
73
I
105
i
41
)
74
J
106
j
42
*
75
K
107
k
43
+
76
L
108
l
44
,
77
M
109
m
45
-
78
N
110
n
46
.
79
O
111
o
47
/
80
P
112
p
48
0
81
Q
113
q
49
1
82
R
114
r
50
2
83
S
115
s
51
3
84
T
116
t
52
4
85
U
117
u
53
5
86
V
118
v
54
6
87
W
119
w
55
7
88
X
120
x
56
8
89
Y
121
y
57
9
90
Z
122
z
58
:
91
[
123
{
59
;
92
\
124
|
60
<
93
]
125
}
61
=
94
^
126
~
62
>
95
_
127
Delete
63
?

You will see that all the symbols you are likely to need are here. Even the digits are represented as symbols. However, although character 48 represents the digit "0", it wouldn't take the value 0 in any calculations. Here's what I mean:

#include <iostream.h>
void main ()
  {  char x = '0';  /* Stored as 48 in memory */
     char y = '1';  /* Stored as 49 in memory */
     int z;

     z = x + y;     /* This is not the same as 0 + 1 */
     cout << z << endl;  /* Displays 97 */
  }

The values x and y are added to give the value stored in z. Because z is an int variable, rather than a char, the number 97 is displayed. If z had also been a char variable, then the character a (the ASCII character 97) would have been displayed. It is, perhaps, wrong to think of the characters from 48 to 57 as digits. They should really be thought of as symbols that happen to look like the digits.

The letter codes above 128 up to 255 aren't specified by the ASCII code. They can be used for different things on different computers. Typically they would be used to store letters from other alphabets (such as é) or mathematical symbols.

There are other letter codes. The runner-up is the EBCDIC code, which I believe was pioneered by the DEC Corporation. I can't think of a computer which runs this (apart, presumably, from those produced by DEC).

Escape Sequences

You will see that that last string literal contained an extra character, namely \. This is to specify an escape sequence. suppose you want to include the character " in a string. You can't do this directly, for example:

cout << "They called the man "Honest" Ron." << endl;

As soon as C++ comes across the " symbol before the letter H, it assumes that this is the end of the string. It then can't interpret the Honest properly. To include the " symbol, put \ in front of it. This is not displayed as a \ character itself, but does tell the compiler that a special symbol is about to follow. In this case, the example above should be written as follows:

cout << "They called the man \"Honest\" Ron." << endl;

This is called an Escape Sequence, because special characters being sent to a printer were always preceded by the Escape character (ASCII code 27). You have already met another of these escape sequences in one of the first sections, namely \n, which moves to a new line. There are others, listed here:

\n
new line
\b
backspace
\t
Horizontal tab
\v
Vertical tab
\"
Double quotation marks
\'
Single quotation marks
\a
Beep!
\\
The \ character itself

The backspace character backs up one character on the screen. Here's an example:

#include <iostream.h>
void main ()
  { cout << "ABCD\bEFGH" << endl; 
     /* This displays ABCEFGH */
  }

The program displays ABCD, then backs up over the D and over-prints it with EFGH. The \\ is necessary as otherwise you could not include the \ character in a string literal - the compiler would think it were the start of an escape sequence. \a is called Beep! as it causes the computer to make a beeping noise.

The old way of implementing strings in C++

There are two ways of implementing string variables in C++. The first is to store them as arrays of characters. This was the way in which the original C programming language stored strings, so strings stored this way are sometimes called C-strings. If you wanted to store the phrase "To be or not to be" (a total of 18 characters including spaces), then you would need an array of 19 elements. The last one stores the special character \0 which indicates that you have reached the end of the string. Here's how a string would be defined:

char s[20] = "hello";
cout << s[0];   /* This would display "h" */
cout << s[1];   /* This would display "e" */
cout << s[2];   /* This would display "l" etc. */

You will see that this string is defined during declaration, just as any other variable could be. Alternatively, you could assign it normally, the way we did in the early days:

char s[20];
s = "hello";

In either case, one letter of the string is stored in each array element starting at 0, like all arrays. Element s[3] contains "l", s[4] contains "o", s[5] contains the character \0. The other elements (s[6] to s[19]) store undefined values.

The program segment above would actually display the string, albeit letter by letter, since we haven't specified that we want a newline in between the characters. However, we don't need to do it quite so clumsily:

char s[20] = "hello";
cout << s;   /* This displays the entire string */

You can alter the string letter by letter, just as you would alter the elements of any other array:

char s[20] = "hello";
s[3] = 'p';
s[4] = '!';
cout << s;   /* This displays help! */

You can also use cin to read the entire string in one go:

char person[40];
cout << "Please type your name : ";
cin >> person;
cout << "Hello, " << person << ", how are you?" << endl;

Please type your name : Richard
Hello, Richard, how are you?

Warning! However, I must mention one word of caution. If you use cin to read a string, It stops reading your input as soon as it reaches a space! any part of the string after (and including) the space is ignored! For example, if you re-ran the example program above, and typed a string with a space in it, you will find that it chops the rest of the string off:

Please type your name : Richard Bowles
Hello, Richard, how are you?

getline()

You can get round the problem of not being able to read beyond a space using cin by using a special function which is linked to cin called getline(). This is used as follows:

cin.getline(destination_string,max_length);

In this case, the destination string is a string variable, declared the normal way. The second parameter represents the maximum number of characters that you can enter before getline() will stop taking any notice. The maximum value can be either a variable or a constant number like 100. This will read the whole of the string that you type in all the way up to the point where you press the Enter key or until the maximum number of characters is reached, whichever comes soonder. The program segment above would be rewritten as:

char person[40];
cout << "Please type your name : ";
cin.getline(person,40);
cout << "Hello, " << person << ", how are you?" << endl;

I didn't tell you about cin.getline() when we were using cin to read numeric variables as they don't contain spaces. The problem only arises when you read in strings.

Arrays of strings

Using this old C-string notation, a string is implemented as an array of char. If you want to put the strings themselves in an array, then the char array will have to be multi-dimensional. Here, for instance, is an array of names:

char dwarves[7][10];
dwarves[0] = "Happy";
dwarves[1] = "Grumpy";
dwarves[2] = "Dopey";
dwarves[3] = "Sleepy";
dwarves[4] = "Sneezy";
dwarves[5] = "Doc";
dwarves[6] = "Bashful";

You can access this array as you would any other multi-dimensional array, for instance, dwarves[3][0] would be the first letter of Sleepy, i.e. S (Remember, count from 0 - it should be becoming second nature by now!) and dwarves[6][4] would be f. Again, each of these strings will have the \0 character, so dwarves[5][3] will be \0 and so will dwarves[2][5].

Here is a program which takes the bubblesort that you met in a previous section and uses it to sort the dwarves into alphabetical order.

int i,j;   /* These are used as loop variables */
int temp;  /* Used in swopping the values */

cout << "Before the sort, the values are " << endl;
for (i = 0; i <= 10; i++)
  cout << dwarves[i] << " ";
cout << endl;

for (i = 0; i < 11; i++)  /* Go through 10 times */
 for (j = 0; j < 10; j++)  /* Check all elements except
                                         last */
  if (dwarves[j][0] > dwarves[j+1][0] || dwarves[j][0] = 
      dwarves[j+1][0] && dwarves[j][1] > dwarves[j+1][1])
    { temp = dwarves[j];
      dwarves[j] = dwarves[j+1];
      dwarves[j+1] = temp;
     }

cout << "After the sort, the values are " << endl;
for (i = 0; i <= 10; i++)
  cout << dwarves[i] << " ";
cout << endl;

I have made a slight change to the comparison routine to take into account the fact that we are comparing strings. With numbers, you only have to do one comparison, but strings consist of several letters. There are several dwarves whose names start with the same letter. The comparison says "If the first letters are out of order, then do the swop. Alternatively, if the first letters are equal, then look at the second letters, and do the swop if they are out of order."

By the first letter, we mean element [0] of the string. By the second letter, we mean element [1] of the string. Go through the word description of the comparison and match it bit-by-bit to the comparison in the IF statement. Even so, there is a slight flaw, since two of the vertically challenged individuals, namely Dopey and Doc, would still slip through the net, as their names start with the same two letters. There is a better way of comparing strings, namely the compare( ) function, but we haven't come on to that yet.


Go back
Top of the page
Main Menu
Questions
Worked Examples
Go on
Go back
Top of the page
Main Menu
Questions
Worked Examples
Go on