Lesson 16: Recursion
 

Recursion is a programming technique that allows the programmer to express operations in terms of themselves. In C++, this takes the form of a function that calls itself. A useful way to think of recursive functions is to imagine them as a process being performed where one of the instructions is to "repeat the process". This makes it sound very similar to a loop because it repeats the same code, and in some ways it is similar to looping. On the other hand, recursion makes it easier to express ideas in which the result of the recursive call is necessary to complete the task. Of course, it must be possible for the "process" to sometimes be completed without the recursive call. One simple example is the idea of building a wall that is ten feet high; if I want to build a ten foot high wall, then I will first build a 9 foot high wall, and then add an extra foot of bricks. Conceptually, this is like saying the "build wall" function takes a height and if that height is greater than one, first calls itself to build a lower wall, and then adds one a foot of bricks. 


A simple example of recursion would be:
 
void recurse()
{
  recurse(); //Function calls itself
}

int main()
{
  recurse(); //Sets off the recursion
}
This program will not continue forever, however. The computer keeps function calls on a stack and once too many are called without ending, the program will crash. Why not write a program to see how many times the function is called before the program terminates?
 
#include <iostream>

using namespace std;

void recurse ( int count ) // Each call gets its own count
{
  cout<< count <<"\n";
  // It is not necessary to increment count since each function's
  //  variables are separate (so each count will be initialized one greater)
  recurse ( count + 1 );
}

int main()
{
  recurse ( 1 ); //First function call, so it starts at one        
}
This simple program will show the number of times the recurse function has been called by initializing each individual function call's count variable one greater than it was previous by passing in count + 1. Keep in mind, it is not a function restarting itself, it is hundreds of functions that are each unfinished with the last one calling a new recurse function. 

It can be thought of like the Russian dolls that always have a smaller doll inside. Each doll calls another doll, and you can think of the size being a counter variable that is being decremented by one. 

Think of a really tiny doll, the size of a few atoms. You can't get any smaller than that, so there are no more dolls. Normally, a recursive function will have a variable that performs a similar action; one that controls when the function will finally exit. The condition where the function will not call itself is termed the base case of the function. Basically, it is an if-statement that checks some variable for a condition (such as a number being less than zero, or greater than some other number) and if that condition is true, it will not allow the function to call itself again. (Or, it could check if a certain condition is true and only then allow the function to call itself). 

A quick example:
 
void doll ( int size )
{
  if ( size == 0 )   // No doll can be smaller than 1 atom (10^0==1) so doesn't call itself
    return;          // Return does not have to return something, it can be used
                     //  to exit a function
  doll ( size - 1 ); // Decrements the size variable so the next doll will be smaller.
}
int main()
{
  doll ( 10 ); //Starts off with a large doll (it's a logarithmic scale)
}
This program ends when size equals one. This is a good base case, but if it is not properly set up, it is possible to have an base case that is always true (or always false). 

Once a function has called itself, it will be ready to go to the next line after the call. It can still perform operations. One function you could write could print out the numbers 123456789987654321. How can you use recursion to write a function to do this? Simply have it keep incrementing a variable passed in, and then output the variable...twice, once before the function recurses, and once after...
 
void printnum ( int begin )
{
  cout<< begin;
  if ( begin < 9 )         // The base case is when begin is greater than 9
  {                           //  for it will not recurse after the if-statement
      printnum ( begin + 1 ); 
  }
  cout<< begin;         // Outputs the second begin, after the program has
                              //  gone through and output
}
This function works because it will go through and print the numbers begin to 9, and then as each printnum function terminates it will continue printing the value of begin in each function from 9 to begin. 

This is just the beginning of the usefulness of recursion. Here's a little challenge, use recursion to write a program that returns the factorial of any number greater than 0. (Factorial is number*number-1*number-2...*1). 

Hint: Recursively find the factorial of the smaller numbers first, i.e., it takes a number, finds the factorial of the previous number, and multiplies the number times that factorial...have fun. :-) 





Lesson 15: Singly linked lists


Linked lists are a way to store data with structures so that the programmer can automatically create a new place to store data whenever necessary. Specifically, the programmer writes a struct or class definition that contains variables holding information about something, and then has a pointer to a struct of its type. Each of these individual struct or classes in the list is commonly known as a node. 

Think of it like a train. The programmer always stores the first node of the list. This would be the engine of the train. The pointer is the connector between cars of the train. Every time the train adds a car, it uses the connectors to add a new car. This is like a programmer using the keyword new to create a pointer to a new struct or class. 

In memory it is often described as looking like this:
 
----------        ----------
- Data   -        - Data   -    
----------        ----------   
- Pointer- - - -> - Pointer-  
----------        ----------
The representation isn't completely accurate, but it will suffice for our purposes. Each of the big blocks is a struct (or class) that has a pointer to another one. Remember that the pointer only stores the memory location of something, it is not that thing, so the arrow goes to the next one. At the end, there is nothing for the pointer to point to, so it does not point to anything, it should be a null pointer or a dummy node to prevent it from accidentally pointing to a totally arbitrary and random location in memory (which is very bad). 

So far we know what the node struct should look like:
 
struct node {
  int x;
  node *next;
};

int main()
{
  node *root;      // This will be the unchanging first node

  root = new node; // Now root points to a node struct
  root->next = 0;  // The node root points to has its next pointer
                   //  set equal to a null pointer
  root->x = 5;     // By using the -> operator, you can modify the node
                   //  a pointer (root in this case) points to.
}
This so far is not very useful for doing anything. It is necessary to understand how to traverse (go through) the linked list before going further. 

Think back to the train. Lets imagine a conductor who can only enter the train through the engine, and can walk through the train down the line as long as the connector connects to another car. This is how the program will traverse the linked list. The conductor will be a pointer to node, and it will first point to root, and then, if the root's pointer to the next node is pointing to something, the "conductor" (not a technical term) will be set to point to the next node. In this fashion, the list can be traversed. Now, as long as there is a pointer to something, the traversal will continue. Once it reaches a null pointer (or dummy node), meaning there are no more nodes (train cars) then it will be at the end of the list, and a new node can subsequently be added if so desired. 

Here's what that looks like:
 
struct node {
  int x;
  node *next;
};

int main()
{
  node *root;       // This won't change, or we would lose the list in memory
  node *conductor;  // This will point to each node as it traverses the list

  root = new node;  // Sets it to actually point to something
  root->next = 0;   //  Otherwise it would not work well
  root->x = 12;
  conductor = root; // The conductor points to the first node
  if ( conductor != 0 ) {
    while ( conductor->next != 0)
      conductor = conductor->next;
  }
  conductor->next = new node;  // Creates a node at the end of the list
  conductor = conductor->next; // Points to that node
  conductor->next = 0;         // Prevents it from going any further
  conductor->x = 42;
}
That is the basic code for traversing a list. The if statement ensures that there is something to begin with (a first node). In the example it will always be so, but if it was changed, it might not be true. If the if statement is true, then it is okay to try and access the node pointed to by conductor. The while loop will continue as long as there is another pointer in the next. The conductor simply moves along. It changes what it points to by getting the address of conductor->next. 

Finally, the code at the end can be used to add a new node to the end. Once the while loop as finished, the conductor will point to the last node in the array. (Remember the conductor of the train will move on until there is nothing to move on to? It works the same way in the while loop.) Therefore, conductor->next is set to null, so it is okay to allocate a new area of memory for it to point to. Then the conductor traverses one more element (like a train conductor moving on to the newly added car) and makes sure that it has its pointer to next set to 0 so that the list has an end. The 0 functions like a period, it means there is no more beyond. Finally, the new node has its x value set. (It can be set through user input. I simply wrote in the '=42' as an example.) 

To print a linked list, the traversal function is almost the same. It is necessary to ensure that the last element is printed after the while loop terminates. 

For example:
 
conductor = root;
if ( conductor != 0 ) { //Makes sure there is a place to start
  while ( conductor->next != 0 ) {
    cout<< conductor->x;
    conductor = conductor->next;
  }
  cout<< conductor->x;
}
The final output is necessary because the while loop will not run once it reaches the last node, but it will still be necessary to output the contents of the next node. Consequently, the last output deals with this. Because we have a pointer to the beginning of the list (root), we can avoid this redundancy by allowing the conductor to walk off of the back of the train. Bad for the conductor (if it were a real person), but the code is simpler as it also allows us to remove the initial check for null (if root is null, then conductor will be immediately set to null and the loop will never begin):
 
conductor = root;
while ( conductor != NULL ) {
  cout<< conductor->x;
  conductor = conductor->next;
}


Previous: Accepting command-line arguments 
Next: Recursion 
Tutorial index


Lesson 14: Accepting command line arguments


In C++ it is possible to accept command line arguments. Command-line arguments are given after the name of a program in command-line operating systems like DOS or Linux, and are passed in to the program from the operating system. To use command line arguments in your program, you must first understand the full declaration of the main function, which previously has accepted no arguments. In fact, main can actually accept two arguments: one argument is number of command line arguments, and the other argument is a full list of all of the command line arguments. 


The full declaration of main looks like this:
 int main ( int argc, char *argv[] )
The integer, argc is the ARGument Count (hence argc). It is the number of arguments passed into the program from the command line, including the name of the program. 

The array of character pointers is the listing of all the arguments. argv[0] is the name of the program, or an empty string if the name is not available. After that, every element number less than argc is a command line argument. You can use each argv element just like a string, or use argv as a two dimensional array. argv[argc] is a null pointer. 

How could this be used? Almost any program that wants its parameters to be set when it is executed would use this. One common use is to write a function that takes the name of a file and outputs the entire text of it onto the screen.
 #include <fstream>#include <iostream>using namespace std;int main ( int argc, char *argv[] ){  if ( argc != 2 ) // argc should be 2 for correct execution    // We print argv[0] assuming it is the program name    cout<<"usage: "<< argv[0] <<" <filename>\n";  else {    // We assume argv[1] is a filename to open    ifstream the_file ( argv[1] );    // Always check to see if file opening succeeded    if ( !the_file.is_open() )      cout<<"Could not open file\n";    else {      char x;      // the_file.get ( x ) returns false if the end of the file      //  is reached or an error occurs      while ( the_file.get ( x ) )        cout<< x;    }    // the_file is closed implicitly here  }}
This program is fairly simple. It incorporates the full version of main. Then it first checks to ensure the user added the second argument, theoretically a file name. The program then checks to see if the file is valid by trying to open it. This is a standard operation that is effective and easy. If the file is valid, it gets opened in the process. The code is self-explanatory, but is littered with comments, you should have no trouble understanding its operation this far into the tutorial. :-) 

Previous: Functions Continued
Next: Linked Lists 
Tutorial index

'프로그래밍 > C, C++' 카테고리의 다른 글

레슨 16: Recursion  (0) 2011.06.07
레슨 15: Singly linked lists  (0) 2011.06.07
레슨 13: More on Functions  (0) 2011.06.07
레슨 12: Introduction to Classes  (0) 2011.06.07
레슨 11: Typecasting  (0) 2011.06.07

Lesson 13: More on Functions

In lesson 4 you were given the basic information on functions. However, I left out one item of interest. That item is the inline function. Inline functions are not very important, but it is good to understand them. The basic idea is to save time at a cost in space. Inline functions are a lot like a placeholder. Once you define an inline function, using the 'inline' keyword, whenever you call that function the compiler will replace the function call with the actual code from the function. 

How does this make the program go faster? Simple, function calls are simply more time consuming than writing all of the code without functions. To go through your program and replace a function you have used 100 times with the code from the function would be time consuming not too bright. Of course, by using the inline function to replace the function calls with code you will also greatly increase the size of your program. 

Using the inline keyword is simple, just put it before the name of a function. Then, when you use that function, pretend it is a non-inline function. 

For example:
 
#include <iostream>

using namespace std;

inline void hello()
{ 
  cout<<"hello";
}
int main()
{
  hello(); //Call it like a normal function...
  cin.get();
}
However, once the program is compiled, the call to hello(); will be replaced by the code making up the function. 

A WORD OF WARNING: Inline functions are very good for saving time, but if you use them too often or with large functions you will have a tremendously large program. Sometimes large programs are actually less efficient, and therefore they will run more slowly than before. Inline functions are best for small functions that are called often. 

Finally, note that the compiler may choose, in its infinite wisdom, to ignore your attempt to inline a function. So if you do make a mistake and inline a monster fifty-line function that gets called thousands of times, the compiler may ignore you. 

In the future, we will discuss inline functions in terms of C++ classes. Now that you understand the concept I will feel more comfortable using inline functions in later tutorials. 

Previous: Classes
Next: Reading command-line arguments 
Tutorial index

'프로그래밍 > C, C++' 카테고리의 다른 글

레슨 15: Singly linked lists  (0) 2011.06.07
레슨 14: Accepting command line arguments  (0) 2011.06.07
레슨 12: Introduction to Classes  (0) 2011.06.07
레슨 11: Typecasting  (0) 2011.06.07
레슨 10: C++ File I/O  (1) 2011.06.07



Lesson 12: Introduction to Classes 

C++ is a bunch of small additions to C, with a few major additions. One major addition is the object-oriented approach (the other addition is support for *Generic programming, which we'll cover later). As the name object-oriented programming suggests, this approach deals with objects. Of course, these are not real-life objects themselves. Instead, these objects are the essential definitions of real world objects. Classes are collections of data related to a single object type. Classes not only include information regarding the real world object, but also functions to access the data, and classes possess the ability to inherit from other classes. (Inheritance is covered in a later lesson.) 

If a class is a house, then the functions will be the doors and the variables will be the items inside the house. The functions usually will be the only way to modify the variables in this structure, and they are usually the only way even to access the variables in this structure. This might seem silly at first, but the idea to make programs more modular - the principle itself is called "encapsulation". The key idea is that the outside world doesn't need to know exactly what data is stored inside the class--it just needs to know which functions it can use to access that data. This allows the implementation to change more easily because nobody should have to rely on it except the class itself. 

The syntax for these classes is simple. First, you put the keyword 'class' then the name of the class. Our example will use the name Computer. Then you put an open bracket. Before putting down the different variables, it is necessary to put the degree of restriction on the variable. There are three levels of restriction. The first is public, the second protected, and the third private. For now, all you need to know is that the public restriction allows any part of the program, including parts outside the class, to access the functions and variables specified as public. The protected restriction prevents functions outside the class to access the variable. The private restriction is similar to protected (we'll see the difference later when we look at inheritance. The syntax for declaring these access restrictions is merely the restriction keyword (public, private, protected) and then a colon. Finally, you put the different variables and functions (You usually will only put the function prototype[s]) you want to be part of the class. Then you put a closing bracket and semicolon. Keep in mind that you still must end the function prototype(s) with a semi-colon. 

Let's look at these different access restrictions for a moment. Why would you want to declare something private instead of public? The idea is that some parts of the class are intended to be internal to the class--only for the purpose of implementing features. On the other hand, some parts of the class are supposed to be available to anyone using the class--these are the public class functions. Think of a class as though it were an appliance like a microwave: the public parts of the class correspond to the parts of the microwave that you can use on an everyday basis--the keypad, the start button, and so forth. On the other hand, some parts of the microwave are not easily accessible, but they are no less important--it would be hard to get at the microwave generator. These would correspond to the protected or private parts of the class--the things that are necessary for the class to function, but that nobody who uses the class should need to know about. The great thing about this separation is that it makes the class easier to use (who would want to use a microwave where you had to know exactly how it works in order to use it?) The key idea is to separate the interface you use from the way the interface is supported and implemented. 

Classes must always contain two functions: a constructor and a destructor. The syntax for them is simple: the class name denotes a constructor, a ~ before the class name is a destructor. The basic idea is to have the constructor initialize variables, and to have the destructor clean up after the class, which includes freeing any memory allocated. If it turns out that you don't need to actually perform any initialization, then you can allow the compiler to create a "default constructor" for you. Similarly, if you don't need to do anything special in the destructor, the compiler can write it for you too! 

When the programmer declares an instance of the class, the constructor will be automatically called. The only time the destructor is called is when the instance of the class is no longer needed--either when the program ends, the class reaches the end of scope, or when its memory is deallocated using delete (if you don't understand all of that, don't worry; the key idea is that destructors are always called when the class is no longer usable). Keep in mind that neither constructors nor destructors return arguments! This means you do not want to (and cannot) return a value in them. 

Note that you generally want your constructor and destructor to be made public so that your class can be created! The constructor is called when an object is created, but if the constructor is private, it cannot be called so the object cannot be constructed. This will cause the compiler to complain. 

The syntax for defining a function that is a member of a class outside of the actual class definition is to put the return type, then put the class name, two colons, and then the function name. This tells the compiler that the function is a member of that class. 

For example:
 
#include <iostream>

using namespace std;

class Computer // Standard way of defining the class
{
public:
  // This means that all of the functions below this(and any variables)
  //  are accessible to the rest of the program.
  //  NOTE: That is a colon, NOT a semicolon...
  Computer();
  // Constructor
  ~Computer();
  // Destructor
  void setspeed ( int p );
  int readspeed();
protected:
  // This means that all the variables under this, until a new type of
  //  restriction is placed, will only be accessible to other functions in the
  //  class.  NOTE: That is a colon, NOT a semicolon...
  int processorspeed;
};
// Do Not forget the trailing semi-colon

Computer::Computer()
{
  //Constructors can accept arguments, but this one does not
  processorspeed = 0;
}

Computer::~Computer()
{
  //Destructors do not accept arguments
}

void Computer::setspeed ( int p )
{
  // To define a function outside put the name of the class
  //  after the return type and then two colons, and then the name
  //  of the function.
  processorspeed = p;
}
int Computer::readspeed()  
{
  // The two colons simply tell the compiler that the function is part
  //  of the class
  return processorspeed;
}

int main()
{
  Computer compute;  
  // To create an 'instance' of the class, simply treat it like you would
  //  a structure.  (An instance is simply when you create an actual object
  //  from the class, as opposed to having the definition of the class)
  compute.setspeed ( 100 ); 
  // To call functions in the class, you put the name of the instance,
  //  a period, and then the function name.
  cout<< compute.readspeed();
  // See above note.
}
This introduction is far from exhaustive and, for the sake of simplicity, recommends practices that are not always the best option. For more detail, I suggest asking questions on our forums and getting a book recommended by ourbook reviews


Previous: Typecasting
Next: Functions Continued 
Tutorial index


*Generic programming

'프로그래밍 > C, C++' 카테고리의 다른 글

레슨 14: Accepting command line arguments  (0) 2011.06.07
레슨 13: More on Functions  (0) 2011.06.07
레슨 11: Typecasting  (0) 2011.06.07
레슨 10: C++ File I/O  (1) 2011.06.07
레슨 9: C Strings  (0) 2011.06.07





Lesson 11: Typecasting


Typecasting is making a variable of one type, such as an int, act like another type, a char, for one single operation. To typecast something, simply put the type of variable you want the actual variable to act as inside parentheses in front of the actual variable. (char)a will make 'a' function as a char. 


For example:
 
#include <iostream> 

using namespace std;

int main()       
{
  cout<< (char)65 <<"\n"; 
  // The (char) is a typecast, telling the computer to interpret the 65 as a
  //  character, not as a number.  It is going to give the character output of 
  //  the equivalent of the number 65 (It should be the letter A for ASCII).
  cin.get();
}
One use for typecasting for is when you want to use the ASCII characters. For example, what if you want to create your own chart of all 256 ASCII characters. To do this, you will need to use to typecast to allow you to print out the integer as its character equivalent.
 
#include <iostream>

using namespace std;

int main()
{
  for ( int x = 0; x < 256; x++ ) {
    cout<< x <<". "<< (char)x <<" "; 
    //Note the use of the int version of x to 
    // output a number and the use of (char) to 
    // typecast the x into a character 	
    // which outputs the ASCII character that 
    // corresponds to the current number
  }
  cin.get();
}
The typecast described above is a C-style cast, C++ supports two other types. First is the function-style cast:
 
int main()       
{
  cout<< char ( 65 ) <<"\n"; 
  cin.get();
}
This is more like a function call than a cast as the type to be cast to is like the name of the function and the value to be cast is like the argument to the function. Next is the named cast, of which there are four:
 
int main()       
{
  cout<< static_cast<char> ( 65 ) <<"\n"; 
  cin.get();
}
static_cast is similar in function to the other casts described above, but the name makes it easier to spot and less tempting to use since it tends to be ugly. Typecasting should be avoided whenever possible. The other three types of named casts are const_cast, reinterpret_cast, and dynamic_cast. They are of no use to us at this time.

Typecasts in practice

So when exactly would a typecast come in handy? One use of typecasts is to force the correct type of mathematical operation to take place. It turns out that in C and C++ (and other programming languages), the result of the division of integers is itself treated as an integer: for instance, 3/5 becomes 0! Why? Well, 3/5 is less than 1, and integer division ignores the remainder. 

On the other hand, it turns out that division between floating point numbers, or even between one floating point number and an integer, is sufficient to keep the result as a floating point number. So if we were performing some kind of fancy division where we didn't want truncated values, we'd have to cast one of the variables to a floating point type. For instance, static_cast<float>(3)/5 comes out to .6, as you would expect! 

When might this come up? It's often reasonable to store two values in integers. For instance, if you were tracking heart patients, you might have a function to compute their age in years and the number of heart times they'd come in for heart pain. One operation you might conceivably want to perform is to compute the number of times per year of life someone has come in to see their physician about heart pain. What would this look like?
 
/* magical function returns the age in years */
int age = getAge();  
/* magical function returns the number of visits */
int pain_visits = getVisits(); 

float visits_per_year = pain_visits / age;
The problem is that when this program is run, visits_per_year will be zero unless the patient had an awful lot of visits to the doc. The way to get around this problem is to cast one of the values being divided so it gets treated as a floating point number, which will cause the compiler to treat the expression as if it were to result in a floating point number:
 
float visits_per_year = pain_visits / static_cast<float>(age);
/* or */
float visits_per_year = static_cast<float>(pain_visits) / age;
This would cause the correct values to be stored in visits_per_year. Can you think of another solution to this problem (in this case)? 


Previous: File I/O
Next: Classes 
Tutorial index

'프로그래밍 > C, C++' 카테고리의 다른 글

레슨 13: More on Functions  (0) 2011.06.07
레슨 12: Introduction to Classes  (0) 2011.06.07
레슨 10: C++ File I/O  (1) 2011.06.07
레슨 9: C Strings  (0) 2011.06.07
레슨 8: Array basics  (0) 2011.06.07



Lesson 10: C++ File I/O


This is a slightly more advanced topic than what I have covered so far, but I think that it is useful. File I/O is reading from and writing to files. This lesson will only cover text files, that is, files that are composed only of ASCII text. 

C++ has two basic classes to handle files, ifstream and ofstream. To use them, include the header file fstream. Ifstream handles file input (reading from files), and ofstream handles file output (writing to files). The way to declare an instance of the ifstream or ofstream class is:
ifstream a_file;
or
ifstream a_file ( "filename" );
The constructor for both classes will actually open the file if you pass the name as an argument. As well, both classes have an open command (a_file.open()) and a close command (a_file.close()). You aren't required to use the close command as it will automatically be called when the program terminates, but if you need to close the file long before the program ends, it is useful. 

The beauty of the C++ method of handling files rests in the simplicity of the actual functions used in basic input and output operations. Because C++ supports overloading operators, it is possible to use << and >> in front of the instance of the class as if it were cout or cin. In fact, file streams can be used exactly the same as cout and cin after they are opened. 

For example:
 
#include <fstream>
#include <iostream>

using namespace std;

int main()
{
  char str[10];

  //Creates an instance of ofstream, and opens example.txt
  ofstream a_file ( "example.txt" );
  // Outputs to example.txt through a_file
  a_file<<"This text will now be inside of example.txt";
  // Close the file stream explicitly
  a_file.close();
  //Opens for reading the file
  ifstream b_file ( "example.txt" );
  //Reads one string from the file
  b_file>> str;
  //Should output 'this'
  cout<< str <<"\n";
  cin.get();    // wait for a keypress
  // b_file is closed implicitly here
}
The default mode for opening a file with ofstream's constructor is to create it if it does not exist, or delete everything in it if something does exist in it. If necessary, you can give a second argument that specifies how the file should be handled. They are listed below:
 
ios::app   -- Append to the file
ios::ate   -- Set the current position to the end
ios::trunc -- Delete everything in the file
For example:
 
ofstream a_file ( "test.txt", ios::app );
This will open the file without destroying the current contents and allow you to append new data. When opening files, be very careful not to use them if the file could not be opened. This can be tested for very easily:
 
ifstream a_file ( "example.txt" );

if ( !a_file.is_open() ) {
  // The file could not be opened
}
else {
  // Safely use the file stream
}



Previous: Strings
Next: Typecasting 
Tutorial index

'프로그래밍 > C, C++' 카테고리의 다른 글

레슨 12: Introduction to Classes  (0) 2011.06.07
레슨 11: Typecasting  (0) 2011.06.07
레슨 9: C Strings  (0) 2011.06.07
레슨 8: Array basics  (0) 2011.06.07
레슨 7: Structures  (0) 2011.06.07




Lesson 9: C Strings



In C++ there are two types of strings, C-style strings, and *C++-style strings. This lesson will discuss C-style strings. C-style strings are really arrays, but there are some different functions that are used for strings, like adding to strings, finding the length of strings, and also of checking to see if strings match. The definition of a string would be anything that contains more than one character strung together. For example, "This" is a string. However, single characters will not be strings, though they can be used as strings. 


Strings are arrays of chars. String literals are words surrounded by double quotation marks.
 
"This is a static string"
To declare a string of 49 letters, you would want to say:
 
char string[50];
This would declare a string with a length of 50 characters. Do not forget that arrays begin at zero, not 1 for the index number. In addition, a string ends with a null character, literally a '\0' character. However, just remember that there will be an extra character on the end on a string. It is like a period at the end of a sentence, it is not counted as a letter, but it still takes up a space. Technically, in a fifty char array you could only hold 49 letters and one null character at the end to terminate the string. 

TAKE NOTE: char *arry; Can also be used as a string. If you have read the tutorial on pointers, you can do something such as:
 
arry = new char[256];
which allows you to access arry just as if it were an array. Keep in mind that to use delete you must put [] between delete and arry to tell it to free all 256 bytes of memory allocated. 

For example:
 
delete [] arry.
Strings are useful for holding all types of long input. If you want the user to input his or her name, you must use a string. Using cin>> to input a string works, but it will terminate the string after it reads the first space. The best way to handle this situation is to use the function cin.getline. Technically cin is a class (a beast similar to a structure), and you are calling one of its member functions. The most important thing is to understand how to use the function however. 

The prototype for that function is:
 
istream& getline(char *buffer, int length, char terminal_char);
The char *buffer is a pointer to the first element of the character array, so that it can actually be used to access the array. The int length is simply how long the string to be input can be at its maximum (how big the array is). The char terminal_char means that the string will terminate if the user inputs whatever that character is. Keep in mind that it will discard whatever the terminal character is. 

It is possible to make a function call of cin.getline(arry, 50); without the terminal character. Note that '\n' is the way of actually telling the compiler you mean a new line, i.e. someone hitting the enter key. 

For a example:
 
#include <iostream>

using namespace std;

int main()
{
  char string[256];                               // A nice long string

  cout<<"Please enter a long string: ";
  cin.getline ( string, 256, '\n' );              // Input goes into string
  cout<<"Your long string was: "<< string <<endl;
  cin.get();
}
Remember that you are actually passing the address of the array when you pass string because arrays do not require an address operator (&) to be used to pass their address. Other than that, you could make '\n' any character you want (make sure to enclose it with single quotes to inform the compiler of its character status) to have the getline terminate on that character. 

cstring is a header file that contains many functions for manipulating strings. One of these is the string comparison function.
 
int strcmp ( const char *s1, const char *s2 );
strcmp will accept two strings. It will return an integer. This integer will either be:
 
Negative if s1 is less than s2.
Zero if s1 and s2 are equal.
Positive if s1 is greater than s2.
Strcmp is case sensitive. Strcmp also passes the address of the character array to the function to allow it to be accessed.
 
char *strcat ( char *dest, const char *src );
strcat is short for string concatenate, which means to add to the end, or append. It adds the second string to the first string. It returns a pointer to the concatenated string. Beware this function, it assumes that dest is large enough to hold the entire contents of src as well as its own contents.
 
char *strcpy ( char *dest, const char *src );
strcpy is short for string copy, which means it copies the entire contents of src into dest. The contents of dest after strcpy will be exactly the same as src such that strcmp ( dest, src ) will return 0.
 
size_t strlen ( const char *s );
strlen will return the length of a string, minus the terminating character ('\0'). The size_t is nothing to worry about. Just treat it as an integer that cannot be negative, which it is. 

Here is a small program using many of the previously described functions:
 
#include <iostream> //For cout
#include <cstring>  //For the string functions

using namespace std;

int main()
{
  char name[50];
  char lastname[50];
  char fullname[100]; // Big enough to hold both name and lastname
  
  cout<<"Please enter your name: ";
  cin.getline ( name, 50 );
  if ( strcmp ( name, "Julienne" ) == 0 ) // Equal strings
    cout<<"That's my name too.\n";
  else                                    // Not equal
    cout<<"That's not my name.\n";
  // Find the length of your name
  cout<<"Your name is "<< strlen ( name ) <<" letters long\n";
  cout<<"Enter your last name: ";
  cin.getline ( lastname, 50 );
  fullname[0] = '\0';            // strcat searches for '\0' to cat after
  strcat ( fullname, name );     // Copy name into full name
  strcat ( fullname, " " );      // We want to separate the names by a space
  strcat ( fullname, lastname ); // Copy lastname onto the end of fullname
  cout<<"Your full name is "<< fullname <<"\n";
  cin.get();
}

Safe Programming

The above string functions all rely on the existence of a null terminator at the end of a string. This isn't always a safe bet. Moreover, some of them, noticeably strcat, rely on the fact that the destination string can hold the entire string being appended onto the end. Although it might seem like you'll never make that sort of mistake, historically, problems based on accidentally writing off the end of an array in a function like strcat, have been a major problem

Fortunately, in their infinite wisdom, the designers of C have included functions designed to help you avoid these issues. Similar to the way that fgets takes the maximum number of characters that fit into the buffer, there are string functions that take an additional argument to indicate the length of the destination buffer. For instance, the strcpy function has an analogous strncpy function
 
char *strncpy ( char *dest, const char *src, size_t len );
which will only copy len bytes from src to dest (len should be less than the size of dest or the write could still go beyond the bounds of the array). Unfortunately, strncpy can lead to one niggling issue: it doesn't guarantee that dest will have a null terminator attached to it (this might happen if the string src is longer than dest). You can avoid this problem by using strlen to get the length of src and make sure it will fit in dest. Of course, if you were going to do that, then you probably don't need strncpy in the first place, right? Wrong. Now it forces you to pay attention to this issue, which is a big part of the battle. 

Still not getting it? Ask an expert! 

Quiz yourself
Previous: Arrays
Next: File I/O 
Tutorial index



*C++-style strings

'프로그래밍 > C, C++' 카테고리의 다른 글

레슨 11: Typecasting  (0) 2011.06.07
레슨 10: C++ File I/O  (1) 2011.06.07
레슨 8: Array basics  (0) 2011.06.07
레슨 7: Structures  (0) 2011.06.07
레슨 6: An introduction to pointers  (0) 2011.06.07

+ Recent posts