Assignment 3
Assignment 3 is worth 4% of your final mark, and is due by Saturday March 10 Sunday March 11
at 11:59 pm. Late submissions will not be accepted, barring exceptional circumstances.
There are two Practical Lab sessions for this assignment (Tuesday February 28 and Tuesday March 6).
This assignment should be submitted via MarkUs.
Your assignments must follow the style guidelines discussed in the course style guide. Failure to do so may result in deductions. There are new requirements for this assignment.
This assignment covers topics up through Carter chapter 6, and the first part of Chapter 7 (parts of 7.1 and 7.2). You should not make use of concepts from subsequent chapters (e.g., string functions).
Announcements and Updates
This section contains a summary the changes that have been made to this page since it was first posted. You should also keep an eye on the Assignment 3 forum on the Discussion Board for additional announcements and clarifications.
(Mar 7): Due date extended to Sunday March 11.
(Mar 4): The tester program is available. The tester's output is identical to the previous testers; details about the format are available on the Assignment 1 tester page. The tester can be run on ECF by using the Makefile
; for more information, please see the section Compiling and Makefiles
in this handout, and the Makefile
information page. Note that the tester does not check the length of your histograms; it replaces the sequence of histogram symbols with the string "[histogram]"
prior to comparing your output with the expected output. (This is the same process that was used on the calculator
tester from Assignment 1.)
Important Notes
Important Note 1: The starter code contains several elements that are intended for use by the auto-marker (e.g., they are there to make it easier for us to mark your program). You should not make any changes to the starter code, except where directed. Any changes outside of the designated areas of the starter code may result in a substantial deduction. For more information, please see the section Starter Code.
Important Note 2: One of the goals of this assignment is to give you practice using pointers and pointer arithmetic. While some of the array manipulations in this assignment would probably be expressed using subscripting in most "normal" programs, we are requiring you to use pointer arithmetic so that you get a chance to practice it. On this assignment, all array manipulations must take place using pointers. In other words, the characters [
or ]
may not appear anywhere in your code, except in array declarations. Any submission that uses subscripting may receive a mark of 0. In other words, the following use of []
is not only fine, but is required (since you will need to use arrays to solve the assignment):
int positions[] = {1, 2, 3}; /* OK */
This is not:
positions[0] = 5; /* Not OK */
Instead, you need to do something like:
*positions = 5; /* OK */
One tip if you're having trouble thinking about pointer arithmetic is to write the array access using a subscript, and then convert that to pointer arithmetic. While you would never bother to do that in the real world (if you've already written the code using array indexing, you'd just use that version of it), it can be a lot of help while you're still getting comfortable with pointer arithmetic.
Overview
The goal of this assignment is to practice using pointers, arrays, and functions, and to get some experience writing programs that consist of more than one file. Your job is to write a function that helps generate histograms. You will then make use of that function in two different programs that generate two different styles of histograms. The first program simulates the results of dealing hands of cards, and the second program plots the frequency of letters that occur in a passage of text. Both histograms represent proportions of values in each category, not raw values. We have provided starter code for each of the programs (and for your histogram function); you should not make any changes outside of the specified areas of the code. We have also provided a Makefile
to help you compile this assignment; for more information about it, see the section Compiling and Makefiles
.
Part 1: Dealing Cards
cardGame
is a program that simulates dealing a hand of cards, and calculates the point value for that hand. Each card is assigned a point value as follows:
A: 1 2: 2 3: 3 4: 4 5: 5 6: 6 7: 7 8: 8 9: 9 10: 10 J: 10 Q: 10 K: 10
The point value (or score) for a hand is the sum of the point values of all of the cards in it. Notice that numeric cards are each worth their specified value, Aces are worth 1, and face cards (Jack, Queen, and King) are all worth 10. Suits are ignored. Our deck is a standard deck of 52 playing cards (four of each type of card, and no jokers).
cardGame
prompts the user for the number of hands to deal, and the number of cards per hand. It then deals each of those hands, records the results, and prints out a summary (user input is in bold):
Enter the number of hands: 1
Enter the number of cards per hand: 2
Here are the results in numeric form:
0: 0
1: 0
2: 0
3: 0
4: 0
5: 0
6: 0
7: 0
8: 0
9: 0
10: 0
11: 1
12: 0
13: 0
14: 0
15: 0
16: 0
17: 0
18: 0
19: 0
20: 0
Here are the results as a histogram:
0:
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:**********************************************************************
12:
13:
14:
15:
16:
17:
18:
19:
20:
In this example, our single hand of 2 cards had a total point value of 11. This is not a very exciting example, since we only dealt a single hand. Let's try a larger sample:
Enter the number of hands: 10000000
Enter the number of cards per hand: 2
Here are the results in numeric form:
0: 0
1: 0
2: 58974
3: 118347
4: 177615
5: 236901
6: 296308
7: 355506
8: 413246
9: 474351
10: 532135
11: 947406
12: 887677
13: 824964
14: 768836
15: 709174
16: 651542
17: 592861
18: 532916
19: 473776
20: 947465
Here are the results as a histogram:
0:
1:
2:
3:*
4:*
5:**
6:**
7:**
8:***
9:***
10:****
11:*******
12:******
13:******
14:*****
15:*****
16:*****
17:****
18:****
19:***
20:*******
This time we dealt 10,000,000 hands, so we have a much larger sample size. At this point, we can examine the histogram, and draw some conclusions. First, we note that there aren't any hands with a score of 0 or 1, which makes sense since we are dealing two cards per hand (i.e., since the minimum card value is 1, if we deal two of them, our minimum score is 1 + 1 or 2). Second, we see that 11 and 20 are the most most common scores. This also makes sense, since both can be generated by 16 possible permutations of cards (for 11: A/10, A/J, A/Q, A/K, 2/9, 3/8, 4/7, 5/6, 6/5, 7/4, 8/3, 9/2, 10/A, J/A, Q/A, and K/A; for 20: 10/10, 10/J, 10/Q, 10/K, J/10, J/J, J/Q, J/K, Q/10, Q/J, Q/Q, Q/K, K/10, K/J, K/Q, and K/K), which is the most of any possible score. Contrast this with a score of 3, which can only be generated by two possible permutations: A/2 and 2/A.
Input
You can assume that the user will enter a valid int
for both of the prompts, but you must validate and prompt for another value if the user enters an int
that is not greater than 0
:
Enter the number of hands: -10
Number of hands must be greater than 0.
Enter the number of hands: 10
Enter the number of cards per hand: -2
Number of cards must be greater than 0.
Enter the number of cards per hand: 2
[results would be printed here]
Output
Your program should print the appropriate number of rows for the numeric and histogram data. In other words, if the user specifies 1 card per hand, then the maximum value shown in the output would be 10:
Enter the number of hands: 1
Enter the number of cards per hand: 1
Here are the results in numeric form:
0: 0
1: 1
2: 0
3: 0
4: 0
5: 0
6: 0
7: 0
8: 0
9: 0
10: 0
Here are the results as a histogram:
0:
1:**********************************************************************
2:
3:
4:
5:
6:
7:
8:
9:
10:
As another example, if the user specified 3 cards per hand, the maximum value would be 30.
You should still print rows for scores that are too small to be possible given the number of cards per hand. For example, with 1 card per hand, a score of 0 is impossible; with 2 cards per hand, scores of 0 and 1 are impossible. Printing these "impossible" rows is a simplification designed to make the assignment easier (i.e., you can just print out the entire list of possible scores each time, without having to worry about where to start).
Details on the histogram bars are discussed in the section Histograms. For this part of the assignment, your histogram bars should be 70 characters wide, and should use the '*'
symbol. Row labels should be right-aligned, and should be followed by a colon. You can assume that the labels will never be more than 2 digits wide (i.e., there will never be a score greater than 99). (Hint: printf()
can do the formatting for you.) For the numeric data, there is a space between the colon and the number; for the histograms, there is not.
A transcript of cardGame
is available here.
The Deck
The deck of cards is a 52-element array of int
s containing all of the possible card values (i.e., four copies each of the card values). To deal a card, we pseudorandomly generate a valid index, and then select the card with that index. In other words, if we wanted to deal two cards, we would pseudorandomly generate a number between 0 and 51, pick the value at that index in the card array, generate another pseudorandom index, and pick the value at that index.
Each of these pseudorandom selections is independent of all previous ones. This means that our simulation has a slight problem from a statistics point of view -- we can end up dealing the same card twice in a row! In other words, whenever we generate a pseudorandom index, there is a 1/52 chance that we will generate the same number again when we generate a second pseudorandom number, since each of the pseudorandom draws is statistically independent from the other. As a concrete example, if we are generating a pair of pseudorandom indices and our first pseudorandom index is 42, we would generate 42 as our second pseudorandom index (on average) one out of every 52 times we generated a pair of numbers.
Since we're not removing a card from the deck array when we draw it (compared to, say, having a list of cards and deleting a card from that list when we dealt it), picking the same index twice in a row means that we deal two copies of the same card. (Not "the same card" as in two different Jacks, but "the same card" as in "two copies of the Jack of Spades".) For our purposes, this limitation is fine, since fixing it would require additional complexity to keep track of which cards had been dealt. Conceptually, we can think of it as if we were replacing each card in the deck prior dealing the next one.
Part 2: Letter Frequency
letterFrequency
also draws histograms, but it plots information about letter frequency counts in a passage of text rather than hands of cards. The program first prompts the user to enter some text, and processes that text character-by-character to count the number of occurrences of each letter, before printing out the results in a similar format to that used in Part 1:
Enter some text (characters other than lowercase letters will be ignored): toronto
Here are the results in numeric form:
a: 0
b: 0
c: 0
d: 0
e: 0
f: 0
g: 0
h: 0
i: 0
j: 0
k: 0
l: 0
m: 0
n: 1
o: 3
p: 0
q: 0
r: 1
s: 0
t: 2
u: 0
v: 0
w: 0
x: 0
y: 0
z: 0
Here are the results as a histogram:
a:||
b:||
c:||
d:||
e:||
f:||
g:||
h:||
i:||
j:||
k:||
l:||
m:||
n:|--------|
o:|------------------------|
p:||
q:||
r:|--------|
s:||
t:|----------------|
u:||
v:||
w:||
x:||
y:||
z:||
The program only counts lowercase letters. Any other character (including uppercase letters and spaces) are skipped, and result in an error message. Skipped characters do not count towards the sequence length.
Enter some text (characters other than lowercase letters will be ignored): this is a LONGER chunk of TEXT! 123.
Skipping non-lowercase letter: ' '.
Skipping non-lowercase letter: ' '.
Skipping non-lowercase letter: ' '.
Skipping non-lowercase letter: 'L'.
Skipping non-lowercase letter: 'O'.
Skipping non-lowercase letter: 'N'.
Skipping non-lowercase letter: 'G'.
Skipping non-lowercase letter: 'E'.
Skipping non-lowercase letter: 'R'.
Skipping non-lowercase letter: ' '.
Skipping non-lowercase letter: ' '.
Skipping non-lowercase letter: ' '.
Skipping non-lowercase letter: 'T'.
Skipping non-lowercase letter: 'E'.
Skipping non-lowercase letter: 'X'.
Skipping non-lowercase letter: 'T'.
Skipping non-lowercase letter: '!'.
Skipping non-lowercase letter: ' '.
Skipping non-lowercase letter: '1'.
Skipping non-lowercase letter: '2'.
Skipping non-lowercase letter: '3'.
Skipping non-lowercase letter: '.'.
Here are the results in numeric form:
a: 1
b: 0
c: 1
d: 0
e: 0
f: 1
g: 0
h: 2
i: 2
j: 0
k: 1
l: 0
m: 0
n: 1
o: 1
p: 0
q: 0
r: 0
s: 2
t: 1
u: 1
v: 0
w: 0
x: 0
y: 0
z: 0
Here are the results as a histogram:
a:|----|
b:||
c:|----|
d:||
e:||
f:|----|
g:||
h:|--------|
i:|--------|
j:||
k:|----|
l:||
m:||
n:|----|
o:|----|
p:||
q:||
r:||
s:|--------|
t:|----|
u:|----|
v:||
w:||
x:||
y:||
z:||
Notice that there is an error message for each character that was skipped. (The first three error messages about spaces are because those are the only invalid characters until we encounter the 'L'
. We count the 't'
, 'h'
, 'i'
, and 's'
, then skip a space, then count an 'i'
and an 's'
, then skip a space, etc.)
You can assume that the user enters fewer than TEXT_MAX_LENGTH
characters.
Notice that these histograms are slightly different than those in the previous section. They are generated using the same function (described below), but look different. This is an example of using a helper function to solve more than one problem. (Think of the way that you use printf()
to print a wide variety of things.) These histograms are 55 symbols long and use the '-'
symbol. In addition, each row of symbols is preceded and followed by a pipe symbol ('|'
). Rows with zero symbols are simply drawn as two consecutive pipe symbols, since there is the leading pipe, no symbols, and then the following pipe.
A transcript of letterFrequency
is available here.
Histograms
This section contains further details about the histogram format. Histograms are made up of rows, and each row is made up of a label followed by a line of characters. The function createHistogramRow()
is used to generate the line of characters; it does not generate the labels.
/* Creates a histogram row in the char[] 'row'.
* Starting at element 0, we fill 'row' with n copies of 'symbol', followed by a '\0'.
* n is the number of symbols used to represent 'value' out of 'maxValue' scaled to 'maxRowLength'.
* (For more details, see the assignment handout.) 'row' must be at least 'maxRowLength' + 1
* characters long. Returns the number of symbols placed in 'row', excluding the '\0'. */
int createHistogramRow(char row[], char symbol, int value, int maxValue, int maxRowLength);
Each row is made up of n symbols. In order to determine n, we need to scale value
so that it is between 0
and maxRowLength
, since if we tried to plot value
directly, our rows would be too long (e.g., the dataset we obtained from dealing 10 million hands of cards had hundreds of thousands of occurrences for some of the scores, and if we plotted value
using one symbol for each occurrence, we would need to use hundreds of thousands of characters for each row). We therefore calculate the fraction value / maxValue
, and then multiply it by maxRowLength
to figure out how many symbols should be in the row. So if value
is equal to maxValue
, their ratio would be 1
, and we would store 1 * maxRowLength
= maxRowLength
symbols into row
. If value
is 0, then the ratio would be 0
, and we would store 0 * maxRowLength
= 0
symbols. If value
is half of maxValue
, then the ratio is 0.5
, and we would store 0.5 * maxRowLength
symbols. The number of symbols should be rounded to the nearest integer, since we don't have any way of representing fractions of symbols (e.g., we can't draw half of an *
).
You can assume that value
is <=
maxValue
. value
must be greater than or equal to 0
, and maxValue
must be greater than 0
.
Note: createHistogramRow()
does not print anything; it fills in row
so that row
can be printed by something else at a future date.
As indicated in the previous sections of the handout, the data portions of the histograms (i.e., excluding the labels) for cardGame
should be a maximum of 70 symbols wide, and should use the '*'
symbol. The histograms for letterFrequency
should be a maximum of 55 characters wide, and should use the -
symbol.
We have provided you with the histogram.h
header file, which should not be modified, and a template for histogram.c
, which you need to complete.
Starter Code
There are four files with starter code for this assignment. Starter code is available for cardGame.c
and letterFrequency.c
. There is also starter code for the histogram functions, histogram.h
and histogram.c
. Finally, a Makefile
is available here, and is discussed in more detail in the following section of the handout.
As indicated above (and as indicated on previous assignments), you should not make any changes to the starter code other than in the locations explicitly indicated (e.g., either a pair of Begin Changes Here
and End Changes Here
comments, or a return value indicated by a // REPLACE THIS LINE
). Any submissions that do may receive a deduction.
More importantly, the starter code contains some features (the #ifndef
lines) which are present to help us mark your code. They allow us to disable your main()
function and use your functions with a main()
function provided by the auto-marker. Do not modify these lines of code, since doing so may break the auto-marker for your submission. Any submissions that do so may receive a mark of 0, since the auto-marker won't be able to test your program.
If you get tired of seeing the same hands over and over again in cardGame
, you can change the seed for the PRNG (Pseudorandom Number Generator) by changing the 1
to something else in the line #define PRNG_SEED 1
. You could also change the line to #define PRNG_SEED time(NULL)
if you want the seed to vary each time you run your program, but that might make debugging more difficult (since you would never get the same results twice). However, you should not change the #ifndef
or #endif
lines immediately preceding or following the #define
, nor should you change the srand(PRNG_SEED)
call in main()
. (Those lines allow us to set the seed when we mark your code; we need to be able to do that since we need to know what results to expect from your program in order to auto-mark it.)
Compiling and Makefiles
This is our first assignment that consists of more than one (related) file. This means that compiling our programs is a bit more complicated. As a quick refresher of what we've discussed in class, when dealing with a multi-file program, there are two necessary components: the header (.h
) file, which contains function prototypes, and the implementation (.c
) file, which contains the C code that implements those functions. When you want to use functions that are implemented in a different file from your main()
, you first need to compile those functions into the machine code that is appropriate for your CPU. Then, you need to link (i.e., bundle) that machine code with the machine code for your main()
function, and create an executable program. When we were using a library that came with C like math.h
, we needed to do the second step (i.e., adding the -lm
flag to gcc
to tell it to look for the math.h
machine code), but the first step was already done for us by the people who wrote gcc
and your operating system (i.e., when gcc
is installed, the machine code for each of the built-in libraries is compiled and stored in a central place, since recompiling them every time we used a standard library would be wasteful). With our own functions, we need to do both steps.
The command:
$ gcc -Wall -std=c99 -c histogram.c
creates a file histogram.o
that contains the machine code for the function(s) in histogram.c
. The -c
flag says "take this .c
file and generate the machine code for it". The .o
stands for an "object file", which is the C term for a file with machine code in it. The command:
$ gcc -Wall -std=c99 -o cardGame histogram.o cardGame.c
compiles the functions in cardGame.c
into machine code, links in the machine code found in histogram.o
, and then creates an executable called cardGame
.
An automated tool called make
can be used to automate and simplify this process. For this assignment we have provided you with a Makefile
(a make
program); details are available here. Note: make
is not part of the curriculum of this course, and you are not expected to understand how make
works, nor will you be tested on it. The Makefile
is there to simplify things for you, but you are not required to use it or understand exactly how it works.
A note about IDEs (Integrated Development Environments, like Xcode, Visual Studio, or Code::Blocks)
You can ignore this section if you are not using an IDE.
If you are using one of these systems, you won't be able to add both .c
starter code files to the same project, since they both have a main()
function, and a program can't have more than one active main()
in it. This is a problem, since you want both files to share a single copy of histogram.c
and histogram.h
so that you don't accidentally end up with two slightly different versions of the histogram files. There are various ways to deal with this, but the details will vary depending on the IDE, so they are outside the scope of this document. However, one option that should work in any IDE is to add a #define
statement to each of the .c
files to temporarily disable that .c
file's main()
. (This makes use of the same mechanism we have in place to do some of the auto-marking.) Adding:
#define CARDGAME_MAIN_DISABLED true
or
#define LETTERFREQUENCY_MAIN_DISABLED true
to the top of the appropriate file will disable that file's main()
, and allow the files to coexist in the same project. If you do this, make sure you delete that line before submitting the file. If you submit a file with that line in place, you may receive a mark of 0 on the auto-marker, since your program won't compile properly. Changing the line to #define CARDGAME_MAIN_DISABLED false
is not acceptable; the line must be deleted. A file with this line in place will fail to compile (with the Makefile
) on ECF, so you should detect this problem when you test your code on ECF. (You are testing your code on ECF, right?).
What to do
Begin by reading over the starter code, and thinking about what the structure of the code needs to be. Note: the starter code contains several instances of the command #ifndef SOMETHING
and #endif
. We will talk about these a bit later in the course, but for now you can just ignore those lines. It's also worth noting another #define
d variable in the files:
/* Should we print out extra debugging information? */
/* MUST be set to false prior to submission. */
#define DEBUGGING false
This is a debugging technique that lets you add helpful print statements to your code, and then turn them on or off depending on whether you are trying to debug something. For example, you could add in a loop that printed out your deck of cards after you've created them to make sure that they are initialized correctly. Once you're sure they are, you want to remove those print statements, since the final output of the program shouldn't have that output. However, if you delete them, and then later decide that you want to debug something that relates to them, you'll need to add back in the same code that you previously deleted. If instead you do something like this:
if (DEBUGGING)
{
// Loop goes here
}
then that code is still in place. By changing a single variable (DEBUGGING
), you can enable or disable all of the extra print statements. Note: please make sure that you've turned DEBUGGING
off prior to submission, since any additional debugging output would be marked wrong by the auto-marker.
Step 1
Begin by writing code to initialize the deck of cards. There is already an array of int
s set up (deck
) to hold the card values, so all you need to do is write code that stores NUM_SUITS
copies of each card value in the array. The card values should be stored in suit order, so deck
should have A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K, A, 2, 3, 4, etc.
Hint: make sure you verify that your deck is initialized correctly before you go any further, since bugs here may cause problems later on. (As a general rule, you should always make sure that you test each part of your program before moving on to the next part, but this is particularly true when dealing with setting up fundamental data structures.) For instance, as indicated above, you might want to print out the array's contents, but you should make sure that this code is inside of an if (DEBUGGING)
block so that it is not printed in your final submission.
Steps 2 and 3
Next, write code that prompts the user for the number of hands, and then the number of cards per hand. Remember that each of these prompts needs to validate that the user input is greater than 0.
Step 4
Next, calculate the largest single card value in the deck. This can be done in a single line of code. Then, calculate the largest possible score per hand.
Step 5
Notice that we have an array scoreOccurrences
that will keep track of the number of times we see each score. In other words, the value in element i
in that array corresponds to the number of times we have seen the score i
. Also notice that this array is initialized to be largestHand + 1
elements long, not largestHand
. Why is that?
Because scoreOccurrences
is a C99 Variable Length Array, we can't use an initializer with it, so we need to initialize it manually. (There is no advantage to manually initializing the array; if possible, we'd prefer to use an initializer, and just say int scoreOccurrences[largestHand + 1] = {0}
. The fact that we can't is simply a limitation of C.) Write a loop that sets every element of scoreOccurrences
to 0
.
Step 6
Now, we will run our simulation. Write a loop that runs for numHands
iterations. The loop should declare a variable to keep track of the hand total, and then draw numCardsPerHand
cards. For each draw, you'll need to get a pseudorandom index, look up the value for that card, and add it to the hand's total. After drawing numCardsPerHand
, you'll need to store the result in the appropriate spot in scoreOccurrences
. You might want to print out additional debugging information along the way. (Hint: the comments in this part of the starter code are indented according to the loop structure.)
When you first test your code, you will notice that all you draw are Aces. This is because we haven't yet written getRandomIndex()
, and the current stub implementation always returns 0. We will correct this in our next step.
Step 7
Write getRandomIndex()
. This can be done in a single line of code. Remember that you are supposed to return a valid index of cards
; make sure that you think about the maximum and minimum values that you should be returning. (It is easy to be off-by-one here, so make sure that you've really thought about it.)
Step 8
Write a loop that prints out the numeric data in scoreOccurrences
. The body of this loop is a single line of code. Remember that, as mentioned in class, if you subtract pointers from each other and try to treat that value as an int
, you might get a compiler warning about it. This is one of the few compiler warnings (the only one in fact) that you can ignore for now; we'll talk about it in more detail later in the course.
At this point, your program should prompt the user for the simulation conditions, run the simulation, and print out a numeric table of results. We are now ready to print the histograms, so we will turn our attention to createHistogramRow()
.
Step 9
Go to histogram.c
, and fill in the single function. Notice that the top of the file includes histogram.h
.
Remember that createHistogramRow()
does not create the array that it fills. That array is created by the function that is calling createHistogramRow()
, and createHistogramRow()
just fills it in. How large does that array need to be?
Also remember that createHistogramRow()
doesn't print anything out, although you may want to add some additional debugging statements that do.
If you want to test your createHistogramRow()
separately, you can write another program that calls the function. If you did this, you would just include histogram.h
, and then link in histogram.o
in the same way that the two existing programs do. You could also write a main()
function in histogram.c
, and then compile and run that single file. However, if you do this, make sure that you delete the main()
function before trying to use histogram.o
with any other file; if you neglect to do so, your program won't compile, since there will be two copies of main()
.
Step 10
Now that you have createHistogramRow()
working, we will return to cardGame.c
, and write the code to print out the histogram. Write another loop that goes through each element of scoreOccurrences
, creates an array of char
s to hold the histogram row, calls createHistogramRow()
with the appropriate parameters to fill in that array, and prints out the label and the histogram row. (Hint: the histogram row is an array of characters that has a \0
at the end. There is an easy way to print one of those.)
At this point, cardGame
is done! We will now turn our attention to letterFrequency
, which is quite similar to the code that we have already written.
Step 11
First, take a look at the framework that is in place in letterFrequency.c
, and the helper function that we have. In this program, we will need to store a list of occurrences in exactly the same way we did in the previous program, but this time we don't have quite as natural a mapping as we did before. In cardGame
, we had a score that could be directly mapped to a scoreOccurrences
index (e.g., the occurrences of a score of 2 were stored in index 2 of the array). In letterFrequency
, our items are lowercase letters, which do not have quite as direct a mapping. (We could in theory use the ASCII values of the characters (so 'a'
would be index 97), but that technique is more fragile and can run into some subtle complications.) We solve this by using a mapping function, letterToIndex()
, which takes a character and returns an int
that we will use as an array index (i.e., it maps from a char
to an int
). Our list of valid characters is stored in LETTERS
; the function returns a special index (LETTER_NOT_FOUND
) if the specified character is not present in that list.
First, print out a prompt to the user, and allocate a buffer (of char
s) to hold the user's input. Since we want to be able to accommodate TEXT_MAX_LENGTH
characters, how large does the buffer need to be? Next, read in a sequence of text from the user. Remember that you need to use fgets()
instead of gets()
.
Step 12
Next, create an array to hold the occurrences of each letter, and initialize each element to 0
. Since we're not using a C99 Variable Length Array, you can use an initializer to do this.
Process each character in the buffer. Read the documentation for fgets()
to make sure that you know when to stop. For each character, you need to convert it to an index, and then check if that index is LETTER_NOT_FOUND
. If it is, you need to print an error message and not count that character. If it's not, then you need to count the occurrence of the character. (Hint: counting the occurrence of the character may involve more than one operation and/or variables.)
Step 13
At this point, you may notice that all of your characters are being counted as 'a'
, since the stub version of letterToIndex()
always returns 0
. Fill in this function so that it operates correctly. You will need to traverse LETTERS
to look for a match.
Step 14
Finally, print out the numeric and histogram summaries. This code is very similar to that in cardGame
, although the arguments to createHistogramRow()
will be different. Note that you are using exactly the same code (i.e., the same histogram.o
file) to generate the histograms.
That's it! Make sure you test your programs thoroughly.
Tester
We will provide you with a simple tester program that will check the format of your output, but not your calculations. Details will be posted here shortly. The tester is now available. See the Announcements and Updates section for details.
Submission
Submit the files cardGame.c
, letterFrequency.c
, and histogram.c
via MarkUs before the due date. Late submissions will not be accepted.