CSCI 261 - Programming Concepts - Summer 2019

A6 - Word & Letter Counts

This assignment is due by Thursday, June 6, 2019, 11:59 PM.
As with all assignments, this must be an individual effort and cannot be pair programmed. Any debugging assistance must be provided in accordance with the course collaboration policy.
Do not forget to complete the following labs with this set: L6A, L6B, L6C, L6D, L6E, L6F .

· Instructions · Rubric · Submission ·


In this homework, we will focus on arrays, vectors, strings, and structs.


Overview



For this assignment, you will ask the user to enter as many sentences as they wish. We will then count the unique words they entered as well as the letter frequencies.


Program from what YOU KNOW



Did you know that most programs aren't well-designed? They really aren't. Often, they're put together by geographically diverse people with different styles, knowledge, programming ability and design taste. But, the programs work. It's a freaky miracle, but they do work.

When given a task in your engineering field that could be solved with software, we do not expect you to be expert programmers (unless your field is software engineering). However, we do expect you to be able to assemble the things you know how to code into cohesive, meaningful, useful programs.

You know how to receive input from the user. You know how to declare and work with arrays. You know that C++ can be your friend or your enemy. You know how to conduct repetitive tasks with loops. You know about data types.

Try to rely on these things that you know, before over-complicating your own solution with things that you don't yet know. This doesn't mean you shouldn't explore and learn more. You should! Always! But, start with what you know first, and build from there.


ABCUF



Always Be Using Functions. Let that be a rule of thumb for the remainder of the semester. When working on your programs, try to reach an end result where main doesn't do much "low level stuff" but rather leverages functions in order to do what your program needs to do.

For example, if you were to write a pizza-making program, don't write one super-long, hard-to-read, scare-your-date-away implementation of main . Use functions:

int main() {
    CreateShoppingList();
    BuyIngredients();
    GatherIngredientsInKitchen();
    PourYourselfGlassOfFineChianti();
    MakeDough();
    MakaSauce();
    //...
    CookPizza();
    return 0;
}


The Specifics Part I: User Input



We want to prompt the user to enter as many sentences as they would like. Inform the user to enter the sentence "I'm done!" to signify they have finished entering sentences. What should we do with each sentence the user enters? Read on.


The Specifics Part II: Counting



For each sentence the user enters, we want to keep track of all the unique words they have entered for the life of the program and the number of times each word has appeared. To do this, we will need to split the sentence in to individual words and removing any punctuation. We will be ignoring the individual case of each word. Therefore THE, ThE, the, and tHE all are the same word.

You must first create a struct named WordCount to represent a word and the number of occurrences. It must have two members, a count stored as an int and a word stored as a string. Since this struct definition will be used in multiple files, you will need to create this in its own header file and then include this header in any other file that requires the use of the WordCount struct.

Next we'll need to have a vector of WordCount to store all of our words and their counts. If this is the first time you are seeing a word then you need to insert it into your vector. Otherwise, if you have seen this word before then you will need to increment the count. (Hint: you'll need to use a searching algorithm on your vector)


The Specifics Part III: Statistics



After the user has signaled they are done entering sentences, you will then need to sort the vector alphabetically by words. (Hint: you'll need to use a sorting algorithm on your vector). Print out all the words and their counts using the following format (substituting the actual words and counts):

# 1 AWORD   :  3
# 2 WORDS2  : 14
...
#21 WORDS21 :  1

Note how the data is aligned and words are alphabetical. Finally, print out the least frequent word with its count along with the most frequenct word and its count. (Hint: you'll need to use the minMax algorithm.)

Most Frequent:  WORDS2  (14)
Least Frequent: WORDS21 ( 1)

We used a vector to store the words and in that context it was an appropriate choice. We are next interested in the frequency of all the letters that appear in the user's input. For this use case, an array is more appropriate (think about why). We want to be sure we are treating uppercase and lowercase letters the same in our counting. Once the counts for each letter have been stored in an array (what is the appropriate type and size of the array?), print out the letter and frequency in the following format (using 3 decimal places). Also, print the most and least frequent letter with their counts. In the event that two letters have the same frequency, report the letter that comes first alphabetically.

A: 30.123%
B:  0.532%
C: 10.001%
...
Z:  5.330%
Most Frequent:  Q (1000)
Least Frequent: E (0)


Example Run



Below is an example of the program running:

Enter as many sentences as you like and enter "I'm done!" when completed:
The quick brown fox jumps over the lazy dog.
A man, a plan, a canal, Panama.
The quick brown fox jumps over the lazy dog.
This is quite fun.
THE THe ThE The tHE tHe thE the
I'm done!

Word Counts
-----------
# 1 A       3
# 2 BROWN   2
# 3 CANAL   1
# 4 DOG     2
# 5 FOX     2
# 6 FUN     1
# 7 IS      1
# 8 JUMPS   2
# 9 LAZY    2
#10 MAN     1
#11 OVER    2
#12 PANAMA  1
#13 PLAN    1
#14 QUICK   2
#15 QUITE   1
#16 THE    12
#17 THIS    1

Most Frequent:  THE   (12)
Least Frequent: CANAL (1)

Letter Counts
-------------
A:  9.302%
B:  1.550%
C:  2.326%
D:  1.550%
E: 11.628%
F:  2.326%
G:  1.550%
H: 10.078%
I:  3.876%
J:  1.550%
K:  1.550%
L:  3.101%
M:  3.101%
N:  5.426%
O:  6.202%
P:  3.101%
Q:  2.326%
R:  3.101%
S:  3.101%
T: 10.853%
U:  4.651%
V:  1.550%
W:  1.550%
X:  1.550%
Y:  1.550%
Z:  1.550%

Most Frequent:  E (15)
Least Frequent: B (2)



Functional Requirements





Hints





Grading Rubric


Your submission will be graded according to the following rubric.

PointsRequirement Description
2 All code submitted properly.
12 All labs completed and submitted
L6A, L6B, L6C, L6D, L6E, L6F
4 Output format matches specifications from example.
1 Output matches example test case.
2 Output matches private test cases.
5 Words and letters are properly sorted ignoring case.
5 Arrays, vectors, string, structs used appropriately.
2 Functional requirements above met.
2 (1) Comments used
(2) Coding style followed
(3) Appropriate variable names, constants, and data types used
(4) Instructions followed
35 Total Points

This assignment is due by Thursday, June 6, 2019, 11:59 PM.
As with all assignments, this must be an individual effort and cannot be pair programmed. Any debugging assistance must be provided in accordance with the course collaboration policy.
Do not forget to complete the following labs with this set: L6A, L6B, L6C, L6D, L6E, L6F .


Submission


Always, always, ALWAYS update the header comments at the top of your main.cpp file. And if you ever get stuck, remember that there is LOTS of help available. The following instructions are copied from How to Submit Homework.


It is critical that you follow these steps when submitting homework.

If you do not follow these instructions, your assignment will receive a major deduction. Why all the fuss? Because we have several hundred of these assignments to grade, and we use computer tools to automate as much of the process as possible. If you deviate from these instructions, our grading tools will not work. And that makes us very unhappy. And when we're unhappy, we give penalties. Thus, make us happy.


Submission Instructions



Here are step-by-step instructions for submitting your homework properly:
  1. File and folder names are extremely important in this process. Please double-check carefully, to ensure things are named correctly.
    1. The top-level folder of your project must be named Set6
    2. Inside Set6, create 7 sub-folders that are required for this Set. The name of each sub-folder is defined in that Set (e.g. L6A, L6B, L6C, L6D, L6E, L6F, and A6).
    3. Copy your main.cpp , additional header & source files () plus the CMakeLists.txt file into the subdirectories of Set6 (steps 1-2), zip this Set6 folder (steps 3-4), and then submit the zipped file (steps 5-11) to Canvas.
    4. For example, when you zip/submit Set6, there will be 7 sub-folders called L6A, L6B, L6C, L6D, L6E, L6F, and A6 inside the Set6 folder, and each of these sub-folders will have a file called main.cpp, additional header & source files, plus the CMakeLists.txt file .

  2. Using Windows Explorer (not to be confused with Internet Explorer), find the file named "main.cpp" located inside the folder for the particular lab or homework assignment you will submit.

    STOP: Are you really sure you are viewing the correct assignment's folder?

  3. Now, for A6, right click on the main.cpp to copy the file. Then, return to the Set6/A6 folder and right click to paste the file. In other words, put a copy of your homework's main.cpp source code into the Set6/A6 folder. Repeat this for each additional header & source file you have with this assignment, plus CMakeLists.txt.

    Follow the same steps for L6A, to put a copy of your lab's main.cpp into the Set6/L6A folder. Repeat this process for Set6/L6B, Set6/L6C, Set6/L6D, Set6/L6E, Set6/L6F.

    STOP: Are you sure your Set6 folder now has all your code to submit?

  4. Now, right-click on the "Set6" folder.
    1. In the pop-up menu that opens, move the mouse "Send to..." and expand the sub-menu.
    2. In the sub-menu that opens, select "Compressed (zipped) folder".

    STOP: Are you really sure you are zipping a Set6 folder with sub-folders that each contain a main.cpp file in it?

  5. After the previous step, you should now see a "Set6.zip" file.

  6. Now visit the Canvas page for this course and click the "Assignments" button in the sidebar.

  7. Find Set6, click on it, find the "Submit Assignment" area, and then click the "Choose File" button.

  8. Find the "Set6.zip" file created earlier and click the "Open" button.

    STOP: Are you really sure you are selecting the right homework assignment? Are you double-sure?

  9. WAIT! There's one more super-important step. Click on the blue "Submit Assignment" button to submit your homework.

  10. No, really, make sure you click the "Submit Assignment" button to actually submit your homework. Clicking the "Choose File" button in the previous step kind of makes it feel like you're done, but you must click the Submit button as well! And you must allow the file time to upload before you turn off your computer!

  11. Canvas should say "Submitted!". Click "Submission Details" and you can download the zip file you just submitted. In other words, verify you submitted what you think you submitted!
In summary, you must zip the "Set6" folder and only the "Set6" folder, this zip folder must have several sub-folders, you must name all these folders correctly, you must submit the correct zip file for this homework, and you must click the "Submit Assignment" button. Not doing these steps is like bringing your homework to class but forgetting to hand it in. No concessions will be made for incorrectly submitted work. If you incorrectly submit your homework, we will not be able to give you full credit. And that makes us unhappy.


This assignment is due by Thursday, June 6, 2019, 11:59 PM.
As with all assignments, this must be an individual effort and cannot be pair programmed. Any debugging assistance must be provided in accordance with the course collaboration policy.
Do not forget to complete the following labs with this set: L6A, L6B, L6C, L6D, L6E, L6F .