CS-1030 Software Design 2
Lab 3: File Text Search application

Objectives

Assignment

This is a two-part, two-week lab.

Following the design indicated in the UML class diagram below, write a console-based application that searches text files for all occurrences of a given string within the file, and for each occurrence, indicate whether the string was an entire word (i.e. surrounded by whitespace) or just part of a word. Further classify the "container" word as alpha-only, numeric-only, alphanumeric-only, or "general" (meaning it also contains non-alphanumeric characters like -,*,&, etc. Finally, report the time required to complete the search and produce the output.

The application should prompt the user for the name of the file to be searched and the substring to be searched for within the file. After acquiring this information from the user, the application should report the result of the search, listing the 1-based column and 1-based line number of every occurrence of the substring found within the file.

Hints: Investigate <string>, <fstream>, <ctime>, <ctype> <iostream>, <iomanip>. Study the string.find() and clock() methods in particular.

One possible example of the user interface is as shown below. Yours does not have to look identical, but the output should appear neatly formatted and understandable.

Welcome to the CS1030 Search Program.
Enter the name of the file you want to search: example.txt
Enter the string you want the search to locate: th
Working...

Results of search for "th":

1: Line 1, Column 1 as part of the alphabetic word "thinking" :
"...thinking about that,..."

2: Line 1, Column 16 as part of the non-alphanumeric word "that," :
"...thinking about that,..."

3: Line 2, Column 3 as part of the alphabetic word "thought" :
"...I thought there would be..."

4: Line 2, Column 11 as part of the alphabetic word "there" :
"...I thought there would be..."

5: Line 3, Column 4 as part of the alphabetic word "further" :
"...further opportunities..."

6: Line 5, Column 12 as part of the alphabetic word "depth" :
"...in more depth..."


Search completed in 0.03 seconds.

 

If the substring was not found anywhere within the file, the application should output

The string "asdf" you specified was not found in example.txt.

The UML diagram illustrating the skeletal structure of your application is shown below:

Implementation Details

Follow the "blueprint" given by the UML diagram; you must implement the public aspects of the classes and functions as shown.

The application contains three global functions - main(), getUserInput, and printOutput. These should be implemented in a file called TextSearchApp.cpp.  Between these global functions, you prompt the user for input, create a FileTextSearcher object, call the methods on the object to do the search and return the results, and print the results to the console.

The FileTextSearcher class should be implemented in FileTextSearcher.cpp and implement the following public member methods:

FileTextSearcher(std::string file, std::string searchString) - constructor; first parameter is a string argument containing the full path to the file to be searched. Second parameter is a string argument containing the string to be searched. Initializes any private member variables you may need. Since constructors cannot return any type of value indicating success or failure, this constructor throws an exception if the file does not exist, cannot be opened, or if searchString is empty. The exception thrown is just a std::string, so it's easy to catch from your main method.

SearchResult getNextSearchResult() - searches the specified text file, beginning at the start of the file. You call this method multiple times, and each time it creates and returns a SearchResult object containing the next found occurrence of the searchString. When the end of file is reached, an empty SearchResult object is returned. 

~FileTextSearcher() - destructor; closes the file stream used to read the file.

FileTextSearcher never writes anything to the console; it is only a "search engine" and leaves IO responsibility to the caller - which in this case is your main application (consisting of the three global functions).

 

The SearchResult class should be implemented in SearchResult.cpp and implement the following public member methods:

SearchResults( int nLine, int nCol, string lineWhereFound ) - constructor; used to create a SearchResults object containing the results of a search. If lineWhereFound is empty, the isEmpty method returns true.

bool isEmpty() - returns true if the object is empty (negative search result), false if the object contains a positive search result.

int getLineNumber() - returns a 1-based line number where the search result was located

int getColumnNumber() - returns a 1-based column number where the search result was located

string getLineText() - returns the line containing the search result.

bool isAlphabetic() - returns true if the search result was found within an alphabetic token

bool isNumeric() - returns true if the search result was found within a numeric token

bool isAlphanumeric() - returns true if the search result was found within an alphanumeric token

~SearchResults() - destructor; cleans up anything as appropriate.

Lab Reports

This is a 2 week lab, with the final lab report due 11:00pm the day before Lab 4 begins. However, there is an initial lab report due at 11:00pm the day before the second Lab 3 session. The content of each lab report is described below.

After 1 week, you must submit completed UML diagrams, created with Enterprise Architect - a single EA project file will contain both required diagrams. The first must show an accurate, detailed static class diagram (ALL member variables and ALL methods for each class), as well as a detailed sequence diagram of the search operation (for a single search), showing the call sequence from one object (or function, as the case may be) to another. Start with this EA project file as a basis for your work.

In addition, you must have a preliminary implementation of your classes - that is, the header files containing the class declarations must be completed, with both member variable and member function declarations in place. These declarations must be thoroughly commented, particularly with respect to the member variables you decide to employ. You should also insert comments on the operation of your class methods (e.g. how they generally work; what library functions they use). Look at Internal documentation as a guideline for comments. Also, if you decide you need additional private member functions (methods), you should thoroughly comment what they do as well. You must give some serious thought to how your application and classes will work in order to get this done correctly.

At the end of week 2 (by 11:00pm the day before Lab 4 begins), you must submit your final materials, including an updated UML diagram, your completed program, and a lab report using this Word document as a template. Any changes to your UML diagrams or class declarations must be clearly explained and justified in your lab report.

You must demonstrate your program to me during Lab 4.

Submit your assignment following these instructions:
  1. Navigate to your VS .NET project directory (see this note on configuring the workspace).
  2. You should have a subdirectory under your VS .NET project directory with the name of the project you used for this lab. 
  3. Upload your submission through WebCT (assignment "Lab 3"), selecting only the .cpp, .h, .sln, .vcproj, and .doc files from this directory. Upload each file separately; do not zip files.
  4. Enter the overall time you spent on this lab into the FAST system (for weeks 3 and 4).
Be sure to keep copies of all your files, in case something gets lost.

Your lab grade will be determined by the following factors:

Program quality Report quality