Skip to content

Commit 59173d3

Browse files
authored
Update README.md
1 parent bd20e11 commit 59173d3

File tree

1 file changed

+57
-0
lines changed

1 file changed

+57
-0
lines changed

README.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,59 @@
11
# datamining-project
22
This is a project for FGCU's knowledge discovery and data mining course.
3+
It focuses on spam email classification.
4+
5+
Part 1 of the project is an implementation of a spam email classifier from scratch in Java 8,
6+
using a Naive Bayes classifier, and also a K-nearest-neighbors classifier.
7+
8+
Part 2 of the project is a pair of experiments in Scikit learn with various other algorithms.
9+
10+
## Run instructions:
11+
12+
### Ensure that the dataset meets the following criteria:
13+
- All files must be .txt files
14+
- All files must be stored alone in their directory with no other files
15+
- All files be divided between two subdirectories, `test` and `training`
16+
- All spam emails are expected to have a filename beginning with "sp"
17+
18+
### Usage:
19+
20+
Part 1:
21+
22+
```
23+
usage: Classify
24+
data_path
25+
algorithm_name (Either "knn" or "naivebayes")
26+
[k] The K value to use - only used if algorithm is "knn"
27+
```
28+
29+
Part 2:
30+
31+
```
32+
usage: python experiments.py
33+
34+
There are no arguments - the program is merely a script which conducts experiments with various values.
35+
```
36+
37+
38+
### Run instuctions:
39+
40+
Part 1:
41+
42+
1. Ensure that you have Java JRE 8 installed.
43+
2. Compile the program using Gradle:
44+
(From the project's part1 subdirectory)
45+
`./gradlew build`
46+
The generated Java class files will then be in the /build/classes/java/main directory.
47+
(The program can also be compiled like a standard java program if this has problems).
48+
3. Run the program as `java Classify arguments`
49+
The arugments are described in the above section
50+
51+
Part 2:
52+
53+
1. Ensure that you have Python 3.7 or greater installed.
54+
a. If you have _both_ Python 2 and Python 3 installed, you will need to ensure you run the program using `python3`.
55+
2. Ensure that you have pip installed.
56+
b. If you have _both_ Python 2 and Python 3 installed, you will need to run pip commands using `pip3`.
57+
3. Install scikit-learn: `pip install sklearn`
58+
4. Install numpy: `pip install numpy`
59+
5. Run the program as: `python experiments.py path_to_data`

0 commit comments

Comments
 (0)