Skip to content

ownx/email_classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Email Classification Project

This is my project for classifying emails into different types (Type 2, Type 3, and Type 4). I tried two different ways to do this:

  1. Chained approach - where each model feeds into the next one
  2. Hierarchical approach - where models are organized in a tree structure

My Project Files

The main folders are:

  • config - has some settings
  • data - for loading emails and cleaning them
  • models - contains my classifier code
  • utils - helper functions I wrote

The main file to run is main.py

Stuff you need to install

You need these packages:

  • scikit-learn (for ML algorithms)
  • numpy
  • pandas (not using it much yet)
  • nltk (for text processing)

I think Python 3.7 or newer should work fine.

How to run it

Just run this command in the terminal:

python main.py

How it works

Chained Approach

In this approach:

  • First I predict if it's Type 2 or not
  • Then I use that result to help predict if it's Type 3
  • Finally I use both previous results to predict Type 4

This works because the Types might be related to each other.

Hierarchical Approach

In this approach:

  • First figure out Type 2
  • Depending on Type 2 result, use a specific model for Type 3
  • Then use both results to pick the right model for Type 4

I'm still working on improving the accuracy. The hierarchical one is more complex but might work better for some email types.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages