Skip to content

Bus Lines Classification using DTW and LCS as similarity measures

Notifications You must be signed in to change notification settings

giorgospan/Bus-Lines-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Bus Lines Categorization

This is the second assignment of "Data Mining" course (spring 2018).

Requirements:

Dataset format:

img not found

  • Part 1: The purpose of this part is to familiarize us with the use of gmplot python library by visualizing 5 different bus lines (i.e: journeyPatternIDs).

  • Part 2: For every bus line(i.e: trajectory) in test_set_a1.csv we need to find its 5 neighbors* from the train_set.csv file. We utilize Dynamic Time Warping (DTW) as similarity measure between two trajectories.

  • Part 3: In this part we do the same as the in the previous part with the exception of utilizing Longest Common Subsequence (LCS) as similarity measure this time.

  • Part 4: The main task of this part is to predict the bus line that each trajectory in test_set_a2.csv is part of. For this purpose we create a typical KNN-Classifier.

Multiprocessing is achieved using Python's multiprocessing module.

* By "neighbors" we mean the 5 most similar trajectories to the one currently being tested.

About

Bus Lines Classification using DTW and LCS as similarity measures

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published