In this exploration we want to analyze the free text summary written for each incident. Here we are visualizing the reports in bulk, as opposed to using keyword searches or word clouds in an attempt to figure out which of the 650 reports we should be reading. Rather than counting words we determine the most unusually occurring words in each report. Then we use a clustering algorithm to group reports that are similar. For this analysis we only link reports with a similarity of 50% or higher.
By clustering the incident reports in this method we can see that most similar incidents that were reported in metro stations and administration buildings were hoax or false reports which leaded to no causalities.