Chapter 7 Conclusion
With our analysis, we have a deeper understanding of the relationships between dog bites and breeds, boroughs, time, genders, spayed/neutered status, and ages, which could assist the management department and dog owners have more targeted control and care for dogs.
Our analysis of breeds have found that the breed with the highest proportion of biting people is “Pit Bull”. To figure out if the reason behind it is about its higher population among all dogs in NYC or simply about this breed’s natural potential of being aggressive, we use the NYC Dog Licensing Dataset and we think the reason may be the aggression of Pit Bulls. Follow this approach of combining our data and the population composition of dogs, we analyze the relationships between dog bites and other characteristics. There are many revealing findings. The “Queens” has the largest number of dog bites and we think this may due to the inadequate management or related policies in this borough. The aggression of dogs may be related to temperature. The higher the temperature, the more dog bites. Dogs of unknown gender are more likely to bite people and spayed/neutered status will not influence the number of dog bites a lot for male dogs and female dogs. We also think that male dogs may have greater natural potential of being aggressive than female dogs. Younger (i.e., 1-5 years old) dogs are more likely to bite people.
These findings could remind the dog’s owners to pay closer attention to their dogs while walking them in conditions that are proven to be more likely to make their dogs aggressive. For example, if you walk a young male Pit Bull in Queens in a hot summer day, you should control it better since it is more aggressive according to our analysis. Our findings are also beneficial for the management department. For example, they could analyze the reason of the high proportion of dog bites in Queens and improve the management or policies.
Our analysis is limited to some extent. First, the data is not entirely satisfactory. For example, if we could match each dog in Dog Bite Data to its license in NYC Dog Licensing Data, we could have the data of “dogs that bite people” and “dogs that do not bite people”. Then we could train a classifier that, given the breed, age, gender, neuter situation, borough, and Zipcode of a certain dog, to predict the probability of a dog attacking people. Second, our analysis also has some limitations. For example, we did not find many relationships between different characteristics of dogs that bite people. We also stop analyzing the dog bites in different boroughs since we are not familiar with specific management or policies in these boroughs.
In the future, we will find more useful data sources to help us further analyze the characteristics of dog bite events. If we could find a data source for licensed dogs with the column “Bite or Not”, we would try to train the classifier mentioned above to predict the dog bite event. We also realize that we have some deficiencies in many Data Analysis and Visualization tools, so we could not choose the best tool in the specific situation sometimes. We will gain a more thorough understanding of the related knowledge.