The similarities analysis of location fishing information through 2 step clustering

2단계 군집분석을 통한 해구별 조업정보의 유사성 분석

  • Cho, Yong-Jun (National Federation of Fisheries Cooperatives, Fisheries Economic Institute)
  • 조용준 (수협중앙회 수산경제연구원)
  • Published : 2009.05.31

Abstract

In this paper, I would present a using method for The Fishing Operation Information(FOI) of National Federation of Fisheries Cooperatives(NFFC) through the availabilities analysis and put out the similarities by the section of the sea through classifying characteristics of fishing patterns by their locations. As a result, although the catch of FOI is nothing more than 33% level to National Fishery Production Statistics(NFPS), FOI data is useful in understanding the patterns of fishing operation by the location because both patterns and correlation were very similar in the usability analysis, comparing the FOI data with NFPS. So I classified optimal clusters for catch, the number of fishing days and the number of fishing vessels through 2 step cluster analysis by the big marine zone and divided fishing patterns.

수협의 어선조업정보는 국가 공식 통계가 가지고 있지 못한 위치별 조업정보를 가지고 있다는 장점이 있다. 위치별 조업정보는 해당 지역의 어업피해보상, 자원가치 산출 등을 추정할 수 있어 국가통계자료로의 가치가 매우 높으나 어업인들의 자기 정보의 노출에 대한 기피로 인해 신뢰성이 떨어지는 단점을 지니고 있다. 본 연구는 유용성분석을 통해 이러한 수협의 어선조업정보의 활용을 위한 방안을 제시하고 위치별 조업패턴의 특성을 분류하여 해구별 유사성의 정보를 산출을 목적으로 하였다. 분석결과 수협의 어선조업정보는 정부 생산통계대비 어획량의 약 33% 수준이나 유용성 분석에서 그 패턴과 상관관계가 밀접해 위치별 패턴파악에 유용한 것으로 나타났다. 이를 바탕으로 대해구별 2단계 군집분석을 통해 어획량, 조업일수, 조업척수에 대해 각각 최적의 군집을 구분하고 이를 종합하여 8개의 군집으로 패턴을 구분하였다.

Keywords

References

  1. Akaike, H. (1976). Canonical correlation analysis of time series and the use of an information criterion, in system fieldentification: Advanced and case studies, R. Mehra and D. G. Laniotis, eds., Academic Press, New York and London.
  2. Berry, M. J. A. and Gordon, L. (1997). Data mining techniques, John Wiley & Sons Inc., New York.
  3. Champman, P., Clinton, J., Khabaz, T., Reinartz, T. and Wirth, R. (1999). The CRISP-DM process modeling, CRISP-DM Consortium.
  4. Chiu, T., Fang, D., Chen, J., Wang, Y. and Jeris, C. (2001). A robust and scalable clustering algorithm for mixed type attributes in large database environment. Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 263-268.
  5. Cho, Y. J. and Hur, J. (2006). A study on improving the predict accuracy rate of hybrid model technique using error pattern modeling: Using logistic regression and discriminant analysis. Journal of the Korean Data & Information Science Society, 17, 269- 278.
  6. Cho, Y. J. and Ko, S. G. (2008). Segmentation of cooperatives' mutuality bank for effective risk management using factor analysis and cluster analysis. Journal of the Korean Data & Information Science Society, 19, 831- 844.
  7. Hand, D. J. (1981). Discrimination and classification, John Wiley & Sons, New York.
  8. Kohonen, T. (1982). Self-organized formation of topologically collect feature maps. Biological Cybernetics, 43, 59-69. https://doi.org/10.1007/BF00337288
  9. Kovesi, B., Boucher, J. M. and Saoudi, S. C. (2001). Stochastic K-means algorithm for vector quantization. Pattern Recognition Letters, 22, 603-610. https://doi.org/10.1016/S0167-8655(01)00021-6
  10. Selim, S. Z. and Ismail, M. A. (1984). K-means type algorithms: A generalized convergence theorem and characterization of local optimality. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 81-87. https://doi.org/10.1109/TPAMI.1984.4767478
  11. Schwarz, G. W. (1978). Estimating the dimension of model. The Annals of Statistics, 6, 462-464.
  12. Zhang, T., Ramakrishnan, R. and Livny M. (1996). BIRCH: An efficient data clustering method for very large databases. Proceedings of the ACM SIGMOD Conference on Management of Data, 103-114, Montreal, Canada.