통합 검색 | Korea Science

Re-SSS: Rebalancing Imbalanced Data Using Safe Sample Screening

Shi, Hongbo;Chen, Xin;Guo, Min
- Journal of Information Processing Systems
- /
- 제17권1호
- /
- pp.89-106
- /
- 2021
Different samples can have different effects on learning support vector machine (SVM) classifiers. To rebalance an imbalanced dataset, it is reasonable to reduce non-informative samples and add informative samples for learning classifiers. Safe sample screening can identify a part of non-informative samples and retain informative samples. This study developed a resampling algorithm for Rebalancing imbalanced data using Safe Sample Screening (Re-SSS), which is composed of selecting Informative Samples (Re-SSS-IS) and rebalancing via a Weighted SMOTE (Re-SSS-WSMOTE). The Re-SSS-IS selects informative samples from the majority class, and determines a suitable regularization parameter for SVM, while the Re-SSS-WSMOTE generates informative minority samples. Both Re-SSS-IS and Re-SSS-WSMOTE are based on safe sampling screening. The experimental results show that Re-SSS can effectively improve the classification performance of imbalanced classification problems.
https://doi.org/10.3745/JIPS.01.0065 인용 PDF KSCI

비례위험모형에서 정보적 중도절단의 효과 (Effects of Informative Censoring in the Proportional Hazards Model)

정대현;홍승만;원동유
- 한국신뢰성학회지:신뢰성응용연구
- /
- 제2권2호
- /
- pp.121-133
- /
- 2002
This paper concerns informative censoring and some of the difficulties it creates in analysis of survival data. For analyzing censored data, misclassification of informative censoring into random censoring is often unavoidable. It is worthwhile to investigate the impact of neglecting informative censoring on the estimation of the parameters of the proportional hazards model. The proposed model includes a primary failure which can be censored informatively or randomly and a followup failure which may be censored randomly. Simulation shows that the loss is about 30% with regard to the confidence interval if we neglect the informative censoring.
PDF

픽토리얼 타이포그래피가 사용된 인쇄 광고의 커뮤니케이션 효과 연구 (Communication Effects of Print Ad Having Pictorial Typography)

이광숙;곽보선
- 한국인쇄학회지
- /
- 제30권2호
- /
- pp.13-22
- /
- 2012
This research attempts to analyze communication effects of print ad having pictorial typography. 150 Questionnaires were distributed to respondents staying Daejeun City and 148 copies were retreated for five days from April 22nd to 26th, 2012. Frequency analysis, factor analysis, Cronbach's alpha for reliability analysis were utilized for data analysis with SPSS 12.0. For testing hypothesis, regression analysis was used. As result of testing hypothesis, 'informative, beneficial, creative, reliable' were partially significant to attitude towards print ad having pictorial typography. That means 'creative' and 'reliable' were insignificant, while 'informative' and 'beneficial' are significant. Variable of the most influencing on attitude towards advertising is 'informative.' 'Informative, beneficial, creative, and reliable' were partially significant to brand attitude, too. That means 'beneficial' and 'creative' were insignificant, while 'informative' and 'reliable' were significant. Variable of the most influencing on brand attitude was 'reliable.' Therefore, to enhance communication effect of print ad having pictorial typography, 'informative' and 'reliable' are most significant variables.
PDF KSCI

다변량회귀에서 정보적 설명 변수 공간의 추정과 투영-재표본 정보적 설명 변수 공간 추정의 고찰 (Note on the estimation of informative predictor subspace and projective-resampling informative predictor subspace)

유재근
- 응용통계연구
- /
- 제35권5호
- /
- pp.657-666
- /
- 2022
정보적 설명 변수 공간은 일반적인 충분차원축소 방법들이 요구하는 가정들이 만족하지 않을 때 중심부분공간을 추정하기 위해 유용하다. 최근 Ko와 Yoo (2022)는 다변량 회귀에서 Li 등 (2008)이 제시한 투영-재표본 방법론을 사용하여 정보적 설명 변수 공간이 아닌 투영-재표본 정보적 설명 변수 공간을 새로이 정의하였다. 이 공간은 기존의 정보적 설명 변수 공간에 포함되지만 중심 부분 공간을 포함한다. 본 논문에서는 다변량 회귀에서 정보적 설명 변수 공간을 직접적으로 추정할 수 있는 방법을 제안하고, 이를 Ko와 Yoo (2022)가 제시한 방법과 이론적으로 그리고 모의실험을 통해 비교하고자 한다. 모의실험에 따르면 Ko-Yoo 방법론이 본 논문에서 제시한 추정 방법보다 더 정확하게 중심 부분 공간을 추정하고, 추정값들의 변동이 적다는 측면에서 보다 더 효율적임을 알 수 있다.
https://doi.org/10.5351/KJAS.2022.35.5.657 인용 PDF KSCI

Application of Principal Component Analysis Prior to Cluster Analysis in the Concept of Informative Variables

Chae, Seong-San
- Communications for Statistical Applications and Methods
- /
- 제10권3호
- /
- pp.1057-1068
- /
- 2003
Results of using principal component analysis prior to cluster analysis are compared with results from applying agglomerative clustering algorithm alone. The retrieval ability of the agglomerative clustering algorithm is improved by using principal components prior to cluster analysis in some situations. On the other hand, the loss in retrieval ability for the agglomerative clustering algorithms decreases, as the number of informative variables increases, where the informative variables are the variables that have distinct information(or, necessary information) compared to other variables.
https://doi.org/10.5351/CKSS.2003.10.3.1057 인용 PDF KSCI

ON THE LEAST INFORMATIVE DISTRIBUTIONS UNDER THE RESTRICTIONS OF SMOOTHNESS

Lee, Jae-Won;Park, Sung-Wook;Nikita Vil'checvskiy;Georgiy Shevlyakov
- 대한수학회지
- /
- 제35권3호
- /
- pp.755-764
- /
- 1998
The least informative distributions minimizing Fisher information for location are obtained in the classes of continuously differentiable and piece-wise continuously differentiable densities with the additional restrictions on their values at the median and mode of population in the point and interval forms. The structure of these optimal solutions depends both on the assumptions of smoothness and form of characterizing restrictions of the class of distributions: in the class of continuously differentiable densities, the least informative distributions are finite and have the cosine-type form, and, in the class of piece-wise continuously differentiable densities, the least informative densities have exponential-type tails, the Laplace density in particular. The dependence of optimal solutions on the assumptions of symmetry is also analyzed.
PDF

Mean estimation of small areas using penalized spline mixed-model under informative sampling

Chytrasari, Angela N.R.;Kartiko, Sri Haryatmi;Danardono, Danardono
- Communications for Statistical Applications and Methods
- /
- 제27권3호
- /
- pp.349-363
- /
- 2020
Penalized spline is a suitable nonparametric approach in estimating mean model in small area. However, application of the approach in informative sampling in a published article is uncommon. We propose a semiparametric mixed-model using penalized spline under informative sampling to estimate mean of small area. The response variable is explained in terms of mean model, informative sample effect, area random effect and unit error. We approach the mean model by penalized spline and utilize a penalized spline function of the inclusion probability to account for the informative sample effect. We determine the best and unbiased estimators for coefficient model and derive the restricted maximum likelihood estimators for the variance components. A simulation study shows a decrease in the average absolute bias produced by the proposed model. A decrease in the root mean square error also occurred except in some quadratic cases. The use of linear and quadratic penalized spline to approach the function of the inclusion probability provides no significant difference distribution of root mean square error, except for few smaller samples.
https://doi.org/10.29220/CSAM.2020.27.3.349 인용 PDF KSCI

정보콘텐츠산업의 경영 실태에 관한 연구 (The Realities of Management in the Informative Contents Industry)

김경일;이용환
- 디지털콘텐츠학회 논문지
- /
- 제8권2호
- /
- pp.157-163
- /
- 2007
정보콘텐츠산업은 디지털 기술의 진화와 발맞추어 평균적으로는 안정된 자본구조와 우수한 이익률지표, 높은 성장률과 높은 생산성을 나타내고 있으나, 거의 모든 지표에서 점차 악화되고 있는 추세를 보이고 있다. 이것은 정보콘텐츠산업의 발전에 따른 시장 수요의 증가보다도 공급 시장의 확대가 더 빠른 속도로 진행되고 있어 경쟁이 격화되고 있고, 비교적 시장진입이 용이하기 때문인 것으로 분석된다. 정보콘텐츠산업의 지속적인 발전을 도모하기 위해서는 새로운 시장의 개척과 이를 뒷받침하기 위한 신기술의 개발이 필요하며, 이를 위하여 정책적인 자금 지원 및 조세정책적인 측면에서의 지원이 절실하다.
PDF

QoL에 의한 정보형 중도탈락의 모형화 (Modelling the Informative Dropouts with QoL)

이기훈
- 한국신뢰성학회지:신뢰성응용연구
- /
- 제6권3호
- /
- pp.225-237
- /
- 2006
This paper proposes a method of modelling the informative dropouts with QoL(quality of life) in survival analysis. QoL is the index to measure the health related quality of life of a patient who got some treatments for a disease. Dropouts are prevalent occurrences on longitudinal study They are commonly dependent to the QoL of patients, that is, severe disease or death and called informative dropouts. Modelling the mechanism of dropouts could achieve the more accurate inference for survival analysis. A likelihood method is proposed to estimate the survival parameter and test the patterns of dropouts.
PDF

A Study on One Factorial Longitudinal Data Analysis with Informative Drop-out

Lee, Ki-Hoon
- Journal of the Korean Data and Information Science Society
- /
- 제17권4호
- /
- pp.1053-1065
- /
- 2006
This paper proposes a method in one-way layouts for longitudinal data with informative drop-out. When dropouts are informative, that is, correlated with unobserved data and/or the previous observed data, the simple imputation methods such as 'last observation carried forward' (LOCF) methods would arise the bias of the testing models. The maximum likelihood procedure combined with a logit model for the drop-out process is proposed to test treatment effects for one factorial designs and compared with LOCF method in two examples.
PDF

검색결과 758건 처리시간 0.03초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)