• Title/Summary/Keyword: Local polynomial fit

Search Result 7, Processing Time 0.024 seconds

GOODNESS-OF-FIT TEST USING LOCAL MAXIMUM LIKELIHOOD POLYNOMIAL ESTIMATOR FOR SPARSE MULTINOMIAL DATA

  • Baek, Jang-Sun
    • Journal of the Korean Statistical Society
    • /
    • v.33 no.3
    • /
    • pp.313-321
    • /
    • 2004
  • We consider the problem of testing cell probabilities in sparse multinomial data. Aerts et al. (2000) presented T=${{\Sigma}_{i=1}}^{k}{[{p_i}^{*}-E{(p_{i}}^{*})]^2$ as a test statistic with the local least square polynomial estimator ${{p}_{i}}^{*}$, and derived its asymptotic distribution. The local least square estimator may produce negative estimates for cell probabilities. The local maximum likelihood polynomial estimator ${{\hat{p}}_{i}}$, however, guarantees positive estimates for cell probabilities and has the same asymptotic performance as the local least square estimator (Baek and Park, 2003). When there are cell probabilities with relatively much different sizes, the same contribution of the difference between the estimator and the hypothetical probability at each cell in their test statistic would not be proper to measure the total goodness-of-fit. We consider a Pearson type of goodness-of-fit test statistic, $T_1={{\Sigma}_{i=1}}^{k}{[{p_i}^{*}-E{(p_{i}}^{*})]^2/p_{i}$ instead, and show it follows an asymptotic normal distribution. Also we investigate the asymptotic normality of $T_2={{\Sigma}_{i=1}}^{k}{[{p_i}^{*}-E{(p_{i}}^{*})]^2/p_{i}$ where the minimum expected cell frequency is very small.

Testing of a discontinuity point in the log-variance function based on likelihood (가능도함수를 이용한 로그분산함수의 불연속점 검정)

  • Huh, Jib
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.1
    • /
    • pp.1-9
    • /
    • 2009
  • Let us consider that the variance function in regression model has a discontinuity/change point at unknown location. Yu and Jones (2004) proposed the local polynomial fit to estimate the log-variance function which break the positivity of the variance. Using the local polynomial fit, Huh (2008) estimate the discontinuity point of the log-variance function. We propose a test for the existence of a discontinuity point in the log-variance function with the estimated jump size in Huh (2008). The proposed method is based on the asymptotic distribution of the estimated jump size. Numerical works demonstrate the performance of the method.

  • PDF

A Nonparametric Goodness-of-Fit Test for Sparse Multinomial Data

  • Baek, Jang-Sun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.2
    • /
    • pp.303-311
    • /
    • 2003
  • We consider the problem of testing cell probabilities in sparse multinomial data. Aerts, et al.(2000) presented $T_1=\sum\limits_{i=1}^k(\hat{p}_i-p_i)^2$ as a test statistic with the local polynomial estimator $(\hat{p}_i$, and showed its asymptotic distribution. When there are cell probabilities with relatively much different sizes, the same contribution of the difference between the estimator and the hypothetical probability at each cell in their test statistic would not be proper to measure the total goodness-of-fit. We consider a Pearson type of goodness-of-fit test statistic, $T=\sum\limits_{i=1}^k(\hat{p}_i-p_i)^2/p_i$ instead, and show it follows an asymptotic normal distribution.

  • PDF

Imputation of Medical Data Using Subspace Condition Order Degree Polynomials

  • Silachan, Klaokanlaya;Tantatsanawong, Panjai
    • Journal of Information Processing Systems
    • /
    • v.10 no.3
    • /
    • pp.395-411
    • /
    • 2014
  • Temporal medical data is often collected during patient treatments that require personal analysis. Each observation recorded in the temporal medical data is associated with measurements and time treatments. A major problem in the analysis of temporal medical data are the missing values that are caused, for example, by patients dropping out of a study before completion. Therefore, the imputation of missing data is an important step during pre-processing and can provide useful information before the data is mined. For each patient and each variable, this imputation replaces the missing data with a value drawn from an estimated distribution of that variable. In this paper, we propose a new method, called Newton's finite divided difference polynomial interpolation with condition order degree, for dealing with missing values in temporal medical data related to obesity. We compared the new imputation method with three existing subspace estimation techniques, including the k-nearest neighbor, local least squares, and natural cubic spline approaches. The performance of each approach was then evaluated by using the normalized root mean square error and the statistically significant test results. The experimental results have demonstrated that the proposed method provides the best fit with the smallest error and is more accurate than the other methods.

Estimation of the number of discontinuity points based on likelihood (가능도함수를 이용한 불연속점 수의 추정)

  • Huh, Jib
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.1
    • /
    • pp.51-59
    • /
    • 2010
  • In the case that the regression function has a discontinuity point in generalized linear model, Huh (2009) estimated the location and jump size using the log-likelihood weighted the one-sided kernel function. In this paper, we consider estimation of the unknown number of the discontinuity points in the regression function. The proposed algorithm is based on testing of the existence of a discontinuity point coming from the asymptotic distribution of the estimated jump size described in Huh (2009). The finite sample performance is illustrated by simulated example.

Managing Deadline-constrained Bag-of-Tasks Jobs on Hybrid Clouds with Closest Deadline First Scheduling

  • Wang, Bo;Song, Ying;Sun, Yuzhong;Liu, Jun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.7
    • /
    • pp.2952-2971
    • /
    • 2016
  • Outsourcing jobs to a public cloud is a cost-effective way to address the problem of satisfying the peak resource demand when the local cloud has insufficient resources. In this paper, we studied the management of deadline-constrained bag-of-tasks jobs on hybrid clouds. We presented a binary nonlinear programming (BNP) problem to model the hybrid cloud management which minimizes rent cost from the public cloud while completes the jobs within their respective deadlines. To solve this BNP problem in polynomial time, we proposed a heuristic algorithm. The main idea is assigning the task closest to its deadline to current core until the core cannot finish any task within its deadline. When there is no available core, the algorithm adds an available physical machine (PM) with most capacity or rents a new virtual machine (VM) with highest cost-performance ratio. As there may be a workload imbalance between/among cores on a PM/VM after task assigning, we propose a task reassigning algorithm to balance them. Extensive experimental results show that our heuristic algorithm saves 16.2%-76% rent cost and improves 47.3%-182.8% resource utilizations satisfying deadline constraints, compared with first fit decreasing algorithm, and that our task reassigning algorithm improves the makespan of tasks up to 47.6%.

GLOBAL Hɪ PROPERTIES OF GALAXIES VIA SUPER-PROFILE ANALYSIS

  • Kim, Minsu;Oh, Se-Heon
    • Journal of The Korean Astronomical Society
    • /
    • v.55 no.5
    • /
    • pp.149-172
    • /
    • 2022
  • We present a new method which constructs an Hɪ super-profile of a galaxy which is based on profile decomposition analysis. The decomposed velocity profiles of an Hɪ data cube with an optimal number of Gaussian components are co-added after being aligned in velocity with respect to their centroid velocities. This is compared to the previous approach where no prior profile decomposition is made for the velocity profiles being stacked. The S/N improved super-profile is useful for deriving the galaxy's global Hɪ properties like velocity dispersion and mass from observations which do not provide sufficient surface brightness sensitivity for the galaxy. As a practical test, we apply our new method to 64 high-resolution Hɪ data cubes of nearby galaxies in the local Universe which are taken from THINGS and LITTLE THINGS. In addition, we also construct two additional Hɪ super-profiles of the sample galaxies using symmetric and all velocity profiles of the cubes whose centroid velocities are determined from Hermite h3 polynomial fitting, respectively. We find that the Hɪ super-profiles constructed using the new method have narrower cores and broader wings in shape than the other two super-profiles. This is mainly due to the effect of either asymmetric velocity profiles' central velocity bias or the removal of asymmetric velocity profiles in the previous methods on the resulting Hɪ super-profiles. We discuss how the shapes (𝜎n/𝜎b, An/Ab, and An/Atot) of the new Hɪ super-profiles which are measured from a double Gaussian fit are correlated with star formation rates of the sample galaxies and are compared with those of the other two super-profiles.