Methods Analyzing Categorical Data

Award Information
Agency:
Department of Health and Human Services
Branch
n/a
Amount:
$375,000.00
Award Year:
1997
Program:
SBIR
Phase:
Phase II
Contract:
1 R43 CA64112-1,
Award Id:
24887
Agency Tracking Number:
24887
Solicitation Year:
n/a
Solicitation Topic Code:
n/a
Solicitation Number:
n/a
Small Business Information
Cytel Software Corp. (Currently CYTEL SOFTWARE CORPORATION)
675 Massachusetts Avenue, Cambridge, MA, 02139
Hubzone Owned:
N
Minority Owned:
N
Woman Owned:
N
Duns:
n/a
Principal Investigator:
Cyrus Mehta
(617) 661-2011
Business Contact:
() -
Research Institution:
n/a
Abstract
Binary logistic regression and its extensions to unordered polytocous response, orderedpolytocous response, and Poisson response are among the most popular mathematical models for theanalysis of categorical data with widespread applicability in the biomedical sciences. The usual methodof inference for such models is unconditional maximum likelihood. For large well balanced data sets, orfor data with only a few parameters this approach is satisfactory. However, unconditional maximumlikelihood estimation can produce inconsistent point estimates, inaccurate p-values and inaccurateconfidence intervals for small or imbalanced data sets, and for sets with a large number of parametersrelative to the number of observations. Sometimes the method fails entirely as no estimates can befound which maximize the unconditional likelihood function. A methodologically sound alternativeapproach which as none of the above drawbacks is the exact conditional approach. Here one estimatesthe parameters of interest by computing the exact permutation distributions of their sufficient statistics,conditional on the observed values of the sufficient statistics for the remaining "nuisance" parameters.The major stumbling block to exact permutational inference has always been the heavy computationalburden it imposes. Despite the availability of fast numerical algorithms for the exact computations, therenumerous instances where a data set is tool large to be analyses by the exact methods, yet too sparseor imbalanced for the maximum likelihood approach to be reliable. What is needed is a reliable MonteCarlo alternative to the exact conditional approach which can bridge the gap between the exact andasymptotic methods of inference. The problem is technically hard because conventional Monte Carlomethods lead to massive rejection of samples that do not satisfy the constraints of the conditionaldistribution. We will build a network sampling approach to the Monte Carlo problem that we believe isa major break-through for this difficult but important problem.

* information listed above is at the time of submission.

Agency Micro-sites


SBA logo

Department of Agriculture logo

Department of Commerce logo

Department of Defense logo

Department of Education logo

Department of Energy logo

Department of Health and Human Services logo

Department of Homeland Security logo

Department of Transportation logo

Enviromental Protection Agency logo

National Aeronautics and Space Administration logo

National Science Foundation logo
US Flag An Official Website of the United States Government