*** Note: Full papers are available in Adobe Acrobat format. You must view them through the Adobe Acrobat Reader. ***
Michael V. Mannino and Yong J. Kim
This paper extends previous research on congested service facilities to generalized service distributions, a significant extension given the limitations of exponential distributions for networked computer job modeling. Building on the framework first presented in Mendelson and Whang (1990), we present fundamental theorems for non-priority M/G/1 queues, nonpreemptive M/G/1 queues, and preemptive-resume M/G/1 queues. For non-priority M/G/1 queues and nonpreemptive M/G/1 queues, these theorems establish optimal assignment rules and incentive-compatible pricing schemes. For preemptive-resume M/G/1 queues, we prove that the total delay cost is less than the total delay cost of nonpreemptive M/G/1 if service times are heterogeneous with decreasing failure rates. This result supports the traditional delay cost to service time assignment rule as a good heuristic for preemptive-resume M/G/1 queues.
Status: submitted to Operations Research, November 2001.
*** Note: Click here for full paper ***
Return to top of page
Michael V. Mannino and Yong J. Kim
Notably absent in the study of congested service facilities is the transition from a non-priority to a priority system. Building on the framework first presented in Mendelson (1985), we study this transition from the perspective of both the system manager and the system user. Using an M/G/1 priority queueing discipline with a fixed traffic assumption, we present three fundamental theorems about the transition: (i) private costs of individual users decrease after the transition, (ii) total delay costs decrease after the transition, and (iii) total revenue decreases after the transition. If the system traffic is not fixed, we prove that the transition cannot increase the private costs of all users. Because the analytical results do not ensure a good transition, we develop an optimization model that incorporates transition costs along with an efficient heuristic to solve the model. Simulation results demonstrate that (1) the initial post-transition solutions are typically Pareto-improving and (2) for non Pareto-improving solutions, the heuristic quickly generates high-quality solutions that cannot be easily improved by additional search.
Status: submitted to Management Science, August 2001.
*** Note: Click here for full paper ***
Return to top of page
Michael V. Mannino and Vijay S. Mookerjee
We derive bounds on the probability of a goal node given a set of acquired input nodes. The bounds apply to decomposable networks, a class of Bayesian Networks encompassing causal trees and causal polytrees. The difficulty of computing the bounds depends on the characteristics of the decomposable network. For directly connected networks with binary goal nodes, tight bounds can be computed in polynomial time. For other kinds of decomposable networks, the derivation of tight bounds requires solving an integer program with a non-linear objective function, a computationally intractable problem in the worst case. We provide a relaxation technique that computes looser bounds in polynomial time for more complex decomposable networks. We briefly describe an application of the probability bounds to a record linkage problem.
Status: IEEE Transactions on Knowledge and Data Engineering, in press, September 2001.
*** Note: Click here for full paper ***
Return to top of page
MEAN-RISK TRADEOFFS IN INDUCTIVE EXPERT SYSTEMS
Vijay S. Mookerjee and Michael V. Mannino
Notably absent in previous research on inductive expert systems is the study of mean-risk tradeoffs. Such tradeoffs may be significant when there are asymmetries such as unequal classification costs, and uncertainties in classification and information acquisition costs. The objective of this research is to develop models to evaluate mean-risk tradeoffs in value based inductive approaches. We develop a combined mean-risk measure and incorporate it into the Risk Based induction algorithm. The mean-risk measure has desirable theoretical properties (consistency and separability) and is supported by empirical results on decision making under risk. Simulation results using the Risk Based algorithm demonstrate: (i) an order of magnitude performance difference between mean based and risk based algorithms and (ii) an increase in the performance difference between these algorithms with risk aversion, uncertainty and asymmetry.
Status: Information Systems Research, 11, 2 (June 2000), 137-158.
*** Note: Click here for full paper ***
Return to top of page
Michael V. Mannino and Vijay S. Mookerjee
Sequential decision models are an important element of expert system optimization when the cost or time to collect inputs is significant and inputs are not known until the system operates. Many expert systems in business, engineering, and medicine have benefited from sequential decision technology. In this survey, we unify the disparate literature on sequential decision models to improve comprehensibility and accessibility. We separate formulation of sequential decision models from solution techniques. For model formulation, we classify sequential decision models by objective (cost minimization versus value maximization), knowledge source (rules, data, belief network, etc.), and optimized form (decision tree, path, input order). A wide variety of sequential decision models are discussed in this taxonomy. For solution techniques, we demonstrate how search methods and heuristics are influenced by economic objective, knowledge source, and optimized form. We discuss open research problems to stimulate additional research and development.
Status: IEEE Transactions on Knowledge and Data Engineering 9, 5 (September-October 1997), 675-687.
*** Note: Click here for full paper ***
Return to top of page
Vijay S. Mookerjee and Michael V. Mannino
Inductive expert systems typically operate with imperfect or noisy input attributes. We study design differences in inductive expert systems arising from implicit versus explicit handling of input noise. Most previous approaches use an implicit approach wherein inductive expert systems are constructed using input data of quality comparable to problems the system will be called upon to solve. We develop an explicit algorithm (ID3ecp) that uses a clean (without input errors) training set and an explicit measure of the input noise level and compare it to a traditional implicit algorithm, ID3p (the ID3 algorithm with the pessimistic pruning procedure). The novel feature of the explicit algorithm is that it injects noise in a controlled rather than random manner in order to reduce the performance variance due to noise. We show analytically that the implicit algorithm has the same expected partitioning behavior as the explicit algorithm. In contrast, however, the partitioning behavior of the explicit algorithm is shown to be more stable (i.e., lower variance) than the implicit algorithm. To extend the analysis to the predictive performance of the algorithms, a set of simulation experiments is described in which the average performance and coefficient of variation of performance of both algorithms are studied on real and artificial data sets. The experimental results confirm the analytical results and demonstrate substantial differences in stability of performance between the algorithms especially as the noise level increases.
Status: published in Information Systems Research 6, 4 (December 1995), 328-356.
*** Note: Click here for full paper ***
Return to top of page
Michael V. Mannino and Vijay S. Mookerjee
We study the sequential information acquisition problem for rule-based expert systems as follows: find the information acquisition strategy that minimizes the expected cost to operate the system while maintaining the same output decisions. This problem arises for rule-based expert systems when the cost or time to collect inputs is significant and the inputs are not known until the system operates.
We develop several "optimistic" heuristics to generate information acquisition strategies and study their properties. The heuristics provide a range of choices concerning precision (i.e., how optimistic) versus computational effort. The heuristics are embedded into an informed search algorithm (based on AO*) that produces the optimal strategy and a greedy search algorithm. The search strategies are designed for situations in which the rules can overlap and the cost of collecting an input may depend on the set of inputs previously collected. We study the properties of these approaches and simulate their performance on a variety of synthetic expert systems. Our results indicate that two of the heuristics are very precise leading to near optimal results for greedy search and moderate search effort for optimal search.
Status: published in INFORMS Journal on Computing 11, 3 (Summer 1999), 278-291.
*** Note: Click here for full paper ***
Return to top of page
Michael V. Mannino and Murlidhar V. Koushik
We consider the inverse problem in classification systems described as follows. Given a set of prototype cases representing a set of categories, a similarity function, and a new case classified in some category, find the cost minimizing changes to the at tribute values such that the case is reclassified as a member of a (different) preferred category. The problem is "inverse" because the usual mapping is from a case to its unknown category. The increased application of classification systems in business suggests that this inverse problem can be of significant benefit to decision makers as a form of sensitivity analysis.
Analytic approaches to this inverse problem are difficult to formulate as the constraints are either not available or prohibitively expensive to determine. To investigate this inverse problem, we develop several genetic algorithms and study their perform ance as problem difficulty increases. We develop a real genetic algorithm with feasibility control, a traditional binary genetic algorithm, and a steepest ascent hill climbing algorithm. In a series of simulation experiments, we compare the performance of these algorithms to the optimal solution as the problem difficulty increases (more attributes and classes). In addition, we analyze certain algorithm effects (level of feasibility control, operator design, and fitness function) to determine the best approach. Our results indicate the viability of the real genetic algorithm and the importance of feasibility control as the problem difficulty increases.
Status: Decision Support Systems 29, 3 (October 2000), 283-300.
*** Note: Click here for full paper ***
Return to top of page
Michael V. Mannino, Sukumar Rathnam, In Jun Choi, and Veronica Tseng
We present a formal approach for modeling complex commands characterized by heavy overloading of function, large numbers of parameters, dependencies among parameters, subtle side effects, and lack of abstraction. Complex commands arise in a variety of business settings such as requesting a brokerage order, enrolling in a course, and specifying a product order. In addition, complex commands are also prevalent where specification of commands is strictly separated from multiple, independent implementations as in open software standards.
Our approach is based on an inheritance structure known as a command lattice. Like other forms of inheritance, command lattices support incremental definition and abbreviation of specifications. Because a complete command lattice can have a large number of specifications, we develop another structure known as a minimal command tree in which a command lattice is derived from a much smaller number of independent specifications. To map from a minimal command tree to a command lattice, we present algorithms that materialize an arbitrary node of a command lattice and compactly generate the behavior of a command lattice. To demonstrate the potential of command lattices, we have implemented a set of tools that provide convenient specification and powerful reasoning capabilities. Our tool collection includes the Command Specification Language that supports a precise and rich specification of the structural and behavioral properties of commands, the incremental definition tool that ensures consistency of command lattices, the browsing tool that displays a command's inheritance structure, the type checker that ensures structural consistency of commands in expressions, and the target system tracer that simulates a sequence of command executions. We discuss our experiences applying the tools to IBM's Distributed Data Management, a large scale specification of data access on remote and heterogeneous IBM systems.
Status: Information Systems Research 5, 3 (September 1994), 249-274.
*** Click here for full paper ***
Return to top of page
Vijay S. Mookerjee and Michael V. Mannino
Retrieval of a set of cases similar to a new case is a problem common to a number of machine learning approaches such as nearest neighbor algorithms, conceptual clustering, and case based reasoning. A limitation of most case retrieval algorithms is their lack of attention to information acquisition costs. When information acquisition costs are considered, cost reduction is hampered by the practice of separating concept formation and retrieval strategy formation.
To demonstrate the above claim, we examine two approaches. The first approach separates concept formation and retrieval strategy formation. To form a retrieval strategy in this approach, we develop the CRlc (case retrieval loss criterion) algorithm that selects attributes in ascending order of expected loss. The second approach jointly optimizes concept formation and retrieval strategy formation using a cost based variant of the ID3 algorithm (ID3c). ID3c builds a decision tree wherein attributes are selected using entropy reduction per unit information acquisition cost.
Experiments with four data sets are described in which algorithm, attribute cost coefficient of variation, and matching threshold are factors. The experimental results demonstrate that (i) jointly optimizing concept formation and retrieval strategy formation has substantial benefits, and (ii) using cost considerations can significantly reduce information acquisition costs, even if concept formation and retrieval strategy formation are separated.
Status: Information Systems Research 8, 1 (March 1997), 51-68.
*** Note: Click here for full paper ***
Return to top of page
Sa Neung Hong and Michael V. Mannino
Abstract
A formal semantics is an essential element of language design because it supports comparison of language designs, provides implementation guidelines, and enables properties about a language to be proved. However, in the field of model management, there has been a lack of attention to the formal semantics of modeling languages. In this paper, the philosophy and formal semantics of the Unified Modeling Language (LU) are presented. The LU supports integrated modeling environments with a large number of models from diverse domains and management science paradigms. We discuss the philosophy of the LU in which logical deduction about empirical worlds is combined with efficient computations about mathematical worlds. We argue that measurement theory stressing homomorphic mappings from empirical to mathematical worlds is an ideal foundation for integrated modeling environments. A denotational semantics is given for the LUincluding the semantic domain, meta functions, and semantic equations. The LU is unique because measurement theory plays a salient role in its underlying denotational semantics. In particular, the semantic domain of the LUincludes explicit homomorphic mappings from empirical to mathematical worlds and semantic equations define conditions, based on homomorphisms, that must be satisfied by valid models.
Status: Decision Support Systems, 13 (1995), 263-293.
Return to top of page
Sa Neung Hong, Michael V. Mannino, and Betsy Greenberg
Abstract
The philosophy and features of the Unified Modeling Language (LU) are presented with emphasis on the support of large, diverse model bases. We argue that measurement theory stressing homomorphic mappings from empirical to mathematical systems is an ideal foundation for integrated modeling environments. Homomorphic mappings are an integral part or the semantics of the LU and motivate a number of language features that support large, diverse model bases. The LUprovides a separate but uniform representation of domain worlds and mathematical systems known as model types. Domain worlds are composed of empirical objects, relations, and functions organized into classes and attributes, while model types are defined by standard queries and a rich collection of assumptions. To emphasize the importance of homomorphic mappings, unification constraints combine common patterns of domain and mathematical knowledge in model templates. Reusability and incremental refinement are supported by inheritance based on partial order relations for classes, attributes, assumptions, arid model types. Together, the semantic foundation and language features provide a practical and formal knowledge representation for large, diverse model bases.
Status: Decision Support Systems, 10 (1993), North-Holland, 319-340.
Return to top of page