Clustering is particularly important to gain access to new ideas and tacit knowledge, especially in young industries.
For example, the Gellman (1976,1982) data base identified SMES as contributing 2. 45 times more innovations per employee than do large firms.
In a clustering strategy, firms take advantage of linkages with other enterprises afforded by geographic proximity
Data constraints can be overcome to study the extent of knowledge spillovers and their link to the geography of innovative activity using proxies like patenting activity, patent citations,
Therefore comparability of the data in this table is guaranteed not fully. 21 Year founded. 22 Not included:
and Han Zhang, 1999, Small Business in the Digital economy: Digital Company of the Future, paper presented at the conference, Understanding the Digital economy:
Data, Tools, and Research, Washington, D c.,25-26,may 1999. Berman, Eli, John Bound and Stephen Machin, 1997,Implications of Skill-Biased Technological Change:
International Evidence, working paper 6166, National Bureau of Economic Research (NBER), Cambridge, MA. Bessant, J.,1999, The Rise and Fall of Supernet:
OECD. OECD, 1999, Cluster analysis and Cluster-based Policy in OECD countries, Paris: OECD. Porter, M. 1990), The Comparative Advantage of Nations, New york:
Prevenzer, Martha, 1997,The Dynamics of Industrial Clustering in Biotechnology,'Small Business Economics, 9 (3), 255-271.
We conduct our analysis on a primary data set of 120 SMES in the Cibaduyut footwear-manufacturing cluster, Indonesia.
Research Design and Data Collection We collected the data in 2012 based on an extensive survey in this cluster,
the official database of company addresses is at best incomplete. We combed through every area in Cibaduyut
The resulting data set presents a near complete representation of firms in this cluster. Measures and Validation Innovative performance Innovation is understood traditionally to mean the introduction of new goods, the use of new materials, the development of new methods of production, the opening of new markets,
We derived multi-item variables using factor analysis, testing for their reliability and validity. We confirmed the reliability of these indicators by computing the Cronbach-alpha coefficient,
2011) indicating that the data are close to normal. We employed a hierarchical regression analysis, with alternative models with and without interaction terms.
Our analysis, based on primary data collected through interviews and questionnaires, provided mixed support for our hypotheses.
First, the data we used, although original and derived from field research, is cross-sectional. This has prevented us from examining the effect of changes over time in firm behavior on innovative performance.
Collecting longitudinal data in the emerging economy context is given particularly challenging the lack of governmental level initiatives to this end.
Investigation of clustering impact to SMES'innovation in Indonesia. Paper presented at the 2nd International Conference on International Business (ICIB), University of Macedonia.
Firm clustering and innovation: Determinants and effects. Papers in Regional Science, 80,337-356. Pérez-Luño, A.,Wiklund, J,
Clustering and Industrialization: Introduction. World Development, 27 (9), 1503-1514. Schoales, J. 2006. Alpha Clusters:
Innovation and Clustering in the Globalised International Economy. Urban Studies, 41 (5/6), 1095-1112.
It will provide a brief overview of the CNIPMMR study pointing out data about Romanian SMES innovation activities and use of information technology in such enterprises.
2010). 2. Data used Data used for this article was collected and compiled by CNIPMMR (Consiliul National al Întreprinderilor Private Mici si Mijlocii din România National Council of Small and Medium Sized Private Enterprises),
the data regarding the nature of innovation activities in SMES show that most innovation efforts are headed (concentrated) towards creation of new products (37,21%)in 2012;
Own adaptation based on CNIPMMR data (2011-2013) 3. 2 Innovation investments In terms of share of investments allocated to innovation from the total enterprises investments almost half of SMES (44,93
processes in the Romanian SMES vision are data security (48,44%),fast access to the enterprise data form anywhere and at anytime (38.21%)and regulatory compliance (35.84%.
9. 42 4. 74 4. 2 2. 31 Data security Access to company data anytime, anywhere Regulatory compliance Ease of team-working Better internal control Removal of redundant data insertion in more Ease of use due to romanian language interface Free and quick solution upgrade
Monthly subscription fee for usage Better performance through internal business Detailed reports of departments activities Nelu Eugen Popescu/Procedia Economics and Finance 16 (2014
data security, data access from anywhere and at anytime, regulatory compliance, team-working possibilities and better internal control.
whether their=clusteringhas fostered a more collaborative culture of learning and knowledge exchange. While in technology parks there is a relatively high level of collaboration with universities
Contribution of research organisations Build (RETA and OTRI) an integrated database of faculty research and consulting skills to match the existing survey of innovation needs of small firms.
OECD Regional Database The socioeconomic context Andalusia is the southernmost region of peninsular Spain and has lagged traditionally behind the rest of the country by most economic variables.
Guidelines for Collecting and Interpreting Innovation Data, 3rd edition. Paris: OECD. Available at www. oecd. org/sti/oslomanual Osterman, P. 1999.
Only a small part of these relationships is captured in the formal data gathered by the university technology transfer offices.
They also note that policymakers who rely on the formal data collected by university technology transfer offices are privy to at best the=tip of the icebergin terms of the true dimensions of university-industry collaboration that exist (Ramos-Vielba et al.
Create a database of faculty's skills and match it with the innovative needs of local firms The detailed surveys of both university research teams and innovative firms conducted by the team at IESA-CSIC reveal that there are already a substantial number of university researchers
One mechanism to accomplish this might take the form of creating an integrated database of faculty research
The IESA-CSIC surveys might even provide the preliminary basis for constructing such a database
Once they have begun to use the database to link researchers up with firms in need of their expertise
as RETA could use the process of building both the database of expert skills in the universities
Based on the 2006 Global Entrepreneurship Monitor (GEM) data, the density of enterprises was approaching the national level.
the most recent data available indicate that the larger firm sectors, including medium sized firms,
accounting for around 35 percent of total exports according to the most recent data, 19 percent of this total represented by unprocessed agricultural products and 16 percent from processed food and drink products (Junta de Andalucía, 2007).
RETA expresses the more widely held belief that clustering of high technology firms, described as Andalusias=closenessmodel is the most effective means of offering support to fast growing and technology dynamic SMES.
whether their=clusteringhas fostered a more collaborative culture of learning and knowledge exchange. A recent study that explored the type
For example, the policy of encouraging clustering of SMES in technology parks and industrial estates is informed by recent research that highlights the importance of encouraging proximity between firms in the pursuit of innovation.
encouraging the physical clustering and co-presence of firms (as we have shown) is not in itself,
In this respect, RETA could play an important role by building up together with OTRI a faculty skills database that could be matched with the existing dataset of=innovative needsof Andalusia firms.
the underlying rationale being that clustering of technology-intensive firms enhances their growth and expansion.
whether=clusteringhas fostered a more collaborative culture of learning and knowledge exchange. Indeed few firms appear to develop collaborations with other firms co-located in the same park.
and could indeed be matched with another database collecting the skills of university faculty members so as to ease knowledge transfer between HEIS and firms, including of small size.
and sectoral mix of cluster strategies Business clustering has brought significant advantages for smaller firms especially because of knowledge spillovers from one firm to another or from institutions to firms.
Contribution of research organisations Build (RETA and OTRI) an integrated database of faculty research and consulting skills to match the existing survey of innovation needs of small firms.
a contingency approach 49 3. Research Design 51 3. 1. Sample and data collection 51 3. 2. Techniques for controlling Common Method Biases 54
direct and indirect causal effects 75 2. 4. Size as a moderator term 79 3. Research Design 81 3. 1. Sample and data collection
from export activities toward innovativeness 99 3. Research Design 100 3. 1. Database 100 3. 2. Variables 102 4. Analysis and Results 106
I would also like to thank Dr. Yancy Vaillant for his contribution with the GEM database.
from GEM database Ordinal regression and Logit regressions Key findings There is a positive effect of EO on SME profitability;
impact on firm profitability 3. Research Design 3. 1. Sample and data collection The companies included in this study were selected based upon three criteria:
The data were collected in two distinct stages. First, we used a questionnaire adapted from the model used in different studies (e g.
the use of personal information collected with the same level of authority within each organization reduces the variability of the data (Nasrallah and Qawasmeh, 2009.
Firms that did not respond to the initial request for data were contacted a second time via telephone one month after the initial contact,
which complete data were available on accounting information in the investigated years. The survey was carried out in the winter of 2009.
The second step of data collection was performed through companies'publications and annual reports to make annual updates to the database of firms
which answered the questionnaire. The financial-statement data are obtained from the SABI of 2007-2009. Finally, to ensure the absence of bias in the data,
we have evaluated the bias of nonresponse (a sample of 121 firms which did not respond to the questionnaire,
has been compared with reference to the ROA and number of employees. The results revealed no significant differences between the two groups.
, p>.10) in 3 Iberian System Analysis of Balance (SABI) is an online database with detailed financial information about Spanish and Portuguese companies. 54 terms of age, number of employees,
a single factor will emerge from the factor analysis or the majority of the covariance will be concentrated in one of the factors (Podsakoff et al.,
We applied the exploratory factor analysis to assess dimensionality and validity. Statisticians KMO of 0. 94 and Bartlett's sphericity test (p<.01) support the idea of the validity of the implementation of factorial analysis and allow us to check
we carried out a confirmatory factor analysis highlighting the existence of a multidimensional construct (see the path diagram for this construct as well as,
and few studies have used longitudinal data to analyze the phenomenon. Concerning the EO-firm growth relationship,
Zahra and Covin (1995) collected data from three different samples over a seven-year period to assess the longitudinal impact of EO on growth revenue.
For example, Wiklund (1999), using data from Swedish small firms, has shown that there is a positive relationship between EO and performance (reflecting growth and financial performance),
Using data from Norway, Madsen (2007) also concluded that the sustained and increased EO level was associated positively with high performance (employment growth
Yamada and Eshima (2009), using longitudinal (two years) data from 300 small technology-based Japanese firms,
data collection, control of response bias and common method biases are repeated. 3. 1. Sample and data collection To test the relationship between EO, network resource and firm growth,
data were collected from a sample of SME Spanish firms. Survey All companies included in this study
which develop manufacturing activities can be classified as SMES, and have been active and are in the business for at least the last five years.
The data were collected in two distinct stages. First, we applied a questionnaire which has been adapted and designed to collect the necessary information,
the use of personal information collected with the same level of authority within each organization reduces the variability of the data (Nasrallah and Qawasmeh,
Firms that did not respond to the initial request for data were contacted a second time via telephone one month after the initial contact,
which had available data in the investigated years. The survey was carried out in the winter of 2009.
The second step of data collection was performed through companies'publications and annual reports to make annual updates to the database of firms
which answered the questionnaire. The financial statement data are obtained from the SABI 2007-2009 database. To ensure the absence of bias in the data,
we have evaluated the bias of non-response (a sample of 121 firms, which have not responded to the questionnaire,
was compared with reference to the ROA and number of employees). The results revealed no significant difference between the two groups.
We applied the exploratory factor analysis to assess dimensionality and validity. Statisticians KMO of 0. 94 and Bartlett's sphericity test (p<0. 01) support the idea of the validity of the implementation of factorial analysis and allow us to check
we carried out a confirmatory factor analysis highlighting the existence of a multidimensional construct (see the path diagram for this construct,
we carried out an exploratory factor analysis to verify whether we could treat the information as a single construct.
SEM can be understood as a combination of confirmatory factor analysis (CFA) and multiple regression (Schreiber et al. 2006).
The Chi-square statistic measures the distance between the original data matrix and the matrix estimated by the model,
Moreover, GFI (0. 869) and the adjusted GFI (0. 818) explain how well our data fit to the proposed theoretical model.
the first objective using these SME data was to reply whether network usage affects the EO development in these Spanish firms or not.
and describes the main data sources; Section 4 presents the estimation results, and Section 5 provides discussion about it
For example, Caldera (2010), using a compiled data from the Encuesta sobre Estrategias Empresariales (ESEE) Spain,
In turn, using Spanish manufacturing data, López Rodríguez and García Rodríguez (2005), stated that product innovations,
Export propensity affects positively the firm's innovativeness. 3. Research design 3. 1. Database The sample used in this essay was taken from the Spanish Global Entrepreneurship Monitor (GEM) by considering the adult population survey for the years 2007
and it provides the required fundamental knowledge by assembling relevant harmonized data on an annual basis (See Reynolds et al.,
2005). 101 This database contains various entrepreneurial measures that are constructed on a survey basis. In our research,
Considering the available information in the GEM database, we used it in two different steps:
by using data from two years, we have provided some evidence from cross-sectional analyzes of 2007 and 2008.
and logit regression models do not represent problems of multi-collinearity. In fact the correlation between innovation in products or services and technological innovation was expected,
Future research could examine using panel data for the prediction that a firm's innovativeness enhances its probability of exporting
Evidence from exogenous innovation impulses and obstacles using German micro data. Oxford Economic Paper, 58,317-350.146 Lages, L. F.,Silva, G. and Styles, C. 2009.
data collection design and implementation 1998-2003. Small Business Economics, 24,205-231. Rhee, J.,Park, T. and Lee, D. H. 2010.
Reporting structural equation modeling and confirmatory factor analysis results: a review. The Journal of Education Research, 99,323-337.
evidence from GEM data. Small Business Economics, 24,335-350. Yamada, K. and Eshima, Y. 2009.
Entrepreneurship Theory and Practice, 35,293-317.156 157 APPENDIX Appendix 1. Confirmatory factor analysis EO Model fit EO construct Recommended level CFA level CFI
The 2015 Gartner CIO Survey gathered data from 2, 810 CIO respondents in 84 countries and all major industries, representing approximately $12. 1 trillion in revenue/public-sector budgets
For this report, we analyzed this data and supplemented it with interviews of 11 CIOS
810 CIO respondents from 84 countries, representing $12. 1 trillion in revenue/budgets and $397 billion in IT spending Survey coverage based on 2,
social and big data are already central to business thinking, and the next set of digital technologies, trends, opportunities and threats is creating yet another competitive frontier.
and capabilities Digital business success requires starting with a digital information and technology mindset, and working backward Measurement is short-term and input-centric,
businesses need forward-looking predictive analytics combined with data-led experimentation (see figure below). Information and technology flip 3:
From Backward-looking reporting Passive analysis of data Structured information Separate analytics To Forward-looking predictive analytics Active experimentation informed by data New types of information,
Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. Gartner shall have no liability for errors,
. 1. 1 Principal Components Analysis 17 3. 1. 2 Factor analysis 21 3. 1. 3 Cronbach Coefficient Alpha 26 3. 2
Grouping information on countries 28 3. 2. 1 Cluster analysis 28 3. 2. 2 Factorial k-means analysis 34 3. 3 Conclusions
IMPUTATION OF MISSING DATA 35 4. 1 Single imputation 36 3. 1. 1 Unconditional mean imputation 37 4. 1. 2 Regression
imputation 38 4. 1. 3 Expected maximization imputation 38 4. 2 Multiple imputation 40 5. NORMALISATION OF DATA 44 5. 1
and factor analysis 56 6. 1. 2 Data envelopment analysis and Benefit of the doubt 59 Benefit of the doubt approach 60 6. 1. 3 Regression approach
81 7. UNCERTAINTY AND SENSITIVITY ANALYSIS 85 7. 1 Set up of the analysis 87 7. 1. 1 Output variables of interest 87 7. 1. 2
General framework for the analysis 88 4 7. 1. 3 Inclusion exclusion of individual sub-indicators 88 7. 1. 4 Data quality 88
7. 1. 5 Normalisation 88 7. 1. 6 Uncertainty analysis 89 7. 1. 7 Sensitivity analysis using variance-based techniques 91 7
We deal with the problem of missing data and with the techniques used to bring into a common unit the indicators that are of very different nature.
and aggregating indicators into a composite and test the robustness of the composite using uncertainty and sensitivity analysis.
whereby a lot of work in data collection and editing is wasted or hidden behind a single number of dubious significance.
or imprecise assessment and use uncertainty and sensitivity analysis to gain useful insights during the process of composite indicators building, including a contribution to the indicators'quality definition and an appraisal of the reliability of countries'ranking.
multivariate analysis, imputation of missing data and normalization techniques aim at supplying a sound and defensible dataset.
and sensitivity analysis to increase transparency and make policy inference more defensible. Section 8 shows how different visualization strategies of the same composite indicator can convey different policy messages.
Factor analysis and Reliability/Item Analysis (e g. Coefficient Cronbach Alpha) can be used to group the information on the indicators.
Cluster analysis can be applied to group the information on constituencies (e g. countries) in terms of their similarity with respect to the different sub-indicators.
(d) a method for selecting groups of countries to impute missing data with a view to decrease the variance of the imputed values.
Clearly the ability of a composite to represent multidimensional concepts largely depends on the quality and accuracy of its components.
Missing data are present in almost all composite indicators and they can be missing either in a random or in a nonrandom fashion.
whether data are missing at random or systematically, whilst most of the methods of imputation require a missing at random mechanism.
Three generic approaches for dealing with missing data can be distinguished i e. case deletion, single imputation or multiple imputation.
The other two approaches see the missing data as part of the analysis and therefore try to impute values through either Single Imputation (e g.
and the use ofexpensive to collect'data that would otherwise be discarded. In the words of Dempster and Rubin (1983:
because it can lull the user into the pleasurable state of believing that the data are complete after all,
and imputed data have substantial bias. Whenever indicators in a dataset are incommensurate with each other,
The normalization method should take into account the data properties and the objectives of the composite indicator.
whether hard or soft data are available, whether exceptional behaviour needs to be rewarded/penalised, whether information on absolute levels matters,
partially, to correct for data quality problems in such extreme cases. The functional transformation is applied to the raw data to represent the significance of marginal changes in its level.
Different weights may be assigned to indicators to reflect their economic significance (collection costs, coverage, reliability and economic reason), statistical adequacy, cyclical conformity, speed of available data, etc.
such as 12 weighting schemes based on statistical models (e g. factor analysis, data envelopment analysis, unobserved components models), or on participatory methods (e g. budget allocation, analytic hierarchy processes).
Weights may also reflect the statistical quality of the data, thus higher weight could be assigned to statistically reliable data (data with low percentages of missing values, large coverage, sound values).
In this case the concern is to reward only sub-indicators easy to measure and readily available, punishing the information that is more problematic to identify and measure.
Uncertainty analysis and sensitivity analysis is a powerful combination of techniques to gain useful insights during the process of composite indicators building,
selection of data, data quality, data editing (e g. imputation), data normalisation, weighting scheme/weights, weights'values and aggregation method.
A combination of uncertainty and sensitivity analysis can help to gauge the robustness of the composite indicator
Sensitivity analysis (SA) studies how much each individual source of uncertainty contributes to the output variance. In the field of building composite indicators, UA is adopted more often than SA (Jamison and Sandbu, 2001;
The composite indicator is no longer a magic number corresponding to crisp data treatment, weighting set or aggregation method,
The iterative use of uncertainty and sensitivity analysis during the development of a composite indicator can contribute to its well-structuring
along with the data, the weights and the documentation of the methodology. Given that composite indicators can be decomposed
or disaggregated so as to introduce alternative data, weighting, normalisation approaches etc. the components of composites should be available electronically as to allow users to change variables, weights,
etc. and to replicate sensitivity tests. 2. 1 Requirements for quality control As mentioned above the concept of quality of the composite indicators is not only a function of the quality of its underlying data (in terms of relevance, accuracy, credibility, etc.)
) Table 2. 1 The Pedigree Matrix for Statistical Information Grade Definitions & Standards Data-collection & Analysis Institutional Culture Review 4 Negotiation Task-force Dialogue
Factor analysis and Reliability/Item Analysis can be used complementarily to explore whether the different dimensions of the phenomenon are balanced well-from a statistical viewpoint-in the composite indicator.
The use of cluster analysis to group countries in terms of similarity between different sub-indicators can serve as:(
(h) a method for selecting groups of countries to impute missing data with a view to decrease the variance of the imputed values.
Cluster analysis could, thereafter, be useful in different sections of this document. The notation that we will adopt throughout this document is the following. tq
say P<Q principal components that preserve a high amount of the cumulative variance of the original data.
because it means that the principal components are measuring different statistical dimensions in the data.
When the objective of the analysis is to present a huge data set using a few variables then in applying PCA there is the hope that some degree of economy can be achieved
Bootstrap refers to the process of randomly resampling the original data set to generate new data sets.
whether the TAI data set for the 23 countries can be viewed as arandom'sample of the entire population as required by the bootstrap procedures (Efron 1987;
Several points can be made regarding the issues of randomness and representativeness of the data. First, it is often difficult to obtain complete information for a data set in the social sciences because, unlike the natural sciences,
controlled experiments are not always possible, as in the case here. As Efron and Tibshirani (1993) state:
A 20 third point on the data quality is that a certain amount of measurement error is likely to exist.
While such measurement error can only be controlled at the data collection stage rather than at the analytical stage, it is argued that the data represent the best estimates currently available (United nations, 2001, p. 46.
Figure 3. 1 (right) demonstrates graphically the relationship between the eigenvalues from the deterministic PCA,
and how the interpretation of the components might be improved are addressed without further ado in the following section on Factor analysis. 3. 1. 2 Factor analysis Factor analysis (FA) has similar aims to PCA.
Principal components factor analysis is preferred most in the development of composite indicators (see Section 6), e g.
it is unlikely that they share common factors. 2. Identify the number of factors that are necessary to represent the data
Assumptions in Principal Components Analysis and Factor analysis 1. Enough number of cases. The question of how many cases (or countries) are necessary to do PCA/FA has no scientific answer
Although social scientists may be attracted to factor analysis as a way of exploring data whose structure is unknown,
which variables are associated most with the outlier cases. 4. Assumption of interval data. Kim and Mueller (1978b
pp. 74-75) note that ordinal data may be used if it is thought that the assignment of ordinal categories to the data does not seriously 25 distort the underlying metric scaling.
Likewise, these authors allow the use of dichotomous data if the underlying metric correlations between the variables are thought to be moderate(.
7) or lower. The result of using ordinal data is that the factors may be much harder to interpret.
Note that categorical variables with similar splits will necessarily tend to correlate with each other, regardless of their content (see Gorsuch, 1983).
Principal components factor analysis (PFA), which is the most common variant of FA, is a linear procedure.
the more important it is to screen data for linearity. 6. Multivariate normality of data is required for related significance tests.
Note, however, that a variant of factor analysis, maximum likelihood factor analysis, does assume multivariate normality. The smaller the sample size, the more important it is to screen data for normality.
Moreover, as factor analysis is based on correlation (or sometimes covariance), both correlation and covariance will be attenuated when variables come from different underlying distributions (ex.,
a normal vs. a bimodal variable will correlate less than 1. 0 even when both series are ordered perfectly co).
7. Underlying dimensions shared by clusters of sub-indicators are assumed. If this assumption is met not, the"garbage in,
Factor analysis cannot create valid dimensions (factors) if none exist in the input data. In such cases, factors generated by the factor analysis algorithm will not be comprehensible.
Likewise, the inclusion of multiple definitionally-similar sub-indicators representing essentially the same data will lead to tautological results. 8. Strong intercorrelations are required not mathematically,
but applying factor analysis to a correlation matrix with only low intercorrelations will require for solution nearly as many factors as there are original variables,
thereby defeating the data reduction purposes of factor analysis. On the other hand, too high inter-correlations may indicate a multi-collinearity problem
and collinear terms should be combined or otherwise eliminated prior to factor analysis. (a) The Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy is a statistics for comparing the magnitudes of the observed correlation coefficients to the magnitudes of the partial correlation coefficients.
The concept is that the partial correlations should not be very large if one is to expect distinct factors to emerge from factor analysis (see Hutcheson and Sofroniou, 1999, p. 224).
A KMO statistic is computed for each individual sub-indicator, and their sum is the KMO overall statistic.
KMO varies from 0 to 1. 0. A KMO overall should be. 60 or higher to proceed with factor analysis (Kaiser and Rice, 1974),
though realistically it should exceed 0. 80 if the results of the principal components analysis are to be reliable.
If not, it is recommended to drop the sub-indicators with the lowest individual KMO statistic values,
but common cut off criterion for suggesting that there is a multi 26 collinearity problem. Some researchers use the more lenient cutoff VIF value of 5. 0. c) The Bartlett's test of sphericity is used to test the null hypothesis that the sub-indicators in a correlation matrix are uncorrelated,
Sensitive to modifications in the basic data: data revisions and updates (e g. new countries. Sensitive to the presence of outliers,
which may introduce a spurious variability in the data. Sensitive to small-sample problems, which are particularly relevant
when the focus is limited on a set of countries. Minimisation of the contribution of subindicators which do not move with other subindicators.
Note also, that the factor analysis in the previous section had indicated ENROLMENT as the sub-indicator that shares the least amount of common variance with the other sub-indicators.
Although both factor analysis and the Cronbach coefficient alpha are based on correlations among sub-indicators, their conceptual framework is different. 28 Table 3. 6. Cronbach coefficient alpha results for the 23 countries after deleting one subindicator (standardised values) at-a-time Deleted sub-indicator
2004) Success of software process implementation 3. 2 Grouping information on countries 3. 2. 1 Cluster analysis Cluster analysis (CLA) is given the name to a collection of algorithms used to classify objects
Cluster analysis has been applied in a wide variety of research problems, from medicine and psychiatry to archeology.
cluster analysis is of great utility. 29 CLA techniques can be hierarchical (for example the tree clustering),
or nonhierarchical when the number of clusters is decided ex ante (for example the k-means clustering).
To do so the clustering techniques attempt to have more in common with own group than with other groups, through minimization of internal variation while maximizing variation between groups.
(=Useful if the data are categorical in nature. Having decided how to measure similarity (the distance measure),
the next step is to choose the clustering algorithm, i e. the rules which govern how distances are measured between clusters.
and hence different classifications may be obtained for the same data, even using the same distance measure.
Single linkage (nearest neighbor. The distance between two clusters is determined by the distance between the two closest elements in the different clusters.
which indicates that the data are represented best by ten clusters: Finland alone, Sweden and 31 USA, the group of countries located between The netherlands and Hungary, then alone Canada, Singapore, Australia, New zealand, Korea, Norway, Japan.
Figure 3. 2. Country clusters for the sub-indicators of technology achievement (standardised data. Type:
3 6 9 12 15 18 21 Figure 3. 3. Linkage distance versus fusion step in the hierarchical clustering for the technology achievement example.
A nonhierarchical method of clustering, different from the Joining (or Tree) clustering shown above, is the k-means clustering (Hartigan,
1975). ) This method is useful when the aim is that of dividing the sample in k clusters of greatest possible distinction.
Table 3. 8. K-means clustering for the 23 countries in the technology achievement case study Group1 (leaders) Group 2 (potential leaders) Group 3 (dynamic adopters
kmeans clustering (standardized data. Finally, expectation maximization (EM) clustering extends the simple k-means clustering in two ways:
so as to maximizes the overall likelihood of the data, given the final clusters (Binder, 1981). 2. Unlike k-means,
EM can be applied both to continuous and categorical data. Ordinary significance tests are not valid for testing differences between clusters.
Principal component analysis or Factor analysis) that summarize the common information in the data set by detecting non-observable dimensions.
On the other hand, the relationships within a set of objects (e g. countries) are explored often by fitting discrete classification models as partitions, n-trees, hierarchies, via nonparametric techniques of clustering.
or when is believed it that some of these do not contribute much to identify the clustering structure in the data set,
frequently carrying out a PCA and then applying a clustering algorithm on the object scores on the first few components.
because PCA or FA may identify dimensions that do not necessarily contribute much to perceive the clustering structure in the data and that,
Various alternative methods combining cluster analysis and the search for a low-dimensional representation have been proposed, and focus on multidimensional scaling or unfolding analysis (e g.,
A method that combines k-means cluster analysis with aspects of Factor analysis and PCA is presented by Vichi and Kiers (2001.
A discrete clustering model together with a continuous factorial one are fitted simultaneously to two-way data,
the data reduction and synthesis, simultaneously in direction of objects and variables; Originally applied to short-term macroeconomic data,
factorial k-means analysis has a fast alternating least-squares algorithm that extends its application to large data sets.
The methodology can therefore be recommended as an alternative to the widely used tandem analysis. 3. 3 Conclusions Application of multivariate statistics,
including Factor analysis, Coefficient Cronbach Alpha, Cluster analysis, is something of an art, and it is certainly not as objective as most statistical methods.
Available software packages (e g. STATISTICA, SAS, SPSS) allow for different variations of these techniques. The different variations of each technique can be expected to give somewhat different results
then it must take its place as one of the important steps during the development of composite indicators. 35 4. Imputation of missing data Missing data are present in almost all the case studies of composite indicators.
Data can be missing either in a random or in a nonrandom fashion. They can be missing at random because of malfunctioning equipment, weather issues, lack of personnel,
but there is no particular reason to consider that the collected data are substantially different from the data that could not be collected.
On the other hand, data are often missing in a nonrandom fashion. For example, if studying school performance as a function of social interactions in the home, it is reasonable to expect that data from students in particularly types of home environments would be more likely to be missing than data from people in other types of environments.
More formally, the missing patterns could be: -MCAR (Missing Completely At random: missing values do not depend on the variable of interest or any other observed variable in the data set.
For example the missing values in variable income would be of MCAR type if (i) people who do not report their income have on average,
but they are conditional on some other variables in the data set. For example the missing values in income would be MAR
if the probability of missing data on income depends on marital status but, within each category of marital status,
One of the problems with missing data is that there is no statistical test for NMAR and often no basis upon
whether data are missing at random or systematically, whilst most of the methods that impute (i e. fill in) missing values require an MCAR or at least an MAR mechanism.
Three generic approaches for dealing with missing data can be distinguished, i e. case deletion, single imputation or multiple imputation.
The other two approaches see the missing data as part of the analysis and therefore try to impute values through either Single Imputation (e g.
and the use ofexpensive to collect'data that would otherwise be discarded. The main disadvantage of imputation is that it can allow data to influence the type of imputation.
In the words of Dempster and Rubin (1983: The idea of imputation is both seductive and dangerous.
because it can lull the user into the pleasurable state of believing that the data are complete after all,
and imputed data have substantial bias. The uncertainty in the imputed data should be reflected by variance estimates.
This allows taking into account the effects of imputation in the course of the analysis. However
The literature on the analysis of missing data is extensive and in rapid development. Therefore
The predictive distribution must be created by employing the observed data. There are, in general, two approaches to generate this predictive distribution:
the danger of this type of modeling missing data is to consider the resulting data set as complete
fill in blanks cells with individual data drawn from similar responding units, e g. missing values for individual income may be replaced with the income of another respondent with similar characteristics (age, sex, race, place of residence, family relationships, job, etc.).
and the time to converge depends on the proportion of missing data and the flatness of the likelihood function.
Another common method (called imputing means within adjustment cells) is to classify the data for the sub-indicator with some missing values in classes
thus the inference based on the entire dataset (including the imputed data) does not fully count for imputation uncertainty.
For nominal variables, frequency statistics such as the mode or hot-and cold-deck imputation methods might be more appropriate. 4. 1. 3 Expected maximization imputation Suppose that X denotes the data.
In the likelihood based estimation the data are assumed to be generated by a model described by a probability
The probability function captures the relationship between the data set and the parameter of the of the data model 5
If the observed variables are dummies for a categorical variable then the prediction (4. 2) are respondent means within classes defined by the variable
while the data set is known, it make sense to reverse the argument and look for the probability of observing a certain given the data set X:
this is the likelihood function. Therefore, given X, the likelihood function L(/X) is any function of O proportional to f (X/:
Assume that missing data are MAR or MCAR8, the EM consists of two components, the expectation (E) and maximization (M) steps.
just as if there were no missing data (thus missing values are replaced by estimated values, i e. initial conditions in the first round of maximization).
In the E step the missing data are estimated by their expectations given the observed data and current estimated parameter values.
which, for complex pattern of incomplete data, can be a very complicate function of. As a result these algorithms often require algebraic manipulations and complex programming.
but careful computation is needed. 8 For NMAR mechanisms one needs to make assumption on the missing-data mechanism
Ch. 15.40 parameters in are estimated re using maximum likelihood applied to the observed data augmented by the estimates of the unobserved data (coming from the previous round).
Effectively, this process maximizes, in each cycle, the expectation of the complete data log likelihood.
The advantage of the EM is its broadness (it can be used for a broad range of problems, e g. variance component estimation or factor analysis),
To test this, different initial starting values for each can be used. 4. 2 Multiple imputation Multiple imputation (MI) is a general approach that does not require a specification of paramentrized likelihood for all data.
The idea of MI is depicted in Figure 4. 1. The imputation of missing data is performed with a random process that reflects uncertainty.
Data set with missing values Set 1 Set 2 Set N Result 1 Result 2 Result N Combine results Figure 4
It assumes that data are drawn from a multivariate Normal distribution and requires MAR or MCAR assumptions.
The theory of MCMC is understood most easily using Bayesian methodology (See Figure 4. 2). Let us denote the observed data as Xobs and the complete dataset as X=(Xobs
we shall estimate it from the data, yielding, and use the distribution f (Xmis Xobs).
and covariance matrix from the data that does not have missing values. Use to estimate Prior distribution.
simulate values for missing data items by randomly selecting a value from the available distribution of values Posterior step:
i e. mean vector and cov. matrix are unchanged as we iterate) Use imputation from final iteration to form a data set without missing values need more iterations enough iterations Figure 4. 2. Functioning of MCMC
whose the distribution depends on the data. So the first step for its estimation is to obtain the posterior distribution of from the data.
Usually this posterior is approximated by a normal distribution. After formulating the posterior distribution of, the following imputation algorithm can be used:
*9 The missing data generating process may depend on additional parameters f, but if f and are independent,
the process called ignorable and the analyst can concentrate on modelling the missing data, given the observed data and.
then we have non-ignorable missing data generating process, which cannot be solved adequately without making assumptions on the functional form of the interdependency. 10 rearranged from K. Chantala and C. Suchindran,
http://www. cpc. unc. edu/services/computer/presentations/mi presentation2. pdf 42 Use the completed data X and the model to estimate the parameter of interest (e g. the mean) ß
In conclusion, Multiple Imputation method imputes several values (N) for each missing value (from the predictive distribution of the missing data),
The N versions of completed data sets are analyzed by standard complete data methods and the results are combined using simple rules to yield single combined estimates (e g.,
p-values, that formally incorporate missing data uncertainty. The pooling of the results of the analyses performed on the multiply imputed data sets,
implies that the resulting point estimates are averaged over the N completed sample points, and the resulting standard errors and p-values are adjusted according to the variance of the corresponding N completed sample point estimates.
Thus, the'between imputation variance',43 provides a measure of the extra inferential uncertainty due to missing data
44 5. Normalisation of data The indicators selected for aggregation convey at this stage quantitative information of different kinds11.
and their robustness to possible outliers in the data. Different normalization methods will supply different results for the composite indicator.
'46 Another transformation which is used often to reduce the skewness of (positive) data varying across many orders of magnitudes is the logarithmic transformation:
yet s/he has to beware that the normalized data will surely be affected by the log transformation.
Therefore, data have to be processed via specific treatment. An example is offered in the Environmental Sustainability Index
when data for a new time point become available. This implies an adjustment of the analysis period T,
In such cases, to maintain comparability between the existing and the new data, the composite indicator would have to be recalculated for the existing data. 5. 2. 4 Distance to a reference country This method takes the ratios of the indicator tqc x for a generic country c and time t
with respect to the sub-indicator t0 qc c x=for the reference country at the initial time 0 t. t0 qc c tt qc qc xx
if there is little variation within the original scores, the percentile banding forces the categorization on the data, irrespective of the distribution of the underlying data.
2003), an OECD report describing the construction of summary indicators from a large OECD database of economic and administrative product market regulations and employment protection legislation.
The summary indicators are obtained by means of factor analysis, in which each component of the regulatory framework is weighted according to its contribution to the overall variance in the data.
Data have been gathered basically from Member countries responses to the OECD Regulatory Indicators Questionnaire, which include both qualitative and quantitative information.
Qualitative information is coded by assigning a numerical value to each of its possible modalities (e g. ranging from a negative to an affirmative answer)
while the quantitative information (such as data on ownership shares or notice periods for individual dismissals) is subdivided into classes.
Examples of the above transformations are shown in Table 5. 6 using the TAI data. The data are sensitive to the choice of the transformation
and this might cause problems in terms of loss of the 52 interval level of the information, sensitivity to outliers, arbitrary choice of categorical scores and sensitivity to weighting.
normalisation techniques using the TAI data. Mean years of school Rank zscore rescaling distance to reference country Log 10 abo ve/Per cen tile Cat ego rica l age 15
coverage, reliability and economic reason), statistical adequacy, cyclical conformity, speed of available data, etc. In this section a number of techniques are presented ranging from weighting schemes based on statistical models (such as factor analysis, data envelopment analysis, unobserved components models),
to participatory methods (e g. budget allocation or analytic hierarchy processes). Weights usually have an important impact on the value of the composite
Weights may also reflect the statistical quality of the data, thus higher weight could be assigned to statistically reliable data (data with low percentages of missing values, large coverage, sound values).
In this case the concern is to reward only easy to measure and readily available baseindicators, punishing the information that is more problematic to identify and measure.
and factor analysis Principal component analysis (PCA) and more specifically factor analysis (FA)( Section 3) group together sub-indicators that are collinear to form a composite indicator capable of capturing as much of common information of those sub-indicators as possible.
but it is rather based on the statistical dimensions of the data. According to PCA/FA, weighting only intervenes to correct for the overlapping information of two or more correlated indicators,
Methodology The first step in FA is to check the correlation structure of the data: if the correlation between the indicators is low then it is unlikely that they share common factors.
small than the number of sub-indicators, representing the data. Summarizing briefly what has been explained in Section 3,
For a factor analysis only a subset of principal components are retained (let's say m), the ones that account for the largest amount of the variance.
Rotation is a standard step in factor analysis, it changes the factor loadings and hence the interpretation of the factors leaving unchanged the analytical solutions obtained ex-ante and ex-post the rotation.
Sensitive to modifications of basic data: data revisions and updates (e g. new observations and new countries) may change the set of weights
(i e. the estimated loadings) used in the composite. Sensitive to the presence of outliers, that may introduce spurious variability in the data Sensitive to small-sample problems
and data shortage that may make the statistical identification or the economic interpretation difficult (in general a relation between data and unknown parameters of 3: 1 is required for a stable solution).
Minimize the contribution of indicators, which do not move with other indicators. Sensitive to the factor extraction and to the rotation methods.
Examples of use Indicators of product market regulation (Nicoletti et al. OECD, 2000) Internal Market Index (EC-DG MARKT, 2001b) Business Climate Indicator (EC-DG ECFIN, 2000) General Indicator of S&t (NISTEP
, 1995) Success of software process Improvement (Emam et al. 1998) 16 To preserve comparability final weights could be rescaled to sum up to one. 59 6. 1. 2 Data envelopment analysis
and Benefit of the doubt Data envelopment analysis (DEA) employs linear programming tools (popular in Operative Research) to retrieve an efficiency frontier
Indicator 1 Indicator 2 a b c d d'0 Figure 6. 1. Performance frontier determined with Data Envelopment Analysis. Rearranged from Mahlberg and Obersteiner (2001.
In the extreme case of perfect collinearity among regressors the model will not even be identified. It is argued further that
It requires a large amount of data to produce estimates with known statistical properties. Examples of use Composite Economic Sentiment Indicator (ESIN) http://europa. eu. int/comm/economy finance National Innovation Capacity index (Porter and Stern, 1999
The observed data consist on a cluster of q=1,, Q (c) indicators, each measuring an aspect of ph (c). Let c=1,
since it would imply separating the correlation due to the collinearity of indicators from the correlation of error terms
However, since not all countries have data on all sub-indicators, the denominator of w c,
q). The likelihood function of the observed data is maximized with respect to the unknown parameters, a q) s, ß (q) s,
Reliability and robustness of results depend on the availability of enough data. With highly correlated sub-indicators there could be identification problems.
AHP allows for the application of data, experience, insight, and intuition in a logical and thorough way within a 69 hierarchy as a whole.
Analytic Hierarchy Process Advantages Disadvantages The method can be used both for qualitative and quantitative data The method increase the transparency of the composite The method requires a high number of pairwise comparisons
Although this methodology uses statistical analysis to treat data, it operates with people (experts, politicians, citizens) who are asked to choose which set of sub-indicators they prefer,
The role of the variability in the weights and their influence in the value of the composite will be the object of the section on sensitivity analysis (section 7). Table 6. 6. Weights for the sub-indicators obtained using 4 different methods:
equal weighting (EW), factor analysis (FA), budget allocation (BAL), and analytic hierarchy process (AHP) Patents Royalties Internet Tech exports Telephones Electricity Schooling University EW 0. 13 0. 13 0. 13
(or geometric) aggregations or non linear aggregations like the multi-criteria or the cluster analysis (the latter is explained in Section 3). This section reviews the most significant ones. 6. 2. 1 Additive methods The simplest
25 Data are normalized not. Normalization does not change the result of the multicriteria method whenever it does not change the ordinal information of the data matrix. 78 S==+Qq 1 jk q jk q jk w (In)) 2e (w (Pr) 1 (6. 15
) where w (Pr) q jk and w (In) q jk are the weights of sub-indicators presenting a preference and an indifference relation respectively.
only if data are expressed all in partially comparable interval scale (i e. temperature in Celsius of Fahrenheit) of type:
Non-comparable data measured in ratio scale (i e. kilograms and pounds: where>0 i i x x f a a (i e. i a varying across subindicators) can only be aggregated meaningfully by using geometric functions,
and Factor analysis is employed usually as a supplementary method with a view to examine thoroughly the relationships among the subindicators.
because it lets the data decide on the weighting issue, and it is sensible to national priorities.
compatibility between aggregation and weighting methods. 27 Compensability of aggregations is studied widely in fuzzy sets theory, for example Zimmermann and Zysno (1983) use the geometric operator
and not importance coefficients 7. Uncertainty and sensitivity analysis The reader will recall from the introduction that composite indicators may send misleading,
A combination of uncertainty and sensitivity analysis can help to gauge the robustness of the composite indicator,
i. selection of sub-indicators, ii. data quality, iii. data editing, iv. data normalisation, v. weighting scheme, vi. weights'values, vii. composite
Uncertainty Analysis (UA) and Sensitivity analysis (SA. UA focuses on how uncertainty in the input factors propagates through the structure of the composite indicator
i. inclusion exclusion of sub-indicators, ii. modelling of data error, e g. based of available information on variance estimation. iii. alternative editing schemes,
e g. multiple imputation, described in section 4. iv. using alternative data normalisation schemes, such as rescaling, standardisation,
Also modelling of the data error, point (ii) above, will not be included as in the case of TAI no standard error estimate is available for the sub-indicators.
()c Rank CI will be an output of interest studied in our uncertainty sensitivity analysis. Additionally, the average shift in countries'rank will be explored.
and sensitivity analysis (both in the first and second TAI analysis), targeting the questions raised in the introduction on the quality of the composite indicator.
the relative sub-indicator q will be neglected almost for that run. 7. 1. 4 Data quality This is not considered here as discussed above. 7. 1. 5 Normalisation As described in Section II-5 several methods are available
1 X Editing 1 Use bivariate correlation to impute missing data 2 Assign zero to missing datum The second input factor 2 X is the trigger to select the normalisation
We anticipate here that a scatter-plot based sensitivity analysis will allow us to track which indicator when excluded affects the output the most.
(either for the BAL or AHP schemes) are assigned to the data. Clearly the selection of the expert has no bearing
such as the variance and higher order moments, can be estimated with an arbitrary level of precision that is related to the size of the simulation N. 7. 1. 7 Sensitivity analysis using variance-based techniques A necessary step
when designing a sensitivity analysis is to identify the output variables of interest. Ideally these should be relevant to the issue tackled by the model,
In the following, we shall apply sensitivity analysis to output variables()c Rank CI, and S R, for their bearing on the quality assessment of our composite indicator.
2000a, EPA, 2004), robust, model-free techniques for sensitivity analysis should be used for non linear models.
Variance-based techniques for sensitivity analysis are model free and display additional properties convenient for the present analysis:
and to explain 92 they allow for a sensitivity analysis whereby uncertain input factors are treated in groups instead of individually they can be justified in terms of rigorous settings for sensitivity analysis,
as we shall discuss later in this section. How do we compute a variance based sensitivity measure for a given input factor i X?
The usefulness of I s, Ti S, also for the case of non-independent input factors, is linked also to their interpretation in terms of settings for sensitivity analysis.
i e. by censoring all countries with missing data. As a result, only 34 countries could in theory be analysed.
as this is the first country with missing data, and it was preferred to analyse the set of countries whose rank was altered not the omission of missing records.
we show in Figure 7. 2 a sensitivity analysis based on the first order indices calculated using the method of Sobol'(1993) in its improved version due to Saltelli (2002).
Rep. of Variance of country rank Non-additive Expert selection Weighting Scheme Aggregation System Exclusion/Inclusion Normalisation Figure 7. 2. Sensitivity analysis results
, Rep. of Total effect sensitivity index Expert Weighting Aggregation Exclusion/Inclusion Normalisation Figure 7. 3. Sensitivity analysis results based on the total effect indices.
The sensitivity analysis results for the average shift in 100 ranking output variable (Equation 7. 2) is shown in Table 7. 2. Interactions are now between expert selection and weighing,
there is not much hope that a robust index will emerge, not even by the best provision of uncertainty and sensitivity analysis.
Data retrieved on 4 october, 2004. A number of lines are superimposed usually in the same chart to allow comparisons between countries.
an assessment of progress can be made by comparing the latest data with the position at a number of baselines.
in direction away from meeting objective Insufficient or no comparable data 109 8. 5 Rankings A quick and easy way to display country performance is to use rankings.
However its graphical features can be helpful for presentational purposes. www. nationmaster. com is a massive central data source on the internet with a handy way to graphically compare nations.
Nation Master is a vast compilation of data from such sources as the CIA World Factbook, United nations, World health organization, World bank, World Resources Institute, UNESCO,
Data selection The quality of composite indicators depends also on the quality of the underlying indicators.
Imputation of missing data The idea of imputation is both seductive and dangerous. Several imputation methods are available,
and the use ofexpensive to collect'data that would otherwise be discarded. The main disadvantage of imputation is that the results are affected by the imputation algorithm used.
Robustness and sensitivity The iterative use of uncertainty and sensitivity analysis during the development of a composite indicator can contribute to its well-structuring.
Uncertainty and sensitivity analysis are suggested the tools for coping with uncertainty and ambiguity in a more transparent and defensible fashion.
The hague. 2. Anderberg, M. R. 1973), Cluster analysis for Applications, New york: Academic Press, Inc. 3. Arrow K. J. 1963)- Social choice and individual values, 2d edition, Wiley, New york. 4. Arrow K. J,
Binder, D. A. 1978),"Bayesian Cluster analysis,"Biometrika, 65,31-38.7. Boscarino J. A.,Figley C. R,
Principal components analysis and exploratory and confirmatory factor analysis. In Grimm and Yarnold, Reading and understanding multivariate analysis.
In Sensitivity analysis (eds A. Saltelli, K. Chan, M. Scott) pp. 167-197. New york: John Wiley & Sons. 12.
and Seiford L. M.,(1995), Data Envelopment Analysis: Theory, Methodology and Applications. Boston: Kluwer. 13.
Cherchye L. 2001), Using data envelopment analysis to assess macroeconomic policy performance, Applied Economics, 33,407-416.14.
Dempster A p. and Rubin D. B. 1983), Introduction pp. 3-10, in Incomplete Data in Sample Surveys (vol. 2:
Everitt, B. S. 1979),"Unresolved Problems in Cluster analysis,"Biometrics, 35,169-181.37. Fabrigar, L. R.,Wegener, D. T.,Maccallum, R c.,
Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4: 272-299.38. Fagerberg J. 2001) Europe at the crossroads:
Funtowicz S. O.,Munda G.,Paruccini M. 1990)- The aggregation of environmental data using multicriteria methods, Environmetrics, Vol. 1 (4), pp. 353-36.43.
Factor analysis. Hillsdale, NJ: Lawrence Erlbaum. Orig. ed. 1974.47. Gough C.,Castells, N, . and Funtowicz S.,(1998), Integrated Assessment:
Hartigan, J. A. 1975), Clustering Algorithms, New york: John Wiley & Sons, Inc. 53. Harvey A.,(1989), Forecasting, structural time series models and the Kalman filter.
A step-by-step approach to using the SAS system for factor analysis and structural equation modeling. Cary, NC:
An Empirical Analysis Based on Survey Data for Swiss Manufacturing, Research Policy, 25,633-45.58. Hollenstein, H. 2003:
A Cluster analysis Based on Firm-level Data, Research Policy, 32 (5), 845-863.59. Homma, T. and Saltelli, A. 1996) Importance measures in global sensitivity analysis of model output.
Reliability Engineering and System Safety, 52 (1), 1-17.60. Hutcheson, G, . and Sofroniou N.,(1999).
Introduction to factor analysis: What it is and how to do it. Thousand Oaks, CA: Sage Publications, Quantitative Applications in the Social sciences Series, No. 13.63.
Factor analysis: Statistical methods and practical issues. Thousand Oaks, CA: Sage Publications, Quantitative Applications in the Social sciences Series, No. 14.64.
Covers confirmatory factor analysis using SEM techniques. See esp. Ch. 7. 77. Koedijk K, . and Kremers J.,(1996), Market opening, regulation and growth in Europe, Economic policy (0) 23.october 78.
Factor analysis as a statistical method. London: Butterworth and Co. 81. Levine, M. S.,(1977. Canonical analysis and factor comparison.
and Schenker N.,(1994), Missing Data, in Handbook for Statistical Modeling in the Social and Behavioral Sciences (G. Arminger, C. C Clogg,
Little R. J. A (1997) Biostatistical Analysis with Missing Data, in Encyclopedia of Biostatistics (p. Armitage and T. Colton eds.
Little R. J. A. and Rubin D. B. 2002), Statistical analysis with Missing Data, Wiley Interscience, J. Wiley &sons, Hoboken, New jersey. 85.
Mahlberg B. and Obersteiner M.,(2001), Remeasuring the HDI by data Envelopment analysis, Interim report IR-01-069, International Institute for Applied System Analysis, Laxenburg
Massart, D. L. and Kaufman, L. 1983), The Interpretation of Analytical Chemical Data by the Use of Cluster analysis, New york:
Milligan, G. W. and Cooper, M. C. 1985),"An Examination of Procedures for Determining the Number of Clusters in a Data Set,"Psychometrika, 50,159-179.93.
Making sense of factor analysis: The use of factor analysis for instrument development in health care research. Thousand Oaks, CA:
Sage Publications. 109. Pré Consultants (2000) The Eco-indicator 99. A damage oriented method for life cycle impact assessment. http://www. pre. nl/eco-indicator99/ei99-reports. htm 110.
Saltelli, A.,Chan, K. and Scott, M. 2000a) Sensitivity analysis, Probability and Statistics series, New york: John Wiley & Sons. 123.
Saltelli, A.,Tarantola, S. and Campolongo, F. 2000b) Sensitivity analysis as an ingredient of modelling. Statistical Science, 15,377-395.125.
Saltelli, A.,Tarantola, S.,Campolongo, F. and Ratto, M. 2004) Sensitivity analysis in practice, a guide to assessing scientific models.
A software for sensitivity analysis is available at http://www. jrc. cec. eu. int/uasa/prj-sa-soft. asp. 126.
Sobol',I. M. 1993) Sensitivity analysis for nonlinear mathematical models. Mathematical Modelling & Computational Experiment 1, 407-414.130.
Spath, H. 1980), Cluster analysis Algorithms, Chichester, England: Ellis Horwood. 132. SPRG (2001) Report of the Scientific Peer review Group on Health Systems Performance Assessment,
Tarantola, S.,Jesinghaus, J. and Puolamaa, M. 2000) Global sensitivity analysis: a quality assurance tool in environmental policy modelling.
In Sensitivity analysis (eds A. Saltelli, K. Chan, M. Scott) pp. 385-397. New york: John Wiley & Sons. 136.
and Kiers, H. 2001) Factorial k-means analysis for two-way data, Computational Statistics and Data analysis, 37 (1), 49-64.142.
Cited with regard to preference for PFA over PCA in confirmatory factor analysis in SEM. 144. World Economic Forum (2002) Environmental Sustainability Index http://www. ciesin org/indicators/ESI/index. html. 145.
Zimmermann H. J. and Zysno P. 1983) Decisions and evaluations by hierarchical aggregation of information, Fuzzy sets and Systems, 10, pp. 243-260.129 APPENDIX TAI is made of a relatively small
However the original data set contains a large number of missing values, mainly due to missing data in Patents and Royalties.
Overtext Web Module V3.0 Alpha
Copyright Semantic-Knowledge, 1994-2011