Maria Angela Ferrario1, Zoltán Bajmócy2, 3, Will Simm1, Stephen Forshaw1. 1 Lancaster University, School of Computing and Communications, Lancaster, UK 2 University of Szeged
, Faculty of Economic Research Centre, Szeged, Hungary 3 Community-based Research for Sustainability Association (CRS), Szeged, Hungary Corresponding author email:
which seeks to promote positive social change by building innovative software solutions with a social conscience 7. Initiatives such as Games for Change1,
The study, reflection and policy intervention on this tension is at the core of Amartya Sen's Capability Approach (CA) 13. CA is a way to approach human well-being that posits human freedoms at its core.
How can we move beyondgadgets 'and investigate the long-term societal and ethical implications of personalised health-technology?
we are investigating the role of low-cost distributed and independent manufacturing (e g. 3d printing) in personalised digital-health
. & Whittle, J. 2014) Software engineering for'social good':'integrating action research, participatory design, and agile development.
In Companion Proceedings of the 36th International Conference on Software engineering (pp. 520-523. ACM. 8. Galimberti, U. 2000) I miti del nostro tempo, Feltrinelli Editore. 9. González, A. H.,Aristizábal, A b,
3 Small Organisations and e-business 3 The two digital divides 3 Background 4 Major obstacles 5 2. THE DIGITAL SYSTEMS EVOLUTION AND THE ADOPTION PHASES...
20 Basic principles for the infrastructure 20 Open source basic infrastructure...20 The components the basic infrastructure...
European small organisations are not ready to use the Internet more intensively as a business tool,
small organisations seem to need favourable conditions to accelerate the diffusion of the Internet and adoption of ICT technologies and thus to avoid a digital divide between larger and smaller enterprises and among geographical areas.
The two digital divides At the Lisbon summit in March 2000, the European union representatives set the goal of becoming the world's most dynamic and competitive knowledge-based economy by 2010 with the need to promote anInformation Society
for All''and to address the issues of the digital divide in the adoption of Internet and e-business use.
The statistical evidence points to two main digital divides on e-business issues within European Member States:
The regional digital divide arising from the different rates of progress in ebusiness development within the EU,
generally perceived as between the Nordic/Western and the Southern European Member States. While Nordic and some Western European countries are sophisticated fast
The digital divide by company size arising from the significantgaps'between SMES and larger enterprises in the more advanced forms of electronic commerce and particularly in terms of e-business integration and associated skills.
and ICT usage by European enterprises survey of 20011 The effect of the two digital divides is cumulative
and ecommerce in all sectors of the economy. 2 Benchmarking national and regional ebusiness policies for SMES Final Benchmarking report 12 june 2002 SMES and ICT goal of Lisbon regional divide
Internet, Web) are the results of Research and Technological Development (RTD) programmes funded by public programmes has grown.
which involve directly hundreds of SMES throughout Europe together with many catalysts: local or regional organisations that work with SMES to facilitate the change process.
procuring hardware and /or software tools (installation, training, and subsequent reorganisation), continuous maintenance, servicing costs and telecommunications charges.
However, getting the right ICT equipment is only part of the equation..SMES often have limited very resources for experimentation;
they can rarely afford to make expensive mistakes, therefore uncertainty about the viability of the initial investment and the rising cost of maintenance services may reduce their willingness to undertake the necessary investments.
Economic failures are an intrinsic element in a fast-changing environment like the Internet. Small organisations are reluctant to invest in ICT rather than concentrating the investments in their core business. 2. THE DIGITAL SYSTEMS EVOLUTION
AND THE ADOPTION PHASES Status of digital adoption for small organizations E-business is described often as the small organisations'gateway to global business and markets,
However, although Internet use figures differ among Member States and sectors, there is generally a positive correlation between the size of an enterprise
and its Internet use for business, i e. the smaller the company, the less it uses ICT7.
September 2002 Digital Business Ecosystems page 7 The initial adoption phases The adoption of Internet-based technologies for e-business is a continuous process, with sequential steps of evolution.
1) e-mail,(2) web-presence,(3) e-commerce,(4) e-business,(5) networked organizations,(6) digital business ecosystems.
In the early stages, Internet has been used as new instrument of commercial communications: First phase: e-mail (early adopter started in 1986:
The first adoption step was based on the usage of Internet for exchanging e-mails and messages.
web-presence (from 1993) The second phase saw proliferation of an electronic presence, usually through a static Web site.
Actually, those websites, lost in cyberspace, were visited not by the target clients, and the unavoidable dispersion of those website led to a limited effectiveness in the cyberspace,
partially solved by the establishment of vertical, thematic or regional e-marketplace portals and efficient search engines, On average, across European union (EU),
only 67%of SMES108 have access to the Internet. In some Member States, this is even less than the Internet penetration rate among households.
Of those that are connected the majority uses the Internet only for information purposes. Only 44%of them have their own website,
but the difference between large enterprises9 and SMES10 and between regions is relevant. Have their own website the 80%of large enterprises;
8 SMES10=enterprises with between 10 and 249 employees 9 large enterprises are considered by Eurosta the enterprises with more than 249 employees e-mail webpresence phases FN, September 2002 Digital
Business Ecosystems page 8 6%of Spanish SMES10, 9%of Italian SMES10, but 67%of Finnish and 65%of German SMES10. 10 E-commerce Third phase:
e-commerce (from 1996. When finally the technology allowed the use of the Internet to perform economical transitions on-line between enterprises and consumers (B2c) or among enterprises and suppliers,
or internally inside the same enterprise (B2b) the e-commerce started, allowing to the enterprises purchases, sales, electronic auctions, e-payments.
Even in the most advanced Member States, only a minority of SMES11 uses the Internet for commercial transactions
especially in the US, these figures are alarming signs that European SMES are committed not yet fully to the Internet.
The OECD estimates that the value of Internet transactions doubles every 12-18 months. European SMES therefore risk missing important economic opportunities.
e-business (from 1999) Internet technology has gone far beyond a mere means of electronic transactions becoming a foundation for applications linked to the core business systems,
with the extension of the usage of Internet from the simple commerce to all operation of their business, inventing new operative processes as well.
from marketing to sales, from customer relationship to logistics and operation management, from education to training and knowledge management. Examples of applications and infrastructures based on Internet includes:
systems for ecommerce, e-procurement, Supply Chain Management, Customer relationship management, Enterprise resource planning, logistics, planning, knowledge management, business intelligence, e training.
Examples of innovative working processes are customer call centers, Intranets that link business partners, data warehouses that improve customer relationships.
The e-business opportunities are taken mainly by large organizations, whilst the single small organization faces with well-known barriers:
-N 11 In 2001,6%of EU enterprises used Internet for electronic delivery and 7%for e-payments;
Only 3%of EU enterprises used Internet for ecommerce for more than 2 years (20.2.02 Eurostat, ibidem) obstacles e-commerce e-business FN, September 2002 Digital Business Ecosystems
The major obstacles could be overcome by having a software infrastructure with services at acceptable costs
and includes outsourcing non-core operations, changes in processes and systems, and paying attention to legal and audit considerations.
The game, then, to be played today is managing organizational genetics together with a new ecology of information technology.
This process is advanced in the sectors of insurances, in the distribution, in the media and in telecom sectors.
The ecosystems are, in fact, characterised by intelligent software components and services, knowledge transfer, interactive training frameworks and integration of business processes and egovernance models.
The latter step in the adoption of Internet-based technologies for business, where the business services and the software components are supported by a pervasive software environment,
and software components and services developed for that area of business will appear. These components are based on a set of specific requirements in sectorial,
databases and the know-how of millions of individuals, is the ultimate source of all economic life. 15 Organizations,
Harnessing the Power of Business Webs, Harvard Business school Press; ISBN: 1578511933;(May 2000) James Moore, Death of Competition:
which could be software components, applications, services, knowledge, business models, training modules, contractual frameworks, laws, These digital species,
WTC regulations Basic protocols, network infrastructure TCP IP XML, ebxml organs Software components, business models Open source models, operating systems Simple species Grass, worms
, tiger Small organizations, universities, chambers o f commerce Basic e-services, Simple services Accounting sys, Payment sys, Groupware sys. Group
They might include systems for electronic payment, for certification and trust, enterprise resource planning, customer relationship management e-procurement.
and software components and services developed for that area of business appear. These components are based on a set of specific requirements in sectorial,
Generic software components and applications adapted for the specific sector (e g. adaptation of customer relationship management systems,
user profiling systems,)New developed or imported sector-specific software components (e g. reservation systems or yield management systems for tourism sector;
which describe the semantics of data, services, processes for that business sector Sector-specific education and training modules Knowledge basis;
The technological infrastructure, the components, the services lives within a set of interconnected computer nodes based on the geographical areas
In this landscape of virtual distributed communities, the active participation of open source developer communities is a measure of the success of the initiative.
software sharing, common development of open source software open and distributed common infrastructure use of digital business ecosystems Complexity of regulations Actions:
The realization of living, 1980, D. Reidel Publishing company, Dordrecht, Holland Terry Winograd and Fernando Flores, Understanding Computer and Cognition, a new foundation for design, 1986, Ablex
an open source model adopting multiple business models; for the digital species of specialised ecosystem: encouraging the maximum coexistence and diversity of models and licences, supporting as much as possible the equal opportunities of service/solution publishing and fair competition;
and the data format are open and not depending from a unique provider, to guarantee the independence from hardware and software platforms,
the highest interoperability and the possibility to reuse the preexisting information and services..Open source basic infrastructure To guarantee that the ecosystems attracts a critical mass of developers of services
and therefore of users, is critical to guarantee evolution and continuity of services in time within an open infrastructure.
of which can be guaranteed due to the availability of the source code. The components the basic infrastructure The basic infrastructure of the common ecosystem environment is composed by the infrastructure network and by architectural modules,
Models for sector-specific ecosystems The use of open source infrastructure, the convergence on open standard and open systems, the strong support for interoperability (if necessary through the creation of compatible free software;
or component (open source or proprietary), could substitute it as soon a more adequate one appears on the ecosystem,
provides the digital support for the economical development of small organisations fosters the private entrepreneurship on the sector of production of software components and services.
whilst in the others the two digital divides will increase: they will be disadvantaged further respect the large enterprises
The European council held in Lisbon on 23/24 March 2000 recognised an urgent need for Europe to quickly exploit the new opportunities of the economy and in particular the Internet.
coverage of the territory number of applications and services present diffusion and availability of the infrastructure.
Stimulus for small and local ICT software and service providers The ecosystems stimulates the innovation
the smallest software producer can compete on equal terms with the most powerful corporations. Competitiveness and innovation is increased then,
generating a supply of software with better conditions of usability, in a model of continuous improvement.
at the local level, the technicians who provide support for proprietary software produced by multinational companies do not have the knowledge and the possibility of high-level development.
The possibility to develop software components and solutions creates more technically qualified employment and a framework of competence
, London WC1E 7hx, UK Email addresses: marina. ranga@stanford. edu; henryetz@stanford. edu Corresponding author:
Fax: 650-725-2166 Abstract This paper introduces the concept of Triple Helix systems as an analytical construct that systematizes the key features of university-industry-government (Triple Helix) interactions into aninnovation
The activities of the Triple Helix actors are measured in terms of probabilistic entropy, which, when negative, suggests a self-organizing dynamic that may temporarily be stabilized in the overlay of communications among the carrying agencies (e g.
manifested through increasing communication and interconnectivity between people and institutions, mobility of people and financial capital, delocalisation and globalisation of production sites, labour and social relationships, etc.
which have become the core of a thriving regional innovation network including incubators, science parks, business centres, venture capitalists, spin-off companies and international R&d intensive 20 companies,
The Consensus Space has a broad coverage of the governance concept, including government and non-government actors who interact continuously to exchange resources
like computer networking, winnowed from a larger collection (Miller, 1997). Yet another situation is when one space becomes the basis for the enhancement of the others,
http://www. adimoserver. se/adimo4/(S (kokri4qeowj3nyvt0tkb1s45))/site/kista/web/default. aspx? p=1546&t=h401&l=en. 27 6. RELEVANCE OF TRIPLE HELIX SYSTEMS FOR KNOWLEDGE-BASED REGIONAL INNOVATION STRATEGIES Regional innovation policies have focused traditionally on the promotion of localized learning processes
The large-scale research programmes in data mining funded by the Defence Advanced Research Projects Agency (DARPA) at Stanford and a few other universities provided the context for the development of the Google search algorithm that soon became the basis
the making of the personal computer. Mcgraw-hill, New york. Galende, J.,Suarez I. 1999. A resource-based analysis of the factors determining a firm's R&d activities.
The Changing Nature of Organizations, Work and Workplace (downloaded on 8 april from http://www. wbdg. org/resources/chngorgwork. php.
The evolution of communication systems. International Journal of Systems Research and Information science 6, 219 230. Leydesdorff, L. 1996.
Configurational Information as Potentially Negative Entropy: The Triple Helix Model. Entropy 10,391-410. Leydesdorff, L.,Etzkowitz, H. 1996.
Emergence of a Triple Helix of University-Industry-Government Relations. Science and Public Policy 23,279-86.
Conflict and the Web of Group Affiliations. Translated and edited by Kurt Wolff. Free Press, Glencoe, IL. 48 Slaughter, S l. Leslie. 1997.
*Related content and download information correct at time of download. Downloaded by WATERFORD INSTITUTE OF TECHNOLOGY At 04:12 03 july 2015 (PT) Types of innovation, sources of information and performance in entrepreneurial SMES Miika Varis and Hannu Littunen Department of health and Social Management
Research limitations/implications As the analysis was reported based on self data provided by the entrepreneurs of SMES,
as well as on the complex web of interactions and on the institutional environment guiding and facilitating the actions and interactions of economic agents.
Section 3 introduces the data and the research methodology used in this study. In section 4 the results of statistical analysis are presented.
also the firm's ability to attract highly qualified labor force will become one of its core competencies (Bougrain and Haudeville, 2002).
fairs Internet Media Professional literature Educational meetings Entrepreneur friends Participation in development projects Five-point Likert-scale (1 Insignificant to 5 Very
Some typical examples are the Internet and other media, commercial exhibitions and fairs, scientific and professional literature, trade journals, educational events, and so forth.
Especially the firms located in peripheral and rural areas are forced often to rely on the generally available information sources due to the lack of relevant local network partners and the inadequateness of public support instruments (cf.
The entrepreneur's assessment of the importance of different generally accessible information sources (fairs, media, internet, etc.
As Na°s and Leppa lahti (1997) remark, a notorious problem with longitudinal statistical analyses such as enterprise panels is attrition
which leads to missing data and possibly biased results. When reviewing the existing literature on innovation-performance relationship more broadly than is possible to depict here,
The entrepreneurs'assessment of the impact of innovations on their firms'growth and profitability is used to measure this association (see Table I). 3. Data
and research methodology 3. 1 Sample and data The primary data for this study were gathered in 2006 via a postal questionnaire among the SMES located in the Northern Savo region in Eastern Finland, approximately 400
As a sample frame for constructing the database, we used the register of SMES in the region that was offered by Suomen Asiakastieto,
In this register, the latest financial statements data of 95 000 Finnish firms and groups are on one CD.
and the firms invited to sampling were contacted by a letter or telephone. Questionnaires were returned, 264
whether the firms had introduced of implemented completely new or radically improved innovations during the four-year period (2002-2006) prior to the data collection,
with cross-sectional data we are unable to proof the existence of a causal relationship or its direction,
However, as our data do not allow a more detailed investigation of this issue, the propositions presented above should be treated with caution,
Regarding our findings, in the case of product innovations, a positive relationship was found between the use of different freely available external information sources (exhibitions, fairs, internet, media, etc.
As our analysis was reported based on self data provided by the owner-managers of SMES, we have to rely on the judgment of the entrepreneur regarding the newness of the innovation.
Second, on the basis of our data, we are unable to state whether the external information source used,
Fifth, in this study the data were gathered from single informants the owner-managers of the firms only.
Science, Technology and Industry Outlook, OECD, Paris. OECD (2005), The Measurement of Scientific and Technological Activities:
Guidelines for Collecting and Interpreting Innovation Data: Oslo Manual, 3rd ed.,OECD, Paris. Pavitt, K. 2005), Innovation processes, in Fagerberg, J.,Mowery, D c. and Nelson, R. R. Eds), The Oxford
evidence from the Vienna software sector, Economic geography, Vol. 85 No. 4, pp. 443-62. To dtling, F. and Kaufmann, A. 1999), Innovation systems in regions of Europe a comparative perspective, European Planning Studies, Vol. 7 No. 6, pp. 699-717.
) Martijn Visser (CWTS) Don Westerheijden (CHEPS)* Erik van Wijk (CWTS) Michel Zitt (OST) International expert panel Nian Cai Liu (Shanghai
databases and data collection tools...79 4. 1 Introduction 79 10 4. 2 Databases 79 Existing databases 79 4. 2. 1 Bibliometric databases 80 4
. 2. 2 Patent databases 82 4. 2. 3 Data availability according to EUMIDA 83 4. 2. 4 Expert view on data availability in non
-European countries 85 4. 2. 5 4. 3 Data collection instruments 87 Self-reported institutional data 88 4. 3. 1 4
. 3. 1. 1 U-Map questionnaire 88 4. 3. 1. 2 Institutional questionnaire 89 4. 3. 1. 3 Field-based
pilot sample and data collection...97 5. 1 Introduction 97 5. 2 The global sample 97 5. 3 Data collection 102 Institutional self-reported data 103
5. 3. 1 5. 3. 1. 1 The process 103 5. 3. 1. 2 Follow-up survey 106 5. 3. 1. 3 Data
cleaning 108 International databases 110 5. 3. 2 5. 3. 2. 1 Bibliometric data 111 5. 3. 2. 2 Patent data
115 6 Testing U multirank: results...119 6. 1 Introduction 119 6. 2 Feasibility of indicators 119 Teaching & Learning 122 6. 2. 1 Research 124 6. 2
Feasibility of data collection 133 Self-reported institutional data 133 6. 3. 1 Student survey data 135 6. 3. 2 Bibliometric
and patent data 135 6. 3. 3 6. 4 Feasibility of up-scaling 137 11 7 Applying U multirank:
Data elements shared between EUMIDA and U multirank: their coverage in national databases...84 Table 4-2:
Availability of U multirank data elements in countries'national databases according to experts in 6 countries (Argentina/AR, Australia/AU, Canada/CA, Saudi arabia/SA, South africa/ZA
, United states/US...86 Table 5-1: Regional distribution of participating institutions...99 Table 5-2:
Self-reported time needed to deliver data (fte staff days...106 Table 5-3: Self-reported time needed to deliver data (fte staff days:
European vs. non-European institutions...106 Table 6-1: Focused institutional ranking indicators: Teaching & Learning...
U multirank data collection process...104 Figure 5-2: Follow up survey: assessment of data procedures and communication...
107 Figure 5-3: Follow up survey: assessment of data collection process...107 Figure 5-4:
Follow up survey: Availability of data...108 Figure 5-5: Distribution of annual average patent volume for pilot institutes (N=165)..116 Figure 7-1:
Combining U-Map and U multirank...142 Figure 7-2: User selection of indicators for personalized ranking tables...
and business studies and should have a sufficient geographical coverage (inside and outside of the EU) and a sufficient coverage of institutions with different missions. 16 In undertaking the project the consortium was assisted greatly by four groups that it worked closely with:
An Advisory board constituted by the European commission as the project initiator which included not only representatives of the Directorate General:
The international panel was consulted at key decision making moments in the project. Crucially, given the user-driven nature of the new transparency instrument designed within the project,
Teaching and learning Research Knowledge transfer International orientation Regional engagement On the basis of data gathered on these indicators across the five performance dimensions,
However, difficulties with the availability and comparability of information mean that it would be unlikely to achieve extensive coverage levels across the globe in the short-term.
or twenty times that number and extending its field coverage from three to around fifteen major disciplinary fields,
Some modifications need to be made to a number of indicators and to the data gathering instruments based on the experience of the pilot study.
and underlying database to produce authoritative expert institutional and field based rankings for particular groups of comparable institutions on dimensions particularly relevant to their activity profiles.
On the positive side they 1 http://en. wikipedia. org/wiki/Three points for a win 25 urge decision-makers to think bigger and set the bar higher,
what sources of data are used and by whom? We concluded from our review that different rankings
It seems that availability of quantitative data has precedence over their 27 validity and reliability.
Increasingly also, web tools of rankings begin to include some degree of interactivity and choice for end users.
The problem of field and regional biases in publication and citation data: many rankings use bibliometric data, ignoring that the available international publication
and citation databases mainly cover peer reviewed journal articles, while that type of scientific communication is prevalent only in a narrow set of disciplines (most natural sciences, some fields in medicine) but not in many others (engineering, other fields in medicine and natural sciences, humanities
and social sciences) 28 The problem of unspecified and volatile methodologies: in many cases, users cannot obtain the information necessary to understand how rankings have been made;
Recent reports on rankings such as the report of the Assessment of University-Based Research Expert Group (AUBR Expert Group, 2009) which defined a number of principles for sustainable collection of research data,
leading to a matrix of data that could be used in different constellations to respond to different scenarios (information needs.
data sources or lessons learned about data and data collection. The results of this part of the exercise will be reflected in the next chapters.
and the selection of data sources depends on the interest of research and the purpose of the measurement.
and how those are linked to the data they gather and display. The global rankings that we studied limit their interest to several hundred preselected universities,
staff and students (5%)Industry income per staff (2. 5%)International faculty (5%)International students (5%)Website http://ranking. heeact. edu. tw
A major reason why the current global rankings focus on research data is that this is the only type of data readily available internationally.
Use of statistics from existing databases. National databases on higher education and research institutions cover different information based on national, different definitions of items and are
therefore not easily used in cross-national comparisons. International databases such as those of UNESCO, OECD and the EU show those comparability problems
but moreover they are focused on the national level and are therefore not useful for institutional
or field comparisons. 3 International databases with information at the institutional level or lower aggregation levels are currently available for specific subfields:
Regarding research output and impact, there are worldwide databases on journal publications and citations (the well-known Thomson Reuters and Scopus databases.
These databases, after thorough checking and adaptation, are used in the research-based global rankings. Their strengths and weaknesses were mentioned above.
Patent databases have not been used until now for global rankings. Self-reported data collected by higher education and research institutions participating in a ranking.
This source is used regularly though not in all global rankings, due to the lack of externally available and verified statistics (Thibaud, 2009.
Self-reported data ought to be validated externally or verified; several methods to that end are available.
but are suited less for gathering factual data. Student satisfaction and to a lesser extent satisfaction of other stakeholders is used in national rankings,
Manipulation of opinion-type data has surfaced in surveys for ranking and is hard to uncover
i e. data available in national public sources are entered into 3 The beginnings of European data collection as in the EUMIDA project may help to overcome this problem for the European region in years to come. 33 the questionnaires sent to higher education institutions for data
and give them the opportunity to verify thepre-filled'data as well. The U-Map test withpre-filling'from national data sources in Norway appeared to be resulted successful
and in a substantial decrease of the burden of gathering data at the level of higher education institutions. 1. 4 Impacts of current rankings According to many commentators,
impacts of rankings on the sector are rather negative: they encourage wasteful use of resources,
institutional data and choice of publication language (English) and channels (journals counted in the international bibliometric databases).
which may have negligent or even harmful effects on their performance in core activities. Most of the effects discussed above are rather negative to students, institutions and the higher education sector.
The decision about an adequate number ofperformance categories'has to be taken with regard to the number of institutions included in a ranking and the distribution of data.
Rankings have to use multiple databases to bring in different perspectives on institutional performance. As much as possible available data sources should be used,
but currently their availability is limited. To create multidimensional 36 rankings, gathering additional data from the institutions is necessary.
Therefore, the quality of the data collection process is crucial. In addition rankings should be self-reflexive with regard to potential unintended consequences and undesirable/perverse effects.
Involvement of stakeholders in the process of designing a ranking tool and selecting indicators is crucial to keep feedback loops short,
The basic methodology, the ranking procedures, the data used (including information about survey samples) and the definitions of indicators have to be public for all users.
An important factor in the argument against rankings and league tables is the fact that often their selection of indicators is guided primarily by the (easy) availability of data rather than by relevance.
This is particularly an issue with survey data (e g. among students, alumni, staff) used in rankings.
In surveys and with regard to self-reported institutional data, the operationalizing of indicators and formulation of questions requires close attention in particular in international rankings,
Hence the indicators and underlying data/measure must be comparable between institutions; they have to measure the same quality in different institutions.
In addition to the general issue of comparability of data across institutions, international rankings have to deal with issues of international comparability.
Indicators, data elements and underlying questions have to be defined and formulated in a way that takes such contextual variations into Account for example,
in order to harmonise data on academic staff (excluding doctoral students). Feasibility The objective of U multirank is to design a multidimensional global ranking tool that is feasible in practice.
we now briefly describe the way we have worked methodologically out the principle of being driven user (see section 2. 2). We propose an interactive web-based approach,
This will result in users creating their own specific and different rankings, according to their needs and wishes, from the entire database.
A detailed description of the methodology used in this classification can be found on the U-Map website (http://www. u-map. eu/methodology doc/)and in the final report of the U-Map project,
4 http://www. ireg-observatory. org/index. php? option=com content&task=view&id=41&itemid=48 47 providing users the option to create tailor-made approaches,
The other important components of the construction process for U multirank are the databases and the data collection tools that allow us to actuallyfill'the indicators.
These will be discussed further in chapter 4 as we explain the design of U multirank in more detail.
In chapters 5 and 6 we report on the U multirank pilot study during which we analysed the data quality
The first step in the indicator selection process was a comprehensive inventory of potential indicators from the literature and from existing rankings and databases.
which we presented information on the availability of data, the perceived reliability of the indicators,
Literature review Review of existing rankings Review of existing databases First selection Stakeholder consultation Expert advice Second selection Pre-test Revision Selection
The measurement of the indicator is the same regardless of who collects the data or when the measure is repeated.
The data sources and the data to build the indicator are reliable. Comparability: The indicators allow comparisons from one situation/system/location to another;
so that data are comparable. Feasibility: The required data to construct the indicator is either available in existing databases and/or in higher education and research institutions,
or can be collected with acceptable effort. Based on the various stakeholders'and experts'assessments of the indicators as well as on our analyses using the four additional criteria,
the feasibility of the data collection instruments (i e. the questionnaires used to collect the data) as well as the clarity of the definitions for the required data elements.
The outcome of the pre-test was used then as further input for the wider pilot where the actual data was collected to quantify the indicators for U multirank at both the institutional and the field level.
Teaching and learning 3. 3. 1education is the core activity in most higher education and research institutions.
they have different foci, use different data, different performance indicators and differentalgorithms'to arrive at judgments.
The qualifications frameworks currently being developed in the Bologna process and in the EU may come to play a harmonising role with regard to educational standards in Europe,
(including expenditure on teaching related overhead) as a percentage of total expenditure Data available. Indicator is input indicator.
Data collection and availability problematic. 4 Relative rate of graduate (un) employment The rate of unemployment of graduates 18 months after graduation as a percentage of the national rate of unemployment
Data availability poses problem. 55 5 Time to degree Average time to degree as a percentage of the official length of the program (bachelor and master) Reflects effectiveness of teaching process.
Availability of data may be a problem. Depends on the kind of programs. Field-based Ranking Definition Comments 6 Student-staff ratio The number of students per fte academic staff Fairly generally available.
existence of external advisory board (including employers) Problems with regard to availability of data. 56 13 Inclusion of work experience into the program Rating based on duration (weeks/credits) and modality
(compulsory or recommended) Data easily available. 14 Computer Facilities: internet access Index including: hardware; internet access, including WLAN;(
field specific) software; access to computer support Data easily available. 15 Student gender balance Number of female students as a percentage of total enrolment Indicates social equity (a balanced situation is considered preferable.
Generally available. But indicator of social context, not of educational quality. Student satisfaction indicators Indicators reflecting students'appreciation of several items related to the teaching & learning process.
Student satisfaction is of high conceptual validity. It can be made available in a comparative manner through a survey.
Availability of teachers/professors (e g. during office hours, via email; informal advice and coaching; feedback on homework, assignments, examinations;
University webpage Quality of information for students on the website. Index of several items including general information on institution and admissions, information about the program, information about classes/lectures;
In addition, data availability proved unsatisfactory for this indicator and comparability issues negatively affect its reliability.
research performance measurement frequently takes place through bibliometric data. Data on publications texts and citations is readily available for building bibliometric indicators (see Table 3-2). This is much less the case for data on research awards and data underlying impact indicators.
In addition to performance measures, sometimes input-related proxies such as the volume of research staff and research income are in use to describe the research taking place in a particular institution or unit.
Compared to such input indicators, bibliometric indicators may be more valid measures for the output or productivity of research teams and institutions.
One may mention audio visual recordings, computer software and databases, technical drawings, designs or working models, major works in production or exhibition and/or award-winning
Expert Group on Assessment of University-Based Research (2010) Apart from using existing bibliometric databases,
While this may improve data coverage, such self-reported accounts may not be standardized or reliable, because respondents may interpret the definitions differently.
Data mostly available. Recommended by Expert Group on University-based Research. Difficult to separate teaching
In some countries, competitive public funding may be difficult to separate from other public funding. 3 Research publication output Frequency count of research publications with at least one author address referring to selected institution (within Web of Science) Broadly accepted.
Data largely available. Widely used in research rankings (Shanghai, Leiden ranking, HEEACT. Different disciplinary customs cause distortion.
Data availability may be weak. 63 5 Interdisciplinary research activities Share of research publications authored by multiple units from the same institution (based on self-reported data) Research
These data refer to database years. Publishing in top-ranked, high impact journals reflects quality of research.
Data largely available. Books and proceedings are considered not. Never been used before in any international classification
Data suffers from lack of agreed definitions and lack of availability. Quantities difficult to aggregate. 9 Number of international awards
Data suffers from lack of agreed definitions and lack of availability. Quantities difficult to aggregate.
research contracts may run over several years. 11 Research publication output Frequency count of (Web of Science) research publications with at least one author address referring to selected institutional unit (relative to fte academic
However, data availability is posing some challenges here. Research publications other than peer-reviewed journal publications are included,
While such data is available, it is limited only to national authors. During the indicator selection process the relevance of the indicator was questioned,
even though data availability and definitions may sometimes pose a challenge. Therefore it was decided to keep them in the list of indicators for U multirank's institutional ranking.
After pretesting the indicators it has become clear that there are some data availability issues in terms of a clarity of definitions (for instance FTE staff) and the cost of collecting particular indicators.
A test of the indicators (and the underlying data elements) in the more broad pilot study (see chapters 5 and 6),
They are more or lessready to use',such as machinery, software, new materials or modified organisms. This is often calledtechnology'.
68 and the near absence of (internationally comparable) data (see chapter 4) 16, it proved extremely difficult to do so.
EC Framework programs) plus direct industry income as a proportion of total income Signals KT success. Some data do exist
ISI databases available. Used in CWTS University-Industry Research Cooperation Scoreboard. 16 See also the brief section on the EUMIDA project,
One of EUMIDA's findings is that data on technology transfer activity and patenting is difficult to collect in a standardized way (using uniform definitions, etc.)
Data are available from secondary (identical) data sources. 5 Size of Technology Transfer Office Number of employees (FTE) at Technology Transfer Office related to the number of FTE
Data are mostly directly available. KT function may be dispersed across the HEI. Not regarded as core indicator by EGKTM. 6 CPD courses offered Number of CPD courses offered per academic staff (fte) Captures outreach to professions Relatively new indicator.
Data available from secondary sources (Patstat. 8 Number of Spin-offs The number of spin-offs created over the last three years per academic staff (fte) EGKTM regards Spin-offs as core indicator.
Data available from secondary sources. Clear definition and demarcation criteria needed. Does not reveal market value of spin-offs.
Field-based Ranking Definition Comments 9 Academic staff with work experience outside higher education Percentage of academic staff with work experience outside higher education within the last 10
Data difficult to collect. 10 Annual income from licensing The annual income from licensing agreements as a percentage of total income Licensing reflects exploiting of IP.
Data available from secondary (identical) data sources. Patents with an academic inventor but another institutional applicant (s) not taken into account.
and from the pre-test it became clear that data is difficult to collect. Therefore this indicator was kept not in the list for the pilot.
For many of the indicators data are available in the institutional databases. Hardly any of such data can be found in national or international databases.
The various manifestations and results of internationalization are captured through the list of indicators shown in Table 3-5. The table includes some comments made during the consultation process that led to the selection of the indicators.
Data availability good. Relevant indicator. Used quite frequently. Sensitive to relativesize'of national language. 2 International academic staff Foreign academic staff members (headcount) as percentage of total number of academic staff members (headcount.
Availability of data problematic. 4 International joint research publications Relative number of research publications that list one or more author affiliate addresses in another country relative to research staff
Data available in international data bases but bias towards certain disciplines and languages. 5 Number of joint degree programs The number of students in joint degree programs with foreign university (including integrated period at foreign university) as a percentage of total
Data available. Indicator not often used. Field-based Ranking Definition Comments 6 Incoming and outgoing students Incoming exchange students as a percentage of total number of students and the number of students going abroad as a percentage of total
Data available. 7 International graduate employment rate The number of graduates employed abroad or in an international organization as a percentage of the total number of graduates employed Indicates the student preparedness on the international labor market.
Data not readily available. No clear international standards for measuring. 8 International academic staff Percentage of international academic staff in total number of (regular) academic staff See above institutional ranking 9 International
Data are available. Stakeholders question relevance. 10 Student satisfaction: Internationalization of programs Index including the attractiveness of the university's exchange programs, the attractiveness of the partner universities, the sufficiency of the number of exchange places;
Data available but sensitive to location (distance to border) of HEI. Stakeholders consider the indicator important. 13 Student satisfaction:
composite indicators depend on the availability of each data element. It should be pointed out here that one of the indicators is a student satisfaction indicator:
and data is available, stakeholders consider this indicator not very important. Moreover, the validity is questionable as the size of the international office as a facilitating service is a very distant proxy indicator.
http://classifications. carnegiefoundation. org/details/community engagement. php 75 acknowledge regional engagement activities? Are there visible structures that function to assist with region-based teaching and learning?
No national data on graduate destinations. 19http://epp. eurostat. ec. europa. eu/portal/page/portal/region cities/regional statistics/nuts classification 20 http://www. oecd
Availability of data problematic. 3 Regional joint research publications Number of research publications that list one or more author-affiliate addresses in the same NUTS2 or NUTS3 region,
Data available (Web of Science), but professional (laymen's) publications not covered. 4 Research contracts with regional business The number of research projects with regional firms,
Definition of internship problematic and data not readily available. Disciplinary bias. Field-based Ranking Definition Comments 6 Degree theses in cooperation with regional enterprises Number of degree theses in cooperation with regional enterprises as a percentage of total number
Data not readily available. Indicator hardly ever used. 9 Student internships in local/regional enterprises Number of internships of students in regional enterprises (as percentage of total students See above institutional ranking,
Limited availability of data. Lack of internationally accepted definition of summer school courses. 77 During the process of selection of indicators the list of indicators underwent a number of revisions.
While data may be found in international patent databases, the indicator is used not often and stakeholders did not particularly favor the indicator.
but data constraints prevent us from the use of such an indicator. Public lectures that are open to an external
The above discussion makes it clear that regional engagement is a dimension that poses many problems with regard to availability of performance-oriented indicators and their underlying data.
In the next chapter we will discuss the data gathering instruments that are available more extensively In chapters 5
databases and data collection Multirank: databases and data collection Multirank: databases and data collection Multirank:
databases and data collection Multirank: databases and data collection Multirank: databases and data collection Multirank:
databases and data collection Multirank: databases and data collection Multirank: databases and data collection Multirank:
databases and data collection Multirank: databases and data collection Multirank: databases and data collection Multirank:
databases and data collection Multirank: databases and data collection Multirank: databases and data collection Multirank:
databases and data collection Multirank: databases and data collection tools tools 4. 1 Introduction In this chapter we will describe the databases
and data collection instruments used in constructing U multirank. The first part is an overview of existing databases mainly on bibliometrics and patents.
The second presents an explanation of the questionnaires and survey tools used for collecting data from the institutions (the self-reported data) at the institutional
and department levels and from students. 4. 2 Databases Existing databases 4. 2. 1one of the activities in the U multirank project was to review existing rankings
and explore their underlying databases. If existing databases can be relied on for quantifying the U multirank indicators this would be helpful in reducing the overall burden for institutions in handling the U-Multirank data requests.
However, from the overview of classifications and rankings presented in chapter 1 (section 1. 3) it is clear that international databases holding information at institution level
or at lower aggregation levels are currently available only for particular aspects of the dimensions Research and Knowledge Transfer.
For other aspects and dimensions, U multirank will have to rely on self-reported data. Regarding research output and impact, there are worldwide databases on journal publications and citations.
For knowledge transfer, the database of patents compiled by the European Patent office is available. In the next two subsections
available bibliometric and patent databases will be discussed. To further assess the availability of data covering individual higher education and research institutions,
the results of the EUMIDA project were taken also into account. 21 The EUMIDA project (see:
www. eumida. org) seeks to develop the foundations of a coherent data infrastructure (and database) at the level of individual higher education institutions.
Section 4. 2. 4 presents an overview of availability based on the outcomes of the EUMIDA project.
Our analysis on data availability was completed with a brief online consultation with the group of international experts connected to U multirank (see section 4. 2. 5). The international experts were asked to give their assessment of the 21 The U multirank project was granted access to the preliminary
outcomes of the EUMIDA project in order to learn about data availability in the countries covered by EUMIDA. 80 situation with respect to data availability in some of the non-EU countries included in U multirank Bibliometric databases 4. 2. 2there are a number of international databases
which can serve as a source of information on the research output of a higher education and research institution (or one of its departments).
An institution's quantity of research-based publications (per capita) reflects its research output and can also be seen as a measure of scientific merit or quality.
In particular, if its publications are cited highly within the international scientific communities this may characterize an institution as high-impact and high-quality.
The production of publications by a higher education and research institute not only reflects research activities in the sense of original scientific research,
but usually also the presence of underlying capacity and capabilities for engaging in sustainable levels of scientific research. 22 The research profile of a higher education
and research institution can be specified further by taking into account its engagement in various types of research collaboration.
For this one can look at joint research publications involving international, regional and private sector partners.
The subset of jointly authored publications is a testimony of successful research cooperation. Data on numbers and citations of research publications are covered relatively well in existing databases.
Quantitative measurements and statistics based on information drawn from bibliographic records of publications are called usuallybibliometric data'.
'These data concern the quantity of scientific publications by an author or organisation and the number of citations (references) these publications have received from other research publications.
There is a wide range of research publications available for characterizing the research profile and research performance of an institution by means of bibliometric data:
lab reports, journal articles, edited books, monographs, etc. The bibliometric methodologies applied in international comparative settings such as U multirank usually draw their information from publications that are released in scientific and technical journals.
This part of the research literature is covered(indexed')by a number of international databases. In most cases the journals indexed are reviewed internationally peer,
which means that they adhere to international quality standards. U multirank therefore makes use of international bibliometric databases to compile some of its research performance indicators
and a number of research-related indicators belonging to the dimensions of Internationalisation, Knowledge Transfer and Regional Engagement. 22 This is why research publication volume is a part of the U-Map indicators that reflect the activity profile of an institution. 81 Two of the most well-known databases that are available for carrying out
bibliometric analyses are the Web of Science and Scopus. 23 Both are commercial databases that provide global coverage of the research literature
and both are easily accessible. The Web of Science database is maintained by ISI, the Institute for Scientific Information,
which was taken over by Thomson Reuters a few years ago. The Web of Science currently covers about 1 million new research papers per year
published in over 10,000 international and regional journals and book series in the natural sciences, social sciences,
and arts and humanities. According to the Web of Science website, 3, 000 of these journals account for about 75%of published articles
and over 90%of cited articles. 24 The Web of Science claims to cover the highest impact journals worldwide,
including Open Access journals and over 110,000 conference proceedings. The Scopus database was launched in 2004 by the publishing house Elsevier.
It claims to be the largest abstract and citation database containing both peer-reviewed research literature and web sources.
It contains bibliometric information covering some 17 500 peer-reviewed journals (including 1, 800 Open Access journals) from more than 5, 000 international publishers.
Moreover it holds information from 400 trade publications and 300 book series, as well as data about conference papers from proceedings and journals.
To compile the publications-related indicators in the U multirank pilot study, bibliometric data was derived from the October 2010 edition of the Web of Science bibliographical database.
An upgradedbibliometric version'of the database is housed and operated by the CWTS (being one of the CHERPA Network partners) under a full license from Thomson Reuters. This dedicated version includes thestandardized institutional names'of higher education
and research institutes that have been checked(cleaned) 'and harmonized in order to ensure that as many as possible of the Web of Science-indexed publications are assigned to the correct institution.
This data processing of address information is done at the aggregate level of the entiremain'organization (not for sub-units such as departments or faculties.
All the selected institutions in the U multirank pilot study produced at least one Web of Science-indexed research publication during the years 1980-2010.
The Web of Science, being both an international and multidisciplinary database, has its pros and cons. The bulk of the research publications are issued in peer-reviewed international scientific and technical journals,
which mainly refer to discovery-orientedbasic'research of the kind that is conducted at universities and research institutes.
There are relatively few conference proceedings in the Web of Science, and no books or 23 Yet another database is Google Scholar.
This is a service based on the automatic recording by Google's search engine of citations to any author's publications (of whatever type) included in other publications appearing on the worldwide web. 24 See:
http://thomsonreuters. com/products services/science/science products/a-z/web of science/82 monographs whatsoever, hence, publications referring toapplied research
'orstrategic research'are underrepresented. It has a relatively poor coverage of non-English language publications.
The coverage of publication output is quite good in the medical sciences, life sciences and natural sciences, but relatively poor in many of the applied sciences and social sciences and particularly within the humanities.
The alternative source of bibliographical information, Elsevier's Scopus database, is likely to provide an extended coverage of the global research literature in those underrepresented fields of science.
For the following six indicators selected for inclusion in the U multirank pilot test (see chapter 6) one can derive data from the CWTS/Thomson Reuters Web of Science database:
1. total publication output 2. university-industry joint publications 3. international joint publications 4. field-normalized citation rate 5. share of the world
's most highly cited publications 6. regional joint publications We note that this set includes four new performance indicators(#2,#3,#5,
#6) that were constructed specially for U multirank and that have never been used before in any international classification or ranking.
Patent databases 4. 2. 3as part of the indicators in the Knowledge Transfer dimension, U multirank selected the number of patent applications for
Data for the co-patenting and patents indicators may be derived from patent databases. For U multirank, patent data were retrieved from the European Patent office (EPO.
Its Worldwide Patent Statistical Database (version October 2009) 25, also known as PATSTAT, is designed and published on behalf of the OECD Taskforce on Patent Statistics.
Other members of this taskforce include the World Intellectual Property Organisation (WIPO), the Japanese Patent office (JPO), the United states Patent and Trademark Office (USPTO), the US National Science Foundation (NSF),
and the European commission represented by Eurostat and by DG Research. 25 This version is held by the K. U. Leuven (Catholic University Leuven)
83 The PATSTAT patent database is designed especially to assist in advanced statistical analysis of patent data.
It contains patent data from over 80 countries; adding up to 70 million records (63 million patent applications and 7 million granted patents.
The patent data are sourced from offices worldwide, including of course the most important and largest ones such as the EPO, the USPTO, the JPO and the WIPO.
Data availability according to EUMIDA 4. 2. 4like the U multirank project, the EUMIDA project (see http://www. eumida. org) collects data on individual higher education and research institutions.
whether a data collection effort can be undertaken by EUROSTAT in the foreseeable future. EUMIDA covers 29 countries (the 27 EU member states plus two additional countries:
Switzerland and Norway) and investigates the data available from national databases in as far as these are held/maintained by national statistical institutes, ministries or other organizations.
The EUMIDA project has demonstrated that a regular data collection by national statistical authorities is feasible across (almost) all EU-member states,
The EUMIDA and U multirank project teams agreed to share information on issues such as definitions of data elements
and data sources, given that the two projects share a great deal of data (indicators). The overlap lies mainly in the area of data related to the inputs (or activities) of higher education and research institutions.
A great deal of this input-related information is used in the construction of the indicators in U-Map.
The EUMIDA data elements therefore are much more similar to the U-Map indicators, since U-Map aims to build activity profiles for individual institutions whereas U multirank constructs performance profiles.
The findings of EUMIDA point to the fact that for the more research intensive higher education institutions, data for the dimensions of Education and Research are covered relatively well
although data on graduate careers and employability are sketchy. Some 26 A patent family is a set of patents taken in various countries to protect a single invention
www. uspto. gov). 84 data on scientific publications is available for most countries. However, overall, performance-related data is less widely available compared to input-related data items.
The role of national statistical institutes is limited quite here and the underlying methodology is not yet consistent enough to allow for international comparability of data.
Table 4-1 below shows the U multirank data elements that are covered in EUMIDA and whether information on these data elements may be found in national databases (statistical offices, ministries, rectors'associations, etc.).
The table shows that EUMIDA primarily focuses on the Teaching & Learning and Research dimensions,
with some additional aspects relating to the Knowledge Transfer dimension. Since EUMIDA never had the intention to cover all dimensions of an institution's activity (or its performance),
The table illustrates that information on only a few U multirank data elements is available from national databases and,
moreover, what data exists is available only in a small minority of European countries. This implies
once again, that the majority of data elements will have to be collected directly from the institutions themselves.
Data elements shared between EUMIDA and U multirank: their coverage in national databases Dimension EUMIDA and U multirank data element European countries where data element is available in national databases Teaching & Learning relative rate of graduate unemployment
CZ, FI, NO, SK, ES Research expenditure on research AT*,BE, CY, CZ*,DK, EE, FI, GR*,HU, IT, LV*,LT*,LU, MT*,NO, PL*,RO*,SI*,ES, SE, CH,
There are confidentiality issues (e g. national statistical offices may not be prepared to make data public without consulting individual HEIS)( p) indicates:
Data are only partially available (e g. only for public HEIS, or only for (some) research universities) The list of EUMIDA countries with abbreviations:
) Expert view on data availability in non-European countries 4. 2. 5the Expert Board of the U multirank project was consulted to assess for their six countries all from outside Europe the availability of data
whether data was available in national databases and/or in the institutions themselves. Table 4-2 shows that the Teaching and Learning dimension scores best in terms of data availability.
The dimensions Research and Knowledge Transfer have far less data available on the national level,
but this is compensated by the data available at the institution level. The same holds true to a lesser extent, for the dimension International Orientation, where little data is available in national databases.
The Regional Engagement dimension is the most problematic in terms of data availability. Here, data will have to be collected from the individual institutions. 27 Argentina, Australia, Canada, Saudi arabia
South africa and the US. 86 Table 4-2: Availability of U multirank data elements in countries'national databases according to experts in 6 countries (Argentina/AR, Australia/AU, Canada/CA, Saudi arabia/SA, South africa/ZA
, United states/US) Dimension U multirank data element Countries where data element is available in national databases Countries where data element is available in institutional database Teaching & Learning
expenditure on teaching AR, US, ZA AR, AU, SA, ZA time to degree AR, CA, US, ZA AR, AU, CA, SA, ZA
graduation rate AR, CA, US, ZA AR, AU, SA, ZA relative rate of graduate unemployment AU, CA
US Research expenditure on research AR, AU, ZA AR, AU, SA, US, ZA number of post-doc positions CA, US, ZA research
In the Research dimension, Expenditure on Research and Research Publication Output data are represented best in national databases.
however, information is not really available in national databases. According to the experts consulted, more data can probably be found in institutional databases.
However, if that is the case, there is always a risk that different institutions may use different definitions
Even if there is information available in databases (national, institutional, or other), our experts stressed that it is not always easy to obtain that information (for instance in case of data relating to the dimension Regional Engagement).
To obtain a better idea of data availability, we carried out a special pre-test (see section 4. 3. 3). 4. 3 Data collection instruments Due to the lack of adequate data sets,
the U multirank project had to rely largely on self-reported data (both at the institutional
and field-based levels), collected directly from the higher education and research institutions. The main instruments to collect data from the institutions were four online questionnaires:
three for the institutions and one for students. 88 The four surveys are: U-Map questionnaire institutional questionnaire field-based questionnaire student survey.
In designing the questionnaires, emphasis was placed on the way in which questions were formulated. It is important that they can only be interpreted in one way
Self-reported institutional data 4. 3. 14.3.1.1 U-Map questionnaire As explained, the U-Map questionnaire is an instrument for identifying similar subsets of higher education institutions within the U multirank sample.
Data is collected in seven main categories: general information: name and contact; public/private character and age of institution;
staff data: fte and headcount; international staff; income: total income; income by type of activity;
data is collected on the performance of the institution. Like the U-Map questionnaire, this questionnaire is structured along the lines of different data types to allow for a more rapid data collection by the institution's respondents.
The questionnaire is divided therefore into the following categories: general information: name and contact; public/private character and age of institution;
coverage; research & knowledge transfer: publications; patents; concerts and exhibitions; start-ups. As the institutional questionnaire and the U-Map questionnaire partly share the same data elements,
institutions were advised to first complete the U-Map questionnaire. Data elements from U-Map are transferred automatically to the U multirank questionnaire using atransfer tool'.
'The academic year 2008/2009 was selected as the default reference year. 4. 3. 1. 3 Field-based questionnaire The field-based questionnaire includes information on individual faculties/departments
Like the institutional questionnaire, the field-based questionnaire is structured along the different types of data requested to reduce the administrative burden for respondents.
Data was collected for the reference period 2009/2010 for data which are expected to be subject to annual fluctuations;
data for three subsequent years was collected to calculate three-year averages. 28 See Appendix 12 for the institutional questionnaire. 90 The following categories are distinguished:
either by mail or email rather than having them complete the survey in the classroom.
and asks for the students'basic demographic data and information on their programme. The main focus of the survey is on the assessment of the teaching
Pretesting the instruments 4. 3. 3a first version of the three new data collection instruments (the institutional questionnaire,
The U multirank questionnaires were tested in terms of cultural/linguistic understanding, clarity of definitions of data elements and feasibility of data collection.
Instead of asking them to provide all the data on a relatively short notice, these institutions were contacted to offer their feedback on the clarity of the questions and on the availability of data.
According to the pre-test results, the general format and structure of the institutional questionnaire seemed to be clear and user-friendly.
Secondly, several indicators presented difficulties to respondents because the required data was collected not centrally by the institution.
Problems emerge however with some output-related 92 data elements such as graduate employment, where often data is collected not at the institutional level.
Interdisciplinarity of programs proved to be another problematic indicator where problems emerged due to the definition of the concept and the absence of the required data.
Research. Most data items in this dimension did not lead to problems. In fact, some of the key indicators are extracted from international bibliometric databases anyway
and did need not data provision from the institutions. As expected, some difficulties emerged forart-related outputs'.
'Sharper definitions were called for here. Knowledge Transfer and Regional Engagement. Compared to Teaching and Research,
these two dimensions are less prevalent in existing national and institutional databases and therefore presented some data availability problems.
This was the case forgraduates working in the region'andstudent internships in regional enterprises'.
'Comprehensive information on start-up firms and professional development courses was not always available for institutions as a whole.
The pre-test did reveal a need for clearer definitions for some data elements. Pre-test results also indicated that some data elements
although highly relevant and valid, could not be collected feasibly because institutions did not have such data.
With respect to this issue the project team, with the help of the Advisory board, had a critical look at the problematic indicators
Problems with regard to the availability of data were reported mainly on issues of academic staff (e g. fte data, international staff), links to business (in education/internships and research) and the use of credits (ECTS.
In order to come to a meaningful and comprehensive set of indicators at the conclusion of the U multirank pilot study we had to aim for a broad data collection to cover a broad range of indicators.
One will have to deal with the issue of institutions providingestimated'values instead of data from existing data sets.
this enabled us to get an impression about the precision of data. For the student questionnaire the conclusion was that there is no need for changes in the design.
Comments received showed that the questionnaire is seen as a useful instrument. 94 Supporting instruments 4. 3. 4in order to assure that a comparable data set was obtained,
This is particularly important as institutions from diverse national settings are an important source for data collection.
The following supporting instruments were provided to offer more clarity to the respondents during the process of data collection:
A glossary of indicators for the four surveys was published on the U multirank website. Throughout the data collection process the glossary was updated regularly.
Afrequently asked questions'(FAQ) section next to aHelpdesk'function was launched on the website.
This allowed questions to be asked concerning the questionnaires and for contact with the U multirank team on other matters.
Protocols describing data collection and handling were developed to explain to the institutions in detail how the different steps were laid out from the start to the finish and the finalising of the data collection.
A technical specifications protocol for U multirank was developed introducing additional functions in the questionnaire to ensure that a smooth data collection could take place:
the option to download the questionnaire in Pdf format, the option to transfer data from the U-Map to the U multirank institutional questionnaire,
and the option to have multiple users access the questionnaire at the same time. We updated the U multirank website regularly
and provided information about the steps/time schedules for data collection. All institutions had clear communication partners from the U multirank team. 4. 4 A concluding perspective This chapter, providing a quick survey of existing databases,
underlines that there are very few international databases/sources where data can be found for our type of rankings.
The only sources that are available are international databases holding bibliometric and patent data. This implies that
in particular for a ranking that aims to sketch a multidimensional picture of an institution at the institutional and disciplinary field levels,
one will have to rely to a large extent on data collected by means of questionnaires sent to representatives of institutions, their students and possibly their 95 graduates.
One could even go beyond these stakeholder groups and include employers and other clients of higher education and research institutions,
The way the data are collected then becomes a critical issue, where compromises have to be built between comprehensiveness,
Different questionnaires will have to be sent to the different data providers: institutions, representatives of departments in the institution and students.
as is the intelligent use of technology (internet, visualisation techniques, supporting tools). The language of the questionnaire is another crucial element for ensuring a good response to the questionnaire.
As rankings order their objects in terms of their scores on quantitative indicators they require uniform definitions of the underlying data elements.
as well as different national customs and definitions of indicators, there are limits to the comparability of data.
and comments to the data they submit through the questionnaires. In a few cases, one may have to allow respondents to provide estimates for some of the answers
if data is otherwise unavailable or too costly to collect. Checking the answers can be done based on internal consistency checks,
comparing data to that of other institutions, or making use of data from other sources, but this clearly also has its limits.
What this chapter has made clear is that the questionnaires and surveys need to be tested first on a small scale before embarking on a bigger survey.
Taking into account the experiences from other similar ranking/data collection projects and making use of the advice of external experts
and national correspondents in the testing and further execution of the survey is yet another part of the provision that needs to be part of the data collection strategy. 5 Testing UTESTING U Testing U
pilot sample and data collectionmultirank: pilot sample and data collection Multirank: pilot sample and data collection Multirank:
pilot sample and data collection Multirank: pilot sample and data collection Multirank: pilot sample and data collection Multirank:
pilot sample and data collectionmultirank: pilot sample and data collection Multirank: pilot sample and data collectionmultirank:
pilot sample and data collection Multirank: pilot sample and data collection Multirank: pilot sample and data collection Multirank:
pilot sample and data collectionmultirank: pilot sample and data collection Multirank: pilot sample and data collection Multirank:
pilot sample and data collection Multirank: pilot sample and data collection Multirank: pilot sample and data collection 5. 1 Introduction Now that we have presented the design
and construction process for U multirank, we will describe the feasibility testing of this multidimensional ranking tool.
This test took place in a pilot study specifically undertaken to analyse the actual feasibility of U multirank on a global scale.
In this chapter we will describe the processes of recruiting the sample of pilot institutions and data collection in the pilot study the collection of both self-reported institutional data
and data from international databases. 5. 2 The global sample A major task of the feasibility study was the selection of institutions to be included in the pilot study.
The selection of the 150 pilot institutions (as specified in the project outline) needed to be informed by two major criteria:
including a group of institutions that reflects as much institutional diversity as possible; and making sure that the sample was regionally and nationally balanced.
In addition we needed to ensure sufficient overlap between the institutional ranking and the field-based rankings in business studies and two fields of engineering.
As has been indicated in chapter 2 of this report, one of the basic ideas of U multirank is the link to U-Map.
U-Map is an effective tool to identify institutional activity profiles of institutions similar enough to compare them in rankings.
Yet at this stage of its development U-Map includes only a limited number of provisional institutional profiles
which makes it insufficiently applicable for the selection of the sample of pilot institutions for the U multirank feasibility test.
Since U-Map cannot yet offer sets of comparable institutional profiles we needed to find another way to create a sample with a sufficient level of diversity of institutional profiles.
We do not (and cannot) claim that we have designed a sample that is representative of the full diversity of higher education in the world (particularly as there is no adequate description of this diversity)
but we have succeeded in including a wide variety of institutional types in our sample. Potential pilot institutions to be invited for the sample were identified in a number of ways:
The existing set of higher education institutions in the U-Map database was included. This offered a clear indication of a broad variety of institutional profiles. 98 Some universities applied through the U multirank website to participate in the feasibility study.
Their broad profiles were checked as far as is possible against the U-Map dimensions in order to be able to describe their profiles.
Finally 115 institutions submitted data as part of the pilot study. Table 5-1: Regional distribution of participating institutions Region and Country Initial proposal for number of institutions Institutions in the final pilot selection Institutions that confirmed participation Institutions which delivered U multirank institutional data Institutions
which delivered U multirank institutional data and U-Map data July 2010 February 2011 April 2011 April 2011 I. EU 27 (population in millions) Austria (8m
) 2 2 5 5 4 Belgium (10m) 3 3 5 3 3 Bulgaria (8 m) 2 3 3 3 3
Region and Country Initial proposal for number of institutions Institutions in the final pilot selection Institutions that confirmed participation Institutions which delivered U multirank institutional data Institutions
which delivered U multirank institutional data and U-Map data Netherlands (16m) 3 7 3 3 3 Poland (38m) 6 12 7 7 6
and Country Initial proposal for number of institutions Institutions in the final pilot selection Institutions that confirmed participation Institutions which delivered U multirank institutional data Institutions
which delivered U multirank institutional data and U-Map data Other Asia 5 2 The Philippines 1 1 1 Taiwan 1 1 0 Vietnam 2
770 students provided data via the online questionnaire. After data cleaning we were able to include 5, 901 student responses in the analysis:
45%in business studies; 23%in mechanical engineering; and 32%in electrical engineering. 5. 3 Data collection The data collection for the pilot study took place via two different processes:
the collection of self-reported data from the institutions involved in the study (including the student survey) and the collection of data on these same institutions from existing international databases on publications/citations and patents.
In the following sections we discuss these data collection processes. 103 Institutional self-reported data 5. 3. 15.3.1.1 The process The process of data collection from the organizations was organised in a sequence of steps
(see Figure 5-1). First we asked the institutions, after official confirmation of participation, to fill in a contact form.
The data collection entailed the following instruments: The U-Map questionnaire to identify institutional profiles Institutional ranking:
U multirank data collection process The institutions were given seven weeks to collect the data, with deadlines set according to the dates the institution confirmed their participation.
After the deadlines for data submission had passed, we checked on the questionnaires submitted by the institutions.
These different steps allowed us to actively follow the data collection process and to assist institutions as needed.
An important element in terms of quality assurance of the data was a feedback loop built into the process.
After the institutions had submitted their questionnaires their data was checked and we provided comments and questions.
check their data, correct inconsistencies and add missing information. Organising a survey among students on a global scale was one of the major challenges in U multirank.
The data collection through the student survey was organized by the participating institutions. They were asked to send invitation letters to their students,
either by regular mail or by email. We prepared a standard letter to students explaining the purpose of the survey/project
Institutions were able to download a package including the letter and a list of passwords (for email invitation) and a form letter (for printed mail invitations.
901 could be included in the analysis. 106 5. 3. 1. 2 Follow-up survey After the completion of the data collection process we asked those institutions that submitted data to share their experience of the process
One particular issue was the burden of data delivery in the various surveys. As can be seen in Table 5-2 this burden differed substantially between the pilot institutions.
Self-reported time needed to deliver data (fte staff days) Data collection tool N Minimum Maximum Mean Institutional questionnaire 26 1. 0 30
spent significantly less time on delivering the data than the institutions from outside Europe. Table 5-3:
Self-reported time needed to deliver data (fte staff days: European vs. non-European institutions Data collection tool Europe Non-Europe Mean N Mean N Institutional questionnaire 6. 2 15
8. 3 10 Field questionnaire Business studies 2. 5 10 7. 3 7 Field questionnaire Electrical engineering 3. 5 8 7. 0
5-2 shows that the data collection process and procedures were judged positively by pilot institutions
assessment of data procedures and communication Other questions in the follow-up survey referred to the efficiency of data collection and the clarity of the questionnaires.
In general the efficiency of data collection was reported to be good by the pilot institutions; critical comments indicated some confusion about the relationship between the U-Map and U multirank institutional questionnaires.
assessment of data collection process Some institutions were critical about the clarity of questions. Comments show that this criticism refers mainly to issues concerning staff data (e g. the concept of full-time equivalents)
and to aspects of research and knowledge transfer (e g. international networks, international prizes, cultural awards and prizes).
In the follow-up survey we also asked about major problems in delivering the data. Most pilot institutions reported no major problems with regard to student,
graduate and staff data. If they had problems these were mostly with research and third mission data (knowledge transfer,
regional engagement)( See Figure 5-4). 02468 10 12 very good good neutral poor very poor General procedures Communication with U multirank 0123456789
10 very good good neutral poor very poor Efficiency of data collection FIR FBR 012345678 very good good neutral poor very
Availability of data 5. 3. 1. 3 Data cleaning As was indicated earlier, due to the lack of relevant and useful data sets we had to rely largely on self-reported data (both at the institutional
and the field-based level). This inevitably raises the question of the control and verification of data.
Based on the experiences from U-Map and from the CHE ranking we applied a number of mechanisms
and procedures to verify data. Verification refers to the identification and correction of errors due to:
Simple data errors Potential manipulation of data In order to reduce the number of errors due to misunderstanding of definitions
and Helpdesk function were launched on the website. Furthermore, we shared the U-map protocol and the U multirank technical specification email (see appendices 10 and 11) with the institutions to ensure that a smooth data collection could take place.
If despite these tools questions of definition still occurred all universities had clear communication partners in the U multirank team.
The main part of the verification process consisted of the data cleaning procedures after receiving the data.
the particular data were included not in the pilot data analysis. The main data 024681012141618student datagraduate datastaff datafinancial dataresearch datatheird Mission Datano problemslack of clarity of definitionslarge effort Data not availablelimited relevanceother problems 109 cleaning
procedures carried out on the data provided by the institutions are described below. The institutional questionnaires For the institutional questionnaires we performed the following checks:
A check on the outliers in the data elements: the raw data (the answers provided by the institutions) were analysed first regarding outliers.
If a score was extremely high or low (compared to the scores of the other institutions on that data element),
the data element was flagged for further analysis. A check on the outliers in indicator scores:
the scores on the indicators were calculated using the raw data and the formulas. If a score was extremely high
the data element was flagged for further analysis. A check for missing values: the data elements where data were missing
or not available were flagged. Comments regarding reasons for missing data were studied and the missing values were compared to data from other institutions from the same country.
These three checks were performed first for the entire data Set in addition, more detailed checks were performed within a country or region.
The focus of these more detailed checks was on: Reference years: a basic check on the consistency of the reference years.
Comments: the comments were used as a source of information for missing values and for potential threats to the validity due to deviant interpretations.
the website of the institution was checked to see whether we could find information regarding the relevant data element.
The same procedure was followed when information was missing. If the website did not provide the information,
other publicly available data sources were identified and studied to find out whether the outlier was due to inadequate interpretation
and data provision regarding the question/data element or to a particular characteristic of the institution.
The departmental questionnaires For the departmental questionnaires the following checks took place: Feedback cycles during the data collection process.
After the first deadline we reviewed the data delivered thus far and inserted questions into the questionnaire
which was sent again to the institutions. Analyses of outliers: for each indicator outliers were identified and analysed in more detail.
The data provided were studied over time and specific changes in trends were analysed. The student survey For the student survey, after data checks we omitted the following elements from the gross student sample:
Missing data on the students'institution Missing data on their field of study (business studies, mechanical engineering, electrical engineering) Students enrolled in programs other than bachelor/short national first degree programs
and master/long national first degree programs Students who had spent little time on the questionnaire and had responded not adequately.
As a result of these checks the data of about 800 student questionnaires have been omitted from the sample. International databases 5. 3. 2the data collection regarding the bibliometric and patent indicators took place by studying the relevant international databases
and extracting from these databases the information to be applied to the institutions and fields in the sample. 111 5. 3. 2. 1 Bibliometric data As indicated in chapter 4,
we analysed the October 2010 edition of the Web of Science database (Wos) to compile the bibliometric data of the institutions involved in the sample.
A crucial aspect of this analysis was the identification of the sets of publications produced by one and the same institution,
which is labelled then by a single,standardised'name tag. The institutions were delimitated according to the set of Wos-indexed publications that contain an author affiliate address explicitly referring to that institution.
statistics were produced that are represented sufficiently in the Wos database, either in the entire Wos or in the preselected Wos fields of science.
Statistical information on 500 universities worldwide is freely available at the CWTS website: www. socialsciences. leiden. edu/cwts/products-services/scoreboard. html 4) Regional joint research publications Frequency count of publications with at least one author address referring to the selected main organization
'The bibliometric data in the pilot version of U multirank database refer to one measurement per indicator.
In the case of the indicators#1-#4 (see section 4. 2. 2) the most recently available publication year was selected for producing statistical data:
The statistics are in form of frequency data or as frequency categories (frequency range. Also, in the case of indicators#2,#3,
and#4 the data were expressed as the share of co-publications within total publication input.
The citation impact data require a citation window stretching back into the recent past in order to collect a sufficiently large number of citations.
The publication count data are all based on awhole counting'method where a publication is attributed in full to each main organization listed in the author addresses.
The annual statistics refer to publication years (rather than database years. The computation routine for the field-normalized citation rate indicator involved collecting citations to each publication according to a variable citation window,
where each publication is tracked with the constraints of the preset time period. For instance, within the time period 2005-2009 all publications from 2005 are tracked for five years up to
These data refer to database years. 114 The research publications in the three fields of our pilot study (business studies,
Hence, in these cases the available bibliometric data were insufficient to create valid and reliable information for the bibliometric performance indicators,
especially when the data is drawn from the Wos database for just a single (recent) publication year.
remove the institution from all indicators that involve bibliometric data; include bibliometric information only for the overall profile across all fields of science;
with the annual field-specific thresholds set at 10 to 15 publications. 5. 3. 2. 2 Patent data As indicated in chapter 4 (section 4. 2. 3
), for our analysis of patents we collected data from the October 2009 version of the international PATSTAT-database.
In this database the institutions participating in the sample were identified and studied in order to extract the institutional-level patent-data.
The extraction covers patents from the three largest patent offices worldwide: the European Patent office (EPO), the US Patent and Trademark Office (USPTO) and the World Intellectual Property Organization (WIPO.
The extraction of institutional-level patent data is based on identification of the institute in the applicant field of the PATSTAT database (see appendix 7:
i e. without an externalbottom-up'verification of the extracted data by one or more representatives of each organization.
although the above discussed harmonization steps imply high levels of accuracy and coverage (see also Magerman, 2009;
Using inventor information for extracting institution-level data is impossible, as patent documents contain no (systematic) information on the institutional affiliation of individual inventors.
On the contrary, the data provided and discussed in the study by Lissoni et al. 2008) show that the extent of academic scientists'contribution to national patenting in France,
As such, when interpreting institution-level patent data such as the ones provided in this study, one should at all times bear in mind the relatively sizable volume of university-invented patents that is not retrieved by the institution-level search strategy and institutional and national variations in the size of the consequential limitation bias.
due to a lack of concordance with the field classification that is present in the patent database.
We will first present the feasibility of the use of the various indicators presented in chapter 3. Next we will discuss the feasibility of the data collection procedures including the quality of the data sources.
the measurement of the indicator is the same regardless of who collects the data or when comparability:
the required data are available or can be collected with an acceptable level of effort. 120 Using these criteria the indicators were preselected'as the base for the pilot test.
relevance concept/construct validity face validity robustness consisting of reliability and comparability availability (of data),
but in most cases data on the indicators can be collected and interpreted. ScoreC'indicates that there are serious problems in collecting data on the indicator.
The (post-pilot) feasibility score is based on three criteria: data availability: the relative actual existence of the data needed to build the indicator.
If information on an indicator or the underlying data elements is/are missing for a relatively large number of cases,
the data availability is assumed to be low. conceptual clarity: the relative consistency across individual questionnaires regarding the understanding of the indicator.
If, in the information collected during the pilot study, there is a relatively large and/or diversified set of comments on the indicator in the various questionnaires,
the conceptual clarity is assumed to be low. data consistency: the relative consistency regarding the actual answers in individual questionnaires to the data needs of the indicator.
If in the information collected during the pilot study, there is a relatively large level of inconsistencies in the information provided in the individual questionnaires,
the data consistency is assumed to be low. 121 Indicators which were ratedA'orB'during (pre-pilot) preliminary rating but
which received A c'in terms of the (post-pilot) feasibility score were reconsidered with regard to their inclusion in the final list of indicators.
despite the problematic score and therefore efforts to enhance the data situation will be proposed; these indicators are keptin'.
Feasibility score Data availability Conceptual clarity Data consistency Recommendation Graduation Rate A b Time to Degree B b Relative Rate of Graduate (Un) employment
For those institutions that did provide data on the breakdown, a number of institutions indicated that the estimates were rather crude.
the indicators that have been built using the information from departmental questionnaires and the indicators related to student satisfaction data. 123 Table 6-2:
rating Feasibility score Data availability Conceptual clarity Data consistency Recommendation Student/staff ratio A a Graduation rate A b Qualification of academic staff
In addition, both institutional and national data, to which some institutions could refer, use different time periods in measuring employment status (e g. six, 12 or 18 months after graduation).
comparability of data is hampered seriously by different time periods. In accordance with the institutional ranking the indicator was regarded
The indicatorinclusion of work experience'is a composite indicator using a number of data elements (e g. internships, teachers'professional experience outside HE) on employability issues;
if one of the data elements is missing, the score for the indicator cannot be calculated. 124 Table 6-3:
rating Feasibility score Data availability Conceptual clarity Data consistency Recommendation Organization of programme A a Inclusion of work experience A a Evaluation of teaching A a
Social climate A a Quality of courses A a Support by teacher A a Computer facilities A a Overall judgment A a Libraries B A Laboratories B A There are no major
Feasibility score Data availability Conceptual clarity Data consistency Recommendation Percentage of expenditure on research A b Field-normalized citation rate*A a Post-docs per fte
and prizes won B c Out Highly cited research publications*B A Interdisciplinary research activities B A*Data source:
The comments on thepost-doc'positions mainly regarded the clarity of definition and the lack of proper data.
The large number of missing data and comments regarding the art-related output was no surprise.
Stakeholders, in particular representatives of art schools, stressed the relevance of this indicator despite the poor data situation.
efforts should be made to enhance the data situation on cultural research outputs of higher education institutions. This cannot be done by 126 producers of rankings alone;
initiatives should also come from providers of (bibliometric) databases as well as stakeholder associations in the sector.
Feasibility score Data availability Conceptual clarity Data consistency Recommendation External research income A a Total publication output*A a Student satisfaction:
*Data source: bibliometric analysis Observations from the pilot test: On the field level, the proposed indicators do not encounter any major feasibility problems.
In general, the data delivered by faculties/departments revealed some problems in clarity of definition of staff data.
Here a clearer yet concise explanation (including an example) should be used in future data collection.
The data on post-doc decisions proved to be more problematic in business studies than in engineering.
Robustness Availability Preliminary rating Feasibility score Data availability Conceptual clarity Data consistency Recommendation Percentage of income from third party funding A c In Incentives for knowledge
B b Technology transfer office staff per fte academic staff B b Co-patenting**B A*Data source:
making it difficult to compare the data. 128 Table 6-7: Field-based ranking indicators:
Robustness Availability Preliminary rating Feasibility score Data availability Conceptual clarity Data consistency Recommendation University-industry joint research publications*A a Academic
Out Number of licensing agreements B c Out*Data source: bibliometric analysis;****patent analysis Observations from the pilot test:
The only indicator with anA'-rating indicating a high degree of feasibility comes from bibliometric analysis. Availability of data onjoint research contracts with private sector'is a major problem,
The indicators based on data from patent databases are feasible only for institutional ranking due to discrepancies in the definition and delineation of fields in the databases.
Only a small number of institutions could deliver data on licensing. There was an agreement among stakeholders
Robustness Availability Preliminary rating Feasibility score Data availability Conceptual clarity Data consistency Recommendation Percentage of programs in foreign language A a International joint research
-seeking students New indicator B Percentage students coming in on exchanges New indicator A Percentage students sent out on exchanges New indicator A*Data source:
Robustness Availability Preliminary rating Feasibility score Data availability Conceptual clarity Data consistency Recommendation Percentage of international students A a Incoming and outgoing students
*B A International research grants B b International doctorate graduation rate B A*Data source: Bibliometric analysis Observations from the pilot test:
Not all institutions have clear data on outgoing students. In some cases only those students participating in institutional or broader formal programs (e g.
Availability of data was relatively low regarding the student satisfaction indicator as only a few students had participated already in a stay abroad
The indicatorinternational orientation of programs'is a composite indicator referring to several data elements;
feasibility is limited by missing cases for some of the data elements. Some institutions could not identify external research funds from international funding organizations.
Robustness Availability Preliminary rating Feasibility score Data availability Conceptual clarity Data consistency Recommendation Percentage of income from regional sources A c In Percentage of graduates
working in the region B c In Research contracts with regional partners B b Regional joint research publications*B A Percentage of students in internships in local enterprises B c In*Data source:
the low level of data consistency showed that there is a wide variety of region definitions used by institutions,
Both in institutional and in field-based data collection information on regional labor market entry of graduates could not be delivered by most institutions.
validity Robustness Availability Preliminary rating Feasibility score Data availability Conceptual clarity Data consistency Recommendation Graduates working in the region B c In Regional participation in continuing
Out Regional joint research publications*New indicator A*Data source: bibliometric analysis 133 Observations from the pilot test:
Less than half of the pilot institutions could deliver data on regional participation in continuing education programs (and only one fifth in mechanical engineering programs.
there is probably no way to improve the data situation in the short term. While far from good, the data situation on student internships in local enterprises and degree theses in cooperation with local enterprises turned out to be less problematic in business studies than that found in the engineering field.
Both internships and degree theses enable the expertise and knowledge of local higher education institutions to be utilized in a regional context, in particular in small-and medium-sized enterprises.
and in many non-metropolitan regions they play an important role in the recruitment of higher education graduates. 6. 3 Feasibility of data collection As explained in section 5. 3 data collection during the pilot
study was carried out via self-reporting from the institutions and analysis of international bibliometric and patent databases.
Self-reported institutional data 6. 3. 1for the collection of self-reported institutional data we made use of several questionnaires:
the U-map questionnaire to identify institutional profiles the U multirank institutional questionnaire the U multirank field-based questionnaire We supported this data collection with extensive data cleaning processes,
in order to further assess the feasibility of the data collection. In general the organization and procedures of the self-reported institutional data collection were evaluated as largely positive or at leastneutral'by the institutions.
Very few institutions were dissatisfied really with the processes. The collection of data by online questionnaires worked well
and the coordination of all data collection via a central contact person in participating institutions also proved successful. 134 We made the following key observations regarding the process of collecting self-reported institutional data:
The parallel institutional data collection for U-Map and U multirank caused some confusion. Although a tool was implemented to pre-fill data from U-Map into U multirank,
some confusion remained concerning the link between the two instruments. In order to test some varieties, institutional and field-based questionnaires were implemented with different features (e g. definition of international staff).
This procedure helped us to judge the relative feasibility of concepts and procedures. The glossary of indicators and data elements proved helpful in achieving a high degree of consistency in the data delivered by the institutions.
Yet the definitions and explanations of some elements (e g. staff categories including fte, the delineation of regions) could be improved bearing in mind that there is an apparent trade-off between adequate explanation
while supplying their data. The effort to include a feedback cycle both in institutional and field-based data collection (with questions
and comments on the data already submitted was appreciated greatly by the institutions. Although it implied a major investment of time by the project team,
this procedure proved to be very efficient and helped significantly to increase the quality and consistency of the data.
In some countries the U multirank student survey conflicted with existing national surveys, which in some cases are highly relevant for institutions.
Our major conclusion regarding the feasibility of the self-reported institutional data is that data availability is an issue in a number of cases.
but with the administrative processes related to data collection in some institutions. It may be assumed that when institutions increase their efforts regarding data collection
and data quality this problem will be mitigated. 135 Student survey data 6. 3. 2one of the major challenges regarding the feasibility of our global student survey is
whether the subjective evaluation of their own institution by students can be compared globally or whether there are differences in the levels of expectations or respondent behavior.
and thus that the feasibility of the data collection through a global-level student survey is sufficiently feasible.
Bibliometric and patent data 6. 3. 3the collection of bibliometric and patent data turned out to be largely unproblematic.
in bibliometric analysis the sets of publications produced by a specific institution (or a subunit of it) have to be identified in international bibliographic databases.
which the institution is detected automatically by lexical queries on the author's affiliation field (the address field) of the publications in the databases, by a query on keywords.
completeness of the selected bibliometric data cannot be guaranteed fully. To assess the feasibility of our bibliometric data collection we studied the potential effects of a bottom-up verification process via a special case study of six French universities.
The aim of the case study was to shed light on how a bottom-up verification approach might collect relevant data that would
otherwise be missed. The case study showed that in some cases a substantial number of the publications might have been missed
Nevertheless, the feasibility of the bibliometric data collection in the pilot study can be judged to be high.
Data were identified easily and analyzed although a warning against placing too much dependence on the completeness of the data remains in place.
With respect to the collection of patent data (via PATSTAT) there are two important caveats. First, as mentioned before,
we were only able to identify our sample institutions in the database. Subunits for field analyses could not be found.
This implies that patents for which the intellectual property rights are assigned to companies, governmental funding agencies or individual scientists are retrieved not
A second important caveat when extracting institutional-level patent data is that organizations register their patents under many different names and spelling variations.
PATSTAT data are no exception: applicant names are misspelled often, and their spelling varies from one patent to the other.
) Also with respect to the collection of patent data the conclusion regarding its feasibility is positive. However, it should be noted that patent data analysis could only be undertaken at the institutional
and not field level. 6. 4 Feasibility of up-scaling The pilot test included a limited number of institutions and only two fields.
is it possible to extend U multirank to a comprehensive global coverage and how easy would it be to add additional fields?
The pilot study suggests that a global multidimensional ranking is unlikely to prove feasible in the sense of achieving extensive coverage levels across the globe in the short term.
The prospects for widespread European coverage are encouraging. A substantial number of institutions both from EU and non-EU European countries participated in the projects.
but at the end of the day did not submit data suggests that data (non-)availability was a common theme.
Experience from the CHE ranking and other field-based rankings show that there is a core set of indicators that is relevant and meaningful for (virtually) all fields.
Similarly some disciplines may see dimensions such as knowledge transfer or regional engagement as less relevant to their core activities.
in particular in those areas and fields which so far have largely been neglected in international rankings due to the lack of adequate data
Combining U-Map and U multirank Our user-driven interactive web tool will imply both steps, too.
including provision for guiding users through the data and a visual framework to display the result data.
In U multirank the presentation of data allows for both: a comparative overview on indicators across institutions,
and, a detailed view of institutional profiles. The ideas presented below are inspired mainly by the U-Map visualisations and the presentation of results in the CHE ranking.
U multirank produces indicators and results on different levels of aggregation leading to a hierarchical data model:
Data at the level of institutions (results of focused institutional rankings) Data at the level of departments (results of field-based rankings) Data at the level of programs (results of field-based
rankings) The presentation format for ranking results should be consistent across the three levels while still accommodating the particular data structures on those levels.
The table displays the ranking groups (in different colours) representing the relative scores on the indicators.
either based on genuine data on higher education systems, e g. the University Systems Ranking published by the Lisbon Council31,
or by simply aggregating institutional data to the system level (e g. the QS National System Strength Ranking.
the definition of the indicators, processes of data collection and discussion on modes of presentation have been based on intensive stakeholder consultation.
In addition access to and navigation through the web tool will be made highly user-driven by specificentrances'for different groups of users (e g. students, researchers/academic staff, institutional administrators, employers) offering specific information
In accordance with EU policies on eaccessiblity32 barriers to access to the U multirank results and data will be removed as much as possible.
However, translation of the web tool and the underlying data is a substantial cost factor.
grouping approach), a description of underlying data sources (e g. self-reported institutional data, surveys, bibliometric data,
patent data) and a clear definition and explanation of indicators (including an explanation of their relevance and
How do users choose to navigate through the web tool? What indicators are selected most frequently in personalized rankings?
Tracking of user behaviour can be built systematically into the implementation of the web tool and by doing
The link between the two projects has been created by guaranteeing the use of U-Map data for the selection of comparable (and thereforerankable')institutions. 8. 2 Scope:
We would argue that U multirank should aim to achieve a relatively wide coverage of European higher education institutions as quickly as possible during the next project Phase in Europe the feasibility
When this strategy leads to a substantial database within the next two years recruitment could be reinforced, at
The frequency of data collection is always a compromise between obtaining the most up to date information
and the workload that data-gathering imposes on the institutions. For the institutional ranking data collection would probably take place via a full update for instance every two or three years.
We suggest a rolling system for the field-based ranking. There is no definitive answer to the question of how many fields there are in international higher education.
If the rankings were updated on a three-year rolling schedule this would allow coverage of 15 fields.
At that stage a better informed decision about the feasibility of extending the coverage of the rankings to further fields could be taken.
This implies that data updates would not lead to the publication of a 33 The Frascati manual has a similar structure
static ranking but would only feed into the database, allowing the user to rank on the basis of the most current information.
with U multirank it is also possible to create so-calledauthoritative'ranking lists from the database.
For instance, an international public organization might be interested in using the database to promote a ranking of the international, research-intensive universities in order to compare a sample of comparable universities worldwide.
this might be an important means of generating revenue from database-derived products. On the other hand, in the first phase of implementation, U multirank should be perceived by all potential users as relevant for their individual needs.
however, is on establishing the flexible web tool. 156 8. 4 The need for international data systems U multirank is isolated not an system,
The quality of the results of any transparency tool depends to a large extent on the availability of relevant and appropriate data.
There is a strong need for a European data system, with institution and field data, preferably with clear relationships to other data systems in the world (such as IPEDS.
and should continue the data collection in a follow-up of the recently finalized EUMIDA project.
The development of the European database resulting from EUMIDA should take into account the basic data needs of U multirank.
This would allow the pre-filling of institutional questionnaires with available data and would substantially reduce the workload for the institutions.
Some specific recommendations regarding the further development of the EUMIDA database can be made: First, there are some elements
such as staff data (the proper and unified definition of full-time equivalents and the specification of staff categories such asprofessor'is an important issue for the comparability of data),
or data related to students and graduates. EUMIDA could contribute to improve the data situation regarding employment-oriented outcome indicators.
An open question is how far EUMIDA is able to go into field-specific data; for the moment pre-filling from this source seems to be more realistic for the institution-level data than for field-based data.
A second aspect of integrated international data systems is the link between U multirank and national ranking systems.
U multirank implies a need for an international database of ranking data consisting of indicators which could be used as a flexible online tool
in order to create personalized rankings by users (according to the user's preferences). This database is a crucial starting point to identify
and rank comparable universities. Developing a European data system and connecting it to similar systems worldwide will strongly increase the potential for multidimensional global mapping and ranking.
Despite this clear need for cross-national/European/global data there will be a continued demand for information about national/regional higher education systems, in particular with regard to undergraduate higher education.
Furthermore it could also be used as a base for an international database and international rankings;
more and more countries could come in step by step with a core set of indicators used in each ranking.
and at the same time provide a core set of joint indicators that can be used for European and global rankings.
thus creating an increasing set of data systems to be combined into a joint database. How to deal with the top-down and bottom up-approach?
In thebottom-up'approach national rankings could feed their data into the international database, the U multirank unit will be able to pre-fill the data collection instruments
and has to fill the gaps to attain European or worldwide coverage. At the same time activities based on the top-down approach might help to make the system known
and to develop trust and credibility. Top-down rankings would also become less expensive to implement
if they could use existing national data and data collection infrastructures. Also, gaining sponsorship for the system could sometimes be easier starting from the national level;
Finalisation of the various U multirank instruments 1. Full development of the database and web tool.
populated with data and tested, and has to start running. The prototypes of the instrument will demonstrate the outcomes and benefits of U multirank. 2. Setting of standards and norms and further development of underdeveloped dimensions and indicators.
A core set of indicators should be defined, definitions of data concepts should be fixed, standardized elements of data collection tools should be developed.
In the feasibility study we found indicators and dimensions where the data collection was difficult,
but they have high relevance and we discovered sufficient potential to develop adequate concepts and data collection methods.
These parts of the ranking model should be developed further. 3. Update of data collection tools/questionnaires according to the revision and further development of indicators and the experiences from the U multirank project.
Depending on the further development of indicators and their operationalization, the data collection instruments have to be adapted.
A major issue is to design the questionnaires in a way that reduces administrative burden for the institutions as far as possible.
Development of pre-filling in EU+countries 4. Further development of pre-filling. In the first round of U multirank pre-filling proved difficult.
and the international U multirank database should be realized. Roll out of U multirank across EU+countries 5. Invitation of EU+higher education institutions and data collection.
Within the next two years all identifiable European higher education institutions should be invited to 159 participate in the institutional as well as in the three selected field-based rankings.
The objective would be to achieve full coverage of institutional profiles and have a sufficient number of comparable institutions.
This could be guaranteed by the smoothness of data collection and the services delivered to participants in the ranking process.
But user-friendliness also deals with the design of the web tool, taking into account the differing information needs
A user-friendly tool needs various levels of information provision, understandable language, clarity of symbols and explanations, assisted navigation through the web tool and feedback loops providing information
Elements of a new project phase Work package Products Deadline Database and web tool Functioning database Functioning web tool prototype 06/2012 Standards
and norms Description of standards and norms Final data model 06/2012 Finalized collection tools Collection tools 06/2012 Pre-filling Planning paper on pre
Data collection Data analysis and publication 06/2012 09/2012 03/2013 06/2013 Specific focused rankings Two rankings conceptualized One benchmarking exercise 12/2013 12/2012
The ranking must be run by a professional organization with expertise in large-scale data analysis and in transparency tools.
Therefore, efficiency in data collection is important. This criterion also refers to an efficient link between national rankings and the international ranking tool.
In addition, efficiency refers to the coordination of different European initiatives to create international databases (such as E3m, EUMIDA.
when the organizational structures do not lead to data monopolies. Service orientation: A key element of U multirank is the flexible, stakeholder-oriented, user-driven approach.
e g. media companies (interested in publishing rankings), consulting companies in the higher education context and data providers (such as the producers of bibliometric databases).
because if HEI experience high workloads with data collection they expect free products in return and are not willing to pay for basic data analysis.
Doubts about commitment to social values of European higher education area (e g. no free access for student users?.
The cost factors are first of all related to the necessary activities involved in the production of ranking data:
Methodological development and updates Communication activities Implementation of (technical) infrastructure Development of a database Provision of tools for data collection Data collection (again including communication) 170 Data analysis (including self-collected
data as well as analysis based on existing data sets as e g. bibliometric analysis) Data publication (including development and maintenance of an interactive web tool) Basic information services for users Internal
This determines the volume of data that has to be processed and the communication efforts. The number of countries/institutions which deliver data for free through a bottom-up system (this avoids costs.
The number of fields involved. To limit cost a ranking could not cover all fields with sufficient size
The surveys that are needed to cover all indicators outlined in the data models of U multirank.
and graduate surveys or the use of databases charged with license fees, e g. bibliometric and patent data.
The frequency of the updating of ranking data. A multidimensional ranking with data from the institutions will not be updated every year;
the best timespan for rankings has to take into account the trade-off between obtaining up to date information
and the workload for the institutions and the costs of updating data for the operative unit. 171 For the different steps we could identify the relevant cost factors, some fixed, some variable:
IT Indicators/databases used (e g. license costs) Development of a database Staff Basic IT costs Provision of tools for data collection Staff Basic IT costs (incl. online survey systems
and databases Data analysis Staff Number of countries and institutions covered Range of indicators and databases License fees of databases (e g. bibliometric) Publication Staff Basic IT costs Features of web tool
to present results Information services for users Staff Basic IT costs Number of countries and institutions covered Range of indicators and databases Scope of information services Internal organization
2) two junior staff members with experiences in statistics, empirical research, large-scale data collection, IT;(3) secretarial support.
Information technology support: further development and implementation of the on-line tool and related software development. Marketing and communication:
the design and development of information packages on ranking and the dissemination of the outcomes as well as the staff time needed to do this.
and could see the delivery of a web tool free of charge to students as its responsibility.
To ensure students'free access to U multirank data the EC could provide also in the long run direct funding of user charges that would
such as special analyses of ranking data, to cross-subsidize the instruments. g) Financial contributions from media partners publishing the results. h) Nonfinancial contributions from third parties,
such as free data provision. i) Free provision of data from national mapping and ranking systems (bottom-up approach).
and who will pay the variable costs for ranking/data collection. The scenarios try to develop a medium-term perspective.
Funding scenario 1 COST FACTOR COST SHARING Basic fixed costs 100%principals Rankings 50%principal 50%media partner, data providers
or data providers who benefit from being positioned in the ranking field. The principal's funding share could also include a contribution from the EC
To keep the web tool free of charges, especially for students, an equivalent to the charges could be paid by the EC.
%media partner, data providers, publishing companies 30%selling of products, user charges The different scenarios could be seen as extreme cases, each of them focusing strongly on one or some of the potential funding sources.
Free data provision, especially from nationally financed and run projects or from other existing data sources, will lower the cost of data collection.
The more that national statistics offices harmonize data collection, the lower the costs will be. Charges to the users of the U multirank web tool would seriously undermine the aim of creating more transparency in European higher education by excluding students for example;
but there is a possibility of some cross-subsidization from selling more sophisticated products such as data support to institutional benchmarking processes, special information services for employers, etc.
The EC could pay for student user charges. Project-based funding for special projects, for instance new methodological developments or rankings of a particulartype'of institution offer an interesting possibility with chances of cross-subsidization.
Market revenues could come from commercial elements of the web tool (advertising, apps. As soon as it is possible to publish authoritative rankings publishers/media partners could contribute to the costs. 176 A major additional source of income would be to charge institutional subscription fees
(which could also be paid by governments or foundations on the institutions'behalf). For U-Map this seems to be a viable solution,
What kind of benefit do institutions have to have in order for it to outweigh the costs of data gathering plus subscription fees?
EC, foundations, other sponsors) with a combination of a variety of market sources contributing cost coverage plus some cost reductions through efficiency gains. 8. 9 A concluding perspective U multirank
and should enlarge this database internationally, targeting the institutions required to reach sufficient coverage for all relevant profiles.
The nature of the ranking has to remain global and should not merely serve European interests.
Data from U-Multirank, U-Map, national field-based rankings and national statistics should be integrated coordinated
'Despite the focus on the flexible web tool, concepts for authoritative rankings, either for the public or for associations of higher education institutions, should be developed because of their market potential.
an exploration of Italian patent data''.''Research Policy, 33: 127 145. Becher, T. and M. Kogan (1992.
"On Doctors, Mechanics, and Computer Specialists: The Economics of Credence Goods"Journal of Economic Literature 44 (1): 5-42.
Retrieved 24.6,2006, from http://www. che. de/downloads/Berlin principles ireg 534. pdf 179 Ischinger, B. and Puukka, J. 2009), Universities for Cities and Regions:
New evidence from the KEINS database''.''Research Evaluation, 17 (2): 87-102. Magerman T, Grouwels J.,Song X. & Van Looy B. 2009.
Data Production Methods for Harmonized Patent Indicators: Patentee Name Harmonization. EUROSTAT Working Paper and Studies, Luxembourg.
What patent data reveal about universities: The case of Belgium''.''Journal of Technology Transfer, 28:47 51.
performance measurements and indicators based on co-authorship data for the world's largest universities, Research Evaluation, 18, pp. 13-24.
Overtext Web Module V3.0 Alpha
Copyright Semantic-Knowledge, 1994-2011