HE Yutong(), TIAN Di( ), GAO Ranran(), FAN Runlong(), YAO Li( ), CHEN Pengfei()
1 College of Instrumentation & Electrical Engineering, Jilin University, Jilin 130021, China2 Institute of Geology Chinese Academy of Geological Sciences, Beijing 100037, China3 College of Earth Sciences, Jilin University, Jilin 130061, China4 The 41st Research Institute of CETC, Qingdao 266555, China
Abstract: This paper proposes a useful web-based system for the management and sharing of electron probe micro-analysis (EPMA) data in geology. A new web-based architecture that integrates the management and sharing functions is developed and implemented. Earth scientists can utilize this system to not only manage their data, but also easily communicate and share it with other researchers. Data query methods provide the core functionality of the proposed management and sharing modules. The modules in this system have been developed using cloud GIS technologies, which help achieve real-time spatial area retrieval on a map. The system has been tested by approximately 263 users at Jilin University and Beijing SHRIMP Center. A survey was conducted among these users to estimate the usability of the primary functions of the system, and the assessment result is summarized and presented.
Key words: electron probe micro-analysis (EPMA); data management; data sharing; web-based architecture; data query; GIS
Electron probe micro-analysis (EPMA) is a well-known and practical method for nondestructive microchemical analysis of solid materials. The spatial scale of analysis and the ability to create detailed images of a sample makes it possible to analyze geological materials in situ and to resolve complex chemical variation within single phases. Commonly used for microscopic analysis, it has been long considered as one of the most important tools in geology[1-6]. Hence, a large amount of EPMA geoanalytical data has been created. Thus far, these data have typically been managed by individual researchers who store these data on local computers using the on-board file system; the data files are stored in folders and named after relatively arbitrary attributes like project names, sampling data, and sample characters. However, with the passage of time, the amount of data increases, and searching for required data can take a lot of time. Thus, efficient management of data by earth scientists during their research is a basic requirement. Furthermore, in geological research, data comparison and referencing are common; researchers search for similar data from previously published literature to make comparisons with their own results or to guide them with their research. Considering this, data sharing and communication is another crucial requirement for earth scientists. Therefore, the construction of a data management and sharing system could provide a platform for earth scientists to manage and share their data effectively and improve work efficiency. In addition, such a system can protect data integrity and allow similar data to be stored in a unified format to provide supplementary data for other professional geological databases[7].
Considerable improvements in computer science and informatics have resulted in the development of diverse data management and sharing systems; these systems have been assimilated in various fields, including geology, for the effective and efficient management and exchange of data[8-10]. However, a system that is specific to EPMA or any other instrument analytical method has not been reported. Most existing systems are based on a professional geological area such as a rock database, geochemical database, and geochronological database[11]. These systems manage only contain certain kinds of data created through EPMA; thus, they are not suitable for daily management and sharing of EPMA geoanalytical data. In addition, most systems provide only one of the two functionalities,i.e., either data management or data sharing. For example, DataView, DeepDive, and A Scientific Data Management System(ISDMS)[12-16]are data management systems and Geochemistry of Rocks of the Oceans and Continemts(GEOROC), The North American Volcanic and Intrusive Rock Database(NAVDAT), and Integrated Data Management for Sediment Geochmistry(SedDB)[17-22]are data sharing systems. Because the data formats and functions might differ between different databases, data transfer and migration between databases is considerably complicated. In addition, most management systems are developed for local operation and serve users internally within a unit, such as users in a local network; moreover, some of these systems are standalone applications installed only on a single computer. Thus, these systems have strict limitations on operation time, location, and hardware.
To address the abovementioned limitations, we propose a useful system that is specialized for the management and sharing of EPMA geoanalytical data. Our proposed system can provide data management functions via the internet, making it accessible to users from different institutions or laboratories for easy and effective management of their EPMA geoanalytical data. In addition, the system integrates the management and sharing functions, making it possible for users to share and manage data simultaneously. To realize this system, a web-based architecture including a management module and a sharing module is designed and implemented. In addition, data querying is included, because it is important for management and sharing; the data querying feature is developed using new cloud GIS technologies. The proposed system has been tested at Jilin University. The performance of the system is assessed by registered users, and the results are summarized and presented in this paper.
The architecture of our proposed EPMA-Web based management and sharing system(WMSS) is shown in Fig. 1. It consists of three parts,i.e., a database structure, data management module, and data sharing module. The database structure is used to maintain and store data. The management module is used to help geological researchers manage their data on a daily basis. To ensure effective data management, three functions are required. Firstly, importing data into the system. Secondly, searching the data for the required information. Thirdly, downloading the acquired data for further processing. Hence, the management module includes three functions,i.e., data upload, data query, and data download. The objective of the sharing module is to facilitate data transfer among geological researchers. In conventional methods, users who want to access or share their data must fill a template and send it to database administrators or managers. Then, the database administrators/managers check whether the data are available and accessible and import the data into a local database. As the sharing module in EPMA-WMSS is integrated with the management module, all data already exists in the database. Thus, the users only need to determine what data should be shared with other users. Another important feature required in such systems is an interface to query data from other users. Therefore, three functions are designed and developed for the sharing module,i.e., data open, data query, and data download. Using the data open function, data owners can change the access rights to their private data and make it public, or open source, which makes it searchable by other researchers. Through the querying interfaces in sharing module, all users who have registered in EPMA-WMSS can search and download open source data.
Fig.1 System architecture of EPMA-WMSS
The system is developed based on the abovementioned system architecture. The Java programming language is used to create a cross-platform web application, and the Tomcat web server is selected to provide rapid and safe communication with clients. To assure the connection between sample information and analytical information, a relational database structure is selected for the development of EPMA-WMSS. Table 1 lists the characteristics of commonly used implementations of relational database systems[23]. Based on the comparison of different relational database management system(RDBMS) implementations, MySQL is selected to implement the relational database structure for our EPMA-WMSS system. In addition, commonly used frontend technologies, namely, HTML5, CSS3, and JavaScript are used to create user friendly and attractive dynamic interactive pages and user interfaces. To make the operation of our system more convenient and effective, several new technologies and plugins are applied to the system. Figure 2 shows the homepage of our EPMA-WMSS.
Table 1 Characteristics of common RDBMS implementations
The analytical data from EPMA devices is typically extracted in the form of an Excel file. This analytical file contains information that geological researchers need to manage. In addition, the sample information including the sampling location, sample descriptions, and sample lithofacies, among others, is important to understand the obtained analytical data for the effective study of geological phenomena. For example, the sample location information indicates the characteristics of geological environment and geological features. The lithofacies data reveal the direction of the explanation. Furthermore, the project and user information are key components for the implementation of EPMA-WMSS. This is because user information is essential to indicate the owner of the data records, and project information is key to help geological researchers query their own data. Hence, a database structure is designed based on EPMA data characteristics and the requirements of geological researchers. Figure 3 shows the relational data model for the database in our proposed system. The structure includes four tables, namely, sample table, analytical table, user table, and project table. In addition, to reduce storage space, a file system is adopted to store file contents, and the path of the file is stored in an analytical information table. The different tables are connected through foreign-keys (highlighted in grey). The primary and foreign keys of the tables are indicated with bold, black and red, italic fonts, respectively. Further details about the concept of foreign keys can be obtained from other publications[29-30]. Data dictionaries are used to explain the database structure[31]. Table 2 lists the data dictionary of the main table,i.e., the sample table as an example.
Fig. 2 Homepage of EPMA-WMSS
Fig.3 Database structure of EPMA-WMSS (the attribute Id in each table is the primary key and its value is assigned in sequence using the auto increment setting; the attributes in italic font indicate foreign keys)
Data dictionaryTable nameSample descriptionDescriptionDescription of samplePrimary keyIdIndexProjectid, Open_flag, UseridFieldDatatypeLength/setDefaultAllow nullCommentNameVARCHAR100null Rock name to be measured by EPMAStructureVARCHAR50null The rock structure nameTextureVARCHAR50null The rock texture nameAlterationVARCHAR20null The rock alteration name in the fieldMain mineralVARCHAR100null The host mineral name of the rockAccessory mineralVARCHAR50null The other mineral names of a rockDegreeVARCHAR20null The alteration degree of the rockProject_idINT10null Identification of the project to which this record belongsOpen_flagINT100?Marking the records public or privateUseridINT100?Marking the data ownerRemarkVARCHAR100null The expanding fields
As time passes, the analytical data acquired by geological researchers increases continuously; among these data, EPMA analytical data are frequently used in geological studies. Considering the large pool of data available, storing and querying involve high storage overhead and long query times. The proposed data management module can be used by geological researchers to store and query their own data easily. Each registered user will have their own memory space to manage their data; the data stored in the module will be private and can only be accessed by the data owner. We have developed three functions to achieve the above mentioned functionalities, including user management, data upload, and data query. For the user management function, each user who wants to manage data using EPMA-WMSS must register on the system. Registered users will be provided their own storage space and can access it using a username and password. The password is processed using MD5 technology[32]. The upload operation is performed in two steps. First, a sample information form is filled; Second, an analytical result file is selected. The sample information form consists of five items and is considerably simple. In addition, a plug-in software named Input-File is used for uploading files to the database. Using this software, we can select and upload multiple files simultaneously on the file upload page, saving the users considerable time. Figure 4 shows the file upload page. As previously mentioned, the data query method is the primary function of the management and sharing modules. Cloud GIS technologies[22]were adopted and intuitive real-time spatial retrieval methods including GIS query and region GIS query were created[33-34]. The GIS query page is created automatically by the system. Once the user opens the query page, an online satellite map will be created and presented automatically. For the region query page, the map will be presented with a drawing tool. The GIS application programming interface(API) is a new feature of Baidu Map (http://lbsyun.baidu.com/) and can be developed easily using the JavaScript language; the API is open and free. Figures 5-6 show the two methods of data query. In Fig. 5, the locations of the samples are marked with a five-pointed star on a map. This map provides a clear view to a user about where and how many samples have been studied in the past. All users have their view in their space. When a five-pointed star is clicked, detailed information including position and sample description is shown in a pop-up box and the analytical data file is listed at the bottom of the map with a download icon. In contrast to GIS query, the region GIS query method enables users to retrieve multiple samples simultaneously according to a dynamic polygon area on the map. A drawing tool is provided on the top-right corner of the map. By clicking the polygon icon, a user can access the drawing tool and draw points on the map. A line is generated to connect two adjacent points. Finally, a closed polygon is created and covers an area on the map. The accuracy of the area can be increased by drawing more points. The samples within this polygon area are presented on the map, and the associated analytical files of these samples are listed in a table at the bottom of the map.
Fig. 4 Sample information upload page
Fig.5 GIS query page
Fig.6 Region GIS query page
The data sharing module provides an open platform for geological researchers to exchange data with each other. In the sharing module, the users can to change the access rights of their own private data to public, which can then be searched by other researchers. In addition, they can also query data from other researchers. To realize these functionalities, three functions including data open function, data querying, and data download function are developed to help geological researchers access their data and query data from other researchers. The data open function is an interface, using which earth scientists can set their private data as a public data source. For data open function, when users enter the page, all their own sample items stored in the system are shown in a tabular format. The users are required to select one or multiple sample rows and click the open button. These data can be accessed by other earth scientists through the sharing interface. The table is designed and implemented using a considerably comprehensive plugin software named DataTables. Using this plug-in, the rows in the table can be ranked easily by clicking the table head. The sharing module contains interfaces for users to query the open data. These interfaces include query methods, using the same methods used in the management module; the difference is that users can access only their data in the management module but all open data in the sharing module. The data download function provides three different formats including PDF, Excel, and comma-separated values(CSV) to export data subsets, which can be used by researchers for comparison.
EPMA-WMSS has been tested at College of Earth Sciences, Jilin University, and Beijing SHRIMP center, Institute of Geology Chinese Academy of Geological Sciences, since 2016 as a tool for managing EPMA geoanalytical data. Approximately 263 researchers have used EPMA-WMSS in daily research. Almost 6 453 sample items and 19 305 associated items of analytical data have been stored in EPMA-WMSS. In addition, open sample data contain 3 573 sample items and 7 346 items of analytical data. The earth scientists who use the system were asked to answer questionnaires for the usability evaluation of the system. Analytical results show that the majority of users find EPMA-WMSS to be a practical and effective tool for managing and sharing data. Figure 7 presents a summary of the usability evaluation of EPMA-WMSS based on the survey. Different colors represent the different levels of usability allocated by the features of EPMA-WMSS. The result indicates that the integration of the management and sharing functions and the query methods were significant in helping earth scientists improve work efficiency in daily research.
Fig.7 Usability evaluation results of EPMA-WMSS
In this study, a new system is designed for the management and sharing of EPMA geoanalytical data. A web-based architecture of EPMA-WMSS that integrates management and sharing modules is proposed. This architecture saves time for earth scientists in terms of exchanging data between different systems and considerably improves the efficiency of daily work. The architecture can be used as a prototype for the construction of management and sharing systems in several different areas in future. In addition, query methods are designed based on cloud GIS technologies; these methods provide an addition to conventional query methods and are more suitable and customizable for earth scientists to query data. The methods can be used in several other geological databases or information systems.
At present, the EPMA-WMSS is implemented and tested in an internal network and only designated users can use them via the internet. In the future, the system will be deployed on a server from where geological researchers all over the world can manage their data. In addition, as EPMA is just one of the instrumental analytical methods used in geology, future work should be dedicated to the application of the system to other important analytical methods such as ICPMS, XRF, and SHRIMP. We hope that this system can be implemented with a data center that helps earth scientists in the management and exchange of various kinds of analytical data.
Journal of Donghua University(English Edition)2018年4期