Hsinchun Chen

Hsinchun Chen

Professor, Management Information Systems
Regents Professor
Member of the Graduate Faculty
Professor, BIO5 Institute
Primary Department
Contact
(520) 621-4153

Research Interest

Dr Chen's areas of expertise include:Security informatics, security big data; smart and connected health, health analytics; data, text, web mining.Digital library, intelligent information retrieval, automatic categorization and classification, machine learning for IR, large-scale information analysis and visualization.Internet resource discovery, digital libraries, IR for large-scale scientific and business databases, customized IR, multilingual IR.Knowledge-based systems design, knowledge discovery in databases, hypertext systems, machine learning, neural networks computing, genetic algorithms, simulated annealing.Cognitive modeling, human-computer interactions, IR behaviors, human problem-solving process.

Publications

Reid, E. F., & Chen, H. (2007). Mapping the contemporary terrorism research domain. International Journal of Human Computer Studies, 65(1), 42-56.

Abstract:

A systematic view of terrorism research to reveal the intellectual structure of the field and empirically discern the distinct set of core researchers, institutional affiliations, publications, and conceptual areas can help us gain a deeper understanding of approaches to terrorism. This paper responds to this need by using an integrated knowledge-mapping framework that we developed to identify the core researchers and knowledge creation approaches in terrorism. The framework uses three types of analysis: (a) basic analysis of scientific output using citation, bibliometric, and social network analyses, (b) content map analysis of large corpora of literature, and (c) co-citation analysis to analyse linkages among pairs of researchers. We applied domain visualization techniques such as content map analysis, block-modeling, and co-citation analysis to the literature and author citation data from the years 1965 to 2003. The data were gathered from ten databases such as the ISI Web of Science. The results reveal: (1) the names of the top 42 core terrorism researchers (e.g., Brian Jenkins, Bruce Hoffman, and Paul Wilkinson) as well as their institutional affiliations; (2) their influential publications; (3) clusters of terrorism researchers who work in similar areas; and (4) that the research focus has shifted from terrorism as a low-intensity conflict to a strategic threat to world powers with increased focus on Osama Bin Laden. © 2006 Elsevier Ltd. All rights reserved.

Lin, Y., Chen, H., Brown, R. A., Li, S., & Yang, H. (2014). Time-to-Event Predictive Modeling for Chronic Conditions Using Electronic Health Records. IEEE INTELLIGENT SYSTEMS, 29(3), 14-20.
Zhou, Y., Qin, J., Reid, E., Lai, G., & Chen, H. (2005). Studying the presence of terrorism on the web: A knowledge portal approach. Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 402-.
Suakkaphong, N., Zhang, Z., & Chen, H. (2011). Disease named entity recognition using semisupervised learning and conditional random fields. Journal of the American Society for Information Science and Technology, 62(4), 727-737.

Abstract:

Information extraction is an important text-mining task that aims at extracting prespecified types of information from large text collections and making them available in structured representations such as databases. In the biomedical domain, information extraction can be applied to help biologists make the most use of their digital-literature archives. Currently, there are large amounts of biomedical literature that contain rich information about biomedical substances. Extracting such knowledge requires a good named entity recognition technique. In this article, we combine conditional random fields (CRFs), a state-of-the-art sequence-labeling algorithm, with two semisupervised learning techniques, bootstrapping and feature sampling, to recognize disease names from biomedical literature. Two data-processing strategies for each technique also were analyzed: one sequentially processing unlabeled data partitions and another one processing unlabeled data partitions in a round-robin fashion. The experimental results showed the advantage of semisupervised learning techniques given limited labeled training data. Specifically, CRFs with bootstrapping implemented in sequential fashion outperformed strictly supervised CRFs for disease name recognition. The project was supported by NIH/NLM Grant R33 LM07299-01, 2002-2005. © 2011 ASIS&T.

McQuaid, M. J., Ong, T., Chen, H., & Nunamaker Jr., J. F. (1999). Multidimensional scaling for group memory visualization. Decision Support Systems, 27(1), 163-176.

Abstract:

We describe an attempt to overcome information overload through information visualization - in a particular domain, group memory. A brief review of information visualization is followed by a brief description of our methodology. We discuss our system, which uses multidimensional scaling (MDS) to visualize relationships between documents, and which we tested on 60 subjects, mostly students. We found three important (and statistically significant) differences between task performance on an MDS-generated display and on a randomly generated display. With some qualifications, we conclude that MDS speeds up and improves the quality of manual classification of documents and that the MDS display agrees with subject perceptions of which documents are similar and should be displayed together.