Hsinchun Chen

Hsinchun Chen

Professor, Management Information Systems
Regents Professor
Member of the Graduate Faculty
Professor, BIO5 Institute
Primary Department
Contact
(520) 621-4153

Research Interest

Dr Chen's areas of expertise include:Security informatics, security big data; smart and connected health, health analytics; data, text, web mining.Digital library, intelligent information retrieval, automatic categorization and classification, machine learning for IR, large-scale information analysis and visualization.Internet resource discovery, digital libraries, IR for large-scale scientific and business databases, customized IR, multilingual IR.Knowledge-based systems design, knowledge discovery in databases, hypertext systems, machine learning, neural networks computing, genetic algorithms, simulated annealing.Cognitive modeling, human-computer interactions, IR behaviors, human problem-solving process.

Publications

Zhou, Y., Qin, J., Lai, G., Reid, E., & Chen, H. (2006). Exploring the dark side of the Web: Collection and analysis of U.S. extremist online forums. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3975 LNCS, 621-626.

Abstract:

Contents in extremist online forums are invaluable data sources for extremism reseach. In this study, we propose a systematic Web mining approach to collecting and monitoring extremist forums. Our proposed approach identifies extremist forums from various resources, addresses practical issues faced by researchers and experts in the extremist forum collection process. Such collection provides a foundation for quantitative forum analysis. Using the proposed approach, we created a collection of 110 U.S. domestic extremist forums containing more than 640,000 documents. The collection building results demonstrate the effectiveness and feasibility of our approach. Furthermore, the extremist forum collection we created could serve as an invaluable data source to enable a better understanding of the extremism movements. © Springer-Verlag Berlin Heidelberg 2006.

Kaza, S., & Chen, H. (2009). Effect of inventor status on intra-organizational innovation evolution. Proceedings of the 42nd Annual Hawaii International Conference on System Sciences, HICSS.

Abstract:

Innovation is one of the primary characteristics that separates successful from unsuccessful organizations. Organizations have a choice in selecting knowledge that is recombined to produce new innovations. The selection of knowledge is influenced by the status of inventors in an organization's internal knowledge network. In this study, we model knowledge flow within an organization and contend that it exhibits unique characteristics not incorporated in most social network measures. Using the model, we also propose a new measure based on random walks and team identification and use it to examine innovation selection in a large organization. Using empirical methods, we find that inventor status determined by the new measure had a significant positive relationship with the likelihood that his/her knowledge would be selected for recombination. We believe that the new measure in addition to modeling knowledge flow in a scientific collaboration network helps better understand how innovation evolves within organizations. © 2009 IEEE.

McDonald, D. M., Chen, H., & Schumaker, R. P. (2005). Transforming open-source documents to terror networks: The arizona terrornet. AAAI Spring Symposium - Technical Report, SS-05-01, 62-69.

Abstract:

Homeland security researchers and analysts more than ever must process large volumes of textual information. Information extraction techniques have been proposed to help alleviate the burden of information overload. Information extraction techniques, however, require retraining and/or knowledge re-engineering when document types vary as in the homeland security domain. Also, while effectively reducing the volume of the information, information extraction techniques do not point researchers to unanticipated interesting relationships identified within the text. We present the Arizona TerrorNet, a system that utilizes less specified information extraction rules to extract less choreographed relationships between known terrorists. Extracted relations are combined in a network and visualized using a network visualizer. We processed 200 unseen documents using the TerrorNet which extracted over 500 relationships between known terrorists. An Al Qaeda network expert made a preliminary inspection of the network and confirmed many of the network links.

Tianjun, F. u., Abbasi, A., & Chen, H. (2007). Interaction coherence analysis for dark web forums. ISI 2007: 2007 IEEE Intelligence and Security Informatics, 343-350.

Abstract:

Interaction coherence analysis (ICA) attempts to accurately identify and construct interaction networks by using various features and techniques. It is useful to identify user roles, user's social and information value, as well as the social network structure of Dark Web communities. In this study, we applied interaction coherence analysis for Dark Web forums using the Hybrid Interaction Coherence (HIC) algorithm. Our algorithm utilizes both system features such as header information and quotations, and linguistic features such as direct address and lexical relation. Furthermore, several similarity-based methods, for example Vector Space Model, Dice equation, and sliding window, are used to address various types of noises. Two experiments have been conducted to compare our HIC algorithm with traditional linkage-based method, similarity-based method, and a simplified HIC method that does not address noise issues. The results demonstrate the effectiveness of our HIC algorithm for identifying interactions in Dark Web forums. © 2007 IEEE.

Chen, H. (1994). Collaborative systems: solving the vocabulary problem. Computer, 27(5), 58-66.

Abstract:

Vocabulary differences have created difficulties for on-line information retrieval systems and are even more of a problem in computer-supported cooperative work (CSCW), where collaborators with different backgrounds engage in the exchange of ideas and information. Our research group at the University of Arizona has investigated two questions related to the vocabulary problem in CSCW. First, what are the nature and characteristics of the vocabulary problem in collaboration, and are they different from those observed in information retrieval or in human-computer interactions research? Second, how can computer technologies and information systems be designed to help alleviate the vocabulary problem and foster seamless collaboration? We examine the vocabulary problem in CSCW and suggest a robust algorithmic solution to the problem.