Hsinchun Chen

Hsinchun Chen

Professor, Management Information Systems
Regents Professor
Member of the Graduate Faculty
Professor, BIO5 Institute
Primary Department
Contact
(520) 621-4153

Research Interest

Dr Chen's areas of expertise include:Security informatics, security big data; smart and connected health, health analytics; data, text, web mining.Digital library, intelligent information retrieval, automatic categorization and classification, machine learning for IR, large-scale information analysis and visualization.Internet resource discovery, digital libraries, IR for large-scale scientific and business databases, customized IR, multilingual IR.Knowledge-based systems design, knowledge discovery in databases, hypertext systems, machine learning, neural networks computing, genetic algorithms, simulated annealing.Cognitive modeling, human-computer interactions, IR behaviors, human problem-solving process.

Publications

Zhang, P., Sun, J., & Chen, H. (2005). Frame-based argumentation for group decision task generation and identification. Decision Support Systems, 39(4), 643-659.

Abstract:

One of the most important stages of group decision-making is the generation and identification of decision tasks. In this paper, we define a decision task with five elements: decision makers, decision executors, decision objectives, decision problems and decision constrains. Based on this distinction, we present a conceptual model for generation and identification of group decision tasks in an organization. In addition, we describe a prototype of a group argumentation support system (GASS) that applies frame-based information structure in electronic brainstorming (EBS) and argumentation to support group decision task generation and identification. Using four group performance indicators, the prototype was evaluated in a lab experiment to determine its effectiveness and efficiency. © 2004 Elsevier B.V. All rights reserved.

Zheng, R., Jiexun, L. i., Chen, H., & Huang, Z. (2006). A framework for authorship identification of online messages: Writing-style features and classification techniques. Journal of the American Society for Information Science and Technology, 57(3), 378-393.

Abstract:

With the rapid proliferation of Internet technologies and applications, misuse of online messages for inappropriate or illegal purposes has become a major concern for society. The anonymous nature of online-message distribution makes identity tracing a critical problem. We developed a framework for authorship identification of online messages to address the identity-tracing problem. In this framework, four types of writing-style features (lexical, syntactic, structural, and content-specific features) are extracted and inductive learning algorithms are used to build feature-based classification models to identify authorship of online messages. To examine this framework, we conducted experiments on English and Chinese online-newsgroup messages. We compared the discriminating power of the four types of features and of three classification techniques: decision trees, backpropagation neural networks, and support vector machines. The experimental results showed that the proposed approach was able to identify authors of online messages with satisfactory accuracy of 70 to 95%. All four types of message features contributed to discriminating authors of online messages. Support vector machines outperformed the other two classification techniques in our experiments. The high performance we achieved for both the English and Chinese datasets showed the potential of this approach in a multiple-language context.

Shieh, I., Chen, S., Lee, D., Lee, S., & Chen, H. (2008). Welcome message from conference co-chairs. IEEE International Conference on Intelligence and Security Informatics, 2008, IEEE ISI 2008, ix-x.
Tianjun, F. u., & Chen, H. (2008). Analysis of cyberactivism: A case study of online free Tibet activities. IEEE International Conference on Intelligence and Security Informatics, 2008, IEEE ISI 2008, 1-6.

Abstract:

Cyberactivism refers to the use of the Internet to advocate vigorous or intentional actions to bring about social or political change. Cyberactivism analysis aims to improve the understanding of cyber activists and their online communities. In this paper, we present a case study of online Free Tibet activities. For web site analysis, we use the inlink and outlink information of five selected seed URLs to construct the network of Free Tibet web sites. The network shows the close relationships between our five seed sites. Centrality measures reveal that tibet.org is probably an information hub site in the network. Further content analysis tells us that common hub site words are most popular in tibet.org whereas dalailama.com focuses mostly on religious words. For forum analysis, descriptive statistics such as the number of posts each month and the post distribution of forum users illustrate that the two large forums FreeTibetAndYou and RFAnews-Tibbs have experienced significant reduction in activities in recent years and that a small percentage of their users contribute the majority of posts. Important phrases of several long threads and active forum users are identified by using mutual information and TF-IDF scores. Such topical analyses help us understand the topics discussed in the forums and the ideas and interest of those forum users. Finally, social network analyses of the forum users are conducted to reflect their interactions and the social structure of their online communities. ©2008 IEEE.

Abbasi, A., & Chen, H. (2008). Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace. ACM Transactions on Information Systems, 26(2).

Abstract:

One of the problems often associated with online anonymity is that it hinders social accountability, as substantiated by the high levels of cybercrime. Although identity cues are scarce in cyberspace, individuals often leave behind textual identity traces. In this study we proposed the use of stylometric analysis techniques to help identify individuals based on writing style. We incorporated a rich set of stylistic features, including lexical, syntactic, structural, content-specific, and idiosyncratic attributes. We also developed the Writeprints technique for identification and similarity detection of anonymous identities. Writeprints is a Karhunen-Loeve transforms-based technique that uses a sliding window and pattern disruption algorithm with individual author-level feature sets. The Writeprints technique and extended feature set were evaluated on a testbed encompassing four online datasets spanning different domains: email, instant messaging, feedback comments, and program code. Writeprints outperformed benchmark techniques, including SVM, Ensemble SVM, PCA, and standard Karhunen-Loeve transforms, on the identification and similarity detection tasks with accuracy as high as 94% when differentiating between 100 authors. The extended feature set also significantly outperformed a baseline set of features commonly used in previous research. Furthermore, individual-author-level feature sets generally outperformed use of a single group of attributes. © 2008 ACM.