Hsinchun Chen
Publications
Abstract:
Contents in extremist online forums are invaluable data sources for extremism reseach. In this study, we propose a systematic Web mining approach to collecting and monitoring extremist forums. Our proposed approach identifies extremist forums from various resources, addresses practical issues faced by researchers and experts in the extremist forum collection process. Such collection provides a foundation for quantitative forum analysis. Using the proposed approach, we created a collection of 110 U.S. domestic extremist forums containing more than 640,000 documents. The collection building results demonstrate the effectiveness and feasibility of our approach. Furthermore, the extremist forum collection we created could serve as an invaluable data source to enable a better understanding of the extremism movements. © Springer-Verlag Berlin Heidelberg 2006.
Abstract:
Innovation is one of the primary characteristics that separates successful from unsuccessful organizations. Organizations have a choice in selecting knowledge that is recombined to produce new innovations. The selection of knowledge is influenced by the status of inventors in an organization's internal knowledge network. In this study, we model knowledge flow within an organization and contend that it exhibits unique characteristics not incorporated in most social network measures. Using the model, we also propose a new measure based on random walks and team identification and use it to examine innovation selection in a large organization. Using empirical methods, we find that inventor status determined by the new measure had a significant positive relationship with the likelihood that his/her knowledge would be selected for recombination. We believe that the new measure in addition to modeling knowledge flow in a scientific collaboration network helps better understand how innovation evolves within organizations. © 2009 IEEE.
Abstract:
Homeland security researchers and analysts more than ever must process large volumes of textual information. Information extraction techniques have been proposed to help alleviate the burden of information overload. Information extraction techniques, however, require retraining and/or knowledge re-engineering when document types vary as in the homeland security domain. Also, while effectively reducing the volume of the information, information extraction techniques do not point researchers to unanticipated interesting relationships identified within the text. We present the Arizona TerrorNet, a system that utilizes less specified information extraction rules to extract less choreographed relationships between known terrorists. Extracted relations are combined in a network and visualized using a network visualizer. We processed 200 unseen documents using the TerrorNet which extracted over 500 relationships between known terrorists. An Al Qaeda network expert made a preliminary inspection of the network and confirmed many of the network links.
Abstract:
Interaction coherence analysis (ICA) attempts to accurately identify and construct interaction networks by using various features and techniques. It is useful to identify user roles, user's social and information value, as well as the social network structure of Dark Web communities. In this study, we applied interaction coherence analysis for Dark Web forums using the Hybrid Interaction Coherence (HIC) algorithm. Our algorithm utilizes both system features such as header information and quotations, and linguistic features such as direct address and lexical relation. Furthermore, several similarity-based methods, for example Vector Space Model, Dice equation, and sliding window, are used to address various types of noises. Two experiments have been conducted to compare our HIC algorithm with traditional linkage-based method, similarity-based method, and a simplified HIC method that does not address noise issues. The results demonstrate the effectiveness of our HIC algorithm for identifying interactions in Dark Web forums. © 2007 IEEE.
Abstract:
Vocabulary differences have created difficulties for on-line information retrieval systems and are even more of a problem in computer-supported cooperative work (CSCW), where collaborators with different backgrounds engage in the exchange of ideas and information. Our research group at the University of Arizona has investigated two questions related to the vocabulary problem in CSCW. First, what are the nature and characteristics of the vocabulary problem in collaboration, and are they different from those observed in information retrieval or in human-computer interactions research? Second, how can computer technologies and information systems be designed to help alleviate the vocabulary problem and foster seamless collaboration? We examine the vocabulary problem in CSCW and suggest a robust algorithmic solution to the problem.
Pagination
- First page
- …
- 71
- 72
- 73
- …
- Last page