Hsinchun Chen

Hsinchun Chen

Professor, Management Information Systems
Regents Professor
Member of the Graduate Faculty
Professor, BIO5 Institute
Primary Department
Contact
(520) 621-4153

Research Interest

Dr Chen's areas of expertise include:Security informatics, security big data; smart and connected health, health analytics; data, text, web mining.Digital library, intelligent information retrieval, automatic categorization and classification, machine learning for IR, large-scale information analysis and visualization.Internet resource discovery, digital libraries, IR for large-scale scientific and business databases, customized IR, multilingual IR.Knowledge-based systems design, knowledge discovery in databases, hypertext systems, machine learning, neural networks computing, genetic algorithms, simulated annealing.Cognitive modeling, human-computer interactions, IR behaviors, human problem-solving process.

Publications

Zhang, Y., Ximing, Y. u., Dang, Y., & Chen, H. (2010). An integrated framework for avatar data collection from the virtual world. IEEE Intelligent Systems, 25(6), 17-23.

Abstract:

To mine the rich social media data produced in virtual worlds, an integrated framework combines bot- and spider-based approaches to collect avatar behavioral and profile data. © 2010 IEEE.

Liu, X., & Chen, H. (2015). A Research Framework for Pharmacovigilance in Health Social Media: Identification and Evaluation of Patient Adverse Drug Event Reports. JOURNAL OF BIOMEDICAL INFORMATICS, 58, 268-279.
Kaza, S., Wang, Y., & Chen, H. (2006). Suspect vehicle identification for border safety with modified mutual information. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3975 LNCS, 308-318.

Abstract:

The Department of Homeland Security monitors vehicles entering and leaving the country at land ports of entry. Some vehicles are targeted to search for drugs and other contraband. Customs and Border Protection agents believe that vehicles involved in illegal activity operate in groups. If the criminal links of one vehicle are known then their border crossing patterns can be used to identify other partner vehicles. We perform this association analysis by using mutual information (MI) to identify pairs of vehicles that are potentially involved in criminal activity. Domain experts also suggest that criminal vehicles may cross at certain times of the day to evade inspection. We propose to modify the mutual information formulation to include this heuristic by using cross-jurisdictional criminal data from border-area jurisdictions. We find that the modified MI with time heuristics performs better than classical MI in identifying potentially criminal vehicles. © Springer-Verlag Berlin Heidelberg 2006.

McDonald, D., & Chen, H. (2002). Using sentence-selection heuristics to rank text segments in TXTRACTOR. Proceedings of the ACM International Conference on Digital Libraries, 28-35.

Abstract:

TXTRACTOR is a tool that uses established sentence-selection heuristics to rank text segments, producing summaries that contain a user-defined number of sentences. The purpose of identifying text segments is to maximize topic diversity, which is an adaptation of the Maximal Marginal Relevance criterion used by Carbonell and Goldstein [5]. Sentence selection heuristics are then used to rank the segments. We hypothesize that ranking text segments via traditional sentence-selection heuristics produces a balanced summary with more useful information than one produced by using segmentation alone. The proposed summary is created in a three-step process, which includes 1) sentence evaluation 2) segment identification and 3) segment ranking. As the required length of the summary changes, low-ranking segments can then be dropped from (or higher ranking segments added to) the summary. We compared the output of TXTRACTOR to the output of a segmentation tool based on the TextTiling algorithm to validate the approach.

Abbasi, A., & Chen, H. (2009). A comparison of tools for detecting fake websites. Computer, 42(10), 78-86.

Abstract:

As fake website developers become more innovative, so too must the tools used to protect Internet users. A proposed system combines a support vector machine classifier and a rich feature set derived from website text, linkage, and images to better detect fraudulent sites. © 2009 IEEE.