Projects

Entity Embedding in Medical Text Data

Cross-disciplinary Scholars in Science and Technology (CSST) project, Scalable Analytics Institute, University of California, Los Angeles, 2017

  • Performed independent research on entity representation learning for the development of a medical case report query system, supervised by Prof. Yizhou Sun and Prof. Wei Wang
  • Constructed graphs for annotated domain-specific entities in clinical case reports and implemented an unsupervised graph embedding method with node2vec for the graph representation of case reports
  • Visualized result by projecting the 128-dimensional embedding of 2491 entities into 3D space using PCA
  • Evaluated result by calculating cosine similarity between medical terms and corresponding abbreviations, which outperformed Google’s word2vec pre-trained word embedding

Disease Evolution Analysis System

Design and Development of Information System Course Project, Wuhan University, 2017

  • Built a clinical decision support system which demonstrates disease evolution patterns and drug usage details
  • Processed data from MIMIC-III clinical database with Apache Spark and managed data with MySQL
  • Employed Hidden Markov Model to model disease progression from sequences of clinical measurements and constructed a disease evolution network for visualization with Gephi