Publications

Active Learning with Maximum Margin Sparse Gaussian Processes

Published in International Conference on Artificial Intelligence and Statistics (AISTATS) , 2021

We present a maximum-margin sparse Gaussian Process (MM-SGP) for active learning (AL) of classification models for multi-class problems. The proposed model makes novel extensions to a GP by integrating maximummargin constraints into its learning process, aiming to further improve its predictive power while keeping its inherent capability for uncertainty quantification. The MM constraints ensure small “effective size” of the model, which allows MM-SGP to provide good predictive performance by using limited “active” data samples, a critical property for AL. Furthermore, as a Gaussian process model, MM-SGP will output both the predicted class distribution and the predictive variance, both of which are essential for defining a sampling function effective to improve the decision boundaries of a large number of classes simultaneously. Finally, the sparse nature of MM-SGP ensures that it can be efficiently trained by solving a low-rank convex dual problem. Experiment results on both synthetic and real-world datasets show the effectiveness and efficiency of the proposed AL model.

Recommended citation: Shi, Weishi, and Qi Yu. "Active Learning with Maximum Margin Sparse Gaussian Processes." International Conference on Artificial Intelligence and Statistics. PMLR, 2021. http://proceedings.mlr.press/v130/shi21a/shi21a.pdf

A Gaussian Process-Bayesian Bernoulli Mixture Model for Multi-Label Active Learning

Published in Neural Information Processing Systems (NIPS) , 2021

We propose a novel integrated Gaussian Process-Bayesian Bernoulli Mixture model (GP-B2M) and a principled sampling function for multi-label classification active learning. The proposed method could accurately quantify a data samples overall contribution to a correlated label space and choose the most informative samples for cost-effective annotation.

Recommended citation: Shi, Weishi, et al. "A Gaussian Process-Bayesian Bernoulli Mixture Model for Multi-Label Active Learning Coming soon

Multifaceted Uncertainty Estimation for Label-Efficient Deep Learning

Published in Neural Information Processing Systems (NIPS) 2020, 2020

We present a novel multi-source uncertainty prediction approach that enables deep learning (DL) models to be actively trained with much less labeled data. By leveraging the second-order uncertainty representation provided by subjective logic (SL), we conduct evidence-based theoretical analysis and formally decompose the predicted entropy over multiple classes into two distinct sources of uncertainty: vacuity and dissonance, caused by lack of evidence and conflict of strong evidence, respectively. The evidence based entropy decomposition provides deeper insights on the nature of uncertainty, which can help effectively explore a large and highdimensional unlabeled data space. We develop a novel loss function that augments DL based evidence prediction with uncertainty anchor sample identification. The accurately estimated multiple sources of uncertainty are systematically integrated and dynamically balanced using a data sampling function for label-efficient active deep learning (ADL). Experiments conducted over both synthetic and real data and comparison with competitive AL methods demonstrate the effectiveness of the proposed ADL model.

Recommended citation: Shi, Weishi, et al. "Multifaceted uncertainty estimation for label-efficient deep learning." Advances in Neural Information Processing Systems 33 (2020). https://zxj32.github.io/data/NIPS2020_ADL_papre.pdf

A Bayesian learning model for design-phase service mashup popularity prediction

Published in Expert Systems with Applications(2020), 2020

Using web services as building blocks to develop software applications, i.e., service mashups, not only reuses software development efforts to minimize development cost, but also leverages user groups and marketing efforts of those services to attract users and improve profits. This has significantly encouraged the development of a large number of service mashups in various domains. However, using existing services, even popular ones, does not guarantee the success of a mashup. In fact, a large portion of existing mashups fail to attract a good number of users, making the mashup development effort less effective. Design-phase popularity prediction can help avoid unpromising mashup developments by providing early-on insight into the potential popularity of a mashup. In this paper, we investigate the factors that can affect the popularity of a mashup through a comprehensive analysis on one of the largest mashup repository (i.e., ProgrammableWeb). We further propose a novel Bayesian approach that offers early-on insight to developers into the potential popularity of a mashup using design-phase features only. Besides identifying those relevant features, the Bayesian learning model can provide a confidence level for each prediction. This provides useful guidance to developers for successful mashup development. Experimental results demonstrate that the proposed approach achieves high prediction accuracy and outperforms competitive models.

Recommended citation: Alshangiti, Moayad, et al. "A Bayesian learning model for design-phase service mashup popularity prediction." Expert Systems with Applications 149 (2020): 113231. https://www.sciencedirect.com/science/article/pii/S0957417420300579

Integrating Multi-level Tag Recommendation with External Knowledge Bases for Automatic Question Answering

Published in ACM Transactions on Internet Technology (TOIT) 2019, 2019

We focus on using natural language unstructured textual Knowledge Bases (KBs) to answer questions from community-based Question-and-Answer (Q8A) websites. We propose a novel framework that integrates multi-level tag recommendation with external KBs to retrieve the most relevant KB articles to answer user posted questions. Different from many existing efforts that primarily rely on the Q8A sites’ own historical data (e.g., user answers), retrieving answers from authoritative external KBs (e.g., online programming documentation repositories) has the potential to provide rich information to help users better understand the problem, acquire the knowledge, and hence avoid asking similar questions in future. The proposed multi-level tag recommendation best leverages the rich tag information by first categorizing them into different semantic levels based on their usage frequencies. A post-tag co-clustering model, augmented by a two-step tag recommender, is used to predict tags at different levels for a given user posted question. A KB article retrieval component leverages the recommended multi-level tags to select the appropriate KBs and search/rank the matching articles thereof. We conduct extensive experiments using real-world data from a Q8A site and multiple external KBs to demonstrate the effectiveness of the proposed question-answering framework.

Recommended citation: Lima, Eduardo, et al. "Integrating multi-level tag recommendation with external knowledge bases for automatic question answering." ACM Transactions on Internet Technology (TOIT) 19.3 (2019): 1-22. https://dl.acm.org/doi/pdf/10.1145/3319528

Integrating Generative and Discriminative Sparse Kernel Machines for Multi-class Active Learning

Published in Neural Information Processing Systems (NIPS) 2019, 2019

We propose a novel active learning (AL) model that integrates Bayesian and discriminative kernel machines for fast and accurate multi-class data sampling. By joining a sparse Bayesian model and a maximum margin machine under a unified kernel machine committee (KMC), the proposed model is able to identify a small number of data samples that best represent the overall data space while accurately capturing the decision boundaries. The integration is conducted using the maximum entropy discrimination framework, resulting in a joint objective function that contains generalized entropy as a regularizer. Such a property allows the proposed AL model to choose data samples that more effectively handle non-separable classification problems. Parameter learning is achieved through a principled optimization framework that leverages convex duality and sparse structure of KMC to efficiently optimize the joint objective function. Key model parameters are used to design a novel sampling function to choose data samples that can simultaneously improve multiple decision boundaries, making it an effective sampler for problems with a large number of classes. Experiments conducted over both synthetic and real data and comparison with competitive AL methods demonstrate the effectiveness of the proposed model.

Recommended citation: Shi, Weishi, and Qi Yu. "Integrating bayesian and discriminative sparse kernel machines for multi-class active learning." Advances in neural information processing systems (2019) https://par.nsf.gov/servlets/purl/10164667

An Efficient Many-Class Active Learning Framework for Knowledge-Rich Domains

Published in IEEE International Conference on Data Mining (ICDM) 2018, 2018

The high cost for labeling data instances is a key bottleneck for training effective supervised learning models. This is especially the case in domains such as medicine and bioinformatics, where expert knowledge is required for understanding and extracting the underlying semantics of data. Active learning provides a means to reduce human labeling efforts by identifying the most informative data instances. In this paper, we propose a cost-effective active learning framework to further lessen human efforts, especially in knowledge-rich domains where a large number of classes may be subject to scrutiny during decision making. In particular, this framework employs a novel many-class sampling model, MC-S, for data sample selection. MC-S is further augmented with convex hull-based sampling to achieve faster convergence of active learning. Evaluation studies conducted over multiple real-world datasets with many classes demonstrate that the proposed framework significantly reduces the overall labeling efforts through fast convergence and early stop of active learning.

Recommended citation: Shi, Weishi, and Qi Yu. "An Efficient Many-Class Active Learning Framework for Knowledge-Rich Domains." 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 2018. https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8594973

From Novice to Expert Narratives of Dermatological Disease

Published in IEEE International Conference on Pervasive Computing and Communications Workshops 2018., 2018

Medical diagnosis requires extensive knowledge of pathological characteristics and the clinical training necessary to master specific domains. In the domain of dermatology, these properties include the size, shape, distribution, color, and location of symptoms. We have built and trained a tag recommender with expert narratives of dermatological disease, then explored the effectiveness of this system on novice narratives. The system mitigates the prerequisite of domain knowledge, empowering a novice to enhance their medical descriptions or even reach an accurate diagnosis. After collecting novice narratives, we explored word alignment to provide a mapping between expert and novice vocabulary allowing novice input to be augmented with expert terminology. Ultimately, we found that our system is an effective educational tool, which could be improved by word alignment and other techniques.

Recommended citation: Obot, Nse, et al. "From novice to expert narratives of dermatological disease." 2018 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops). IEEE, 2018. https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8480162

Correlation-Aware Multi-Label Active Learning for Web Service Tag Recommendation.

Published in IEEE International Conference on Web Services (ICWS) 2017, 2017

Tag recommendation has gained significant popularity for annotating various web-based resources including web services. Compared with other approaches, tag recommendation based on supervised learning models usually lead to good accuracy. However, a high-quality training data set is needed, which demands manual tagging efforts from domain experts. While we could leverage the tags of existing web services assigned by their developers, the quality of these tags may not be good enough to build accurate classifiers for tag recommendation. In this paper, a novel multi-label active learning approach is proposed for web service tag recommendation. The proposed approach is able to identify a small number of most informative web services to be tagged by domain experts. We further minimize the domain expert efforts by learning and leveraging the correlations among tags to improve the active learning process. We conduct a comprehensive experimental study on a real-world data set and results demonstrate the effectiveness of our approach.

Recommended citation: Shi, Weishi, Xumin Liu, and Qi Yu. "Correlation-aware multi-label active learning for web service tag recommendation." 2017 IEEE International Conference on Web Services (ICWS). IEEE, 2017. https://ieeexplore.ieee.org/document/8029766

Statistical Learning of Domain-Specific Quality-of-Service Features from User Reviews.

Published in ACM Transactions on Internet Technology (TOIT) 2017., 2017

With the fast increase of online services of all kinds, users start to care more about the Quality of Service (QoS) that a service provider can offer besides the functionalities of the services. As a result, QoS-based service selection and recommendation have received significant attention since the mid-2000s. However, existing approaches primarily consider a small number of standard QoS parameters, most of which relate to the response time, fee, availability of services, and so on. As online services start to diversify significantly over different domains, these small set of QoS parameters will not be able to capture the different quality aspects that users truly care about over different domains. Most existing approaches for QoS data collection depend on the information from service providers, which are sensitive to the trustworthiness of the providers. Some service monitoring mechanisms collect QoS data through actual service invocations but may be affected by actual hardware/software configurations. In either case, domain-specific QoS data that capture what users truly care about have not been successfully collected or analyzed by existing works in service computing. To address this demanding issue, we develop a statistical learning approach to extract domain-specific QoS features from user-provided service reviews. In particular, we aim to classify user reviews based on their sentiment orientations into either a positive or negative category. Meanwhile, statistical feature selection is performed to identify statistically nontrivial terms from review text, which can serve as candidate QoS features. We also develop a topic models-based approach that automatically groups relevant terms and returns the term groups to users, where each term group corresponds to one high-level quality aspect of services. We have conducted extensive experiments on three real-world datasets to demonstrates the effectiveness of our approach.

Recommended citation: Liu, Xumin, et al. "Statistical learning of domain-specific quality-of-service features from user reviews." ACM Transactions on Internet Technology (TOIT) 17.2 (2017): 1-24. https://dl.acm.org/doi/abs/10.1145/3053381

Weishi Shi

Publications