Shan-Hung Wu 吳尚鴻

Professor, CS, NTHU

Email: shwu [AT] cs.nthu.edu.tw
Office: Delta 603, No. 101, Sec. 2, Kuang-Fu Rd., Hsinchu, Taiwan 30013

Biosketch

I am a professor in the Department of Computer Science at National Tsing Hua University (NTHU), Taiwan. Since 2016, I have served as the division director for the Division of Academic Information System at the Computer & Communication Center of NTHU. My research interests include:

I received the Ph.D. degree in Electrical Engineering from the National Taiwan University, Taiwan (Sep 2005 - Feb 2009). Before joining NTHU in 2010, I was a senior research scientist at Telcordia Technologies Inc. (formerly Bellcore) during 2004 and 2010.

I am also a part-time programmer. I like to write code with my students, and we have developed some interesting software. I received the Google Faculty Award in 2020.

Publications

See DBLP for the complete list.

The remarkable performance achieved by Deep Neural Networks (DNNs) in many applications is followed by the rising concern about data privacy and security. Since DNNs usually require large datasets to train, many practitioners scrape data from external sources such as the Internet. However, an external data owner may not be willing to let this happen, causing legal or ethical issues. In this paper, we study the generalization attacks against DNNs, where an attacker aims to slightly modify training data in order to spoil the training process such that a trained network lacks generalizability. These attacks can be performed by data owners and protect data from unexpected use. However, there is currently no efficient generalization attack against DNNs due to the complexity of a bilevel optimization involved. We propose Neural Tangent Generalization Attacks (NTGA) that, to the best of our knowledge, is the first work enabling clean-label, black-box generalization attack against DNNs. We conduct extensive experiments, and the empirical results demonstrate the effectiveness of NTGA.

Deep neural networks are shown to be susceptible to both adversarial attacks and backdoor attacks. Although many defenses against an individual type of the above attacks have been proposed, the interactions between the vulnerabilities of a network to both types of attacks have not been carefully investigated yet. In this paper, we conduct experiments to study whether adversarial robustness and backdoor robustness can affect each other and find a trade-off—by increasing the robustness of a network to adversarial examples, the network becomes more vulnerable to backdoor attacks. We then investigate the cause and show how such a trade-off can be exploited for either good or bad purposes. Our findings suggest that future research on defense should take both adversarial and backdoor attacks into account when designing algorithms or robustness measures to avoid pitfalls and a false sense of security.

Deep neural networks are shown to be vulnerable to adversarial attacks. This motivates robust learning techniques, such as the adversarial training, whose goal is to learn a network that is robust against adversarial attacks. However, the sample complexity of robust learning can be significantly larger than that of “standard” learning. In this paper, we propose improving the adversarial robustness of a network by leveraging the potentially large test data seen at runtime. We devise a new defense method, called runtime masking and cleansing (RMC), that adapts the network at runtime before making a prediction to dynamically mask network gradients and cleanse the model of the non-robust features inevitably learned during the training process due to the size limit of the training set. We conduct experiments on real-world datasets and the results demonstrate the effectiveness of RMC empirically.

The Convolutional Neural Networks (CNNs) have laid the foundation for many techniques in various applications. Despite achieving remarkable performance in some tasks, the 3D viewpoint generalizability of CNNs is still far behind humans visual capabilities. Although recent efforts, such as the Capsule Networks, have been made to address this issue, these new models are either hard to train and/or incompatible with existing CNN-based techniques specialized for different applications. Observing that humans use binocular vision to understand the world, we study in this paper whether the 3D viewpoint generalizability of CNNs can be achieved via a binocular vision. We propose CNN^{2}, a CNN that takes two images as input, which resembles the process of an object being viewed from the left eye and the right eye. CNN^{2} uses novel augmentation, pooling, and convolutional layers to learn a sense of three-dimensionality in a recursive manner. Empirical evaluation shows that CNN^{2} has improved viewpoint generalizability compared to vanilla CNNs. Furthermore, CNN^{2} is easy to implement and train, and is compatible with existing CNN-based specialized techniques for different applications.

We study the problem of detecting critical structures using a graph embedding model. Existing graph embedding models lack the ability to precisely detect critical structures that are specific to a task at the global scale. In this paper, we propose a novel graph embedding model, called the Ego-CNNs, that detects precise critical structures efficiently. An Ego-CNN can be jointly trained with a task model and help explain/discover knowledge for the task. We conduct extensive experiments and the results show that Ego-CNNs (1) can lead to comparable task performance as the state-of- the-art graph embedding models, (2) works nicely with CNN visualization techniques to illustrate the detected structures, and (3) is efficient and can incorporate with scale-free priors, which commonly occurs in social network datasets, to further improve the training efficiency.

Semi-supervised clustering algorithms have been proposed to identify data clusters that align with user perceived ones via the aid of side information such as seeds or pairwise constrains. However, traditional side information is mostly at the instance level and subject to the sampling bias, where non-randomly sampled instances in the supervision can mislead the algorithms to wrong clusters. In this paper, we propose learning from the feature-level supervision. We show that this kind of supervision can be easily obtained in the form of perception vectors in many applications. Then we present novel algorithms, called Perception Embedded (PE) clustering, that exploit the perception vectors as well as traditional side information to find clusters perceived by the user. Extensive experiments are conducted on real datasets and the results demonstrate the effectiveness of PE empirically.

The Cross Domain Collaborative Filtering (CDCF) exploits the rating matrices from multiple domains to make better recommendations. Existing CDCF methods adopt the sub-structure sharing technique that can only transfer linearly correlated knowledge between domains. In this paper, we propose the notion of Hyper-Structure Transfer (HST) that requires the rating matrices to be explained by the projections of some more complex structure, called the hyper-structure, shared by all domains, and thus allows the non-linearly correlated knowledge between domains to be identified and transferred. Extensive experiments are conducted and the results demonstrate the effectiveness of our HST models empirically.

We study the target node prediction problem: given two social networks, identify those nodes/users from one network (called the source network) who are likely to join another (called the target network, with nodes called target nodes). Although this problem can be solved using existing techniques in the field of cross domain classification, we observe that in many real-world situations the cross-domain classifiers perform sub-optimally due to the heterogeneity between source and target networks that prevents the knowledge from being transferred. In this paper, we propose learning the consistent behavior of common users to help the knowledge transfer. We first present the Consistent Incidence Co-Factorization (CICF) for identifying the consistent users, i.e., common users that behave consistently across networks. Then we introduce the Domain-UnBiased (DUB) classifiers that transfer knowledge only through those consistent users. Extensive experiments are conducted and the results show that our proposal copes with heterogeneity and improves prediction accuracy.

Deterministic database systems have been shown to significantly improve the availability and scalability of a distributed database system deployed on a shared-nothing architecture across WAN while ensuring strong consistency. However, their scalability and performance advantages highly depend on the quality of data partitioning due to the reduced flexibility in transaction processing. Although a deterministic database system can employ workload driven data (re-)partitioning and live data migration algorithms to partition data, we found that the effectiveness of these algorithms is limited in complex real-world environments due to the unpredictability of machine workloads. In this paper, we present Hermes, a deterministic database system prototype that, for the first time, does not rely on sophisticated data partitioning to achieve high scalability and performance. Hermes employs a novel transaction routing mechanism that jointly optimizes the balance of machine workloads, data (re-)partitioning, and live data migration by looking into the queued transactions to be executed in the near future. We conducted extensive experiments which show that Hermes is able to yield 29% to 137% increase in transaction throughput as compared to the state-of-the-art systems under complex real-world workloads.

Recent deterministic database systems have achieved high scalability and high availability in distributed environments given OLTP workloads. However, modern OLTP applications usually have changing workloads or access patterns, so how to make the resource provisioning elastic to the changing workloads becomes an important design goal for a deterministic database system. Live migration, which moves the specified data from a source machine to a destination node while continuously serving the incoming transactions, is a key technique required for the elasticity. In this paper, we present MgCrab, a live migration technique for a deterministic database system, that leverages the determinism to maintain the consistency of data on the source and destination nodes at very low cost during a migration period. We implement MgCrab on an open-source database system. Extensive experiments were conducted and the results demonstrate the effectiveness of MgCrab.

Deterministic database systems, a type of NewSQL database systems, have been shown to yield high throughput on a cluster of commodity machines while ensuring the strong consistency between replicas, provided that the data can be well-partitioned on these machines. However, data partitioning can be suboptimal for many reasons in real-world applications. In this paper, we present T-Part, a transaction execution engine that partitions transactions in a deterministic database system to deal with the unforeseeable workloads or workloads whose data are hard to partition. By modeling the dependency between transactions as a T-graph and continuously partitioning that graph, T-Part allows each transaction to know which later transactions on other machines will read its writes so that it can push forward the writes to those later transactions immediately after committing. This forward-pushing reduces the chance that the later transactions stall due to the unavailability of remote data. We implement a prototype for T-Part. Extensive experiments are conducted and the results demonstrate the effectiveness of T-Part.

The K-Nearest Neighbors (KNN) query has been of significant interest in many studies and has become one of the most important spatial queries in mobile sensor networks. Applications of KNN queries may include vehicle navigation, wildlife social discovery, and squad/platoon searching on the battlefields. Current approaches to KNN search in mobile sensor networks require a certain kind of indexing support. This index could be either a centralized spatial index or an in-network data structure that is distributed over the sensor nodes. Creation and maintenance of these index structures, to reflect the network dynamics due to sensor node mobility, may result in long query response time and low battery efficiency, thus limiting their practical use. In this paper, we propose a maintenance-free itinerary-based approach called Density-aware Itinerary KNN query processing (DIKNN). The DIKNN divides the search area into multiple cone-shape areas centered at the query point. It then performs a query dissemination and response collection itinerary in each of the cone-shape areas in parallel. The design of the DIKNN scheme takes into account several challenging issues such as the trade-off between degree of parallelism and network interference on query response time, and the dynamic adjustment of the search radius (in terms of number of hops) according to spatial irregularity or mobility of sensor nodes. To optimize the performance of DIKNN, a detailed analytical model is derived that automatically determines the most suitable degree of parallelism under various network conditions. This model is validated by extensive simulations. The simulation results show that DIKNN yields substantially better performance and scalability over previous work, both as k increases and as the sensor node mobility increases. It outperforms the second runner with up to a 50 percent saving in energy consumption and up to a 40 percent reduction in query response time, while rendering the same level of query result accuracy.

Current approaches to k nearest neighbor (KNN) search in mobile sensor networks require certain kind of indexing support. This index could be either a centralized spatial index or an in-network data structure that is distributed over the sensor nodes. Creation and maintenance of these index structures, to reflect the network dynamics due to sensor node mobility, may result in long query response time and low battery efficiency, thus limiting their practical use. In this paper, we propose a maintenance-free, itinerary-based approach called density-aware itinerary KNN query processing (DIKNN). The DIKNN divides the search area into multiple cone-shape areas centered at the query point. It then performs a query dissemination and response collection itinerary in each of the cone-shape areas in parallel. The design of the DIKNN scheme also takes into account challenging issues such as the the dynamic adjustment of the search radius (in terms of number of hops) according to spatial irregularity or mobility of sensor nodes. The simulation results show that DIKNN yields substantially better performance and scalability over previous work, both as k increases and as the sensor node mobility increases. It outperforms the second runner with up to 50% saving in energy consumption and up to 40% reduction in query response time, while rendering the same level of query result accuracy.

The demand for autonomous vehicles (AVs) is rapidly growing these years. As AVs have a potential to free drivers' cognitive resources from driving to other tasks, reading is one of the common activities users conduct in travel multitasking. Nevertheless, ways to supporting reading in AVs have been little explored. To fill this gap, we explored the design of an in-vehicle reader on a windshield in AVs along three dimensions: dynamics, position, and text segmentation. We conducted two in-lab within-subject experiments to examine the eight kinds of in-car reading modalities that represented the combinations of the three dimensions in terms of drivers' reaction time and reading comprehension. Our results show a case where an adaptive positioning would be particularly beneficial for supporting reading in AVs. And our general suggestion is to use a static reading zone presented off-road and in sentences because it leads to faster reaction and better reading comprehension.

Object detection based on pre-trained deep neural networks (DNNs) has achieved impressive performance and enabled many applications. However, DNN-based object detectors are shown to be vulnerable to physical adversarial attacks. Despite that recent efforts have been made to defend against these attacks, they either use strong assumptions or become less effective with pre-trained object detectors. In this paper, we propose adversarial pixel masking (APM), a defense against physical attacks, which is designed specifically for pre-trained object detectors. APM does not require any assumptions beyond the “patch-like” nature of a physical attack and can work with different pre-trained object detectors of different architectures and weights, making it a practical solution in many applications. We conduct extensive experiments, and the empirical results show that APM can significantly improve model robustness without degrading clean performance.

This paper proposes a novel audio CAPTCHA system that requires a user to respond immediately after hearing a short and easy-to-remember cue in its mixture with background music. Potential attacking paths based on cross correlation (CC) and sound event detection (SED) are implemented to test the security of the system. Then, two defending measures based on phase-modification of the cue and audio watermarking are established against CC and SED-based attacks, respectively. Human subjects were recruited to test the system and the results indicated that the subjects can pass the proposed audio CAPTCHA >90% of the times. In contrast, the proposed defending measures suppress the passing rate of both attacks to 8.2% and 0.4%, respectively.

We study the problem of region-semantics preserving (RSP) image synthesis. Given a reference image and a region specification R, our goal is to train a model that is able to generate realistic and diverse images, each preserving the same semantics as that of the reference image within the region R. This problem is challenging because the model needs to (1) understand and preserve the marginal semantics of the reference region; i.e., the semantics excluding that of any subregion; and (2) maintain the compatibility of any synthesized region with the marginal semantics of the reference region. In this paper, we propose a novel model, called the fast region-semantics preserver (Fast-RSPer), for the RSP image synthesis problem. The Fast-RSPer uses a pre-trained GAN generator and a pre-trained deep feature extractor to generate images without undergoing a dedicated training phase. This makes it particularly useful for the interactive applications. We conduct extensive experiments using the real-world datasets and the results show that Fast-PSPer can synthesize realistic, diverse RSP images efficiently.

With the growing popularity of Social Networking Services (SNSs), increasing amounts of sensitive information are stored online and linked to SNS accounts. The obvious value of SNS accounts gives rise to the identity fraud problem-unauthorized, stealthy use of SNS accounts. For example, anxious parents may use their children's SNS accounts to spy on the children's social interaction; or husbands/wives may check their spouses' SNS accounts if they suspect infidelity. Stealthy identity fraud could happen to anyone and seriously invade the privacy of account owners. However, there is no known defense against such behavior when an attacker, possibly an acquaintance of the victim, gets access to the victim's computing devices. In this paper, we propose to extend the use of continuous authentication to detect the in situ identity fraud incidents, which occurs when the attackers use the same accounts, the same devices, and IP addresses as the victims. Using Facebook as a case study, we show that it is possible to detect such incidents by analyzing SNS users' browsing behavior. Our experiment results demonstrate that the approach can achieve higher than 80% detection accuracy within 2 min, and over 90% after 7 min of observation time.

Cognitive radios (CRs) are proposed to alleviate the huge need for radio spectrum. There are two known steps for a pair of CRs to start communication: the rendezvous and data-channel negotiation. Despite that the rendezvous can be achieved by some well-studied techniques such as the channel hopping, the strategies for data-channel negotiation receive much less attention and their impact on data transmission performance remains unclear. In this paper, we study existing data-channel negotiation schemes for channel-hopping CRs and observe that 1) for short data transmission, they incur a huge overhead, called notification delay, that severely limits the throughput; and 2) for long data transmission, they lead to large overhead, called interruption delay, in handling the PU interruption, which makes performance unstable. By carefully re-examining the steps toward low-overhead and stable data transmission, we argue that a key step, called self-channel selection, is missing and should precede the rendezvous and data channel selection steps. In this step, each CR selects, in a distributed manner, only a small amount of the most stable channels to be used in the later steps. To realize the self-channel selection, we introduce a Randomly-Started Stability-Descent (RSSD) selection algorithm. Expensive simulations are conducted and the results demonstrate the effectiveness of RSSD in reducing 1) the notification delay; 2) the chance of PU interruption; and 3) the interruption delay if PU interruption occurs, which overall improve the performance and quality of data transmission.

Quorum-based power-saving (QPS) protocols have been proposed for ad hoc networks (e.g., IEEE 802.11 ad hoc mode) to increase energy efficiency and prolong the operational time of mobile stations. These protocols assign to each station a cycle pattern that specifies when the station should wake up (to transmit/receive data) and sleep (to save battery power). In all existing QPS protocols, the cycle length is either identical for all stations or is restricted to certain numbers (e.g., squares or primes). These restrictions on cycle length severely limit the practical use of QPS protocols as each individual station may want to select a cycle length that is best suited for its own need (in terms of remaining battery power, tolerable packet delay, and drop ratio). In this paper, we propose the notion of hyper quorum system (HQS)—a generalization of QPS that allows for arbitrary cycle lengths. We describe algorithms to generate two different classes of HQS given any set of arbitrary cycle lengths as input. We also describe how to find the optimal cycle length for a station to maximize energy efficiency, subject to certain performance constraints. We then present analytical and simulation results that show the benefits of HQS-based power-saving protocols over the existing QPS protocols. The HQS protocols yield up to 41% improvement in energy efficiency under heavy traffic loads while eliminating more than 90% delay drops under light traffic loads.

Cognitive radio (CR) is intended to meet the exponentially growing demand for spectrum by allowing for opportunistic utilization of idle legacy channels. Rendezvous, where two radios complete handshaking in an idle channel, is a key step for "stranger" (unknown to each other) CRs to start communication. However, none of existing algorithms guarantee rendezvous for heterogeneous or stranger CRs with different spectrum-sensing capabilities, in spite of the fact that (i) a wide variety of mobile devices are equipped with heterogeneous radios and (ii) there are numerous applications requiring efficient rendezvous for heterogeneous radios/CRs. In this paper, we propose a new channel hopping algorithm, called Heterogeneous Hopping (HH), that guarantees rendezvous without assuming existence of a universal channel set that can be sensed by all radios. HH is realized with a two-layer design that harmonizes the fixed-short-cycle and parity-alignment techniques we propose here, in order to guide CRs to rendezvous in two complementary situations resulting from the different capabilities of mobile wireless devices. To best of our knowledge, HH is the first channel-hopping scheme that guarantees rendezvous between heterogeneous radios. Our in-depth evaluation has shown HH to be significantly faster than simple extensions of existing schemes. Moreover, the latter cannot guarantee successful rendezvous, either.

In this paper, we propose to use a continuous authentication approach to detect the in-situ identity fraud incidents, which occur when the attackers use the same devices and IP addresses as the victims. Using Facebook as a case study, we show that it is possible to detect such incidents by analyzing SNS users’ browsing behavior. Our experiment results demonstrate that the approach can achieve reasonable accuracy given a few minutes of observation time.

Professional Activities

Chairs

PC Members

Reviewers

NuerIPS (NIPS), ICML, AAAI, IJCAI, KDD, CIKM, ICDM, SIGMOD, PVLDB, ICDE, INFOCOM, SODA, ICDCS, MM, to name a few

Talks

Awards

DataLab

I work with many smart and creative students. Our lab locates at Delta 723 and 724. Please feel free to stop by and chat with us if you want to know more about our projects and research life.

To members:    Join this Group

Courses

To prospective students: Loads of my courses are generally very heavy. So, please make sure you reserve enough time before enrolling in them. If you have any questions or suggestions, please contact TAs first before dropping me an email.

This class introduces the concepts and practices of deep learning. The course consists of three parts. In the first part, we give a quick introduction of classical machine learning and review some key concepts required to understand deep learning. In the second part, we discuss how deep learning differs from classical machine learning and explain why it is effective in dealing with complex problems such as the image and natural language processing. Various CNN and RNN models will be covered. In the third part, we introduce the deep reinforcement learning and its applications.

This course also gives coding labs. We will use Python 3 as the main programming language throughout the course. Some popular machine learning libraries such as Scikit-learn and Tensorflow will be used and explained in detials.

This course provides an overview of the current database management systems in the cloud, and explains how they are different from traditional database systems. The goal is to get students familiar with some well-known implementations like NoSQL database systems, deterministic database systems etc., and more importantly, to help students understand the design tradeoffs when configuring/building their own database systems given a particular set of target applications (tenants) in mind.

This course presents hands-on labs in the software aspect for students to be familiar with software design and development processes. The class follows a mainstream of building a useful mobile app under selected topics. We introduce some important software design concepts and development tools.

Past Courses

Software

I like to turn ideas into real things. At my leisure, I write code with my students to transform our research results into fun and easy-to-use software that benefits the general audience.

ElaSQL

ElaSQL is a distributed relational database system prototype that aims to offer high scalability, high availability, and elasticity to online transaction processing (OLTP) applications.

VanillaDB

VanillaDB is a collection of simple-to-read, fast, and extensible database system components aiming to lower the barrier of new-system prototyping and learning the database internals.

Flora (graduated)

Flora is an app that uses behavioral psychology to help people stay off their phones and focus on what's more important in real life. This project has graduated and become a startup company.

Patents

  1. Shan-Hung Wu, Shao-Kan Pi, Thing-Yu Cheng, "Method and System for One-Time Connection," U.S. Patent App. No. US-14/710,599 (assignee: NTHU, pending)
  2. Shan-Hung Wu, Meng-Kai Liao, Shao-Kan Pi, Yu-Shan Lin, "Deterministic Database System and Data Transferring Method Thereof," U.S. Patent App. No. US-14/693,903 (assignee: NTHU)
  3. Shan-Hung Wu, Cheng-Yu Hsu, You-Jhih Wong, "Method and Electronic Device for Rating Outfit," U.S. Patent App. No. US-14/601,256 (assignee: NTHU, pending)
  4. Shan-Hung Wu, Meng-Ren Chen, "Method, Server, and Apparatus for Lecture Feedback," U.S. Patent App. No. US-14/583,202 (assignee: NTHU, pending)
  5. Shan-Hung Wu, Shao-Kan Pi, Ting-Yu Cheng, "Electronic Apparatus and Soft Locking Method Thereof," U.S. Patent App. No. US-14/576,171 (assignee: NTHU, pending)
  6. Shan-Hung Wu, Chung-Min Chen, and Ming-Syan Chen, "An Asymmetric and Asynchronous Energy Conservation Protocol for Vehicular Networks," U.S. Patent, No. 8,428,514 (assignee: Telcordia Technologies Inc.)