Shan-Hung Wu 吳尚鴻

Associate Professor, CS, NTHU
Email: shwu [AT]
Phone/Fax: +886-3-5742961, Office: Delta 603
Address: No. 101, Sec. 2, Kuang-Fu Rd., Hsinchu, Taiwan 30013



I am currently an associate professor at the Department of Computer Science, National Tsing Hua University (NTHU), Taiwan. Since 2016, I also serve as Division Director for the Division of Academic Information System, Computer & Communication Center of NTHU. My research interests include:

  • Machine Learning and Artificial Intelligence;
  • NewSQL Database Systems and Big Data Management;
  • App and Web Intelligence.

I received the Ph.D. degree in Electrical Engineering from the National Taiwan University, Taiwan (2005/09 - 2009/02). Before joining NTHU in 2010, I was a senior research scientist at Telcordia Technologies Inc. (formerly Bellcore) during 2004 and 2010.

I am also a programmer. At my leisure, I write some interesting software.


See DBLP for the boring list of my publications.

Machine Learning

  • Ruo-Chun Tzeng and Shan-Hung Wu, "Distributed, Egocentric Representations of Graphs for Detecting Critical Structures," in Proc. of the 36nd Int'l Conf. on Machine Learning (ICML), June 2019

    We study the problem of detecting critical structures using a graph embedding model. Existing graph embedding models lack the ability to precisely detect critical structures that are specific to a task at the global scale. In this paper, we propose a novel graph embedding model, called the Ego-CNNs, that detects precise critical structures efficiently. An Ego-CNN can be jointly trained with a task model and help explain/discover knowledge for the task. We conduct extensive experiments and the results show that Ego-CNNs (1) can lead to comparable task performance as the state-of- the-art graph embedding models, (2) works nicely with CNN visualization techniques to illustrate the detected structures, and (3) is efficient and can incorporate with scale-free priors, which commonly occurs in social network datasets, to further improve the training efficiency.

    PDF Sup. Code

  • Ting-Yu Cheng, Kuan-Hua Lin, Xinyang Gong, Kang-Jun Liu, and Shan-Hung Wu, "Learning User Perceived Clusters with Feature-Level Supervision," in Advances In Neural Information Processing Systems (NIPS), December 2016

    Semi-supervised clustering algorithms have been proposed to identify data clusters that align with user perceived ones via the aid of side information such as seeds or pairwise constrains. However, traditional side information is mostly at the instance level and subject to the sampling bias, where non-randomly sampled instances in the supervision can mislead the algorithms to wrong clusters. In this paper, we propose learning from the feature-level supervision. We show that this kind of supervision can be easily obtained in the form of perception vectors in many applications. Then we present novel algorithms, called Perception Embedded (PE) clustering, that exploit the perception vectors as well as traditional side information to find clusters perceived by the user. Extensive experiments are conducted on real datasets and the results demonstrate the effectiveness of PE empirically.

    PDF Sup. Dataset

  • Yan-Fu Liu, Cheng-Yu Hsu, and Shan-Hung Wu, "Non-Linear Cross-Domain Collaborative Filtering via Hyper-Structure Transfer," in Proc. of the 32nd Int'l Conf. on Machine Learning (ICML), July 2015

    The Cross Domain Collaborative Filtering (CDCF) exploits the rating matrices from multiple domains to make better recommendations. Existing CDCF methods adopt the sub-structure sharing technique that can only transfer linearly correlated knowledge between domains. In this paper, we propose the notion of Hyper-Structure Transfer (HST) that requires the rating matrices to be explained by the projections of some more complex structure, called the hyper-structure, shared by all domains, and thus allows the non-linearly correlated knowledge between domains to be identified and transferred. Extensive experiments are conducted and the results demonstrate the effectiveness of our HST models empirically.

    PDF Sup.

  • Shan-Hung Wu, Hao-Heng Chien, Kuan-Hua Lin, and Philip S. Yu, "Learning the Consistent Behavior of Common Users for Target Node Prediction across Social Networks," in Proc. of the 31st Int'l Conf. on Machine Learning (ICML), June 2014

    In this work, We study the target node prediction problem: given two social networks, identify those nodes/users from one network (called the source network) who are likely to join another (called the target network, with nodes called target nodes). Although this problem can be solved using existing techniques in the field of cross domain classification, we observe that in many realworld situations the cross-domain classifiers perform sub-optimally due to the heterogeneity between source and target networks that prevents the knowledge from being transferred. In this paper, we propose learning the consistent behavior of common users to help the knowledge transfer. We first present the Consistent Incidence Co-Factorization (CICF) for identifying the consistent users, i.e., common users that behave consistently across networks. Then we introduce the Domain-UnBiased (DUB) classifiers that transfer knowledge only through those consistent users. Extensive experiments are conducted and the results show that our proposal copes with heterogeneity and improves prediction accuracy.

    PDF Sup.

Big Data Management and NewSQL Database systems

  • Yu-Shan Lin, Shao-Kan Pi, Meng-Kai Liao, Ching Tsai, Aaron Elmore, and Shan-Hung Wu, "MgCrab: Transaction Crabbing for Live Migration in Deterministic Database Systems," in Proc. of the VLDB Endowment, Vol. 12, No. 5, Jan 2019

    Recent deterministic database systems have achieved high scalability and high availability in distributed environments given OLTP workloads. However, modern OLTP applications usually have changing workloads or access patterns, so how to make the resource provisioning elastic to the changing workloads becomes an important design goal for a deterministic database system. Live migration, which moves the specified data from a source machine to a destination node while continuously serving the incoming transactions, is a key technique required for the elasticity. In this paper, we present MgCrab, a live migration technique for a deterministic database system, that leverages the determinism to maintain the consistency of data on the source and destination nodes at very low cost during a migration period. We implement MgCrab on an open-source database system. Extensive experiments were conducted and the results demonstrate the effectiveness of MgCrab.


  • Shan-Hung Wu, Tsai-Yu Feng, Meng-Kai Liao, Shao-Kan Pi, and Yu-Shan Lin, "T-Part: Partitioning of Transactions for Forward-Pushing in Deterministic Database Systems," in Proc. of the 2016 ACM Int'l Conf. on Management of Data (SIGMOD), June 2016

    Deterministic database systems, a type of NewSQL database systems, have been shown to yield high throughput on a cluster of commodity machines while ensuring the strong consistency between replicas, provided that the data can be well-partitioned on these machines. However, data partitioning can be suboptimal for many reasons in real-world applications. In this paper, we present T-Part, a transaction execution engine that partitions transactions in a deterministic database system to deal with the unforeseeable workloads or workloads whose data are hard to partition. By modeling the dependency between transactions as a T-graph and continuously partitioning that graph, T-Part allows each transaction to know which later transactions on other machines will read its writes so that it can push forward the writes to those later transactions immediately after committing. This forward-pushing reduces the chance that the later transactions stall due to the unavailability of remote data. We implement a prototype for T-Part. Extensive experiments are conducted and the results demonstrate the effectiveness of T-Part.


App and Web Intelligence

  • Kang-Jun Liu, Tsu-Jui Fu, and Shan-Hung Wu, "Region-Semantics Preserving Image Synthesis," in Asian Conference on Computer Vision (ACCV), December 2018

    We study the problem of region-semantics preserving (RSP) image synthesis. Given a reference image and a region specification R, our goal is to train a model that is able to generate realistic and diverse images, each preserving the same semantics as that of the reference image within the region R. This problem is challenging because the model needs to (1) understand and preserve the marginal semantics of the reference region; i.e., the semantics excluding that of any subregion; and (2) maintain the compatibility of any synthesized region with the marginal semantics of the reference region. In this paper, we propose a novel model, called the fast region-semantics preserver (Fast-RSPer), for the RSP image synthesis problem. The Fast-RSPer uses a pre-trained GAN generator and a pre-trained deep feature extractor to generate images without undergoing a dedicated training phase. This makes it particularly useful for the interactive applications. We conduct extensive experiments using the real-world datasets and the results show that Fast-PSPer can synthesize realistic, diverse RSP images efficiently.


  • Shan-Hung Wu, Man-Ju Chou, Chun-Hsiung Tseng, Yuh-Jye Lee, Kuan-Ta Chen, "Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on Facebook," in IEEE Systems Journal, 11(4), December 2017

    With the growing popularity of Social Networking Services (SNSs), increasing amounts of sensitive information are stored online and linked to SNS accounts. The obvious value of SNS accounts gives rise to the identity fraud problem-unauthorized, stealthy use of SNS accounts. For example, anxious parents may use their children's SNS accounts to spy on the children's social interaction; or husbands/wives may check their spouses' SNS accounts if they suspect infidelity. Stealthy identity fraud could happen to anyone and seriously invade the privacy of account owners. However, there is no known defense against such behavior when an attacker, possibly an acquaintance of the victim, gets access to the victim's computing devices. In this paper, we propose to extend the use of continuous authentication to detect the in situ identity fraud incidents, which occurs when the attackers use the same accounts, the same devices, and IP addresses as the victims. Using Facebook as a case study, we show that it is possible to detect such incidents by analyzing SNS users' browsing behavior. Our experiment results demonstrate that the approach can achieve higher than 80% detection accuracy within 2 min, and over 90% after 7 min of observation time


  • Ching-Chan Wu, Shan-Hung Wu, Wen-Tsuen Chen, "On Low-Overhead and Stable Data Transmission between Channel-Hopping Cognitive Radios," in IEEE Trans. on Mobile Computing (TMC), 16(9), September 1 2017

    Cognitive radios (CRs) are proposed to alleviate the huge need for radio spectrum. There are two known steps for a pair of CRs to start communication: the rendezvous and data-channel negotiation. Despite that the rendezvous can be achieved by some well-studied techniques such as the channel hopping, the strategies for data-channel negotiation receive much less attention and their impact on data transmission performance remains unclear. In this paper, we study existing data-channel negotiation schemes for channel-hopping CRs and observe that 1) for short data transmission, they incur a huge overhead, called notification delay, that severely limits the throughput; and 2) for long data transmission, they lead to large overhead, called interruption delay, in handling the PU interruption, which makes performance unstable. By carefully re-examining the steps toward low-overhead and stable data transmission, we argue that a key step, called self-channel selection, is missing and should precede the rendezvous and data channel selection steps. In this step, each CR selects, in a distributed manner, only a small amount of the most stable channels to be used in the later steps. To realize the self-channel selection, we introduce a Randomly-Started Stability-Descent (RSSD) selection algorithm. Expensive simulations are conducted and the results demonstrate the effectiveness of RSSD in reducing 1) the notification delay; 2) the chance of PU interruption; and 3) the interruption delay if PU interruption occurs, which overall improve the performance and quality of data transmission.


Professional Activities


  • PC member of the 28th Int'l Joint Conference on Artificial Intelligence (IJCAI), August 2019
  • PC member of the 33rd AAAI Conference on Artificial Intelligence (AAAI), February 2019
  • PC member of the 27th International Joint Conference on Artificial Intelligence (IJCAI), July 2018
  • PC member of the 23rd European Conference on Artificial Intelligence (ECAI), July 2018
  • Program Chair of the 21st International Conference on Extending Database Technology (EDBT) Demo Track, March, 2018
  • PC member of the 2017 ACM Int'l Conf. on Information and Knowledge Management (CIKM)
  • PC member of the 2017 ACM Int'l Conf. on Management of Data (SIGMOD) Demo
  • Program Chair of the 2016 Conference on Technologies and Applications of Artificial Intelligence (TAAI), November 2016


  • "Deap Transfer Learning: State-of-the-Arts and Pending Problems," in NTU Math Department, 2018

  • "Modern Database System Design for Applications in the Cloud," in NTU CS Department, 2018

  • "Big Data behind Small Apps," in AI Forum 2016, AEARU-CSWT 2016

    The marketplace for mobile apps is very crowded and competitive today. Many developers take the "start from small" strategy to gradually acquire the early adopters from a niche and find the product-market-fit. However, this strategy leads to some general misconception that gaining popularity requires the founders to have profound experience in marketing and/or resources.

    In this talk, I share my personal experience in running some apps using a scientific approach speaking above the data collected by the apps. Technically, we found that some of the data analysis tasks are more challenging than the well-known "V's" barriers of big data analytics. Commercially, we show that innovative execution/marketing strategies aided by specialized machine learning techniques are a key for driving unfair business advantage.


  • "Asymmetric Support Vector Machines: A Tutorial," in AI Forum 2010

    This is a tutorial of my work published in KDD'08. Many practical applications of classification require the classifier to produce a very low false-positive rate. Although the Support Vector Machine (SVM) has been widely applied to these applications due to its superiority in handling high dimensional data, there are relatively little effort other than setting a threshold or changing the costs of slacks to ensure the low false-positive rate. In this paper, we propose the notion of Asymmetric Support Vector Machine (ASVM) that takes into account the false-positives and the user tolerance in its objective. Such a new objective formulation allows us to raise the confidence in predicting the positives, and therefore obtain a lower chance of false-positives. We study the effects of the parameters in ASVM objective and address some implementation issues related to the Sequential Minimal Optimization (SMO) to cope with large-scale data. An extensive simulation is conducted and shows that ASVM is able to yield either noticeable improvement in performance or reduction in training time as compared to the previous arts.



  • New Faculty Research Award, NTHU, 2015
  • Outstanding Research Award, EECS, NTHU, 2014
  • Outstanding Teaching Award, EECS, NTHU, 2013
  • IBM Ph.D. Fellowship Award, 2008 (70/575 worldwide)


I am lucky to work with many smart and creative students. Our lab locates at Delta 723 and 724. People are nice there, although busy sometimes. Please feel free to stop by and chat with us if you want to know more about our research life and projects.

To members: please join this group.


To prostective students: most of my courses give heavy loads. Please make sure you reserve enough time before taking them. If you have any question or suggestion about my courses, please feel free to send me an email.

  • Deep Learning

    This class introduces the concepts and practices of deep learning. The course consists of three parts. In the first part, we give a quick introduction of classical machine learning and review some key concepts required to understand deep learning. In the second part, we discuss how deep learning differs from classical machine learning and explain why it is effective in dealing with complex problems such as the image and natural language processing. Various CNN and RNN models will be covered. In the third part, we introduce the deep reinforcement learning and its applications.

    This course also gives coding labs. We will use Python 3 as the main programming language throughout the course. Some popular machine learning libraries such as Scikit-learn and Tensorflow will be used and explained in detials.

    Fall 2019

  • Database Systems

    This course provides an overview of the current database management systems in the cloud, and explains how they are different from traditional database systems. The goal is to get students familiar with some well-known implementations like NoSQL databases, Google BigTable, Google MegaStore, and Google Spanner etc., and more importantly, to help students make better decisions on the design tradeoffs when configuring/building their own database systems given a particular set of target applications (tenants) in mind.

    Spring 2019

  • Modern Web & App Programming

    This course presents hands-on labs for students to be familiar with the web and app development process, techniques, and tools. Students are asked to build real, useful applications (websites and/or mobile apps) accessible to the public.

    The classes are divided into three parts. First, we give a primer to common web technologies such as HTTP, HTML, CSS, and Javascript. We cover different programming paradigms, including the OOP and functional programming. Handy tools such as Git are covered to get students familiar with the project-based and team-based development. In the second part, we introduce modern web development techniques such as React, Redux, Node.js, and Firebase and some reusable templates from Bootstrap and Reactstrap that speed up the development process. Last, we extend our horizon to the mobile development landscape by introducing the React Native. We also give case studies on how to leverage Machine Learning and Data Mining algorithms to convert raw user data into the app intelligence.

    Spring 2019

  • Machine Learning (Advanced)

    This is an advanced course on machine learning. We will introduce the deep generative models and more engineering aspects of machine learning. In addition, we will cover the reinforcement learning and how it can leverage the game theory and deep learning to interact with the environments better.

    Detailed syllabus will be announced later.

    Past: Spring 2011-2012, Fall 2013-2015

  • App Entrepreneurship and Implementation

    Past: Spring 2013-2014


I like to turn ideas into real things. At my leisure, I write code with my students to transform our research results into fun and easy-to-use software that benefits the general audience:


ElaSQL is a distributed relational database system prototype that aims to offer high scalability, high availability, and elasticity to the on-line transaction processing (OLTP) applications.


VanillaDB is a collection of simple-to-read, fast, and extensible database system components aiming to lower the barrier of new-system prototyping and/or learning the database internals.


Flora is an app that helps you stay focused on a task with your friends. It's a companion to the Forest app initiated by a former DataLab member Shao-Kan Pi.


  1. Shan-Hung Wu, Shao-Kan Pi, Thing-Yu Cheng, "Method and System for One-Time Connection," U.S. Patent App. No. US-14/710,599 (assignee: NTHU, pending)
  2. Shan-Hung Wu, Meng-Kai Liao, Shao-Kan Pi, Yu-Shan Lin, "Deterministic Database System and Data Transferring Method Thereof," U.S. Patent App. No. US-14/693,903 (assignee: NTHU, pending)
  3. Shan-Hung Wu, Cheng-Yu Hsu, You-Jhih Wong, "Method and Electronic Device for Rating Outfit," U.S. Patent App. No. US-14/601,256 (assignee: NTHU, pending)
  4. Shan-Hung Wu, Meng-Ren Chen, "Method, Server, and Apparatus for Lecture Feedback," U.S. Patent App. No. US-14/583,202 (assignee: NTHU, pending)
  5. Shan-Hung Wu, Shao-Kan Pi, Ting-Yu Cheng, "Electronic Apparatus and Soft Locking Method Thereof," U.S. Patent App. No. US-14/576,171 (assignee: NTHU, pending)
  6. Shan-Hung Wu, Chung-Min Chen, and Ming-Syan Chen, "An Asymmetric and Asynchronous Energy Conservation Protocol for Vehicular Networks," U.S. Patent, No. 8,428,514 (assignee: Telcordia Technologies Inc.)