My lab is recruiting graduate students who have passion in:
Prospective students are welcome to have a chat with me to better understand our topics and projects. Please send me an email with your resume (including your transcript and past projects) first to arrange the meeting time.
Our ACCV'18 Demo Video is out.
The website for the course CS565600 Deep Learning is now online.
I am currently an associate professor at the Department of Computer Science, National Tsing Hua University (NTHU), Taiwan. Since 2016, I also serve as Division Director for the Division of Academic Information System, Computer & Communication Center of NTHU. My research interests include:
I received the Ph.D. degree in Electrical Engineering from the National Taiwan University, Taiwan (2005/09 - 2009/02). Before joining NTHU in 2010, I was a senior research scientist at Telcordia Technologies Inc. (formerly Bellcore) during 2004 and 2010.
I am also a programmer. At my leisure, I write some interesting software.
See DBLP for the boring list of my publications.
Kang-Jun Liu, Tsu-Jui Fu, and Shan-Hung Wu, "Region-Semantics Preserving Image Synthesis," in Asian Conference on Computer Vision (ACCV), December 2018
We study the problem of region-semantics preserving (RSP) image synthesis. Given a reference image and a region specification R, our goal is to train a model that is able to generate realistic and diverse images, each preserving the same semantics as that of the reference image within the region R. This problem is challenging because the model needs to (1) understand and preserve the marginal semantics of the reference region; i.e., the semantics excluding that of any subregion; and (2) maintain the compatibility of any synthesized region with the marginal semantics of the reference region. In this paper, we propose a novel model, called the fast region-semantics preserver (Fast-RSPer), for the RSP image synthesis problem. The Fast-RSPer uses a pre-trained GAN generator and a pre-trained deep feature extractor to generate images without undergoing a dedicated training phase. This makes it particularly useful for the interactive applications. We conduct extensive experiments using the real-world datasets and the results show that Fast-PSPer can synthesize realistic, diverse RSP images efficiently.
Ting-Yu Cheng, Kuan-Hua Lin, Xinyang Gong, Kang-Jun Liu, and Shan-Hung Wu, "Learning User Perceived Clusters with Feature-Level Supervision," in Advances In Neural Information Processing Systems (NIPS), December 2016
* NIPS is a top conference in the field of Machine Learning.
Semi-supervised clustering algorithms have been proposed to identify data clusters that align with user perceived ones via the aid of side information such as seeds or pairwise constrains. However, traditional side information is mostly at the instance level and subject to the sampling bias, where non-randomly sampled instances in the supervision can mislead the algorithms to wrong clusters. In this paper, we propose learning from the feature-level supervision. We show that this kind of supervision can be easily obtained in the form of perception vectors in many applications. Then we present novel algorithms, called Perception Embedded (PE) clustering, that exploit the perception vectors as well as traditional side information to find clusters perceived by the user. Extensive experiments are conducted on real datasets and the results demonstrate the effectiveness of PE empirically.
Yan-Fu Liu, Cheng-Yu Hsu, and Shan-Hung Wu, "Non-Linear Cross-Domain Collaborative Filtering via Hyper-Structure Transfer," in Proc. of the 32nd Int'l Conf. on Machine Learning (ICML), July 2015
* ICML is a top conference in the field of Machine Learning.
The Cross Domain Collaborative Filtering (CDCF) exploits the rating matrices from multiple domains to make better recommendations. Existing CDCF methods adopt the sub-structure sharing technique that can only transfer linearly correlated knowledge between domains. In this paper, we propose the notion of Hyper-Structure Transfer (HST) that requires the rating matrices to be explained by the projections of some more complex structure, called the hyper-structure, shared by all domains, and thus allows the non-linearly correlated knowledge between domains to be identified and transferred. Extensive experiments are conducted and the results demonstrate the effectiveness of our HST models empirically.
Shan-Hung Wu, Hao-Heng Chien, Kuan-Hua Lin, and Philip S. Yu, "Learning the Consistent Behavior of Common Users for Target Node Prediction across Social Networks," in Proc. of the 31st Int'l Conf. on Machine Learning (ICML), June 2014
In this work, We study the target node prediction problem: given two social networks, identify those nodes/users from one network (called the source network) who are likely to join another (called the target network, with nodes called target nodes). Although this problem can be solved using existing techniques in the field of cross domain classification, we observe that in many realworld situations the cross-domain classifiers perform sub-optimally due to the heterogeneity between source and target networks that prevents the knowledge from being transferred. In this paper, we propose learning the consistent behavior of common users to help the knowledge transfer. We first present the Consistent Incidence Co-Factorization (CICF) for identifying the consistent users, i.e., common users that behave consistently across networks. Then we introduce the Domain-UnBiased (DUB) classifiers that transfer knowledge only through those consistent users. Extensive experiments are conducted and the results show that our proposal copes with heterogeneity and improves prediction accuracy.
Shan-Hung Wu, Tsai-Yu Feng, Meng-Kai Liao, Shao-Kan Pi, and Yu-Shan Lin, "T-Part: Partitioning of Transactions for Forward-Pushing in Deterministic Database Systems," in Proc. of the 2016 ACM Int'l Conf. on Management of Data (SIGMOD), June 2016
* ACM SIGMOD is a top conference in the field of Database Systems and Big Data Management.
Deterministic database systems, a type of NewSQL database systems, have been shown to yield high throughput on a cluster of commodity machines while ensuring the strong consistency between replicas, provided that the data can be well-partitioned on these machines. However, data partitioning can be suboptimal for many reasons in real-world applications. In this paper, we present T-Part, a transaction execution engine that partitions transactions in a deterministic database system to deal with the unforeseeable workloads or workloads whose data are hard to partition. By modeling the dependency between transactions as a T-graph and continuously partitioning that graph, T-Part allows each transaction to know which later transactions on other machines will read its writes so that it can push forward the writes to those later transactions immediately after committing. This forward-pushing reduces the chance that the later transactions stall due to the unavailability of remote data. We implement a prototype for T-Part. Extensive experiments are conducted and the results demonstrate the effectiveness of T-Part.
Shan-Hung Wu, Man-Ju Chou, Chun-Hsiung Tseng, Yuh-Jye Lee, Kuan-Ta Chen, "Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on Facebook," in IEEE Systems Journal, 11(4), December 2017
With the growing popularity of Social Networking Services (SNSs), increasing amounts of sensitive information are stored online and linked to SNS accounts. The obvious value of SNS accounts gives rise to the identity fraud problem-unauthorized, stealthy use of SNS accounts. For example, anxious parents may use their children's SNS accounts to spy on the children's social interaction; or husbands/wives may check their spouses' SNS accounts if they suspect infidelity. Stealthy identity fraud could happen to anyone and seriously invade the privacy of account owners. However, there is no known defense against such behavior when an attacker, possibly an acquaintance of the victim, gets access to the victim's computing devices. In this paper, we propose to extend the use of continuous authentication to detect the in situ identity fraud incidents, which occurs when the attackers use the same accounts, the same devices, and IP addresses as the victims. Using Facebook as a case study, we show that it is possible to detect such incidents by analyzing SNS users' browsing behavior. Our experiment results demonstrate that the approach can achieve higher than 80% detection accuracy within 2 min, and over 90% after 7 min of observation time
Ching-Chan Wu, Shan-Hung Wu, Wen-Tsuen Chen, "On Low-Overhead and Stable Data Transmission between Channel-Hopping Cognitive Radios," in IEEE Trans. on Mobile Computing (TMC), 16(9), September 1 2017
Cognitive radios (CRs) are proposed to alleviate the huge need for radio spectrum. There are two known steps for a pair of CRs to start communication: the rendezvous and data-channel negotiation. Despite that the rendezvous can be achieved by some well-studied techniques such as the channel hopping, the strategies for data-channel negotiation receive much less attention and their impact on data transmission performance remains unclear. In this paper, we study existing data-channel negotiation schemes for channel-hopping CRs and observe that 1) for short data transmission, they incur a huge overhead, called notification delay, that severely limits the throughput; and 2) for long data transmission, they lead to large overhead, called interruption delay, in handling the PU interruption, which makes performance unstable. By carefully re-examining the steps toward low-overhead and stable data transmission, we argue that a key step, called self-channel selection, is missing and should precede the rendezvous and data channel selection steps. In this step, each CR selects, in a distributed manner, only a small amount of the most stable channels to be used in the later steps. To realize the self-channel selection, we introduce a Randomly-Started Stability-Descent (RSSD) selection algorithm. Expensive simulations are conducted and the results demonstrate the effectiveness of RSSD in reducing 1) the notification delay; 2) the chance of PU interruption; and 3) the interruption delay if PU interruption occurs, which overall improve the performance and quality of data transmission.
"Big Data behind Small Apps," in AI Forum 2016, AEARU-CSWT 2016
The marketplace for mobile apps is very crowded and competitive today. Many developers take the "start from small" strategy to gradually acquire the early adopters from a niche and find the product-market-fit. However, this strategy leads to some general misconception that gaining popularity requires the founders to have profound experience in marketing and/or resources.
In this talk, I share my personal experience in running some apps using a scientific approach speaking above the data collected by the apps. Technically, we found that some of the data analysis tasks are more challenging than the well-known "V's" barriers of big data analytics. Commercially, we show that innovative execution/marketing strategies aided by specialized machine learning techniques are a key for driving unfair business advantage.
"Asymmetric Support Vector Machines: A Tutorial," in AI Forum 2010
This is a tutorial of my work published in KDD'08. Many practical applications of classification require the classifier to produce a very low false-positive rate. Although the Support Vector Machine (SVM) has been widely applied to these applications due to its superiority in handling high dimensional data, there are relatively little effort other than setting a threshold or changing the costs of slacks to ensure the low false-positive rate. In this paper, we propose the notion of Asymmetric Support Vector Machine (ASVM) that takes into account the false-positives and the user tolerance in its objective. Such a new objective formulation allows us to raise the confidence in predicting the positives, and therefore obtain a lower chance of false-positives. We study the effects of the parameters in ASVM objective and address some implementation issues related to the Sequential Minimal Optimization (SMO) to cope with large-scale data. An extensive simulation is conducted and shows that ASVM is able to yield either noticeable improvement in performance or reduction in training time as compared to the previous arts.
I am very lucky to be able to work with many smart and creative students. Our lab locates at Delta 723 and 724. People are nice there, although busy sometimes. Please feel free to stop by and chat with us if you want to know more about our research life and projects.
To members: please join this group.
Most of my courses give heavy loads. Please make sure you reserve enough time before taking them. If you have any question or suggestion about my courses, please feel free to send me an email.
This class introduces the concepts and practices of deep learning. The course consists of three parts. In the first part, we give a quick introduction of classical machine learning and review some key concepts required to understand deep learning. In the second part, we discuss how deep learning differs from classical machine learning and explain why it is effective in dealing with complex problems such as the image and natural language processing. Various CNN and RNN models will be covered. In the third part, we introduce the deep reinforcement learning and its applications.
This course also gives coding labs. We will use Python 3 as the main programming language throughout the course. Some popular machine learning libraries such as Scikit-learn and Tensorflow will be used and explained in detials.
This course provides an overview of the current database management systems in the cloud, and explains how they are different from traditional database systems. The goal is to get students familiar with some well-known implementations like NoSQL databases, Google BigTable, Google MegaStore, and Google Spanner etc., and more importantly, to help students make better decisions on the design tradeoffs when configuring/building their own database systems given a particular set of target applications (tenants) in mind.
Modern Web & App Programming
This course presents hands-on labs for students to be familiar with the web and app development process, techniques, and tools. Students are asked to build real, useful applications (websites and/or mobile apps) accessible to the public.
Machine Learning (Advanced)
This is an advanced course on machine learning. We will introduce the deep generative models and more engineering aspects of machine learning. In addition, we will cover the reinforcement learning and how it can leverage the game theory and deep learning to interact with the environments better.
Detailed syllabus will be announced later.
Past: Spring 2011-2012, Fall 2013-2015
App Entrepreneurship and Implementation
Past: Spring 2013-2014
I like to turn ideas into real things. At my leisure, I write code with my students to transform our research results into fun and easy-to-use software that benefits the general audience:
ElaSQL is a distributed relational database system prototype that aims to offer high scalability, high availability, and elasticity to the on-line transaction processing (OLTP) applications.
VanillaDB is a collection of simple-to-read, fast, and extensible database system components aiming to lower the barrier of new-system prototyping and/or learning the database internals.