IEEE International Conference on Data Mining

    The 1st International Workshop on Cross-disciplinary Data Exchange and Collaboration (CDEC)

    November 17th, 2018 in conjunction with IEEE ICDM 2018, Singapore

    Return to Information Page

    Schedule

    Workshop Day: 17th November, 2018

    Room: Leo 2 (Convention Center at Resort World Sentosa)

    7:45 OPENING
    8:00 Welcome and Opening Remarks
    hayashi

    [8:00-8:15]

    Teruaki Hayashi

    Cross-disciplinary Data Exchange and Collaboration in the Era of Data 3.0

    8:15 Platforms for Cross-disciplinary Data Exchange and Collaboration
    iwasa

    [8:15-8:35]

    Daiji Iwasa

    Preliminary Case Study on Data Utilization and Collaboration on the Web

    Abstract

    Recently, data holders collect various kinds of data owing to the improvement of Internet of Things (IoT) technologies. On the other hand, data analytic can observe even large data owing to the spread of analysis methods/tools and strong computing power. The critical point for accelerating data utilization is the communication of data stakeholders. Data analysts should consider the purpose for which data holders collect data. However, the communication among stakeholders is hard in case some of them are not familiar with data. Innovators Marketplace on Data Jackets (IMDJ) [5] is a workshop method to tackle this problem. In IMDJ workshop, participants state their requirements and create a scenario for solving these requirements based on Data Jackets. Data Jacket (DJ) [4] is a framework to describe structured information about data in natural language, which enable for those who are not familiar with data to discuss based on data. In this paper, we introduce a platform called Web-IMDJ for conducting IMDJ workshop on the web. Web-IMDJ not only reduces the burden of workshops but enables to participate in workshop remotely. By conducting workshop on Web-IMDJ as case study, we found that the number of ideas is as many as previous IMDJ and the capacity of participants is superior in Web-IMDJ.

    8:35
    hayashi

    [8:35-8:55]

    Yusuke Ejiri, Sasaki Hiromichi, Ikeda Eiji

    Realization of Data Exchange and Utilization Society by Blockchain and Data Jacket

    Abstract

    Based on the recent rapid progress of big data analysis and AI technology, it is expected that large amount of data and latest analyzing technologies are combined with the creativeness of people, then co-creation across industries is accelerated to create innovative services or products. To realize that, it is inevitable to build system to share data in the society across industries securely, utilize them, and create value continuously. In this document, we introduce Fujitsu’s solution development to realize data exchange and utilization society. Especially we focus on the “consortium” structure, where people are connected with mutual trust, and that is key factor to reduce various risk and insecure feeling about dealing with data then accelerate data utilization.

    8:55
    gu

    [8:55-9:15]

    Haiqian Gu, Jie Wang, Ziwen Wang, Bojin Zhuang, Wenhao Bian, Fei Su

    Cross-platform Modeling of Users’ Behavior on Social Media

    Abstract

    With the booming development and popularity of mobile applications, different verticals accumulate abundant data of user information and social behavior, which are spontaneous, genuine and diversified. However, each platform describes user’s portraits in only certain aspect, resulting in difficult combination of those internet footprints together. In our research, we proposed a modeling approach to analyze user’s online behavior across different social media platforms. Structured and unstructured data of same users shared by NetEase Music and Sina Weibo have been collected for cross-platform analysis of correlations between music preference and other users’ characteristics. Based on music tags of genre and mood, genre cluster of five groups and mood cluster of four groups have been formed by computing their collected song lists with K-means method. Moreover, with the help of user data of Weibo, correlations between music preference (i.e. genre, mood) and Big Five personalities (BFPs) and basic information (e.g. gender, resident region, tags) have been comprehensively studied, building up full-scale user portraits with finer grain. Our findings indicate that people’s music preference could be linked with their real social activities. For instance, people living in mountainous areas generally prefer folk music, while those in urban areas like pop music more. Interestingly, dog lovers could love sad music more than cat lovers. Moreover, our proposed cross-platform modeling approach could be adapted to other verticals, providing an online automatic way for profiling users in a more precise and comprehensive way.

    9:00
    9:15 Discussion
    9:25 Special Speech
    mano

    [9:25-9:50]

    Hiroshi Mano (EverySense,Inc. C.E.O)

    Overview of Data Trading Market and Operator

    Abstract

    Today various things are connected via the Internet by IoT, and the big data are observed and generated by them. And the utilization by advanced analysis of those big data using AI is widely expected as a basis for new social growth. Currently, many companies and institutions possess various data, but efforts to distribute these data beyond industry, academia and government and cooperate are not sufficiently executed.
    In this session, I will explain the outline and requirements of the data trading market, which is intended to promote data distribution with the actual implementation. Futher more, in Japan, DTA (Data Trading Alliance), which promotes the standardization of concrete systems and cooperative schemes for data distribution under the cooperation of industry, academia and the government, was established last year. Therefore I will also introduce the overview of DTA as a founder of DTA.

    Biography

    Hiroshi Mano (Ph.D.) is the C.E.O of EverySense,Inc. and Founder and Secretary General of Data Trading Alliance (DTA). EverySense provides data exchange services mediating between sensor data providers and sensor data requesters. DTA is one of the largest alliances of data trading composed of 119 companies in Japan (October, 2018).
    He established Root Inc. in 1993, developed digital wireless communication devices, and proposed a total network solution for convergence of analog and digital technologies. In addition, has been participating in numerous public and private councils and R&D initiatives for WLAN-based high-speed mobile communications system development, technology enabling and commercialization, wireless adaptation and local information networking. A chair for IEEE 802.11 TGai WG for international standardization since 2010. And awarded Japan Communication Minister's Award 2017 for Information and Communication Technology Prize for the standardization efforts. In 2014, established EverySense, Inc. In U.S. Silicon Valley. EverySense developed an IoT information trading platform and acquired its national patent in Japan. Founder and Secretary General of Data Trading Alliance (DTA) that is an industry-academic-government alliance with the cooperation of Japan Cabinet Office, Japan Ministry of Internal Affairs and Communications, Japan Ministry of Economy. Has been deeply involved in Japan and overseas in standardization and rule proposals in the fields of wireless communications, Internet, data trading, etc. and contributed to the Big Data strategy proposal in the G7 ICT Ministerial Meeting in Turin in 2017.

    9:50 Discussion
    10:00 Coffee Break
    10:20 Case Studies on Cross-disciplinary Data Analysis
    hirano

    [10:20-10:40]

    Masanori Hirano , Hiroki Sakaji, Shoko Kimura, Kiyoshi Izumi, Hiroyasu Matsushima, Shintaro Nagao, Atsuo Kato

    Selection of Related Stocks using Financial Text Mining

    Abstract

    We propose a method to select and rank stocks related to a given theme. The proposed method has two flows; obtaining related words, and selecting related stocks based on obtained related words. First, on the basis of the given theme word, the proposed method selects words with high similarity using an ensemble of word2vec models. Then, we modify the similarity based on the results of the word matches in information from companies including investor relations documents and homepages. Second, the top-10 similar words are matched to the company data, and we extract sentences related to the given theme from the data of each company. We then calculate company similarity by summing the modified similarity of related words in the extracted sentences as a final similarity measure of each company. Finally, we select the top-n related stocks based on the obtained final similarity. Targeting the Japanese documents, companies, and stocks, we achieved 0.49 accuracy (precision, recall, and F1-value), which is better than the result of randomly selecting. In addition, by comparing the results obtained using a completely different theme, we verified that the proposed method works correctly and can filter related stocks effectively.

    10:40
    hayashi

    [10:40-11:00]

    Pei Zhou, Muhao Chen, Kai-Wei Chang, Carlo Zaniolo

    Quantification and Analysis of Scientific Language Variation Across Research Fields

    Abstract

    Quantifying differences in terminologies from various academic domains has been a longstanding problem yet to be solved. We propose a computational approach for analyzing linguistic variation among scientific research fields by capturing the semantic change of terms based on a neural language model. The model is trained on a large collection of literature in five computer science research fields, for which we obtain field-specific vector representations for key terms, and global vector representations for other words. Several quantitative approaches are introduced to identify the terms whose semantics have drastically changed, or remain unchanged across different research fields. We also propose a metric to quantify the overall linguistic variation of research fields. After quantitative evaluation on human annotated data and qualitative comparison with other methods, we show that our model can improve cross-disciplinary data collaboration by identifying terms that potentially induce confusion during interdisciplinary studies.

    11:00 Intelligent Systems for Cross-disciplinary Data Exchange
    hayashi

    [11:00-11:20]

    Akinori Abe, Yuki Hayashi

    How to determine the necessary knowledge in the conflicting situation?

    Abstract

    Innovators Marketplace on Data Jackets (IMDJ) is called as Innovation Game. The Innovation Game seems a game where a new production will be obtained during the combination of various techniques, materials and previous products. During conducting the IMDJ participants generate hypotheses to explain the shown requirements. Usually they refer to the existing techniques or consult with the network to obtain the necessary techniques. In the IMDJ situation, to refer to the existing techniques will not so difficult. Because the system will provide a very sufficient data base. In fact it is rather difficult to determine which techniques will be necessary to the requirements. Because the IMDJ data base is not fully opened to the public. In the case, participants should guess or estimate inside the data base. Previously, I discussed how to generate missing knowledge. By the proposal method it is possible to generate missing knowledge. In fact this is a critical issue in the perfomance in IMDJ. However more serious problem will exist. Sometimes it is necessary to consider inconsistent (conflicting) requirements. In the situation, it is necessary to satisfy both requirments as possible as we can. In this paper, I will discuss the hypotheses generation in such an inconsistent situation. For the situation, previously I introduced paraconsistent logic to deal with contradictory situations.

    11:20
    ifuku

    [11:20-11:40]

    Masafumi Ifuku, Noriyuki Kushiro, Yusuke Aoyama

    Requirements Definition with Extended Goal Graph

    Abstract

    Significant requirements are often discovered during discussion about tradeoffs and conflicts between stakeholders in requirements meeting. Developing a method to handle tradeoffs and conflicts becomes a breakthrough to acquire significant requirements which are difficult to elicit for requirements analysts. In this paper, the Extended Goal Graph (EGG) is proposed as a method for handling tradeoffs and conflicts by providing traceability between requirements analysis and system design. We developed the EGG system to support requirement definition process with the EGG. The system was applied to the requirement definition meeting among medical doctors and potential patients for selecting proper inspections required for diagnosing disease when shadows on lungs were found.

    11:40 Discussion
    11:50 Closing Remarks
    ohsawa

    [11:50-12:00]

    Prof. Yukio Ohsawa

    From MoDAT Series (2013-2017) toward CDEC

    12:00 CLOSING
    NOTE
    [Social Tour and Gathering]

    We are planning a social tour in the afternoon to visit a distinguished person in one of the successful internet companies in Singapore. Also, we are going to have an exchange party in Singapore city in the evening, so please join us. The details will be announced via e-mail or this Web page.


    [Awards]

    These awards celebrate the most inspiring and effective presentations and papers, that are delivered by impactful, confident and engaging speakers. The details will be announced via e-mail or this Web page.

    Contact

    Dr. Teruaki Hayashi (co-chair)

    Email: hayashi -at- sys.t.u-tokyo.ac.jp

    go to top

    Copyright © Ohsawa Lab All Rights Reserved.