IEEE International Conference on Big Data

    The 4th International Workshop on Cross-disciplinary Data Exchange and Collaboration (CDEC2021)

    –System Modeling and Data Origination for Social Implementation–

    December 15th-18th, 2021 in conjunction with IEEE BigData 2021, Orlando, FL, USA Online conference

    Return to Information Page

    Schedule

    Special Session Day: 15th December, 2021 (9AM - 1PM, USA Eastern Standard Time)

    OPENING
    9:00
    hayashi

    [9:00-9:05]

    Teruaki Hayashi

    Welcome and Opening Remarks

    9:05
    hayashi

    [9:05-9:25]

    Teruaki Hayashi, Takumi Shimizu, Yoshiaki Fukami, Hiroki Sakaji, and Hiroyasu Matsushima

    Growing Process of Communities on Data Platforms: Case Analysis of a COVID-19 Dataset

    Abstract

    In recent years, there have been growing expectations for the creation of new businesses and the improvement of the value of existing services by exchanging data in different fields. Data stored in-house within organizations have become a new source of innovation. While there is a high need for the value creation of data, determining the data value is not an easy task, as there is a wide range of factors to be considered, such as data pricing, acquisition cost, usage value, and update frequency. In this study, we observe communication, such as the sharing of know-hows in data exchange and analysis, and discuss the growing process of a community on the data platform. For the experiment, we focused on the data community in the COVID-19 disaster and used a unique dataset from the data platform Kaggle, which is the data analysis competition service. The results suggest that user actions differ in the discussion of the dataset and analysis. Moreover, providing topics, user participation, and activating actions in the early stages after the dataset is released are essential for forming a data community. We argue that the actions on the data analysis, such as comments and votes, are also crucial for fostering a common understanding of the data value.

    9:25
    arai

    [9:25-9:45]

    Koki Arai

    Data Distribution and Competition Law Issues

    Abstract

    This study summarizes the issues in the promotion of data distribution, how huge digital platformers conduct in the market, and the actual regulations and initiatives for them. In particular, it briefly examines the actual status of litigation under US antitrust law and the issues. The study discusses the following five points: First, it highlights the delineation of the relevant market, which is also the first entry point in competition analysis. Second, it undertakes the existence of monopoly power, which requires the analysis of dominant firms. Third, it analyzes the conduct of forming, maintaining, and strengthening market power. Fourth, it analyzes the consumer damages. Fifth, it discusses whether antitrust actions are needed to solve the problems and remedies.

    9:45
    fukami

    [9:45-10:05]

    Yoshiaki Fukami, Takumi Shimizu, and Hiroyasu Matsushima

    The Impact of Decentralized Identity Architecture on Data Exchange

    Abstract

    Digital Identity is indispensable for the promotion of data exchange and the diffusion of digital government. The development and standardization of Decentralized Identifiers (DIDs) based on blockchain technology is underway. Through a comparison between existing centralized IDs and decentralized IDs, we examine the impact of decentralized identity architecture on data exchange.

    10:05
    watanabe

    [10:05-10:25]

    Naoki Watanabe

    A Numerical Study with Experimental Data on Risk-Averse Subcontractors in Procurement Auctions with Subcontract Bids

    Abstract

    Prime contractors often solicit estimates for part of their work from potential subcontractors who can perform that work on their behalf with lower costs, prior to submitting their own estimates in procurement auctions. In a simple model for capturing this aspect of procurement auctions, this note clarifies through a numerical study with experimental data an important factor for obtaining clearer results in the experimental sessions. It was shown that unclear results observed in the sessions conducted previously would be due to the presence of extreme patterns in the distributions of risk aversion rates among the participants. We need to control the extreme risk aversion rates of participants for designing the experiment for future research.

    10:25 Coffee Break
    10:35
    speaker

    [10:35-11:15]

    Invited Talk

    Hotaka Yonezawa
    Smile Spirits (Representative)

    How to use data & video analysis in sports

    Abstract

    It has been a long time since data analysis and video analysis have been used in sports as well. Especially in recent years, data analysis using sensor technology has been performed to help prevent injuries and adjust practice menus. Although the technology has evolved and the price of equipment has dropped compared to 10 years ago, it has not been popularized and is currently available only to top athletes such as professional sports and national athletes. In this session, we will explain how data analysis and video analysis are used in sports with examples, and how can we use them for more athletes and sports players? And how is it desirable for athletes to utilize data and video analysis? I would like to make time to think about these things.

    Biography

    From 1998 to 2008, he participated in competitions around the world as a moguls player. At the same time, he studied sports coaching using video analysis and also served as a private coach for the Korean national team moguls. Currently, I am applying this research to coach various athletes such as soccer, basketball, fencing, and boccia using video analysis. In 2016, using sports data and video analysis, we developed "winning habit coaching" to sharpen the senses while drawing out the independence of athletes.
    In recent years, world champions have also been produced, contributing to winning medals at the Tokyo Olympics and Paralympics. In addition, he has been a deputy representative director of the Japan E-Coaching Association, and has been training analysts and analyzing coaches as professional performance analyst since 2007. He has also served as a part-time lecturer at Takushoku University and a part-time lecturer at Kokushikan University for many years, teaching video analysis coaching in lectures and practical training, and contributing to the development of many teachers and coaches.

    11:15
    taguchi

    [11:15-11:35]

    Rei Taguchi, Hikaru Watanabe, Masanori Hirano, Masahiro Suzuki, Hiroki Sakaji, Kiyoshi Izumi, and Kenji Hiramatsu

    Market Trend Analysis Using Polarity Index Generated from Analyst Reports

    Abstract

    This study demonstrates whether analysts’ sentiment toward individual stocks is useful in predicting the macroeconomic index. This can be achieved by using natural language processing to create polarity indexes from analyst reports. In this study, the created polarity indexes were analyzed using the Vector Autoregressive model with various macroeconomic indexes. Consequently, it was confirmed that the polarity indexes do have an impact on indexes such as prices, exchange rates, and government bonds.

    11:35
    ave

    [11:35-11:55]

    Akinori Abe and Kotaro Fukushima

    How can we collect satisfactory answers in art appreciation experiment?

    Abstract

    We have been studying the way of art appreciation. For the study we have conducted several experiments. Several results could be obtained. However, we have experienced some problems. One is for free descriptions, it is rather difficult to obtain sufficient number or size of answers from participants. Or it will take a lot of time to collect full free descriptions. Although we conducted the previous experiment to solve the question that "how will the value and preference of art be changed according to offered factors?" Then we conducted the experiment in order to determine which factor (information) will change the viewers' sense of value and preference of art in the art appreciation. In this paper, we will analyze the answers in the previous experiment to find which type of information or question could make participants write a sufficient size of free descriptions. In addition, we will discuss the problem in the context of IMDJ (Innovators Marketplace on Data Jackets).

    11:55
    sakaji

    [11:55-12:15]

    Hiroki Sakaji, Teruaki Hayashi, Yoshiaki Fukami, Takumi Shimizu, Hiroyasu Matsushima, and Kiyoshi Izumi

    Retrieving of Data Similarity using Metadata on a Data Analysis Competition Platform

    Abstract

    In recent years, instead of closing data and analysis skills in-house, there has been much interest in widely releasing data analysis knowledge on the web. A data exchange platform is a type of digital platform that exchanges data between stakeholders, e.g., data owners, users, and analysts. However, the datasets handled on such platforms are independently acquired and stored by the data providers for their own purposes. These datasets are not based on the premise of coordination and combination, and there is currently little information available to discuss the systematic organization and combination of these datasets. In this study, we focus on a metadata, summary information of data, and examine the similarity of data on a data exchange platform using natural language processing. In our experiments, we use the metadata from the data exchange platform Kaggle. To compare the similarity of the data, our method employs word2vec and BERT as vectorize methods and converts data descriptions to vectors. Then, our method measures the distances of each vector by calculating cosine similarities between each vector. From experimental results, we found that Kaggle has the same character as other data exchange platforms. Additionally, the results indicated the usability of the natural language processing-based method for extracting similar data pairs.

    12:15
    ogata

    [12:15-12:35]

    Noriyuki Kushiro and Yusuke Ogata

    Supporting Test Case Design on Reasoning Scheme with Natural Language Processing Technique

    Abstract

    The purpose of the study is to support test case design by introducing natural language processing techniques and reasoning schemes for designing. Sentences containing a logical relationship of implication (key sentences) are extracted from a specification document with natural language processing techniques, and supplementary explanation sentences for the key sentence (subordinate sentences) that complement parameters appeared in the key sentence are placed in Toulmin’s reasoning scheme based on “similarity” to the key sentence. As a result, we succeeded in automatically generating test cases for product testing, and also in supporting to extract parameters for each test case.

    12:35
    ohsawa

    [12:35-12:40]

    Prof. Yukio Ohsawa

    General Comment and Closing Remarks

    12:40 CLOSING

    Contact

    Dr. Teruaki Hayashi (co-chair)

    Email: hayashi -at- sys.t.u-tokyo.ac.jp

    go to top

    Copyright © Ohsawa Lab All Rights Reserved.