Waseda Frontline Research Vol. 23: Information theory and data science (Part 2 of 3)

Toshiyasu Matsushima
Professor, Faculty of Science and Engineering; Director, Center for Data Science
Field of expertise: Information theory

Part 2: Investigating what information is

Information theory, which mathematically elucidates what information is, can be said to be the most fundamental and essential of all the various information-related fields of research. Professor Toshiyasu Matsushima of the Department of Applied Mathematics, School of Fundamental Science and Engineering, who specializes in information theory, explains the history, significance, and appeal of information theory. In this part, Professor Matsushima will explain the scientific aspect of information theory. The Center for Research Strategy also asked him about the Waseda University Center for Data Science, established in 2017, where Professor Matsushima serves as director. (Date of interview: September 6, 2018)

The search for what information is, in the same way as matter is searched

Last time, I talked about information theory in terms of information data compression, which is useful for practical applications such as the Internet. This could be regarded as the engineering aspect of information theory. On the other hand, I am also engaged in research from the scientific side, so to speak, to answer the question of what information is. Researchers who approach both aspects may be quite rare.

As we delve into the nature of the various issues involved in transmitting and processing information, the outline of the information being worked with can be seen, and the nature of the information can gradually be understood. Last time, I talked about Claude Shannon, who came up with information theory, and how he sought for the limit of how much information data could be compressed and restored. I believe that Shannon tried to show the essence of information in that way, and that was his true intention.

By seeking various information related limits, we seek to explore the nature of the information being used. In the same way as a physicist would explore the question of what matter is, we are exploring what information is. This is motivation behind studying information theory.

Professor Matsushima talks about the scientific aspect of information theory.

Supporting artificial intelligence and machine learning with mathematical theory

In terms of the scientific aspects of information theory, research on artificial intelligence and machine learning can also, in a broad sense, be considered part of information theory because they are also processing information. Recently, artificial intelligence exclusively refers to deep learning, and machine learning is associated more with applications and programming. However, the basic theories are still mathematics.

My doctoral dissertation was titled “Research on knowledge information processing based on information theory,” and it was about artificial intelligence and machine learning. I thought about how we could explain human thinking mathematically. As such, some of my areas of expertise are mathematics based theories about machine learning.

We clarify things such as, for example, how can mathematically optimal predictions be made and to what extent the mathematical limit of accuracy is, instead of just putting data in a computer and being satisfied if prediction results with a certain degree of accuracy can be obtained for a given purpose. There is, of course, an engineering aspect to following the process by which a machine looks at a drawing or a picture taken by a person and comes to understand what it is. On the other hand, it becomes a scientific aspect when we explore the process in terms of how humans think and capture information.

An example of a model in learning theory (machine learning). This model deals with supervised learning. The purpose of learning theory is to achieve a learning machine on a computer with learning ability similar to humans. (Source: Matsushima Laboratory, partially modified)

People gather information and try to predict stock prices and traffic volume. However, according to the way Shannon thinks, predictions cannot be made more accurately than the amount of information inherent in the collected information. Nevertheless, people are now inputting data in deep learning and appreciating the output results, believing that the predictions are made with very good accuracy. This may be analogous to being content near the base of a mountain, even though what people are aiming for, which is the limit of the highest prediction accuracy, is at the summit.

For artificial intelligence prediction, it is also necessary to mathematically clarify where the theoretical limit of accuracy is. Knowing the limit will make clear how accurate a prediction can be, where the problem is, and why the limit cannot be reached.

Such research corresponds to information theory becoming the foundation of information and communication technology, and that it is indispensable to the development of these technologies as a mathematical basic theory for machine learning and artificial intelligence.

Abstract thinking in mathematics

This may be surprising, but our everyday research is basically conducted using paper and pencils, not computers. It is said that ancient Greek philosophers and mathematicians thought about things while taking walks, and that Einstein did his thinking while lying on a boat swaying on a lake. I do a lot of thinking while walking. In the past, there were times on my way home when I suddenly realized I had walked well past my home.

I want students, especially those in science and engineering, to proactively exercise abstract thinking in mathematics. Usually, when trying to solve a given problem, you should think about things such as whether or not you could use existing ideas. On the other hand, in many engineering fields, we first try to express the same problem mathematically. Then, it is possible to mathematically deduce the given limit and how close you can reach that limit. After that, an algorithm to run a computer, for example, is created to make this happen. I would like students to know that there is a method of thinking abstractly before solving the problem. Rather than just acquiring knowledge, I want students to learn to think.

Data science becoming more important

In Part 3, I will talk with Professor Manabu Kobayashi of the Center for Data Science. This center, where I serve as director, was established on campus in 2017. Going forward, it is inevitable that research in various fields will become data driven. Like how Isaac Newton formulated his theory on universal gravity when an apple fell from a tree, researchers used to make hypotheses by making observations in close proximity and confirming them through experiments. However, we can grasp what is going not only around us but in the world nowadays from data of each individual. We are in an age where we can make hypotheses and acquire knowledge by analyzing collected data.

Conventional intellectual activity and data driven intellectual activity. (Source: Waseda University Center for Data Science, partially modified)

Yet, even if a researcher is trying to establish a new theory with data, such as with positive economics, and wants to incorporate data science as an important part of their research, it is not easy to learn data science so suddenly. Therefore, a method for accelerating and advancing research by having experts on the subject and in data science work together could be a model for data-driven research.

By bringing together researchers and experts from various fields, data scientists with various abilities, and data collection, we can expect to find clues in solving global problems that was not possible in previous research approaches.

The Center for Data Science holds symposiums and other events. In April 2018, a symposium called “Together with Shigenobu Okuma, who established the statistical system, become a data nation and solve global problems” was held in the Okuma Auditorium.

The Japanese name of the center contains two characters meaning education. I would like for us to move forward with research and education as two wheels. At Waseda University, I aim to develop talents who can use data to make hypotheses and test them in their own field of expertise. Annually, 10,000 students enter Waseda University, and these students will go on to oversee politics, economics, law, sports, and other various fields. If those students acquire abilities to use data to analyze and draw conclusions, 10,000 new people with such abilities will become a part of society every year. I believe that this strength will have the ability to change Japan and the world.

In Part 3, we will speak with Professor Manabu Kobayashi, a full-time staff member of the Center for Data Science.

☞Click here for Part 1
☞Click here for Part 3

Profile

Toshiyasu Matsushima
Toshiyasu Matsushima graduated from Graduate School of Science and Engineering, Waseda University with a Ph.D. in Management System Engineering in 1991. He worked at NEC Corp, and held positions as lecturer at Yokohama College of Commerce; associate professor at the Department of Industrial Management and professor at the Department of Industrial and Management Systems Engineering, School of Engineering, Waseda University, before being appointed professor at the Department of Applied Mathematics, School of Fundamental Science and Engineering in 2007.

Professor Matsushima concurrently serves as director of Waseda University Center for Data Science, which was established in December 2017. His research field is information theory and its applications, and his research topics are those of theoretical research, such as various entropy; machine learning using statistical information; statistical processing; communications; information security; optimality such as in control; theoretical research in performance limits; and design of optimal algorithm and their performance evaluations.

He was a visiting research fellow at the Department of Electrical Engineering, University of Hawaii; visiting faculty member at the Department of Statistics, University of California, Berkeley; chairman of IEICE Engineering Sciences Society; chairman of the Special Committee on Information Theory Research, IEICE Engineering Sciences Society; deputy chairperson, Society of Information Theory and its Applications; and administrator of the Society for Quality Control Management. Professor Matsushima also served as member of the editorial committee for papers published by the Japanese Society for Artificial Intelligence; Institute of Electronics; Information and Communication Engineers; and the Society for Quality Control Management. Additionally, he is the manager of Waseda University’s rugby club.

For further details, visit the Matsushima Laboratory website at http://www.matsu.mgmt.waseda.ac.jp/?la=en

Global Research Center（GRC）Waseda University

News

쀣 Tags