Vol.17
Expanding possible uses of computers from
both the researchers' and the users' perspective
Hayato Yamana
Professor, Department of Computer Science,
School of Science and Engineering
From the incunabula of the personal computer
■to the incunabula of the Internet: Expanding research themes as times change
  Computer mania in the early school days

Even when I was in elementary school, I was fascinated by computers and was involved in some computer related activities such as creating simple programs for board computers, which are programmed by directly inputting hexadecimal numbers. By the time I became a junior high student, I had created some programs myself using a pocket computer, now called a scientific electronic computer.
  When I entered high school, personal computers were finally on the market, so I persuaded my parents to buy me one. I was a ham radio fan at that time, so I created a computer program to instantly analyze the identification, the so-called callsign, of each station, and entered a contest to see who could communicate with the most people without overlapping. Since I wanted to be a professional in a field related to electricity, I entered a science and technology university without hesitation. However, at that time there was no department or course specializing in computers or information, so I majored in electronic communications.
  During my undergraduate and graduate courses, I did research on parallel computing, which was then a booming sector of hardware architecture. By the time I had gone on to my doctoral studies, I started to feel that it was not enough to develop hardware technology: even though there were high capacity machines in existence, it was software which directed the work of those machines, that is of utmost importance. I wanted to do software research in order to optimize the capacity of the machines, so I shifted my research theme to compilers, which translate programming language into machine language.
  After completing my doctorate, I worked at the Electrotechnical Laboratory (ETL) of the former Ministry of International Trade and Industry (MITI); that place is now called the National Institute of Advanced Industrial Science and Technology (AIST). At the beginning, I thought that a researcher's job was to conduct research and write papers, however, in my third year at the laboratory I changed my way of thinking after I being seconded for a year to the (former) MITI's Machinery and Information Industries Bureau in Kasumigaseki, the government arena of Tokyo. I was in charge of managing government projects in coordination with industry. A key question there was how government funded research could feed back to the community, so I began to think that research must have some practical application.
 The biggest change in my thinking about research came when I started to look at research from the end users' perspective. Policy and government projects should be accountable to the people. Although such awareness has been gradually rising among research organizations lately, there is still a deep-rooted tendency for professionals not to explain their work in a way that lay people can understand. I could not have come to such a realization if I had not left academia to do practical work in Kasumigaseki.
  I was not supposed to be doing technical research in Kasumigaseki, but I started to investigate web search technology with my next research project in mind. In 1996 I established a study group involving people in Japan who were creating web search engines; we gathered to discuss future technology needs. This was the beginning of my interest in web related research.

■New horizon: the world of web searches

  The very first web search engine in Japan was Senrigan (Clairvoyance) created by Waseda University researchers. This engine, later managed by ASCII Corporation, survived for a while. Other engines were ODIN, created by Tokyo University researchers, and Mondo, from Kyoko University labs. These engines were followed by a number from private and foreign companies. The human network I was involved in then continues to this day: we created a portal at the dawn of the Internet, and we are connected to the human network that supports today's main search portal.
  When I returned to my laboratory from Kasumigaseki, I started intensive research about the web. That was the time when the number of public users of the Internet began to increase. Netscape, a popular browser software, was created in the USA, so I created patches for the Japanese version of Netscape. During the same period I published a know-how booklet about the web, "www-clients-for-mac."
  As I mentioned earlier, the prototype web search engines were created at universities; there is a history of scholars and researchers voluntarily starting advanced development of such prototype software and applications and contributing to the development of technology by offering their results free of charge. Such activity might not be pure research but it does have a high public profile.
  I stayed for three more years at the Electrotechnical Laboratory, and the last year was in the planning division, where one could see the overall direction of the whole laboratory. I benefited greatly from that overview of many kinds of research and learned the joy of connecting different kinds of research: researchers tend not to know much about research other than their own projects. Unlike companies, in which employees are in a vertical hierarchy, research organizations such as laboratories and universities are more horizontal, focusing on the will and direction of individual researchers.
 
Web search evolves into analysis:
■Everybody is a web use expert

 In 2000 I took a teaching position at Waseda University. My main research fields are (1) computer architecture, (2) information search and data mining, (3) bioinformatics, and (4) brain computing. I also work tangentially in a wide range of fields including the development of applications for web users, and the fusion area of bio and artificial intelligence.
  Among these research interests, my most central research is on information search and data mining. Prof. Yoichi Muraoka of Waseda and I are working together on the Ministry of Education, Culture, Sports, Science and Technology's (MEXT) 'e-Society Project', a colossal project aiming to collect the world's largest volume of web pages and analyze them. In the year 2005, Google was searching 8 billion web pages, while our project was working with 12 billion. There is no other data collection that large (see Figure 1).(see *1)
  *1: In Oct. 2006, Google's web index was estimated to contain over 35 billion pages. However, our index, 14.5 billion pages as of Oct.2006 is still the largest research collection.


Figure 1: The number of web pages collected at 'e-society Project'
 

  However, we are not just collecting data, but also developing algorithms for updating web pages efficiently. For example, in web search sites like Google, information that people use is usually found within the top rank of the list, no more than 1000 sites, so it is enough just to update the top rank of the list frequently. However, the top ranks of information are not always adequate for searcher's needs, and it is necessary to update the whole list from time to time, especially to support searches for information related to crime or hazards.
  Nevertheless, updating every page takes labor and time if the update starts from scratch. Efficient updates require the ability to view and collect the necessary information without accessing every single page. For example, when the deeper pages whose URL are long are updated, the shallower pages also tend to be updated, so technology that uses such relations should be developed. At the moment, we have achieved the highest collection efficiency in the world: collecting only 26% of web pages, yet covering 76% of all updated pages.
  Moreover, we are also conducting research on how to utilize such large bodies of web information. Currently I am conducting analysis on link relations among web pages, for example, technology to pick out important links for particular users. Such technology can be also useful for finding crime-related patterns such as transactions of personal data and on-line payments.
  Analysis of links and transactions does not reflect the content of web pages, so if some technology is created to analyze the content of web pages, it will be useful for checking such things as whether bad comments are being made about a company or collecting users' opinions from blogs for use as marketing strategy data base. At the moment, the main function of search engines is still to offer searches of collected web pages, but we are aiming at extensions of that potential to completely new services by adding web page analysis.
  Today anybody can use free automatic translation services by entering text into portal web sites. However, the accuracy of such free services is still poor. Experienced users use search sites such as Google to check idioms and phases in order to improve translation quality. I am now working with professional translators to develop an algorithm to automate such technology.
  Individual technology for current search services has matured and experienced net searchers who have developed their own techniques for taking command of that technology for various purposes. However, not everybody can use the web so skillfully. A complete service that automates such complicated tasks will be very convenient.

The world of the 'Knowledge Grid,'
■ where computers link various knowledge automatically

  I am also conducting research on compilers (in the field of computer architecture), my specialty since graduate school. Prof. Hironori Kasahara of Waseda University and I participated in the 'Advanced Parallelizing Compiler (APC) Technology Project,' a three-year national project with the objective of doubling compiler functionality, and we exceeded our objective (See Figure 2: a web page).


Figure 2: Home page of Advanced Parallelizing Compiler Technology Project
http://www.cs.waseda.ac.jp/eng/project/index.html  

  Among my other recent research is bioinformatics, which is a fusion of information science, bio-information science and life science. The latest data mining technology developed at the e-Society Project can be used to analyze a large volume of data for genes and proteins, so I established the Waseda Research Institute of Information Technological Biology in April 2005 in coordination with the National Institute of Advanced Industrial Science and Technology (AIST).
  I am also involved in applied research using artificial intelligence. It is no longer impossible to record all human acts in digital form. Of course this does not mean recording the actions of every human being around the world, but it is possible for people and computers to share an articulate environment in some limited way.
  For instance, when hearing a conversation, 'Mr. A had a match with Mr. B,' people can understand that Mr. A played tennis with Mr. B, but computers usually cannot infer that context. However, it is possible for computers to understand to some extent if we input some common background knowledge into the computer. It will not be impossible to have a person say to a computer, 'Please find it,' and the computer will be able to infer what 'it' is from background knowledge and context.
  It is also possible for machines to collect valuable data efficiently if they find meaningful patterns in the data and learn them. Such technology can be applied to the extraction of knowledge from large volumes of data. Once a certain pattern is found by data mining, artificial intelligence technology can be applied to enhance the automatic data collection. Such technology is called the 'knowledge grid,' which refers to the use of the image of knowledge lying on a grid to support automatic data collection.
  It is not easy to pull together the wide range of research fields I have mentioned into one unique systematic technology, but I would like to set them in the direction of mutual development. For example, one future target is to develop linkages between applied research and foundation research, such as the development of a high capacity processor (in computer architecture) for foundation research in data mining.

Students learning from teachers and from each other

  After I moved from the laboratory to the university, in addition to conducting research, educating students and giving them motivation became very important for me. The key to motivating students is finding interesting themes for them. Since there are a variety of research themes, I let students choose their own theme according to their interests. If I do not know so much about the theme, I study it with them. Also, I sometimes have the students exchange ideas about each other's research results. I believe that this kind of group work is important. If one has a chance to join ongoing work on a new theme, one can enter into the new area of study much more easily and efficiently.
  Recalling my own experience, I never thought I would become a researcher on the web or data mining. Regarding my students' eventual employment, there are a variety of job possibilities, from research institutes to manufacturing, from banks to portal sites. However, no matter what direction they choose, there is no such thing as a lifetime direction. I often tell my students, 'You should do whatever you are told and it will surely feed back to you sometime, somewhere.'
  Recently I translated some English documents introducing trends in search technology and spam in the USA. This led me to join METI's 'Sectional Committee for Vision and Technology of Intellectual Access to Informationfor Next Generation Intelligent Information Access??'. It was a good opportunity to get to know the latest trends in the field. I believe that the breadth of my research is a result of feedback from the wide range of people I met and of activities outside my area of pure research.


Figure 3: Waseda Research Institute of Information Technological Biology
http://www.it-bio.waseda.ac.jp/  

Profile●Hayato Yamana

Professor, Department of Computer Science,
School of Science and Engineering

Born in 1964 in Yamaguchi Prefecture, Prof. Yamana received his Doctor of Engineering degree at Waseda University in 1993. He began his career at the Electrotechnical Laboratory (ETL) of the former Ministry of International Trade and Industry (MITI), and was seconded to MITI's Machinery and Information Industries Bureau for a year in 1996. He was subsequently appointed Associate Professor of Computer Science at Waseda University in 2000, and has been a professor in that department (as well as visiting professor at the National Institute of Informatics) since 2005. Professor Yamana has received a number of awards, including the IPSJ (Information Processing Society of Japan) Yamashita Memorial Research Award in 1995, the IPSJ Best Author Award in 2002 and the ITE (Institute of Image Information and Television Engineers) Best Author Award in 2003. He has written, co-written and translated a number of books including Google Hacks (translation supervisor), Google Pocket Guide (translation), Com Series: An Introduction to Super Parallel Computers (co-author), How to Search the World Wide Web-A Guide to Search Engines (co-author) and Objects that Evolve the Internet (author).
Homepage:http://www.yama.info.waseda.ac.jp/~yamana/

Back to top page


Last revised: October 23, 2006 For inquiry:Research Promotion Division koho-rps@list.waseda.jp
Copyright(c) 2006 All rights reserved. Research Promotion Division, Waseda University