{"id":5511,"date":"2025-10-09T11:11:22","date_gmt":"2025-10-09T02:11:22","guid":{"rendered":"https:\/\/www.waseda.jp\/fcom\/riba\/?p=5511"},"modified":"2025-10-09T11:13:29","modified_gmt":"2025-10-09T02:13:29","slug":"2025%e5%b9%b410%e6%9c%8827%e6%97%a5%e6%9c%88-%e3%81%ab%e7%94%a3%e7%a0%94%e8%ac%9b%e6%bc%94%e4%bc%9a%e3%80%8c-making-sense-of-ai-generated-speech-challenges-and-opportunities-for-phonetics-and-speec","status":"publish","type":"post","link":"https:\/\/www.waseda.jp\/fcom\/riba\/news\/5511","title":{"rendered":"2025\u5e7410\u670827\u65e5(\u6708) \u306b\u7523\u7814\u8b1b\u6f14\u4f1a\u300c Making sense of AI-generated speech: challenges and opportunities for phonetics and speech science research\u300d\u304c\u958b\u50ac\u3055\u308c\u307e\u3059\u3002"},"content":{"rendered":"<h3 style=\"text-align: left;\">\u300cMaking sense of AI-generated speech: challenges and opportunities for phonetics and speech science research\u300d<\/h3>\n<div class=\"table-wrapper\"><table class=\"table table-colored-tbhd\" style=\"width: 100%; height: 311px;\" width=\"100%\">\n<tbody>\n<tr style=\"height: 24px;\">\n<th style=\"width: 19.397%; height: 24px;\" width=\"20%\">\u65e5\u6642<\/th>\n<td style=\"width: 80.163%; height: 24px;\">2025\u5e7410\u670827\u65e5\uff08\u6708\uff0916:00\uff5e17:15<\/td>\n<\/tr>\n<tr style=\"height: 24px;\">\n<th style=\"width: 19.397%; height: 24px;\" width=\"20%\">\u958b\u50ac\u65b9\u6cd5<\/th>\n<td style=\"width: 80.163%; height: 24px;\">\u2460\u5bfe\u9762 \uff0a11\u53f7\u99284\u968e\u7b2c3\u4f1a\u8b70\u5ba4\u306b\u304a\u8d8a\u3057\u304f\u3060\u3055\u3044\u3002<br \/>\n\u2461Zoom\uff0a\u304a\u7533\u8fbc\u307f\u5b8c\u4e86\u306e\u81ea\u52d5\u8fd4\u4fe1\u30e1\u30fc\u30eb\u306b\u3066\u3001\u53c2\u52a0\u7528URL\u3092\u304a\u77e5\u3089\u305b\u3044\u305f\u3057\u307e\u3059\u3002<\/td>\n<\/tr>\n<tr style=\"height: 24px;\">\n<th style=\"width: 19.397%; height: 24px;\" width=\"20%\">\u5bfe\u8c61<\/th>\n<td style=\"width: 80.163%; height: 24px;\">\u5b66\u751f\u30fb\u6559\u8077\u54e1\u30fb\u4e00\u822c<\/td>\n<\/tr>\n<tr style=\"height: 24px;\">\n<th style=\"width: 19.397%; height: 24px;\" width=\"20%\">\u8b1b\u6f14\u8005<\/th>\n<td style=\"width: 80.163%; height: 24px;\">James Tanner \u6c0f<br \/>\n\uff08Research Associate, University of Glasgow\uff09<a href=\"https:\/\/waseda.box.com\/s\/ewre42iszgni3qj47povzbskc2v6t0sl\">https:\/\/waseda.box.com\/s\/ewre42iszgni3qj47povzbskc2v6t0sl<\/a><\/td>\n<\/tr>\n<tr style=\"height: 143px;\">\n<th style=\"width: 19.397%; height: 143px;\">\u8981\u65e8<\/th>\n<td style=\"width: 80.163%; height: 143px;\">Recent years have seen the substantial increase in the use and adoption of Artificial Intelligence (AI) across many spheres of daily life. For speech, AI models are capable of producing fluent and natural speech of a given speaker with only limited input data (&#8216;voice cloning&#8217;), raising a wide range of risks for security, privacy, and personal identity. In spite of these risks, however, there remains little scientific understanding of the properties of AI-generated speech, including how AI speech models learn linguistic information from the speech signal and the extent to which AI-generated speech patterns similarly or differently to human speech. Decades of research within linguistics and phonetics, however, has uncovered how speakers manipulate patterns of acoustic variability to signal linguistic structure, social indexicality, and speaker-specific properties, and so provides a unique perspective with which to explain the behaviours of AI speech models.<br \/>\nIn this talk, I will explore how methodologies from phonetics and speech science research help provide a window into understanding the properties of AI-generated speech, and how these properties may systematically differ from human speech. By examining patterns of acoustic variability in AI-generated English and Japanese stops in both the same language (e.g. Japanese-Japanese) and opposite language of the target speaker (e.g., Japanese-English), it is found that AI-generated speech both differs from the expected human speech patterns and exhibits preference for the language-specific phonetic implementation of stops when synthesising a speaker into the opposite language. These initial findings demonstrate the potential of acoustic-phonetic approaches to the study of AI-generated speech and point the way towards phonetically-motivated and interpretable methods for AI speech detection.<\/td>\n<\/tr>\n<tr style=\"height: 24px;\">\n<th style=\"width: 19.397%; height: 24px;\">\u4e16\u8a71\u4eba<\/th>\n<td style=\"width: 80.163%; height: 24px;\">\u7be0\u539f\u3000\u9756\u660e\uff08\u65e9\u7a32\u7530\u5927\u5b66\u5546\u5b66\u5b66\u8853\u9662 \u51c6\u6559\u6388\uff09<\/td>\n<\/tr>\n<tr style=\"height: 72px;\">\n<th style=\"width: 19.397%; height: 24px;\" width=\"20%\">\u53c2\u52a0\u7533\u3057\u8fbc\u307f\u65b9\u6cd5<\/th>\n<td style=\"width: 80.163%; height: 24px;\">\u53c2\u52a0\u306f<a href=\"https:\/\/my.waseda.jp\/application\/noauth\/application-detail-noauth?param=URSjf_1ldjyWnWsb4Or0Og\" target=\"_blank\" rel=\"noopener\">\u3053\u3061\u3089<\/a>\u304b\u3089\u304a\u7533\u8fbc\u307f\u304f\u3060\u3055\u3044\u3002\u203b10\u670823\u65e5\uff08\u6728\uff0917:00\u7de0\u5207<\/td>\n<\/tr>\n<tr style=\"height: 24px;\">\n<th style=\"width: 19.397%; height: 24px;\">\u5171\u50ac<\/th>\n<td style=\"width: 80.163%; height: 24px;\">\u65e9\u7a32\u7530\u5927\u5b66\u5546\u5b66\u90e8\u30fb\u7523\u696d\u7d4c\u55b6\u7814\u7a76\u6240\u30fb\u68ee\u7530\u5f70\u5206\u79d1\u4f1a\u30fb\u65e9\u7a32\u7530\u5927\u5b66\u7dcf\u5408\u7814\u7a76\u6a5f\u69cb\u3053\u3068\u3070\u306e\u79d1\u5b66\u7814\u7a76\u6240<\/td>\n<\/tr>\n<tr>\n<th style=\"width: 19.397%;\">\u5f8c\u63f4<\/th>\n<td style=\"width: 80.163%;\">\u65e5\u672c\u97f3\u58f0\u5b66\u4f1a<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/div>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u300cMaking sense of AI-generated speech: challenges and opportunities for phonetics and speech science research\u300d  [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[24],"class_list":["post-5511","post","type-post","status-publish","format-standard","hentry","category-news","tag-events"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.waseda.jp\/fcom\/riba\/wp-json\/wp\/v2\/posts\/5511","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.waseda.jp\/fcom\/riba\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.waseda.jp\/fcom\/riba\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.waseda.jp\/fcom\/riba\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.waseda.jp\/fcom\/riba\/wp-json\/wp\/v2\/comments?post=5511"}],"version-history":[{"count":2,"href":"https:\/\/www.waseda.jp\/fcom\/riba\/wp-json\/wp\/v2\/posts\/5511\/revisions"}],"predecessor-version":[{"id":5513,"href":"https:\/\/www.waseda.jp\/fcom\/riba\/wp-json\/wp\/v2\/posts\/5511\/revisions\/5513"}],"wp:attachment":[{"href":"https:\/\/www.waseda.jp\/fcom\/riba\/wp-json\/wp\/v2\/media?parent=5511"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.waseda.jp\/fcom\/riba\/wp-json\/wp\/v2\/categories?post=5511"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.waseda.jp\/fcom\/riba\/wp-json\/wp\/v2\/tags?post=5511"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}