{"id":357,"date":"2020-10-18T10:00:50","date_gmt":"2020-10-18T01:00:50","guid":{"rendered":"https:\/\/staka.jp\/wordpress\/?p=357"},"modified":"2020-10-18T10:00:50","modified_gmt":"2020-10-18T01:00:50","slug":"%e6%a9%9f%e6%a2%b0%e7%bf%bb%e8%a8%b3%e3%81%a8%e8%a8%b3%e6%8a%9c%e3%81%91%e3%81%a8constituency-parsing","status":"publish","type":"post","link":"https:\/\/staka.jp\/wordpress\/?p=357","title":{"rendered":"\u6a5f\u68b0\u7ffb\u8a33\u3068\u8a33\u629c\u3051\u3068Constituency parsing"},"content":{"rendered":"\n<p>\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u306e\u304a\u8a66\u3057\u30b5\u30a4\u30c8\uff08<a href=\"https:\/\/devneko.jp\/demo\/\">https:\/\/devneko.jp\/demo\/<\/a>\uff09\u3092\u66f4\u65b0\u3057\u305f\u3002\u4e3b\u306b\u4e0b\u8a18\u306e\u6a5f\u80fd\u3092\u8ffd\u52a0\u3057\u3066\u3044\u308b\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>\u6700\u59273000\u6587\u5b57\u307e\u3067\u306e\u9577\u6587\u5bfe\u5fdc<\/li><li>\u8a33\u629c\u3051\u9632\u6b62\u30e2\u30fc\u30c9\u306e\u9ad8\u5ea6\u5316 <\/li><li>\u7ffb\u8a33\u7d50\u679c\u306b\u5bfe\u3059\u308b\u30b9\u30b3\u30a2\u8868\u793a<\/li><\/ul>\n\n\n\n<p>\u9577\u6587\u5bfe\u5fdc\u306f\u6587\u5b57\u6570\u5236\u9650\u3092\u5916\u3057\u3066nltk\u306esent_tokenize[1]\u3092\u4f7f\u7528\u3057\u3066\u3044\u308b\u3060\u3051\u3067\u3042\u308b\u3002\u7ffb\u8a33\u7d50\u679c\u306b\u5bfe\u3059\u308b\u30b9\u30b3\u30a2\u8868\u793a\u3001\u8a33\u629c\u3051\u9632\u6b62\u30e2\u30fc\u30c9\u306f\u4ee5\u4e0b\u306e\u3088\u3046\u306b\u591a\u5c11\u5de5\u592b\u3057\u305f\u3002<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">\u8a33\u629c\u3051\u9632\u6b62\u30e2\u30fc\u30c9<\/h3>\n\n\n\n<p>Deep Learning\u306a\u6a5f\u68b0\u7ffb\u8a33\u3067\u306f\u8a33\u629c\u3051\u3068\u3044\u3046\u73fe\u8c61\u304c\u767a\u751f\u3059\u308b\u3002\u3053\u308c\u306f\u8a33\u3059\u3079\u304d\u82f1\u6587\u3092\u7701\u7565\u3057\u3066\u3057\u307e\u3046\u3068\u3044\u3046\u73fe\u8c61\u3067\u3042\u308b\u3002\u7d50\u679c\u3001\u6d41\u66a2\u3067\u3042\u308b\u304c\u60c5\u5831\u304c\u6b20\u3051\u305f\u6587\u7ae0\u304c\u51fa\u529b\u3055\u308c\u308b\u3002Google\u7ffb\u8a33\u3084DeepL\u7ffb\u8a33\u306a\u3069\u30e1\u30b8\u30e3\u30fc\u306a\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u3067\u3082\u8d77\u304d\u308b\u3053\u3068\u304c\u3042\u308a<strong>\uff08\u5f53\u7136\u306a\u304c\u3089\uff09<\/strong>\u500b\u4eba\u958b\u767a\u306e\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u3067\u306f\u3088\u304f\u767a\u751f\u3059\u308b\u3002<\/p>\n\n\n\n<p>\u4f8b\u3048\u3070\u3001\u4e0b\u8a18\u306e\u82f1\u8a9e\u6587\u3092\u7ffb\u8a33\u3059\u308b\u4f8b\u3092\u793a\u3059\u3002<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p>Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. <\/p><cite>https:\/\/en.wikipedia.org\/wiki\/Natural_language_processing\u300011 October 2020, at 18:45 (UTC) \u306e\u7248\u3001Wikipedia\u3088\u308a\u5f15\u7528<\/cite><\/blockquote>\n\n\n\n<p>\u73fe\u5728\u306e\u79c1\u306e\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u306f\u4e0a\u8a18\u6587\u7ae0\u3092\u300c \u81ea\u7136\u8a00\u8a9e\u51e6\u7406(nlp)\u306f\u3001\u30b3\u30f3\u30d4\u30e5\u30fc\u30bf\u3068\u4eba\u9593\u306e\u8a00\u8a9e\u9593\u306e\u30a4\u30f3\u30bf\u30e9\u30af\u30b7\u30e7\u30f3\u306b\u95a2\u3059\u308b\u8a00\u8a9e\u5b66\u3001\u30b3\u30f3\u30d4\u30e5\u30fc\u30bf\u79d1\u5b66\u3001\u4eba\u5de5\u77e5\u80fd\u306e\u30b5\u30d6\u30d5\u30a3\u30fc\u30eb\u30c9\u3067\u3042\u308b\u3002 \u300d\u3068\u7ffb\u8a33\u3057\u3001\u300cin particular\u300d\u4ee5\u5f8c\u306e\u60c5\u5831\u304c\u629c\u3051\u3066\u3044\u308b[2]\u3002<\/p>\n\n\n\n<p>\u8a33\u629c\u3051\u306b\u306f\u69d8\u3005\u306a\u7406\u7531\u304c\u8003\u3048\u3089\u308c\u308b\u304c\u9577\u3044\u6587\u3060\u3068\u767a\u751f\u3057\u3084\u3059\u3044\u3002\u305d\u3053\u3067\u8a33\u629c\u3051\u9632\u6b62\u30e2\u30fc\u30c9\u3067\u306fconstituency parsing[3]\u3092\u884c\u3063\u305f\u3046\u3048\u3067\u610f\u5473\u304c\u6210\u7acb\u3057\u305d\u3046\u306a\u30d6\u30ed\u30c3\u30af\u306b\u5206\u5272\u3057\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u3092\u9069\u7528\u3059\u308b\u30d5\u30ed\u30fc\u3092\u63a1\u7528\u3057\u3066\u3044\u308b\u3002\u30d6\u30ed\u30c3\u30af\u5206\u5272\u3057\u305f\u7d50\u679c\u306f\u304a\u8a66\u3057\u30b5\u30a4\u30c8\u306e\u4e00\u756a\u4e0b\u306b\u8868\u793a\u3055\u308c\u308b\u3002\u672c\u4ef6\u3067\u306f\u7ffb\u8a33\u5bfe\u8c61\u306e\u6587\u304c<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p>Natural language processing ( NLP ) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, <\/p><cite> https:\/\/en.wikipedia.org\/wiki\/Natural_language_processing\u300011 October 2020, at 18:45 (UTC) \u306e\u7248\u3001Wikipedia\u3088\u308a\u5f15\u7528 <\/cite><\/blockquote>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p><span style=\"font-size: inherit;\">in particular how to program computers to process and analyze large amounts of natural language data.<\/span><\/p><cite> https:\/\/en.wikipedia.org\/wiki\/Natural_language_processing\u300011 October 2020, at 18:45 (UTC) \u306e\u7248\u3001Wikipedia\u3088\u308a\u5f15\u7528 <\/cite><\/blockquote>\n\n\n\n<p>\u306b\u5206\u5272\u3055\u308c\u305f\u3002\u7d50\u679c\u3001\u8a33\u629c\u3051\u9632\u6b62\u30e2\u30fc\u30c9\u3067\u306f\u4e0a\u8a18\u306e\u82f1\u6587\u3092\u300c\u81ea\u7136\u8a00\u8a9e\u51e6\u7406(nlp)\u306f\u3001\u30b3\u30f3\u30d4\u30e5\u30fc\u30bf\u3068\u4eba\u9593\u306e\u8a00\u8a9e\u9593\u306e\u76f8\u4e92\u4f5c\u7528\u306b\u95a2\u3059\u308b\u8a00\u8a9e\u5b66\u3001\u30b3\u30f3\u30d4\u30e5\u30fc\u30bf\u79d1\u5b66\u3001\u4eba\u5de5\u77e5\u80fd\u306e\u30b5\u30d6\u30d5\u30a3\u30fc\u30eb\u30c9\u3067\u3042\u308b\u3002 \u7279\u306b\u3001\u30b3\u30f3\u30d4\u30e5\u30fc\u30bf\u304c\u5927\u91cf\u306e\u81ea\u7136\u8a00\u8a9e\u30c7\u30fc\u30bf\u3092\u51e6\u7406\u304a\u3088\u3073\u5206\u6790\u3059\u308b\u305f\u3081\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u65b9\u6cd5\u3002\u300d\u3068\u7ffb\u8a33\u3057\u305f\u3002\u610f\u5473\u3068\u3057\u3066\u306f\u826f\u304f\u306a\u3063\u3066\u3044\u308b\u4e00\u65b9\u3067\u6d41\u66a2\u3055\u306f\u640d\u306a\u308f\u308c\u3066\u3044\u308b\u3002 \u5b9f\u88c5\u3057\u305f\u8a33\u629c\u3051\u9632\u6b62\u30e2\u30fc\u30c9\u306f\u6587\u3092\u5206\u5272\u3057\u3066\u7ffb\u8a33\u3057\u3066\u3044\u308b\u3060\u3051\u3067\u3042\u308a\u3001\u73fe\u72b6\u306e\u6a5f\u68b0\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u306f\u6587\u8108\u306e\u8003\u616e\u3082\u3067\u304d\u3066\u3044\u306a\u3044\u3002\u8a33\u629c\u3051\u9632\u6b62\u30e2\u30fc\u30c9\u306e\u7ffb\u8a33\u54c1\u8cea\u306f\u901a\u5e38\u30e2\u30fc\u30c9\u306b\u6bd4\u3079\u3066\u4f4e\u304f\u306a\u308b\u3002 <\/p>\n\n\n\n<p>\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u306e\u304a\u8a66\u3057\u30b5\u30a4\u30c8\u3067\u306f\u901a\u5e38\u306e\u7ffb\u8a33\u00d7\uff12[4]\u3068\u8a33\u629c\u3051\u9632\u6b62\u30e2\u30fc\u30c9\u00d7\uff12\u306e\u7d50\u679c\u3092\u6587\u6bce\u306b\u6bd4\u8f03\u3057\u3001\u6700\u3082\u826f\u3044\u7d50\u679c\uff08\u30b9\u30b3\u30a2\u7b97\u51fa\u65b9\u6cd5\u306f\u5f8c\u8ff0\uff09\u3092\u63a1\u7528\u3057\u3066\u3044\u308b\u3002<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">\u30b9\u30b3\u30a2\u8868\u793a<\/h3>\n\n\n\n<p>\u304a\u8a66\u3057\u30b5\u30a4\u30c8\u3067\u306f\u82f1\u8a9e\u6587\u3068\u5bfe\u5fdc\u3059\u308b\u7ffb\u8a33\u6587\u305d\u308c\u305e\u308c\u306b\u3064\u3044\u3066\u30b9\u30b3\u30a2\u304c\u4ed8\u4e0e\u3055\u308c\u3066\u3044\u308b\u3002\u30b9\u30b3\u30a2\u306f\u7ffb\u8a33\u6587\u304c\u826f\u3044\u304b\u3069\u3046\u304b\u3092\u8868\u3059\u6307\u6a19\u3067\u3042\u308a\u30010.0 &#8211; 1.0\u3067\u8a55\u4fa1\u3055\u308c\u308b\u3002\u6982\u306d0.7\u4ee5\u4e0a\u3067\u3042\u308c\u3070\u305d\u308c\u306a\u308a\u306e\u8a33\u6587\u306b\u306a\u3063\u3066\u3044\u308b\u3053\u3068\u304c\u591a\u304f\u30010.5\u4ee5\u4e0b\u306e\u5834\u5408\u306f\u4f55\u304b\u3057\u3089\u306e\u554f\u984c\u304c\u8d77\u304d\u3066\u3044\u308b\u3053\u3068\u304c\u591a\u3044\u3002\u7279\u306b0.3\u4ee5\u4e0b\u306e\u5834\u5408\u306f\u307b\u307c\u78ba\u5b9f\u306b\u8a33\u629c\u3051\u304c\u767a\u751f\u3057\u3066\u3044\u308b\u3002<\/p>\n\n\n\n<p>\u30b9\u30b3\u30a2\u306f\u300c\u2460\u6587\u306e\u985e\u4f3c\u5ea6\u300d\u00d7\u300c\u2461\u5358\u8a9e\/\u5f62\u614b\u7d20\u6570\u306e\u985e\u4f3c\u5ea6\u300d\u3067\u8a08\u7b97\u3057\u3066\u3044\u308b\u3002\u300c\u2460\u6587\u306e\u985e\u4f3c\u5ea6\u300d\u306fUniversal Sentence Encoder[5] + cos\u985e\u4f3c\u5ea6\u3067\u3042\u308b\u3002LaBSE[6]\u3082\u8a66\u884c\u3057\u305f\u304c\u3053\u306e\u30bf\u30b9\u30af\u3067\u306f\u30e1\u30e2\u30ea\u30fb\u8a08\u7b97\u6642\u9593\u306e\u5897\u52a0[7]\u306b\u6bd4\u3079\u3066\u52b9\u679c\u304c\u8584\u304b\u3063\u305f\u3002\u300c\u2461 \u5358\u8a9e\/\u5f62\u614b\u7d20\u6570\u306e\u985e\u4f3c\u5ea6 \u300d\u306f\u82f1\u6587\u306e\u5358\u8a9e\u6570\u3068\u65e5\u672c\u8a9e\u6587\u306e\u5f62\u614b\u7d20\u6570\u306e\u6bd4\u7387\u304c\u5bfe\u8a33\u30c7\u30fc\u30bf\u306e\u5e73\u5747\uff080.85\uff09\u306b\u8fd1\u3044\u304b\u3092\u8a08\u7b97\u3057\u3066\u3044\u308b\u3002\u5f62\u614b\u7d20\u89e3\u6790\u306fMeCab\u3092\u7528\u3044\u305f\u3002<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">\u6240\u611f\u30fb\u305d\u306e\u4ed6<\/h3>\n\n\n\n<p>\u304a\u8a66\u3057\u30b5\u30a4\u30c8\u306e\u51e6\u7406\u30d5\u30ed\u30fc\u306f\u4ee5\u4e0b\u306e\u901a\u308a\u3067\u6a5f\u68b0\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u3092\u4f7f\u3046\u969b\u306e\u5bfe\u5fdc\u306f\u5927\u4f53\u5b9f\u65bd\u3067\u304d\u305f\u6c17\u304c\u3057\u3066\u3044\u308b\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>\u6539\u884c\u304c\u9023\u7d9a\u3057\u305f\u5834\u5408\u306f\u5225\u306e\u6587\u3068\u307f\u306a\u3057\u3001\u51e6\u7406\u30d6\u30ed\u30c3\u30af\u3092\u5206\u3051\u308b\u3002\uff08\u9014\u4e2d\u6539\u884c\u304c1\u3064\u306e\u5834\u5408\u3001\u6587\u306f\u9023\u7d9a\u3057\u3066\u3044\u308b\u3068\u307f\u306a\u3059\u3002arxiv\u306e\u8ad6\u6587\u3084PPT\u8cc7\u6599\u3067\u3042\u308a\u304c\u3061\u306a\u6539\u884c\u306e\u5165\u308a\u65b9\u306b\u5bfe\u5fdc\u3057\u3066\u3044\u308b\u3002\uff09<\/li><li>\u51e6\u7406\u30d6\u30ed\u30c3\u30af\u5185\u306e\u6587\u7ae0\u3092NLTK\u306esent_tokenize\u3067\u6587\u306b\u5206\u5272\u3059\u308b\u3002<\/li><li>\u6587\u306b\u5206\u5272\u3055\u308c\u305f\u30c7\u30fc\u30bf\u305d\u308c\u305e\u308c\u306b\u5bfe\u3057\u3066constituency parsing\u3092\u884c\u3044\u3001\u610f\u5473\u304c\u6210\u7acb\u3059\u308b\u3068\u601d\u308f\u308c\u308b\u4e00\u5b9a\u306e\u9577\u3055\u3067\u6587\u3092\u5206\u5272\u3059\u308b\u3002<\/li><li>\u4e0a\u8a18\u30012.\u30013.\u3067\u4f5c\u6210\u3057\u305f\u6587\u306e\u30ea\u30b9\u30c8\u3092\u6a5f\u68b0\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u3067\u548c\u8a33\u3059\u308b\u3002\u548c\u8a33\u306f\u30cf\u30a4\u30d1\u30fc\u30d1\u30e9\u30e1\u30fc\u30bf\u3092\u5909\u3048\u305f2\u3064\u306e\u30a8\u30f3\u30b8\u30f3\u3067\u884c\u3046\u3002<\/li><li>\u7ffb\u8a33\u5bfe\u8c61\u306e\u82f1\u8a9e\u6587\u305d\u308c\u305e\u308c\u306b\u3064\u3044\u3066\uff14\u3064\u306e\u548c\u8a33\u7d50\u679c\uff083.\u306e\u6709\u7121\u00d74.\u306e2\u3064\u306e\u7d50\u679c\uff09\u306e\u30b9\u30b3\u30a2\uff08USE\u306ecos\u985e\u4f3c\u5ea6\u00d7\u5358\u8a9e\/\u5f62\u614b\u7d20\u6570\u306e\u5e73\u5747\u6bd4\uff09\u3092\u8a08\u7b97\u3057\u3001\u4e00\u756a\u826f\u3044\u3082\u306e\u3092\u63a1\u7528\u3059\u308b\u3002<\/li><\/ol>\n\n\n\n<p>\u305d\u308c\u306a\u308a\u306b\u8907\u96d1\u306a\u51e6\u7406\u306b\u306a\u3063\u3066\u3044\u308b\u304cOSS\u306e\u30bd\u30d5\u30c8\u30d5\u30a7\u30a2\u30fb\u30e2\u30c7\u30eb\u3092\u30d5\u30eb\u6d3b\u7528\u3057\u3066\u3044\u308b\u305f\u3081\u30b3\u30fc\u30c9\u306e\u8a18\u8ff0\u91cf\u306f\u305d\u3053\u307e\u3067\u591a\u304f\u306a\u3044\u3002\u4e0a\u8a18\u51e6\u7406\u3082\u305d\u306e\u3046\u3061github\u3068\u304b\u3067\u516c\u958b\u3057\u3088\u3046\u3068\u601d\u3063\u3066\u3044\u308b\u3002<\/p>\n\n\n\n<p>\uff08\u4eca\u306e\u3068\u3053\u308d\u60c5\u71b1\u304c\u6b8b\u3063\u3066\u3044\u308b\u306e\u3067\uff09\u4eca\u5f8c\u306f\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u81ea\u4f53\u306e\u5f37\u5316\u3092\u884c\u3063\u3066\u3044\u304f\u4e88\u5b9a\u3067\u3042\u308b\u3002<\/p>\n\n\n\n<p>\u73fe\u6642\u70b9\u3067\u524d\u56de\u4f7f\u3063\u305f\u30c7\u30fc\u30bf\u306b\u52a0\u3048\u3066\u7d04200\u4e07\u5bfe\u8a33\u30da\u30a2\u306e\u4f5c\u6210\u304c\u5b8c\u4e86\u3057\u3066\u3044\u308b\u3002\u52a0\u3048\u306650\u4e07\u5bfe\u8a33\u30da\u30a2\u7a0b\u5ea6\u306f\u8ffd\u52a0\u3067\u304d\u305d\u3046\u306a\u306e\u3067\u30c7\u30fc\u30bf\u91cf\u306f1.5\u500d\u7a0b\u5ea6\u306b\u306f\u306a\u308b\u898b\u8fbc\u307f\u3067\u3042\u308b\u3002\u307c\u3061\u307c\u3061\u5c0f\u6587\u5b57\u7d71\u4e00\u3092\u3057\u306a\u304f\u3066\u3082\u826f\u3055\u305d\u3046\u306a\u30c7\u30fc\u30bf\u91cf\u306b\u306a\u3063\u3066\u3044\u308b\u3053\u3068\u3082\u3042\u308a\u3001\u6761\u4ef6\u3092\u5909\u3048\u306a\u304c\u3089\u6df1\u5c64\u5b66\u7fd2\u30e2\u30c7\u30eb\u3092\u4f5c\u3063\u3066\u6bd4\u8f03\u3059\u308b\u3088\u3046\u306a\u4e8b\u3082\u3084\u3063\u3066\u3044\u304d\u305f\u3044[8]\u3002<\/p>\n\n\n\n<p>\u6587\u8108\u304c\u8a08\u7b97\u53ef\u80fd\u306a\u30c7\u30fc\u30bf\uff08\u5bfe\u8a33\u30da\u30a2\u306e\u5143\u3068\u306a\u3063\u305f\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u60c5\u5831\u304c\u6b8b\u3063\u3066\u3044\u308b\u30c7\u30fc\u30bf\uff09\u3082\u305d\u308c\u306a\u308a\u306b\u3042\u308b\u306e\u3067\u3001\u6587\u8108\u30d1\u30e9\u30e1\u30fc\u30bf\u3092\u5165\u308c\u305f\u6a5f\u68b0\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u306e\u4f5c\u308a\u305f\u3044\u306a\u30fc\u3068\u3082\u601d\u3063\u3066\u3044\u308b\u3002<\/p>\n\n\n\n<p>\u69cb\u7bc9\u3057\u305f\u30e2\u30c7\u30eb\u306fCC BY SA\u304f\u3089\u3044\u306e\u30e9\u30a4\u30bb\u30f3\u30b9\u3067\u516c\u958b\u3059\u308b\u4e88\u5b9a\u3067\u81ea\u7136\u8a00\u8a9e\u51e6\u7406\u5206\u91ce\u306e\u82f1\u8a9e\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092\u548c\u8a33\u3059\u308b\u5229\u7528\u65b9\u6cd5\u3092\u60f3\u5b9a\u3057\u3066\u3044\u308b\u3002\u30a2\u30ce\u30c6\u30fc\u30b7\u30e7\u30f3\u69cb\u9020\u3092\u4fdd\u6301\u3057\u305f\u3044\u5834\u5408\u306e\u652f\u63f4\u6a5f\u80fd[9]\u7d44\u307f\u5165\u308c\u3082\u4e88\u5b9a\u3057\u3064\u3064\u3001\u6642\u9593\u304c\u3042\u307e\u308a\u306a\u3044\u306a\u30fc\u3068\u601d\u3063\u3066\u3044\u308b\u4eca\u65e5\u3053\u306e\u9803\u3002<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">\u811a\u6ce8<\/h3>\n\n\n\n<p>[1] <a href=\"https:\/\/www.nltk.org\/api\/nltk.tokenize.html?highlight=sent_tokenize#nltk.tokenize.sent_tokenize\">https:\/\/www.nltk.org\/api\/nltk.tokenize.html?highlight=sent_tokenize#nltk.tokenize.sent_tokenize<\/a> <br \/>[2] \u30e1\u30b8\u30e3\u30fc\u306a\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u306f\u6b63\u3057\u304f\u51e6\u7406\u3059\u308b\u3002\u6d41\u77f3\u3067\u3042\u308b\u3002<br \/>[3] \u4eca\u56de\u306fAllen NLP\u306e<a href=\"https:\/\/demo.allennlp.org\/constituency-parsing\">https:\/\/demo.allennlp.org\/constituency-parsing<\/a>\u3092\u7528\u3044\u305f\u3002<br \/>[4] \u7ffb\u8a33\u306f\u30cf\u30a4\u30d1\u30fc\u30d1\u30e9\u30e1\u30fc\u30bf\u3092\u5909\u3048\u30662\u56de\u5b9f\u884c\u3057\u3066\u3044\u308b\u3002\u8907\u6570\u5019\u88dc\u3092\u51fa\u3057\u3066\u9078\u3076\u3068\u3044\u3046\u306e\u3082\u3088\u304f\u898b\u3089\u308c\u308b\u69cb\u6210\u3060\u304c\u3001\u672c\u4ef6\u3067\u306f\u884c\u3063\u3066\u3044\u306a\u3044\u3002<br \/>[5] <a href=\"https:\/\/tfhub.dev\/google\/universal-sentence-encoder\/4 \">https:\/\/tfhub.dev\/google\/universal-sentence-encoder\/4 <\/a>\u5bfe\u8a33\u30c7\u30fc\u30bf\u4f5c\u6210\u3067\u3082\u304a\u4e16\u8a71\u306b\u306a\u3063\u305f\u30e2\u30c7\u30eb\u3067\u3042\u308b\u3002<br \/>[6] <a href=\"https:\/\/tfhub.dev\/google\/LaBSE\/1\"><\/a><a href=\"https:\/\/tfhub.dev\/google\/LaBSE\/1\">https:\/\/tfhub.dev\/google\/LaBSE\/1<\/a> BERT\u7cfb\u306e\u30e2\u30c7\u30eb\u3067\u3042\u308a\u3001\u591a\u8a00\u8a9e\u5bfe\u5fdc\u306eText Embedding\u7528\u9014\u3067\u306f\u6700\u65b0\u30fb\u6700\u9ad8\u6027\u80fd\u306b\u8fd1\u3044\u3068\u601d\u308f\u308c\u308b\u3002<br \/>[7] \u985e\u4f3c\u5ea6\u306e\u59a5\u5f53\u6027\u3067\u306fUSE\u306b\u6bd4\u3079\u3066LaBSE\u304c\u3084\u3084\u826f\u3044\u304c\u3001\u8a08\u7b97\u6642\u9593\u304c\u6570\u5341\u500d\uff0850\u500d\u4ee5\u4e0a\uff09\u3067\u3042\u308a\u30e1\u30e2\u30ea\u4f7f\u7528\u91cf\u3082\u5897\u52a0\u3059\u308b\u3002\u304a\u8a66\u3057\u30b5\u30a4\u30c8\u3067\u4f7f\u3063\u3066\u3044\u308bVPS\u3067\u52d5\u304b\u3059\u306e\u306f\u53b3\u3057\u304b\u3063\u305f\u3002<br \/>[8] AWS\u8ab2\u91d1\u304c\u51c4\u3044\u3053\u3068\u306b\u306a\u308a\u305d\u3046\u3002\u3002\u3002\u672c\u5f53\u306fBack Translation\u3082\u3084\u308a\u305f\u3044\u30fb\u30fb\u30fb\u3002<br \/>[9] \u82f1\u8a9e\u6587\u2192\u65e5\u672c\u8a9e\u6587\u3067\u30bf\u30b0\u69cb\u9020\u3092\u7dad\u6301\u3059\u308b\u7a0b\u5ea6\u306e\u6a5f\u80fd\u306f\u5165\u308c\u305f\u3044\u3002tokenizer\uff08sentence piece\uff09\u69cb\u7bc9\u6642\u70b9\u3067\u30bf\u30b0\u3092\u7279\u6b8a\u8a18\u53f7\u6271\u3044\u3057\u3001\u5bfe\u8a33\u30da\u30a2\u306b\u6b63\u3057\u304f\u30bf\u30b0\u3092\u6271\u3063\u3066\u3044\u308b\u6587\u3092\u8ffd\u52a0\u3057\u3066\u5b66\u7fd2\u3055\u305b\u308b\u4e88\u5b9a\u3067\u3042\u308b\u3002\u3053\u306e\u3042\u305f\u308a\u306f\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u305d\u306e\u3082\u306e\u306b\u624b\u3092\u5165\u308c\u306a\u3044\u3068\u5b9f\u73fe\u3057\u306b\u304f\u304f\u3001\u30e1\u30b8\u30e3\u30fc\u306a\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u3067\u540c\u69d8\u306e\u4e8b\u3092\u3084\u308b\u306e\u306f\u7c21\u5358\u3067\u306f\u306a\u3044\u3068\u601d\u3063\u3066\u3044\u308b\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u7ffb\u8a33\u30a8\u30f3\u30b8\u30f3\u306e\u304a\u8a66\u3057\u30b5\u30a4\u30c8\uff08https:\/\/devneko.jp\/demo\/\uff09\u3092\u66f4\u65b0\u3057\u305f\u3002\u4e3b\u306b\u4e0b\u8a18\u306e\u6a5f\u80fd\u3092\u8ffd\u52a0\u3057\u3066\u3044\u308b\u3002 \u6700\u59273000\u6587\u5b57\u307e\u3067\u306e\u9577\u6587\u5bfe\u5fdc \u8a33\u629c\u3051\u9632\u6b62\u30e2\u30fc\u30c9\u306e\u9ad8\u5ea6\u5316 \u7ffb\u8a33\u7d50\u679c\u306b\u5bfe\u3059\u308b\u30b9\u30b3\u30a2\u8868\u793a \u9577\u6587\u5bfe\u5fdc\u306f\u6587 &hellip; <a href=\"https:\/\/staka.jp\/wordpress\/?p=357\" class=\"more-link\"><span class=\"screen-reader-text\">&#8220;\u6a5f\u68b0\u7ffb\u8a33\u3068\u8a33\u629c\u3051\u3068Constituency parsing&#8221; \u306e<\/span>\u7d9a\u304d\u3092\u8aad\u3080<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2,11,13],"tags":[],"class_list":["post-357","post","type-post","status-publish","format-standard","hentry","category-ai","category-11","category-13"],"_links":{"self":[{"href":"https:\/\/staka.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/357","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/staka.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/staka.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/staka.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/staka.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=357"}],"version-history":[{"count":0,"href":"https:\/\/staka.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/357\/revisions"}],"wp:attachment":[{"href":"https:\/\/staka.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=357"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/staka.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=357"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/staka.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=357"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}