艾德·金特 張雅暉/譯
The AI-powered chatbot ChatGPT is taking the Internet by storm with its impressive language capabilities, helping to draw up legal contracts as well as write fiction. But it turns out that the underlying technology could also help spot the early signs of Alzheimers disease, potentially making it possible to diagnose the debilitating condition sooner.
人工智能聊天機(jī)器人ChatGPT正憑借其驚人的語(yǔ)言能力風(fēng)靡互聯(lián)網(wǎng),它可以幫助起草法律合同,也能幫忙寫(xiě)小說(shuō)。但事實(shí)證明,這項(xiàng)基礎(chǔ)技術(shù)還能幫助發(fā)現(xiàn)阿爾茨海默病的早期癥狀,從而更快確診這種令人衰弱的病癥。
Catching Alzheimers early can significantly improve treatment options and give patients time to make lifestyle changes that could slow progression. Diagnosing the disease typically requires brain imaging or lengthy cognitive evaluations though, which can be both expensive and time-consuming and therefore unsuitable for widespread screening, says Hualou Liang a professor of biomedical engineering at Drexel University in Philadelphia.
盡早發(fā)現(xiàn)阿爾茨海默病可以極大提高治療方案的選擇空間,并給患者時(shí)間去改變生活方式,進(jìn)而延緩病情發(fā)展。費(fèi)城德雷塞爾大學(xué)生物醫(yī)學(xué)工程的梁化樓教授說(shuō),診斷這種疾病通常需要做腦部成像或長(zhǎng)期的認(rèn)知評(píng)估,可能昂貴且耗時(shí),因此不適用于廣泛篩查。
A promising avenue for early detection of Alzheimers is automated speech analysis. One of the most common and noticeable symptoms of the disease is problems with language, such as grammatical mistakes, pausing, repetition, or forgetting the meaning of words, says Liang. This has led to growing interest in using machine learning to spot early signs of the disease in the way people talk.
自動(dòng)語(yǔ)音分析是早期檢測(cè)阿爾茨海默病的一個(gè)途徑,很有發(fā)展前景。梁教授說(shuō),這種疾病最常見(jiàn)和最明顯的癥狀之一就是語(yǔ)言出現(xiàn)問(wèn)題,比如語(yǔ)法錯(cuò)誤、停頓、重復(fù)或忘記語(yǔ)詞含義。因此,運(yùn)用機(jī)器學(xué)習(xí)來(lái)檢測(cè)人們說(shuō)話方式中隱現(xiàn)的疾病早期跡象已經(jīng)引起日益廣泛的關(guān)注。
Normally this relies on purpose-built models, but Liang and his colleagues wanted to see if they could repurpose the technology behind ChatGPT, OpenAIs large language model GPT-3, to spot the telltale signs of Alzheimers. They discovered it could discriminate between transcripts of speech from Alzheimers patients and healthy volunteers well enough to predict the disease with 80 percent accuracy, which represents state-of-the-art performance.
通常情況下,機(jī)器學(xué)習(xí)要依靠專(zhuān)門(mén)構(gòu)建的模型,但梁教授和他的同事們想嘗試看看能否重新調(diào)整ChatGPT(OpenAI的大語(yǔ)言模型GPT-3)的底層技術(shù),用來(lái)檢測(cè)阿爾茨海默病的警示跡象。他們發(fā)現(xiàn),ChatGPT可以很好地區(qū)分阿爾茨海默病患者和健康實(shí)驗(yàn)志愿者的語(yǔ)音轉(zhuǎn)錄文本,預(yù)測(cè)該病的準(zhǔn)確率達(dá)到80%,這展現(xiàn)了其最先進(jìn)的性能。
“These large language models like GPT-3 are so powerful they can pick up these kinds of subtle differences,” says Liang. “If the subject has some kind of issue [involving] Alzheimers, and thats already reflected in the language, the hope is that we can use machine learning to pick up these kinds of signals that allow us to do early diagnostics.”
“像GPT-3這樣的大語(yǔ)言模型非常強(qiáng)大,足以捕捉到那些細(xì)微差異?!绷航淌谡f(shuō),“如果研究對(duì)象有某種(涉及到)阿爾茨海默病的問(wèn)題,且這種問(wèn)題已經(jīng)反映在語(yǔ)言之中,我們就有望能夠利用機(jī)器學(xué)習(xí)來(lái)捕捉到這些信號(hào),從而得以進(jìn)行早期診斷。”
The researchers tested their approach on a collection of 237 audio recordings taken from healthy volunteers and Alzheimers patients, which were converted to text using a pre-trained speech recognition model. To enlist the help of GPT-3, the researchers made use of one of its less well-known capabilities. Its API makes it possible to feed a chunk of text into the model and get it to spit out what is known as an “embedding”—a numerical representation of a piece of text that encodes its meaning and can be used to assess its similarity to other text.
研究人員以收集到的237份健康志愿者和阿爾茨海默病患者的錄音作為樣本,檢驗(yàn)了他們的方法,這些錄音由預(yù)先訓(xùn)練好的語(yǔ)音識(shí)別模型轉(zhuǎn)換成文本。研究人員利用GPT-3不太起眼的一個(gè)功能來(lái)尋得幫助。GPT-3的API可以先將大段文本輸入至模型,然后使其輸出所謂的一段“嵌入”——由數(shù)字表達(dá)的一段文本,對(duì)文本含義進(jìn)行編碼,可用于評(píng)估其與他類(lèi)文本的相似性。
While most machine learning models deal with word embeddings, one of the novel features of GPT-3, says Liang, is that its powerful enough to produce embeddings for entire paragraphs. And because of the models vast size and the huge amount of data used to train it, it is able to produce very rich representations of the text.
梁教授說(shuō),大多數(shù)機(jī)器學(xué)習(xí)模型都可以詞嵌入,但GPT-3有一個(gè)新性能,強(qiáng)大到可以生成整個(gè)段落的嵌入。憑借巨大的模型規(guī)模和海量訓(xùn)練數(shù)據(jù),它能夠生成非常豐富的文本表達(dá)。
The researchers used this capability to create embeddings for all of the transcripts from both Alzheimers patients and healthy individuals. They then took a selection of these embeddings, combined with labels to say which group they came from, and used them to train machine-learning classifiers to distinguish between the two groups. When tested on unseen transcripts the best classifier achieved an accuracy of 80.3 percent, as reported in a paper in PLOS Digital Health.
研究人員利用該性能為阿爾茨海默病患者和健康個(gè)體的所有語(yǔ)音轉(zhuǎn)錄文本創(chuàng)建了嵌入。之后,他們對(duì)這些嵌入進(jìn)行了篩選,加標(biāo)簽明示分組,并用它們訓(xùn)練機(jī)器學(xué)習(xí)分類(lèi)器來(lái)區(qū)分這兩類(lèi)人群。正如《科學(xué)公共圖書(shū)館·數(shù)字健康》上的一篇論文所稱(chēng),在對(duì)未見(jiàn)過(guò)的轉(zhuǎn)錄文本進(jìn)行測(cè)試時(shí),最優(yōu)分類(lèi)器達(dá)到了80.3%的準(zhǔn)確率。
That was significantly better than the 74.6 percent the researchers achieved when they applied a more conventional approach to the speech data, which relies on acoustic features that have to be painstakingly identified by experts. They also compared their technique to several cutting-edge machine-learning approaches that use large language models too but include an extra step in which the model is laboriously fine-tuned using some of the transcripts from the training data. They matched the performance of the top model and outperformed the other two.
這明顯優(yōu)于研究人員采用更傳統(tǒng)方法處理語(yǔ)音數(shù)據(jù)所達(dá)到的74.6%的準(zhǔn)確率,而傳統(tǒng)方法必須靠專(zhuān)家費(fèi)力識(shí)別聲學(xué)特征。他們還將自己的技術(shù)與另外幾種尖端的機(jī)器學(xué)習(xí)方法進(jìn)行了比較,這些方法也使用大型語(yǔ)言模型,但卻多了一個(gè)步驟,即使用訓(xùn)練數(shù)據(jù)的一些轉(zhuǎn)錄文本對(duì)模型進(jìn)行勞力費(fèi)神的微調(diào)。該技術(shù)的表現(xiàn)與其中最頂級(jí)的模型不相上下,贏過(guò)了另外兩種。
Interestingly, when the researchers tried fine-tuning, the GPT-3 model performance actually dropped. This might seem counter-intuitive, but Liang points out that this is probably due to the mismatch in size between the vast amount of data used to train GPT-3 and the small amount of domain-specific training data available for fine-tuning.
有趣的是,研究人員嘗試微調(diào)后,GPT-3模型的性能反而下降了。這看似有悖常理,但梁教授指出,這可能是用于訓(xùn)練GPT-3的大量數(shù)據(jù)和可用于微調(diào)的特定領(lǐng)域少量訓(xùn)練數(shù)據(jù)間的大小不匹配所致。
While the team does achieve state-of-the-art results, Frank Rudzicz, an associate professor of computer science at the University of Toronto, says relying on privately owned models to carry out this kind of research does raise some problems. “Part of the reason these closed APIs are limiting is that we also cant inspect or deeply modify the internals of those models or do a more complete set of experiments that would help elucidate potential sources of error that we need to avoid or correct,” he says.
雖然該團(tuán)隊(duì)的確取得了一些最先進(jìn)的成果,但多倫多大學(xué)計(jì)算機(jī)科學(xué)副教授弗蘭克·魯基奇表示,依賴(lài)私有模型進(jìn)行此類(lèi)研究確實(shí)會(huì)帶來(lái)一些問(wèn)題?!斑@些封閉的API存在局限的部分原因是,我們不能檢查或深入修改這些模型的內(nèi)部構(gòu)建,也不能執(zhí)行一套更為完整的實(shí)驗(yàn)來(lái)幫助闡明需要避免或糾正的潛在錯(cuò)誤源?!彼缡欠治?。
Liang is also open about the limitations of the approach. The model is nowhere near accurate enough to properly diagnose Alzheimers, he says, and any real-world deployment of this kind of technology would be as an initial screening step designed to direct people toward a specialist for a full medical evaluation. As with many AI-based approaches, its also hard to know exactly what the model is picking up on when it detects Alzheimers, which may be a problem for medical staff. “The doctor, very naturally would ask why you get these results,” says Liang. “They want to know what feature is really important.”
梁教授對(duì)該方法的局限性也開(kāi)誠(chéng)布公。他說(shuō),該模型目前還遠(yuǎn)不足以精確診斷出阿爾茨海默病,這種技術(shù)的任何實(shí)際應(yīng)用將僅限于作為最初的篩查手段,旨在引導(dǎo)人們向?qū)<覍で笕娴尼t(yī)學(xué)評(píng)估。同許多基于人工智能的方法一樣,很難準(zhǔn)確知道該模型在檢測(cè)出阿爾茲海默病時(shí)捕捉到了什么,這對(duì)醫(yī)療人員來(lái)說(shuō)可能是個(gè)問(wèn)題?!搬t(yī)生自然而然會(huì)問(wèn)你這些結(jié)果是怎么得來(lái)的?!绷航淌谡f(shuō),“他們想知道什么特征是真正重要的?!?/p>
Nonetheless, Liang thinks the approach holds considerable promise and he and his colleagues are planning to build an app that can be used at home or in a doctors office to simplify screening of the disease.
盡管如此,梁教授認(rèn)為這一方法前景相當(dāng)好,他和同事正計(jì)劃開(kāi)發(fā)一款可以在家里或醫(yī)生診室使用的應(yīng)用程序,以簡(jiǎn)化阿爾茲海默病的篩查過(guò)程。
(譯者單位:對(duì)外經(jīng)濟(jì)貿(mào)易大學(xué))