LANGUAGE models have become a key factor when it comes to creating the most thorough and accurate artificial intelligence possible. The new model developed by Microsoft and Nvidia is said to feature about 530 billion parameters and to be capable of exceptional accuracy, especially in reading comprehension and complex sentence formation.
Nvidia and Microsoft's Megatron-Turing Natural Language Generation model (MT-NLG) marks a new record for a language model. According to the tech firms, their model is the most powerful to date.
Thanks to its 530 billion parameters, it is able to outperform OpenAI's GPT-3 as well as Google's BRET. Specialized in natural language, it is able to understand texts, reason and make deductions to form a complete and precise sentence.
Language models are built around a statistical approach. While many methods exist, it is the n-gram model that is being used here.
The learning phase enables analysis of a large quantity of texts to estimate the probabilities that a word will 'fit' correctly in a sentence.
The probability of a word sequence is the product of the probabilities of the words previously used. By using probabilities, we can create perfectly grammatical sentences.
Biased algorithms still an issue
With 530 billion parameters, the MT-NLP model is particularly sophisticated. In the field of machine learning, parameters are often defined as the unit of measurement for machine performance.
t has been repeatedly shown that models with a large number of parameters ultimately perform better, resulting in more accurate, nuanced language due to their large dataset.
These models are capable of summarizing books and texts and even writing poems.
To train MT-NLG, Microsoft and Nvidia created their own dataset of about 270 billion "tokens" from English-language websites.
In natural language, "tokens" are used to break up text into smaller chunks to better distribute information.
The websites included academic sources such as Arxiv, Pubmed, educational websites such as Wikipedia or Github as well as news articles and even messages on social networks.
As always with language models, the main problem with widespread, public use is bias in the algorithms.
The data used to train machine learning algorithms contain human stereotypes embedded in the texts.
Gender, racial, physical and religious biases are widely present in these models. And it is particularly difficult to remove these problems.
For Microsoft and Nvidia, this is one of the main challenges with such a model. Both companies say that the use of MT-NLG "must ensure that proper measures are put in place to mitigate and minimize potential harm to users."
Before fully benefiting from these revolutionary models, this issue needs to be tackled, and for the moment it seems far from resolved.
ETX Studio
Wed Oct 13 2021
Language patterns reach record highs, but questions remain. - ETX Studio
Vivy Yusof and husband claim trial to alternative charges of misappropriating Khazanah, PNB funds
The couple is jointly charged with misappropriating RM8 million in investment funds from Khazanah and PNB.
Trump says Biden left him 'inspirational-type' letter
Trump says Biden advised him to enjoy his term and emphasised the importance of the role.
TikTok owner ByteDance plans to spend $12 bln on AI chips in 2025, FT reports
This move comes as the Chinese company faces pressure from Washington to sell its popular video-sharing app in the United States.
Meta lures TikTok creators with bonuses
Eligible TikTok creators will be able to earn up to US$5,000 in bonuses over three months for posting Reels on Facebook and Instagram.
Hundreds of Capitol rioters released from prison after Trump's sweeping pardon
The new president pardons more than 1,500 people, including some who assaulted police officers.
Malaysia cannot rely on outdated legislation to face emerging challenges - PM
The PM highlights the government's commitment to keeping legislation aligned with global developments.
Malaysia not in ‘pressing need’ to adopt nuclear power - PM
The PM acknowledges the "possibility" of adopting the energy, citing the country's upcoming massive data centers.
Turkish ski resort fire kills 76, guests forced to jump from windows
The fire began on the restaurant floor of the 12-storey Grand Kartal Hotel, authorities said.
TIMELINE - Antisemitic attacks escalate in Australia
Police in the state of New South Wales, that has Sydney as its capital, have arrested forty people for antisemitic offences.
PM Anwar congratulates President Trump, looks forward to strengthened Malaysia-US relations
Malaysia stands ready to collaborate on shared priorities as Trump administration strives to herald a new golden age for America, PM says.
Trump's withdrawal of US from WHO to impact global health
Here are facts about US financing for global health and potential implications of Trump's move.
Indian state contests life sentence of murder, rape convict, seeks death, source says
West Bengal state launches an appeal to overturn a life sentence in favour of the death penalty for the convicted police volunteer.
A new American era is ushered in by a familiar Trump
Donald Trump promises a new golden age, casting himself as a uniter.
What ASEAN offers the world in the Intelligent Age
Under Malaysias Chairmanship in 2025, the focus on collective action, regional cohesion and forward-thinking strategies will be pivotal.
Trump’s second term begin with sweeping policy changes. Here’s what you need to know
US President Donald Trump began his second term with a historic wave of executive orders, reportedly signing up to 200 in a single day.
Trump administration canceling flights for nearly 1,660 Afghan refugees, say US official, advocate
The US decision also leaves in limbo thousands of other Afghans who have been approved for resettlement as refugees in the US.
Najib used Jho Low to receive 1MDB funds - Prosecution
DPP says Najib appears to be protecting Jho Low despite being given numerous warnings about his character and actions.
Investigation into Daim and wife continues - MACC Chief
Tan Sri Azam Baki says there were indicators suggesting the existence of foreign assets that need to be identified in relation to the case.
Gov't to replace toll waivers with 'more targeted approach' during festivals
Nanta Linggi says there will be no toll waivers, and 2024 was the last year for the provision of toll-free travel during festive seasons.
Trump announces 'DOGE' advisory group, attracting instant lawsuits
'DOGE' is run by Elon Musk and has grandiose goals of eliminating entire federal agencies and cutting three quarters of federal govt jobs.