Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

How 'BLOOM' Wants To Democratize AI By Freeing It From Big Tech Companies

Most of its life, AI has been controlled by big technology companies.

From OpenAI's GPT-3's autoregressive language prediction Model, for example, to Google's LaMDA that introduced a "breakthrough conversation technology" that allows people to have various conversation on various topics, open-endedly.

While the general public can certainly use the technologies from big Tech Companies, large language models that power the AIs from those companies are mostly put behind the screen, hidden from the public.

And here, Bloom wants to change that.

The large language model wants to free AI from Big Tech companies' control, by open sourcing the technology.

Tech companies do what they have to do, mostly because of their financial goals.

Large language models are extremely expensive to create, and requires even more money to run and maintain.

The GPT-3 for example, has cost OpenAI millions of dollars to train.

Inevitably, tech companies want to protect such large investments. And in the world where many other tech companies are also developing their own AIs, everyone wants to have competitive advantages over others.

Therefore, tech companies may not want to open source their large language models, and has rarely disclose detail information about their models' inner-workings, unless there are exceptions.

Furthermore, this approach also makes it hard to hold them accountable.

In contrast, BLOOM isn't restricted to any of those.

The secrecy and exclusivity are what the researchers working on BLOOM hope to change.

BLOOM provides the technology all for free.

While BLOOM is providing its large language models to the public without limits or boundaries, it doesn't mean that BLOOM is an underdog.

As a matter of fact, according to a blog post, BLOOM's large language model has been trained using a massive 176 billion parameters.

This makes BLOOM larger than GPT-3.

BLOOM was created by BigScience, a research project that launched in early 2021. The initiative is bootstrapped and led by AI startup Hugging Face.

"Large ML models have changed the world of AI research over the last two years but the huge compute cost necessary to train them resulted in very few teams actually having the ability to train and research them," said Thomas Wolf, the BigScience co-lead and Hugging Face co-founder.

The team of 100,000 researchers from more than 60 countries and 250 institutions developed BLOOM to promote inclusion and responsibility in large language models.

And here, the team trained the model on the Jean Zay supercomputer in Paris, France.

"We adopted a data-first approach to make sure the training corpus was aligned with our values," said Christopher Akiki, a BigScience researcher based at Leipzig University.

"The multidisciplinary and international makeup of BigScience enabled us to critically reflect on every step of the process from multiple vantage points: ethical, legal, environmental, linguistic, and technical. That meant we were able to mitigate ethical concerns without compromising on performance or scale."

What's more, BLOOM is also multilingual, which is very unlike Google’s LaMDA and OpenAI’s GPT-3.

The model can generate text in 46 natural languages and dialects and 13 programming languages. For many of them, it’s the first-ever language model with over 100 billion parameters.

Just when Big Tech companies and western internet companies dominate with their English-dominated products, BLOOM's approach is highly unusual.

"GPT-3 is monolingual and BLOOM was designed from the start to be multilingual so it was trained on several languages, and also to incorporate a significant amount of programming language data," Teven Le Scao, research engineer at Hugging Face

"Although it was never trained on any of those specific tasks, Bloom can be asked to produce summaries or translations of text, output code from instructions, and follow prompts to perform original tasks such as writing recipes, extracting information from a news article, or composing sentences using a newly-defined invented word … Bloom’s performance will continue to improve as the workshop continues to experiment and advance on top of BLOOM," said the researchers ahead of the release.

In the world where large language models are providing solutions for creating products to do certain tasks, including translating languages and more, BLOOM that could help democratize the technology, could help create a deep impact in the society.

BLOOM allows the chance for even more researchers to explore the benefits of AIs, including their risks.

"BLOOM is a demonstration that the most powerful AI models can be trained and released by the broader research community with accountability and in an actual open way, in contrast to the typical secrecy of industrial AI research labs." said Teven Le Scao, co-lead of BLOOM’s training, in a statement.

Using its distribution model, anyone can publicly view the meeting notes, discussions, and code behind the model.

Anyone can use BLOOM as long as they agree to the system’s Responsible AI License.

However, training it is a different thing, and the cost is certainly not covered.

But according to BigScience, the price to train AIs using the BLOOM model is relatively affordable, and should cost researchers less than $40/hour on a cloud provider.

While BLOOM may not be able to compete with those that were built by OpenAI or Google or others, BLOOM can at least provide a way for the public to scrutinize them.

Open sourcing the technology means that everyone has equal access to the technology.

Published: 
14/07/2022
News
AI
Research
Review


This post first appeared on Eyerys | Eyes For Solution, please read the originial post: here

Share the post

How 'BLOOM' Wants To Democratize AI By Freeing It From Big Tech Companies

×

Subscribe to Eyerys | Eyes For Solution

Get updates delivered right to your inbox!

Thank you for your subscription

×